Comparing fcab61bdcd...304ed02f74 - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-03-10 03:40:37 +00:00

Author	SHA1	Message	Date
Conrad Ludgate	304ed02f74	Proxy release 2025-07-03 10:07 UTC	2025-07-03 11:07:21 +01:00
Conrad Ludgate	1a7c2450f5	fix redis credentials check (#12455 ) ## Problem `keep_connection` does not exit, so it was never setting `credentials_refreshed`. ## Summary of changes Set `credentials_refreshed` to true when we first establish a connection, and after we re-authenticate the connection.	2025-07-03 11:07:21 +01:00
Conrad Ludgate	d08748d4e4	Proxy release 2025-07-02 12:38 UTC	2025-07-02 13:38:03 +01:00
Ivan Efremov	fcf8127900	[proxy]: Fix redis IRSA expiration failure errors (#12430 ) Relates to the [#30688](https://github.com/neondatabase/cloud/issues/30688)	2025-07-02 13:38:03 +01:00
github-actions[bot]	c82d646598	Proxy release 2025-07-01 06:01 UTC	2025-07-01 06:01:46 +00:00
Suhas Thalanki	5f3532970e	[compute] fix: background worker that collects installed extension metrics now updates collection interval (#12277 ) ## Problem Previously, the background worker that collects the list of installed extensions across DBs had a timeout set to 1 hour. This cause a problem with computes that had a `suspend_timeout` > 1 hour as this collection was treated as activity, preventing compute shutdown. Issue: https://github.com/neondatabase/cloud/issues/30147 ## Summary of changes Passing the `suspend_timeout` as part of the `ComputeSpec` so that any updates to this are taken into account by the background worker and updates its collection interval.	2025-06-30 22:12:37 +00:00
Arpad Müller	2e681e0ef8	detach_ancestor: delete the right layer when hardlink fails (#12397 ) If a hardlink operation inside `detach_ancestor` fails due to the layer already existing, we delete the layer to make sure the source is one we know about, and then retry. But we deleted the wrong file, namely, the one we wanted to use as the source of the hardlink. As a result, the follow up hard link operation failed. Our PR corrects this mistake.	2025-06-30 21:36:15 +00:00
Dmitrii Kovalkov	8e216a3a59	storcon: notify cplane on safekeeper membership change (#12390 ) ## Problem We don't notify cplane about safekeeper membership change yet. Without the notification the compute needs to know all the safekeepers on the cluster to be able to speak to them. Change notifications will allow to avoid it. - Closes: https://github.com/neondatabase/neon/issues/12188 ## Summary of changes - Implement `notify_safekeepers` method in `ComputeHook` - Notify cplane about safekeepers in `safekeeper_migrate` handler. - Update the test to make sure notifications work. ## Out of scope - There is `cplane_notified_generation` field in `timelines` table in strocon's database. It's not needed now, so it's not updated in the PR. Probably we can remove it. - e2e tests to make sure it works with a production cplane	2025-06-30 14:09:50 +00:00
Erik Grinaker	d0a4ae3e8f	pageserver: add gRPC LSN lease support (#12384 ) ## Problem The gRPC API does not provide LSN leases. ## Summary of changes * Add LSN lease support to the gRPC API. * Use gRPC LSN leases for static computes with `grpc://` connstrings. * Move `PageserverProtocol` into the `compute_api::spec` module and reuse it.	2025-06-30 12:44:17 +00:00
Erik Grinaker	a384d7d501	pageserver: assert no changes to shard identity (#12379 ) ## Problem Location config changes can currently result in changes to the shard identity. Such changes will cause data corruption, as seen with #12217. Resolves #12227. Requires #12377. ## Summary of changes Assert that the shard identity does not change on location config updates and on (re)attach. This is currently asserted with `critical!`, in case it misfires in production. Later, we should reject such requests with an error and turn this into a proper assertion.	2025-06-30 12:36:45 +00:00
Christian Schwarz	66f53d9d34	refactor(pageserver): force explicit mapping to `CreateImageLayersError::Other` (#12382 ) Implicit mapping to an `anyhow::Error` when we do `?` is discouraged because tooling to find those places isn't great. As a drive-by, also make SplitImageLayerWriter::new infallible and sync. I think we should also make ImageLayerWriter::new completely lazy, then `BatchLayerWriter:new` infallible and async.	2025-06-30 11:03:48 +00:00
Busra Kugler	2af9380962	Revert "Replace step-security maintained actions" (#12386 ) Reverts neondatabase/neon#11663 and https://github.com/neondatabase/neon/pull/11265/ Step Security is not yet approved by Databricks team, in order to prevent issues during Github org migration, I'll revert this PR to use the previous action instead of Step Security maintained action.	2025-06-30 10:15:10 +00:00
Ivan Efremov	620d50432c	Fix path issue in the proxy-bernch CI workflow (#12388 )	2025-06-30 09:33:57 +00:00
Erik Grinaker	1d43f3bee8	pageserver: fix stripe size persistence in legacy HTTP handlers (#12377 ) ## Problem Similarly to #12217, the following endpoints may result in a stripe size mismatch between the storage controller and Pageserver if an unsharded tenant has a different stripe size set than the default. This can lead to data corruption if the tenant is later manually split without specifying an explicit stripe size, since the storage controller and Pageserver will apply different defaults. This commonly happens with tenants that were created before the default stripe size was changed from 32k to 2k. * `PUT /v1/tenant/config` * `PATCH /v1/tenant/config` These endpoints are no longer in regular production use (they were used when cplane still managed Pageserver directly), but can still be called manually or by tests. ## Summary of changes Retain the current shard parameters when updating the location config in `PUT \| PATCH /v1/tenant/config`. Also opportunistically derive `Copy` for `ShardParameters`.	2025-06-30 09:08:44 +00:00
Dmitrii Kovalkov	c746678bbc	storcon: implement safekeeper_migrate handler (#11849 ) This PR implements a safekeeper migration algorithm from RFC-035 https://github.com/neondatabase/neon/blob/main/docs/rfcs/035-safekeeper-dynamic-membership-change.md#change-algorithm - Closes: https://github.com/neondatabase/neon/issues/11823 It is not production-ready yet, but I think it's good enough to commit and start testing. There are some known issues which will be addressed in later PRs: - https://github.com/neondatabase/neon/issues/12186 - https://github.com/neondatabase/neon/issues/12187 - https://github.com/neondatabase/neon/issues/12188 - https://github.com/neondatabase/neon/issues/12189 - https://github.com/neondatabase/neon/issues/12190 - https://github.com/neondatabase/neon/issues/12191 - https://github.com/neondatabase/neon/issues/12192 ## Summary of changes - Implement `tenant_timeline_safekeeper_migrate` handler to drive the migration - Add possibility to specify number of safekeepers per timeline in tests (`timeline_safekeeper_count`) - Add `term` and `flush_lsn` to `TimelineMembershipSwitchResponse` - Implement compare-and-swap (CAS) operation over timeline in DB for updating membership configuration safely. - Write simple test to verify that migration code works	2025-06-30 08:30:05 +00:00
Aleksandr Sarantsev	9bb4688c54	storcon: Remove testing feature from kick_secondary_downloads (#12383 ) ## Problem Some of the design decisions in PR #12256 were influenced by the requirements of consistency tests. These decisions introduced intermediate logic that is no longer needed and should be cleaned up. ## Summary of Changes - Remove the `feature("testing")` flag related to `kick_secondary_download`. - Set the default value of `kick_secondary_download` back to false, reflecting the intended production behavior. Co-authored-by: Aleksandr Sarantsev <aleksandr.sarantsev@databricks.com>	2025-06-30 05:41:05 +00:00
Dmitrii Kovalkov	47553dbaf9	neon_local: set timeline_safekeeper_count if we have less than 3 safekeepers (#12378 ) ## Problem - Closes: https://github.com/neondatabase/neon/issues/12298 ## Summary of changes - Set `timeline_safekeeper_count` in `neon_local` if we have less than 3 safekeepers - Remove `cfg!(feature = "testing")` code from `safekeepers_for_new_timeline` - Change `timeline_safekeeper_count` type to `usize`	2025-06-28 12:59:29 +00:00
Erik Grinaker	e50b914a8e	compute_tools: support gRPC base backups in `compute_ctl` (#12244 ) ## Problem `compute_ctl` should support gRPC base backups. Requires #12111. Requires #12243. Touches #11926. ## Summary of changes Support `grpc://` connstrings for `compute_ctl` base backups.	2025-06-27 16:39:00 +00:00
Christian Schwarz	e33e109403	fix(pageserver): buffered writer cancellation error handling (#12376 ) ## Problem The problem has been well described in already-commited PR #11853. tl;dr: BufferedWriter is sensitive to cancellation, which the previous approach was not. The write path was most affected (ingest & compaction), which was mostly fixed in #11853: it introduced `PutError` and mapped instances of `PutError` that were due to cancellation of underlying buffered writer into `CreateImageLayersError::Cancelled`. However, there is a long tail of remaining errors that weren't caught by #11853 that result in `CompactionError::Other`s, which we log with great noise. ## Solution The stack trace logging for CompactionError::Other added in #11853 allows us to chop away at that long tail using the following pattern: - look at the stack trace - from leaf up, identify the place where we incorrectly map from the distinguished variant X indicating cancellation to an `anyhow::Error` - follow that anyhow further up, ensuring it stays the same anyhow all the way up in the `CompactionError::Other` - since it stayed one anyhow chain all the way up, root_cause() will yield us X - so, in `log_compaction_error`, add an additional `downcast_ref` check for X This PR specifically adds checks for - the flush task cancelling (FlushTaskError, BlobWriterError) - opening of the layer writer (GateError) That should cover all the reports in issues - https://github.com/neondatabase/cloud/issues/29434 - https://github.com/neondatabase/neon/issues/12162 ## Refs - follow-up to #11853 - fixup of / fixes https://github.com/neondatabase/neon/issues/11762 - fixes https://github.com/neondatabase/neon/issues/12162 - refs https://github.com/neondatabase/cloud/issues/29434	2025-06-27 15:26:00 +00:00
Folke Behrens	0ee15002fc	proxy: Move client connection accept and handshake to pglb (#12380 ) * This must be a no-op. * Move proxy::task_main to pglb::task_main. * Move client accept, TLS and handshake to pglb. * Keep auth and wake in proxy.	2025-06-27 15:20:23 +00:00
Arpad Müller	4c7956fa56	Fix hang deleting offloaded timelines (#12366 ) We don't have cancellation support for timeline deletions. In other words, timeline deletion might still go on in an older generation while we are attaching it in a newer generation already, because the cancellation simply hasn't reached the deletion code. This has caused us to hit a situation with offloaded timelines in which the timeline was in an unrecoverable state: always returning an accepted response, but never a 404 like it should be. The detailed description can be found in [here](https://github.com/neondatabase/cloud/issues/30406#issuecomment-3008667859) (private repo link). TLDR: 1. we ask to delete timeline on old pageserver/generation, starts process in background 2. the storcon migrates the tenant to a different pageserver. - during attach, the pageserver still finds an index part, so it adds it to `offloaded_timelines` 4. the timeline deletion finishes, removing the index part in S3 5. there is a retry of the timeline deletion endpoint, sent to the new pageserver location. it is bound to fail however: - as the index part is gone, we print `Timeline already deleted in remote storage`. - the problem is that we then return an accepted response code, and not a 404. - this confuses the code calling us. it thinks the timeline is not deleted, so keeps retrying. - this state never gets recovered from until a reset/detach, because of the `offloaded_timelines` entry staying there. This is where this PR fixes things: if no index part can be found, we can safely assume that the timeline is gone in S3 (it's the last thing to be deleted), so we can remove it from `offloaded_timelines` and trigger a reupload of the manifest. Subsequent retries will pick that up. Why not improve the cancellation support? It is a more disruptive code change, that might have its own risks. So we don't do it for now. Fixes https://github.com/neondatabase/cloud/issues/30406	2025-06-27 15:14:55 +00:00
Heikki Linnakangas	5a82182c48	impr(ci): Refactor postgres Makefile targets to a separate makefile (#12363 ) Mainly for general readability. Some notable changes: - Postgres can be built without the rest of the repository, and in particular without any of the Rust bits. Some CI scripts took advantage of that, so let's make that more explicit by separating those parts. Also add an explicit comment about that in the new postgres.mk file. - Add a new PG_INSTALL_CACHED variable. If it's set, `make all` and other top-Makefile targets skip checking if Postgres is up-to-date. This is also to be used in CI scripts that build and cache Postgres as separate steps. (It is currently only used in the macos walproposer-lib rule, but stay tuned for more.) - Introduce a POSTGRES_VERSIONS variable that lists all supported PostgreSQL versions. Refactor a few Makefile rules to use that.	2025-06-27 14:49:52 +00:00
Arpad Müller	37e181af8a	Update rust to 1.88.0 (#12364 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Announcement blog post](https://blog.rust-lang.org/2025/06/26/Rust-1.88.0/) Prior update was in https://github.com/neondatabase/neon/pull/11938	2025-06-27 13:51:59 +00:00
Peter Bendel	6f4198c78a	treat strategy flag test_maintenance as boolean data type (#12373 ) ## Problem In large oltp test run https://github.com/neondatabase/neon/actions/runs/15905488707/job/44859116742 we see that the `Benchmark database maintenance` step is skipped in all 3 strategy variants, however it should be executed in two. This is due to treating the `test_maintenance` boolean type in the strategy in the condition of the `Benchmark database maintenance` step ## Summary of changes Use a boolean condition instead of a string comparison ## Test run from this pull request branch https://github.com/neondatabase/neon/actions/runs/15923605412	2025-06-27 13:49:26 +00:00
Vlad Lazar	cc1664ef93	pageserver: allow flush task cancelled error in sharding autosplit test (#12374 ) ## Problem Test is failing due to compaction shutdown noise (see https://github.com/neondatabase/neon/issues/12162). ## Summary of changes Allow list the noise.	2025-06-27 13:13:11 +00:00
Vlad Lazar	ebb6e26a64	pageserver: handle multiple attached children in shard resolution (#12336 ) ## Problem When resolving a shard during a split we might have multiple attached shards with the old shard count (i.e. not all of them are marked in progress and ignored). Hence, we can compute the desired shard number based on the old shard count and misroute the request. ## Summary of Changes Recompute the desired shard every time the shard count changes during the iteration	2025-06-27 12:46:18 +00:00
Mikhail	ebc12a388c	fix: endpoint_storage_addr as String (#12359 ) It's not a SocketAddr as we use k8s DNS https://github.com/neondatabase/cloud/issues/19011	2025-06-27 11:06:27 +00:00
Conrad Ludgate	abc1efd5a6	[proxy] fix connect_to_compute retry handling (#12351 ) # Problem In #12335 I moved the `authenticate` method outside of the `connect_to_compute` loop. This triggered [e2e tests to become flaky](https://github.com/neondatabase/cloud/pull/30533). This highlighted an edge case we forgot to consider with that change. When we connect to compute, the compute IP might be cached. This cache hit might however be stale. Because we can't validate the IP is associated with a specific compute-id☨, we will succeed the connect_to_compute operation and fail when it comes to password authentication☨☨. Before the change, we were invalidating the cache and triggering wake_compute if the authentication failed. Additionally, I noticed some faulty logic I introduced 1 year ago https://github.com/neondatabase/neon/pull/8141/files#diff-5491e3afe62d8c5c77178149c665603b29d88d3ec2e47fc1b3bb119a0a970afaL145-R147 ☨ We can when we roll out TLS, as the certificate common name includes the compute-id. ☨☨ Technically password authentication could pass for the wrong compute, but I think this would only happen in the very very rare event that the IP got reused and the compute's endpoint happened to be a branch/replica. # Solution 1. Fix the broken logic 2. Simplify cache invalidation (I don't know why it was so convoluted) 3. Add a loop around connect_to_compute + authenticate to re-introduce the wake_compute invalidation we accidentally removed. I went with this approach to try and avoid interfering with https://github.com/neondatabase/neon/compare/main...cloneable/proxy-pglb-connect-compute-split. The changes made in commit 3 will move into `handle_client_request` I suspect,	2025-06-27 10:36:27 +00:00
Dmitrii Kovalkov	6fa1562b57	pageserver: increase default max_size_entries limit for basebackup cache (#12343 ) ## Problem Some pageservers hit `max_size_entries` limit in staging with only ~25 MiB storage used by basebackup cache. The limit is too strict. It should be safe to relax it. - Part of https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Increase the default `max_size_entries` from 1000 to 10000	2025-06-27 09:18:18 +00:00
Heikki Linnakangas	10afac87e7	impr(ci): Remove unnecessary 'make postgres-headers' build step (#12354 ) The 'make postgres' step includes installation of the headers, no need to do that separately.	2025-06-26 16:45:34 +00:00
Vlad Lazar	72b3c9cd11	pageserver: fix wal receiver hang on remote client shutdown (#12348 ) ## Problem Druing shard splits we shut down the remote client early and allow the parent shard to keep ingesting data. While ingesting data, the wal receiver task may wait for the current flush to complete in order to apply backpressure. Notifications are delivered via `Timeline::layer_flush_done_tx`. When the remote client was being shut down the flush loop exited whithout delivering a notification. This left `Timeline::wait_flush_completion` hanging indefinitely which blocked the shutdown of the wal receiver task, and, hence, the shard split. ## Summary of Changes Deliver a final notification when the flush loop is shutting down without the timeline cancel cancellation token having fired. I tried writing a test for this, but got stuck in failpoint hell and decided it's not worth it. `test_sharding_autosplit`, which reproduces this reliably in CI, passed with the proposed fix in https://github.com/neondatabase/neon/pull/12304. Closes https://github.com/neondatabase/neon/issues/12060	2025-06-26 16:35:34 +00:00
Arpad Müller	232f2447d4	Support pull_timeline of timelines without writes (#12028 ) Make the safekeeper `pull_timeline` endpoint support timelines that haven't had any writes yet. In the storcon managed sk timelines world, if a safekeeper goes down temporarily, the storcon will schedule a `pull_timeline` call. There is no guarantee however that by when the safekeeper is online again, there have been writes to the timeline yet. The `snapshot` endpoint gives an error if the timeline hasn't had writes, so we avoid calling it if `timeline_start_lsn` indicates a freshly created timeline. Fixes #11422 Part of #11670	2025-06-26 16:29:03 +00:00
Erik Grinaker	a2d2108e6a	pageserver: use base backup cache with gRPC (#12352 ) ## Problem gRPC base backups do not use the base backup cache. Touches https://github.com/neondatabase/neon/issues/11728. ## Summary of changes Integrate gRPC base backups with the base backup cache. Also fixes a bug where the base backup cache did not differentiate between primary/replica base backups (at least I think that's a bug?).	2025-06-26 15:52:15 +00:00
Alex Chi Z.	33c0d5e2f4	fix(pageserver): make posthog config parsing more robust (#12356 ) ## Problem In our infra config, we have to split server_api_key and other fields in two files: the former one in the sops file, and the latter one in the normal config. It creates the situation that we might misconfigure some regions that it only has part of the fields available, causing storcon/pageserver refuse to start. ## Summary of changes Allow PostHog config to have part of the fields available. Parse it later. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-26 15:49:08 +00:00
Dmitrii Kovalkov	605fb04f89	pageserver: use bounded sender for basebackup cache (#12342 ) ## Problem Basebackup cache now uses unbounded channel for prepare requests. In theory it can grow large if the cache is hung and does not process the requests. - Part of https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Replace an unbounded channel with a bounded one, the size is configurable. - Add `pageserver_basebackup_cache_prepare_queue_size` to observe the size of the queue. - Refactor a bit to move all metrics logic to `basebackup_cache.rs`	2025-06-26 13:26:24 +00:00
Conrad Ludgate	fd1e8ec257	[proxy] review and cleanup CLI args (#12167 ) I was looking at how we could expose our proxy config as toml again, and as I was writing out the schema format, I noticed some cruft in our CLI args that no longer seem to be in use. The redis change is the most complex, but I am pretty sure it's sound. Since https://github.com/neondatabase/cloud/pull/15613 cplane longer publishes to the global redis instance.	2025-06-26 11:25:41 +00:00
Konstantin Knizhnik	be23eae3b6	Mark pages as avaiable in LFC only after generation check (#12350 ) ## Problem If LFC generation is changed then `lfc_readv_select` will return -1 but pages are still marked as available in bitmap. ## Summary of changes Update bitmap after generation check. Co-authored-by: Kosntantin Knizhnik <konstantin.knizhnik@databricks.com>	2025-06-26 07:06:27 +00:00
Alex Chi Z.	6f70885e11	fix(pageserver): allow refresh_interval to be empty (#12349 ) ## Problem Fix for https://github.com/neondatabase/neon/pull/12324 ## Summary of changes Need `serde(default)` to allow this field not present in the config, otherwise there will be a config deserialization error. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-25 22:15:03 +00:00
Erik Grinaker	f755979102	pageserver: payload compression for gRPC base backups (#12346 ) ## Problem gRPC base backups use gRPC compression. However, this has two problems: * Base backup caching will cache compressed base backups (making gRPC compression pointless). * Tonic does not support varying the compression level, and zstd default level is 10% slower than gzip fastest level. Touches https://github.com/neondatabase/neon/issues/11728. Touches https://github.com/neondatabase/cloud/issues/29353. ## Summary of changes This patch adds a gRPC parameter `BaseBackupRequest::compression` specifying the compression algorithm. It also moves compression into `send_basebackup_tarball` to reduce code duplication. A follow-up PR will integrate the base backup cache with gRPC.	2025-06-25 18:16:23 +00:00
Matthias van de Meent	1d49eefbbb	RFC: Endpoint Persistent Unlogged Files Storage (#9661 ) ## Summary A design for a storage system that allows storage of files required to make Neon's Endpoints have a better experience at or after a reboot. ## Motivation Several systems inside PostgreSQL (and Neon) need some persistent storage for optimal workings across reboots and restarts, but still work without. Examples are the cumulative statistics file in `pg_stat/global.stat`, `pg_stat_statements`' `pg_stat/pg_stat_statements.stat`, and `pg_prewarm`'s `autoprewarm.blocks`. We need a storage system that can store and manage these files for each Endpoint. [GH rendered file](https://github.com/neondatabase/neon/blob/MMeent/rfc-unlogged-file/docs/rfcs/040-Endpoint-Persistent-Unlogged-Files-Storage.md) Part of https://github.com/neondatabase/cloud/issues/24225	2025-06-25 16:25:57 +00:00
Alex Chi Z.	6c77638ea1	feat(storcon): retrieve feature flag and pass to pageservers (#12324 ) ## Problem part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes It costs $$$ to directly retrieve the feature flags from the pageserver. Therefore, this patch adds new APIs to retrieve the spec from the storcon and updates it via pageserver. * Storcon retrieves the feature flag and send it to the pageservers. * If the feature flag gets updated outside of the normal refresh loop of the pageserver, pageserver won't fetch the flags on its own as long as the last updated time <= refresh_period. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-25 14:58:18 +00:00
Conrad Ludgate	9b26b7bde9	Proxy release 2025-06-25 14:35 UTC	2025-06-25 15:35:55 +01:00
Conrad Ludgate	93350f7018	[proxy]: BatchQueue::call is not cancel safe - make it directly cancellation aware (#12345 ) ## Problem https://github.com/neondatabase/cloud/issues/30539 If the current leader cancels the `call` function, then it has removed the jobs from the queue, but will never finish sending the responses. Because of this, it is not cancellation safe. ## Summary of changes Document these functions as not cancellation safe. Move cancellation of the queued jobs into the queue itself. ## Alternatives considered 1. We could spawn the task that runs the batch, since that won't get cancelled. * This requires `fn call(self: Arc<Self>)` or `fn call(&'static self)`. 2. We could add another scopeguard and return the requests back to the queue. * This requires that requests are always retry safe, and also requires requests to be `Clone`.	2025-06-25 15:35:55 +01:00
Conrad Ludgate	517a3d0d86	[proxy]: BatchQueue::call is not cancel safe - make it directly cancellation aware (#12345 ) ## Problem https://github.com/neondatabase/cloud/issues/30539 If the current leader cancels the `call` function, then it has removed the jobs from the queue, but will never finish sending the responses. Because of this, it is not cancellation safe. ## Summary of changes Document these functions as not cancellation safe. Move cancellation of the queued jobs into the queue itself. ## Alternatives considered 1. We could spawn the task that runs the batch, since that won't get cancelled. * This requires `fn call(self: Arc<Self>)` or `fn call(&'static self)`. 2. We could add another scopeguard and return the requests back to the queue. * This requires that requests are always retry safe, and also requires requests to be `Clone`.	2025-06-25 14:19:20 +00:00
Conrad Ludgate	27ca1e21be	[console_redirect_proxy]: fix channel binding (#12238 ) ## Problem While working more on TLS to compute, I realised that Console Redirect -> pg-sni-router -> compute would break if channel binding was set to prefer. This is because the channel binding data would differ between Console Redirect -> pg-sni-router vs pg-sni-router -> compute. I also noticed that I actually disabled channel binding in #12145, since `connect_raw` would think that the connection didn't support TLS. ## Summary of changes Make sure we specify the channel binding. Make sure that `connect_raw` can see if we have TLS support.	2025-06-25 13:41:30 +00:00
Arpad Müller	1dc01c9bed	Support cancellations of timelines with hanging ondemand downloads (#12330 ) In `test_layer_download_cancelled_by_config_location`, we simulate hung downloads via the `before-downloading-layer-stream-pausable` failpoint. Then, we cancel a timeline via the `location_config` endpoint. With the new default as of https://github.com/neondatabase/neon/pull/11712, we would be creating the timeline on safekeepers regardless if there have been writes or not, and it turns out the test relied on the timeline not existing on safekeepers, due to a cancellation bug: * as established before, the test makes the read path hang * the timeline cancellation function first cancels the walreceiver, and only then cancels the timeline's token * `WalIngest::new` is requesting a checkpoint, which hits the read path * at cancellation time, we'd be hanging inside the read, not seeing the cancellation of the walreceiver * the test would time out due to the hang This is probably also reproducible in the wild when there is S3 unavailabilies or bottlenecks. So we thought that it's worthwhile to fix the hang issue. The approach chosen in the end involves the `tokio::select` macro. In PR 11712, we originally punted on the test due to the hang and opted it out from the new default, but now we can use the new default. Part of https://github.com/neondatabase/neon/issues/12299	2025-06-25 13:40:38 +00:00
Heikki Linnakangas	7c4c36f5ac	Remove unnecessary separate installation of libpq (#12287 ) `make install` compiles and installs libpq. Remove redundant separate step to compile and install it.	2025-06-25 10:47:56 +00:00
Tristan Partin	a2d623696c	Update pgaudit to latest versions (#12328 ) These updates contain some bug fixes and are completely backwards compatible with what we currently support in Neon. Link: https://github.com/pgaudit/pgaudit/compare/1.6.2...1.6.3 Link: https://github.com/pgaudit/pgaudit/compare/1.7.0...1.7.1 Link: https://github.com/pgaudit/pgaudit/compare/16.0...16.1 Link: https://github.com/pgaudit/pgaudit/compare/17.0...17.1 Signed-off-by: Tristan Partin <tristan.partin@databricks.com> Signed-off-by: Tristan Partin <tristan.partin@databricks.com>	2025-06-25 09:03:02 +00:00
Tristan Partin	aa75722010	Set pgaudit.log=none for monitoring connections (#12137 ) pgaudit can spam logs due to all the monitoring that we do. Logs from these connections are not necessary for HIPPA compliance, so we can stop logging from those connections. Part-of: https://github.com/neondatabase/cloud/issues/29574 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-24 17:42:23 +00:00
Matthias van de Meent	6c6de6382a	Use enum-typed PG versions (#12317 ) This makes it possible for the compiler to validate that a match block matched all PostgreSQL versions we support. ## Problem We did not have a complete picture about which places we had to test against PG versions, and what format these versions were: The full PG version ID format (Major/minor/bugfix `MMmmbb`) as transfered in protocol messages, or only the Major release version (`MM`). This meant type confusion was rampant. With this change, it becomes easier to develop new version-dependent features, by making type and niche confusion impossible. ## Summary of changes Every use of `pg_version` is now typed as either `PgVersionId` (u32, valued in decimal `MMmmbb`) or PgMajorVersion (an enum, with a value for every major version we support, serialized and stored like a u32 with the value of that major version) --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-06-24 17:25:31 +00:00
Dmitry Savelev	158d84ea30	Switch the billing metrics storage format to ndjson. (#12338 ) ## Problem The billing team wants to change the billing events pipeline and use a common events format in S3 buckets across different event producers. ## Summary of changes Change the events storage format for billing events from JSON to NDJSON. Resolves: https://github.com/neondatabase/cloud/issues/29994	2025-06-24 15:36:36 +00:00
Conrad Ludgate	4dd9ca7b04	[proxy]: authenticate to compute after connect_to_compute (#12335 ) ## Problem PGLB will do the connect_to_compute logic, neonkeeper will do the session establishment logic. We should split it. ## Summary of changes Moves postgres authentication to compute to a separate routine that happens after connect_to_compute.	2025-06-24 14:15:36 +00:00
Arpad Müller	552249607d	apply clippy fixes for 1.88.0 beta (#12331 ) The 1.88.0 stable release is near (this Thursday). We'd like to fix most warnings beforehand so that the compiler upgrade doesn't require approval from too many teams. This is therefore a preparation PR (like similar PRs before it). There is a lot of changes for this release, mostly because the `uninlined_format_args` lint has been added to the `style` lint group. One can read more about the lint [here](https://rust-lang.github.io/rust-clippy/master/#/uninlined_format_args). The PR is the result of `cargo +beta clippy --fix` and `cargo fmt`. One remaining warning is left for the proxy team. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2025-06-24 10:12:42 +00:00
Ivan Efremov	a29772bf6e	Create proxy-bench periodic run in CI (#12242 ) Currently run for test only via pushing to the test-proxy-bench branch. Relates to the #22681	2025-06-24 09:54:43 +00:00
github-actions[bot]	7f4f1785a8	Proxy release 2025-06-24 06:01 UTC	2025-06-24 06:01:22 +00:00
Arpad Müller	0efff1db26	Allow cancellation errors in tests that allow timeline deletion errors (#12315 ) After merging of PR https://github.com/neondatabase/neon/pull/11712 we saw some tests be flaky, with errors showing up about the timeline having been cancelled instead of having been deleted. This is an outcome that is inherently racy with the "has been deleted" error. In some instances, https://github.com/neondatabase/neon/pull/11712 has already added the error about the timeline having been cancelled. This PR adds them to the remaining instances of https://github.com/neondatabase/neon/pull/11712, fixing the flakiness.	2025-06-23 22:26:38 +00:00
Aleksandr Sarantsev	5eecde461d	storcon: Fix migration for Attached(0) tenants (#12256 ) ## Problem `Attached(0)` tenant migrations can get stuck if the heatmap file has not been uploaded. ## Summary of Changes - Added a test to reproduce the issue. - Introduced a `kick_secondary_downloads` config flag: - Enabled in testing environments. - Disabled in production (and in the new test). - Updated `Attached(0)` locations to consider the number of secondaries in their intent when deciding whether to download the heatmap.	2025-06-23 18:55:26 +00:00
Alex Chi Z.	85164422d0	feat(pageserver): support force overriding feature flags (#12233 ) ## Problem Part of #11813 ## Summary of changes Add a test API to make it easier to manipulate the feature flags within tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-23 17:31:53 +00:00
John Spray	6c3aba7c44	storcon: adjust AZ selection for heterogenous AZs (#12296 ) ## Problem The scheduler uses total shards per AZ to select the AZ for newly created or attached tenants. This makes bad decisions when we have different node counts per AZ -- we might have 2 very busy pageservers in one AZ, and 4 more lightly loaded pageservers in other AZs, and the scheduler picks the busy pageservers because the total shard count in their AZ is lower. ## Summary of changes - Divide the shard count by the number of nodes in the AZ when scoring in `get_az_for_new_tenant` --------- Co-authored-by: John Spray <john.spray@databricks.com>	2025-06-23 15:50:31 +00:00
Erik Grinaker	68a175d545	test_runner: fix `test_basebackup_with_high_slru_count` gzip param (#12319 ) The `--gzip-probability` parameter was removed in #12250. However, `test_basebackup_with_high_slru_count` still uses it, and keeps failing. This patch removes the use of the parameter (gzip is enabled by default).	2025-06-23 15:33:45 +00:00
Alex Chi Z.	5e2c444525	fix(pageserver): reduce default feature flag refresh interval (#12246 ) ## Problem Part of #11813 ## Summary of changes The current interval is 30s and it costs a lot of $$$. This patch reduced it to 600s refresh interval (which means that it takes 10min for feature flags to propagate from UI to the pageserver). In the future we can let storcon retrieve the feature flags and push it to pageservers. We can consider creating a new release or we can postpone this to the week after the next week. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-23 13:51:21 +00:00
Heikki Linnakangas	8d711229c1	ci: Fix bogus skipping of 'make all' step in CI (#12318 ) The 'make all' step must run always. PR #12311 accidentally left the condition in there to skip it if there were no changes in postgres v14 sources. That condition belonged to a whole different step that was removed altogether in PR#12311, and the condition should've been removed too. Per CI failure: https://github.com/neondatabase/neon/actions/runs/15820148967/job/44587394469	2025-06-23 13:23:33 +00:00
Vlad Lazar	0e490f3be7	pageserver: allow concurrent rw IO on in-mem layer (#12151 ) ## Problem Previously, we couldn't read from an in-memory layer while a batch was being written to it. Vice-versa, we couldn't write to it while there was an on-going read. ## Summary of Changes The goal of this change is to improve concurrency. Writes happened through a &mut self method so the enforcement was at the type system level. We attempt to improve by: 1. Adding interior mutability to EphemeralLayer. This involves wrapping the buffered writer in a read-write lock. 2. Minimise the time that the read lock is held for. Only hold the read lock while reading from the buffers (recently flushed or pending flush). If we need to read from the file, drop the lock and allow IO to be concurrent. The new benchmark variants with concurrent reads improve between 70 to 200 percent (against main). Benchmark results are in this [commit](`891f094ce6`). ## Future Changes We can push the interior mutability into the buffered writer. The mutable tail goes under a read lock, the flushed part goes into an ArcSwap and then we can read from anything that is flushed _without_ any locking.	2025-06-23 13:17:30 +00:00
Erik Grinaker	7e41ef1bec	pageserver: set gRPC basebackup chunk size to 256 KB (#12314 ) gRPC base backups send a stream of fixed-size 64KB chunks. pagebench basebackup with compression enabled shows this to reduce throughput: * 64 KB: 55 RPS * 128 KB: 69 RPS * 256 KB: 73 RPS * 1024 KB: 73 RPS This patch sets the base backup chunk size to 256 KB.	2025-06-23 12:41:11 +00:00
Heikki Linnakangas	7916aa26e0	Stop using build-tools image in compute image build (#12306 ) The build-tools image contains various build tools and dependencies, mostly Rust-related. The compute image build used it to build compute_ctl and a few other little rust binaries that are included in the compute image. However, for extensions built in Rust (pgrx), the build used a different layer which installed the rust toolchain using rustup. Switch to using the same rust toolchain for both pgrx-based extensions and compute_ctl et al. Since we don't need anything else from the build-tools image, I switched to using the toolchain installed with rustup, and eliminated the dependency to build-tools altogether. The compute image build no longer depends on build-tools. Note: We no longer use 'mold' for linking compute_ctl et al, since mold is not included in the build-deps-with-cargo layer. We could add it there, but it doesn't seem worth it. I proposed stopping using mold altogether in https://github.com/neondatabase/neon/pull/10735, but that was rejected because 'mold' is faster for incremental builds. That doesn't matter much for docker builds however, since they're not incremental, and the compute binaries are not as large as the storage server binaries anyway.	2025-06-23 09:11:05 +00:00
Heikki Linnakangas	52ab8f3e65	Use `make all` in the "Build and Test locally" CI workflow (#12311 ) To avoid duplicating the build logic. `make all` covers the separate `postgres-*` and `neon-pg-ext` steps, and also does `cargo build`. That's how you would typically do a full local build anyway.	2025-06-23 09:10:32 +00:00
Heikki Linnakangas	3d822dbbde	Refactor Makefile rules for building the extensions under pgxn/ (#12305 )	2025-06-22 19:43:14 +00:00
Heikki Linnakangas	af46b5286f	Avoid recompiling `postgres_ffi` when there has been no changes (#12292 ) Every time you run `make`, it runs `make install` on all the PostgreSQL sources, which copies the header files. That in turn triggers a rebuild of the `postgres_ffi` crate, and everything that depends on it. We had worked around this earlier (see #2458), by passing a custom INSTALL script to the Postgres makefiles, which refrains from updating the modification timestamp on headers when they have not been changed, but the v14 makefile didn't obey INSTALL for the header files. Backporting c0a1d7621b to v14 fixes that. This backports upstream PostgreSQL commit c0a1d7621b to v14. Corresponding PR in the 'postgres' repo: https://github.com/neondatabase/postgres/pull/660	2025-06-21 21:07:38 +00:00
Erik Grinaker	47f7efee06	pageserver: require stripe size (#12257 ) ## Problem In #12217, we began passing the stripe size in reattach responses, and persisting it in the on-disk state. This is necessary to ensure the storage controller and Pageserver have a consistent view of the intended stripe size of unsharded tenants, which will be used for splits that do not specify a stripe size. However, for backwards compatibility, these stripe sizes were optional. ## Summary of changes Make the stripe sizes required for reattach responses and on-disk location configs. These will always be provided by the previous (current) release.	2025-06-21 15:01:29 +00:00
Tristan Partin	868c38f522	Rename the compute_ctl admin scope to compute_ctl:admin (#12263 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-20 22:49:05 +00:00
Tristan Partin	c8b2ac93cf	Allow the control plane to override any Postgres connection options (#12262 ) The previous behavior was for the compute to override control plane options if there was a conflict. We want to change the behavior so that the control plane has the absolute power on what is right. In the event that we need a new option passed to the compute as soon as possible, we can initially roll it out in the control plane, and then migrate the option to EXTRA_OPTIONS within the compute later, for instance. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-20 18:46:30 +00:00
Dmitrii Kovalkov	b2954d16ff	storcon, neon_local: add timeline_safekeeper_count (#12303 ) ## Problem We need to specify the number of safekeepers for neon_local without `testing` feature. Also we need this option for testing different configurations of safekeeper migration code. We cannot set it in `neon_fixtures.py` and in the default config of `neon_local` yet, because it will fail compatibility tests. I'll make a separate PR with removing `cfg!("testing")` completely and specifying this option in the config when this option reaches the release branch. - Part of https://github.com/neondatabase/neon/issues/12298 ## Summary of changes - Add `timeline_safekeeper_count` config option to storcon and neon_local	2025-06-20 16:03:17 +00:00
Alex Chi Z.	79485e7c3a	feat(pageserver): enable gc-compaction by default everywhere (#12105 ) Enable it across tests and set it as default. Marks the first milestone of https://github.com/neondatabase/neon/issues/9114. We already enabled it in all AWS regions and planning to enable it in all Azure regions next week. will merge after we roll out in all regions. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-20 15:35:11 +00:00
Heikki Linnakangas	eaf1ab21c4	Store intermediate build files in `build/` rather than `pg_install/build/` (#12295 ) This way, `pg_install` contains only the final build artifacts, not intermediate files like *.o files. Seems cleaner.	2025-06-20 14:50:03 +00:00
Vlad Lazar	6508f4e5c1	pageserver: revise gc layer map lock handling (#12290 ) ## Problem Timeline GC is very aggressive with regards to layer map locking. We've seen timelines with loads of layers in production that hold the write lock for the layer map for 30 minutes at a time. This blocks reads and the write path to some extent. ## Summary of changes Determining the set of layers to GC is done under the read lock. Applying the updates is done under the write lock. Previously, everything was done under write lock.	2025-06-20 11:57:30 +00:00
Conrad Ludgate	a298d2c29b	[proxy] replace the batch cancellation queue, shorten the TTL for cancel keys (#11943 ) See #11942 Idea: * if connections are short lived, they can get enqueued and then also remove themselves later if they never made it to redis. This reduces the load on the queue. * short lived connections (<10m, most?) will only issue 1 command, we remove the delete command and rely on ttl. * we can enqueue as many commands as we want, as we can always cancel the enqueue, thanks to the ~~intrusive linked lists~~ `BTreeMap`.	2025-06-20 11:48:01 +00:00
Arpad Müller	8b197de7ff	Increase upload timeout for test_tenant_s3_restore (#12297 ) Increase the upload timeout of the test to avoid hitting timeouts (which we sometimes do). Fixes https://github.com/neondatabase/neon/issues/12212	2025-06-20 10:33:11 +00:00
Erik Grinaker	15d079cd41	pagebench: improve `getpage-latest-lsn` gRPC support (#12293 ) This improves `pagebench getpage-latest-lsn` gRPC support by: * Using `page_api::Client`. * Removing `--protocol`, and using the `page-server-connstring` scheme instead. * Adding `--compression` to enable zstd compression.	2025-06-20 08:31:40 +00:00
Erik Grinaker	dc1625cd8e	pagebench: add `basebackup` gRPC support (#12250 ) ## Problem Pagebench does not support gRPC for `basebackup` benchmarks. Requires #12243. Touches #11728. ## Summary of changes Add gRPC support via gRPC connstrings, e.g. `pagebench basebackup --page-service-connstring grpc://localhost:51051`. Also change `--gzip-probability` to `--no-compression`, since this must be specified per-client for gRPC.	2025-06-19 15:40:57 +00:00
dependabot[bot]	a6d4de25cd	build(deps): bump urllib3 from 1.26.19 to 2.5.0 in the pip group across 1 directory (#12289 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-19 14:20:02 +00:00
Arpad Müller	ec1452a559	Switch on --timelines-onto-safekeepers in integration tests (#11712 ) Switch on the `--timelines-onto-safekeepers` param in integration tests. Some changes that were needed to enable this but which I put into other PRs to not clutter up this one: * #11786 * #11854 * #12129 * #12138 Further fixes that were needed for this: * https://github.com/neondatabase/neon/pull/11801 * https://github.com/neondatabase/neon/pull/12143 * https://github.com/neondatabase/neon/pull/12204 Not strictly needed, but helpful: * https://github.com/neondatabase/neon/pull/12155 Part of #11670 Closes #11424	2025-06-19 11:17:01 +00:00
Heikki Linnakangas	1950ccfe33	Eliminate dependency from pageserver_api to postgres_ffi (#12273 ) Introduce a separate `postgres_ffi_types` crate which contains a few types and functions that were used in the API. `postgres_ffi_types` is a much small crate than `postgres_ffi`, and it doesn't depend on bindgen or the Postgres C headers. Move NeonWalRecord and Value types to wal_decoder crate. They are only used in the pageserver-safekeeper "ingest" API. The rest of the ingest API types are defined in wal_decoder, so move these there as well.	2025-06-19 10:31:27 +00:00
Heikki Linnakangas	2ca6665f4a	Remove outdated 'clean' Makefile targest (#12288 ) We have been bad at keeping them up-to-date, several contrib modules and neon extensions were missing from the clean rules. Give up trying, and remove the targets altogether. In practice, it's straightforward to just do `rm -rf pg_install/build`, so the clean-targets are hardly worth the maintenance effort. I kept `make distclean` though. The rule for that is simple enough.	2025-06-19 10:24:09 +00:00
Heikki Linnakangas	fa954671b2	Remove unnecessary Postgres libs from the storage docker image (#12286 ) Since commit `87ad50c925`, storage_controller has used diesel_async, which in turn uses tokio-postgres as the Postgres client, which doesn't require libpq. Thus we no longer need libpq in the storage image.	2025-06-19 10:00:01 +00:00
Erik Grinaker	6f4ffdb48b	pageserver: add gRPC compression (#12280 ) ## Problem The gRPC page service should support compression. Requires #12111. Touches #11728. Touches https://github.com/neondatabase/cloud/issues/25679. ## Summary of changes Add support for gzip and zstd compression in the server, and a client parameter to enable compression. This will need further benchmarking under realistic network conditions.	2025-06-19 09:54:34 +00:00
Vlad Lazar	3f676df3d5	pageserver: fix initial layer visibility calculation (#12206 ) ## Problem GC info is an input to updating layer visibility. Currently, gc info is updated on timeline activation and visibility is computed on tenant attach, so we ignore branch points and compute visibility by taking all layers into account. Side note: gc info is also updated when timelines are created and dropped. That doesn't help because we create the timelines in topological order from the root. Hence the root timeline goes first, without context of where the branch points are. The impact of this in prod is that shards need to rehydrate layers after live migration since the non-visible ones were excluded from the heatmap. ## Summary of Changes Move the visibility calculation into tenant attachment instead of activation.	2025-06-19 09:53:18 +00:00
Suhas Thalanki	20f4febce1	fix: additional changes to terminate pgbouncer on compute suspend (#12153 ) (#12284 ) Addressed [retrospective comments ](https://github.com/neondatabase/neon/pull/12153#discussion_r2154197503) to https://github.com/neondatabase/neon/pull/12153	2025-06-18 19:31:22 +00:00
Mikhail	762905cf8d	endpoint storage: parse config with type:LocalFs\|AwsS3\|AzureContainer (#12282 ) https://github.com/neondatabase/cloud/issues/27195	2025-06-18 17:45:20 +00:00
Elizabeth Murray	830ef35ed3	Domain client for Pageserver GRPC. (#12111 ) Add domain client for new communicator GRPC types.	2025-06-18 15:51:49 +00:00
Erik Grinaker	d8d62fb7cb	test_runner: add gRPC support (#12279 ) ## Problem `test_runner` integration tests should support gRPC. Touches #11926. ## Summary of changes * Enable gRPC for Pageservers, with dynamic port allocations. * Add a `grpc` parameter for endpoint creation, plumbed through to `neon_local endpoint create`. No tests actually use gRPC yet, since computes don't support it yet.	2025-06-18 14:05:13 +00:00
Aleksandr Sarantsev	e6a404c66d	Fix flaky test_sharding_split_failures (#12199 ) ## Problem `test_sharding_failures` is flaky due to interference from the `background_reconcile` process. The details are in the issue https://github.com/neondatabase/neon/issues/12029. ## Summary of changes - Use `reconcile_until_idle` to ensure a stable state before running test assertions - Added error tolerance in `reconcile_until_idle` test function (Failure cases: 1, 3, 19, 20) - Ignore the `Keeping extra secondaries` warning message since it i retryable (Failure case: 2) - Deduplicated code in `assert_rolled_back` and `assert_split_done` - Added a log message before printing plenty of Node `X` seen on pageserver `Y`	2025-06-18 13:27:41 +00:00
Peter Bendel	7e711ede44	Increase tenant size for large tenant oltp workload (#12260 ) ## Problem - We run the large tenant oltp workload with a fixed size (larger than existing customers' workloads). Our customer's workloads are continuously growing and our testing should stay ahead of the customers' production workloads. - we want to touch all tables in the tenant's database (updates) so that we simulate a continuous change in layer files like in a real production workload - our current oltp benchmark uses a mixture of read and write transactions, however we also want a separate test run with read-only transactions only ## Summary of changes - modify the existing workload to have a separate run with pgbench custom scripts that are read-only - create a new workload that - grows all large tables in each run (for the reuse branch in the large oltp tenant's project) - updates a percentage of rows in all large tables in each run (to enforce table bloat and auto-vacuum runs and layer rebuild in pageservers Each run of the new workflow increases the logical database size about 16 GB. We start with 6 runs per day which will give us about 96-100 GB growth per day. --------- Co-authored-by: Alexander Lakhin <alexander.lakhin@neon.tech>	2025-06-18 12:40:25 +00:00
Mikhail	e95f2f9a67	compute_ctl: return LSN in /terminate (#12240 ) - Add optional `?mode=fast\|immediate` to `/terminate`, `fast` is default. Immediate avoids waiting 30 seconds before returning from `terminate`. - Add `TerminateMode` to `ComputeStatus::TerminationPending` - Use `/terminate?mode=immediate` in `neon_local` instead of `pg_ctl stop` for `test_replica_promotes`. - Change `test_replica_promotes` to check returned LSN - Annotate `finish_sync_safekeepers` as `noreturn`. https://github.com/neondatabase/cloud/issues/29807	2025-06-18 12:25:19 +00:00
Heikki Linnakangas	5a045e7d52	Move pagestream_api to separate module (#12272 ) For general readability.	2025-06-18 12:03:14 +00:00
Dimitri Fontaine	67fbc0582e	Validate safekeeper_connstrings when parsing compute specs. (#11906 ) This check API only cheks the safekeeper_connstrings at the moment, and the validation is limited to checking we have at least one entry in there, and no duplicates. ## Problem If the compute_ctl service is started with an empty list of safekeepers, then hard-to-debug errors may happen at runtime, where it would be much easier to catch them early. ## Summary of changes Add an entry point in the compute_ctl API to validate the configuration for safekeeper_connstrings. --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2025-06-18 10:01:05 +00:00
Heikki Linnakangas	3af6b3a2bf	Avoid redownloading rust toolchain on Postgres changes (#12265 ) Create a separate stage for downloading the Rust toolchain for pgrx, so that it can be cached independently of the pg-build layer. Before this, the 'pg-build-nonroot=with-cargo' layer was unnecessarily rebuilt every time there was a change in PostgreSQL sources. Furthermore, this allows using the same cached layer for building the compute images of all Postgres versions.	2025-06-18 09:49:42 +00:00
Erik Grinaker	04013929cb	pageserver: support full gRPC basebackups (#12269 ) ## Problem Full basebackups are used in tests, and may be useful for debugging as well, so we should support them in the gRPC API. Touches #11728. ## Summary of changes Add `GetBaseBackupRequest::full` to generate full base backups. The libpq implementation also allows specifying `prev_lsn` for full backups, i.e. the end LSN of the previous WAL record. This is omitted in the gRPC API, since it's not used by any tests, and presumably of limited value since it's autodetected. We can add it later if we find that we need it.	2025-06-18 06:48:39 +00:00
Suhas Thalanki	83069f6ca1	fix: terminate pgbouncer on compute suspend (#12153 ) ## Problem PgBouncer does not terminate connections on a suspend: https://github.com/neondatabase/cloud/issues/16282 ## Summary of changes 1. Adds a pid file to store the pid of PgBouncer 2. Terminates connections on a compute suspend --------- Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2025-06-17 22:56:05 +00:00
Mikhail	7d4f662fbf	upgrade default neon version to 1.6 (#12185 ) Changes for 1.6 were merged and deployed two months ago https://github.com/neondatabase/neon/blob/main/pgxn/neon/neon--1.6--1.5.sql. In order to deploy https://github.com/neondatabase/neon/pull/12183, we need 1.6 to be default, otherwise we can't use prewarm API on read-only replica (`ALTER EXTENSION` won't work) and we need it for promotion	2025-06-17 17:46:35 +00:00
Alexander Bayandin	a5cac52e26	compute-image: add a patch for onnxruntime (#12274 ) ## Problem The checksum for eigen (a dependency for onnxruntime) has changed which breaks compute image build. ## Summary of changes - Add a patch for onnxruntime which backports changes from `f57db79743` (we keep the current version) Ref https://github.com/microsoft/onnxruntime/issues/24861	2025-06-17 16:35:20 +00:00
Konstantin Knizhnik	dfa055f4be	Support event trigger for Neon users (#10624 ) ## Problem https://github.com/neondatabase/neon/issues/7570 Even triggers are supported only for superusers. ## Summary of changes Temporary switch to superuser when even trigger is created and disable execution of user's even triggers under superuser. --------- Co-authored-by: Dimitri Fontaine <dim@tapoueh.org> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-17 15:44:50 +00:00
Erik Grinaker	a4c76740c0	pageserver: emit gRPC GetPage errors as responses (#12255 ) ## Problem When converting `proto::GetPageRequest` into `page_api::GetPageRequest` and validating the request, errors are returned as `tonic::Status`. This will tear down the GetPage stream, which is disruptive and unnecessary. ## Summary of changes Emit invalid request errors as `GetPageResponse` with an appropriate `status_code` instead. Also move the conversion from `tonic::Status` to `GetPageResponse` out into the stream handler.	2025-06-17 15:41:17 +00:00
Dmitrii Kovalkov	f2e96b2323	tests: prepare test_compatibility.py for --timelines-onto-safekeepers (#12204 ) ## Problem Compatibility tests may be run against a compatibility snapshot generated with --timelines-onto-safekeepers=false. We need to start the compute without a generation (or with 0 generation) if the timeline is not storcon-managed, otherwise the compute will hang. - Follow up on https://github.com/neondatabase/neon/pull/12203 - Relates to https://github.com/neondatabase/neon/pull/11712 ## Summary of changes - Handle compatibility snapshot generated with no `--timelines-onot-safekeepers` properly	2025-06-17 15:16:07 +00:00
Dmitrii Kovalkov	dee73f0cb4	pageserver: implement max_total_size_bytes limit for basebackup cache (#12230 ) ## Problem The cache was introduced as a hackathon project and the only supported limit was the number of entries. The basebackup entry size may vary. We need to have more control over disk space usage to ship it to production. - Part of https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Store the size of entries in the cache and use it to limit `max_total_size_bytes` - Add the size of the cache in bytes to metrics.	2025-06-17 15:08:59 +00:00
Erik Grinaker	edf51688bc	neon_local: support gRPC connstrings for endpoints (#12271 ) ## Problem `neon_local` should support endpoints using gRPC, by providing `grpc://` connstrings with the Pageservers' gRPC ports. Requires #12268. Touches #11926. ## Summary of changes * Add `--grpc` switch for `neon_local endpoint create`. * Generate `grpc://` connstrings for endpoints when enabled. Computes don't actually support `grpc://` connstrings yet, but will soon. gRPC is configured when the endpoint is created, not when it's started, such that it continues to use gRPC across restarts and reconfigurations. In particular, this is necessary for the storage controller's local notify hook, which can't easily plumb through gRPC configuration from the start/reconfigure commands but has access to the endpoint's configuration.	2025-06-17 14:39:42 +00:00
Aleksandr Sarantsev	4a8f3508f9	storcon: Add safekeeper request label group (#12239 ) ## Problem The metrics `storage_controller_safekeeper_request_error` and `storage_controller_safekeeper_request_latency` currently use `pageserver_id` as a label. This can be misleading, as the metrics are about safekeeper requests. We want to replace this with a more accurate label — either `safekeeper_id` or `node_id`. ## Summary of changes - Introduced `SafekeeperRequestLabelGroup` with `safekeeper_id`. - Updated the affected metrics to use the new label group. - Fixed incorrect metric usage in safekeeper_client.rs ## Follow-up - Review usage of these metrics in alerting rules and existing Grafana dashboards to ensure this change does not break something.	2025-06-17 13:33:01 +00:00
Erik Grinaker	48052477b4	storcon: register Pageserver gRPC address (#12268 ) ## Problem Pageservers now expose a gRPC API on a separate address and port. This must be registered with the storage controller such that it can be plumbed through to the compute via cplane. Touches #11926. ## Summary of changes This patch registers the gRPC address and port with the storage controller: * Add gRPC address to `nodes` database table and `NodePersistence`, with a Diesel migration. * Add gRPC address in `NodeMetadata`, `NodeRegisterRequest`, `NodeDescribeResponse`, and `TenantLocateResponseShard`. * Add gRPC address flags to `storcon_cli node-register`. These changes are backwards-compatible, since all structs will ignore unknown fields during deserialization.	2025-06-17 13:27:10 +00:00
Erik Grinaker	d81353b2d1	pageserver: gRPC base backup fixes (#12243 ) ## Problem The gRPC base backup implementation has a few issues: chunks are not properly bounded, and it's not possible to omit the LSN. Touches #11728. ## Summary of changes * Properly bound chunks by using a limited writer. * Use an `Option<Lsn>` rather than a `ReadLsn` (the latter requires an LSN).	2025-06-17 12:37:43 +00:00
Aleksandr Sarantsev	143500dc4f	storcon: Improve stably_attached readability (#12249 ) ## Problem The `stably_attached` function is hard to read due to deeply nested conditionals ## Summary of Changes - Refactored `stably_attached` to use early returns and the `?` operator for improved readability	2025-06-17 10:10:10 +00:00
Aleksandr Sarantsev	1a5f7ce6ad	storcon: Exclude another secondaries while optimizing secondary (#12251 ) ## Problem If the node intent includes more than one secondary, we can generate a replace optimization using a candidate node that is already a secondary location. ## Summary of changes - Exclude all other secondary nodes from the scoring process to ensure optimal candidate selection.	2025-06-17 10:09:55 +00:00
Alexander Lakhin	01ccb34118	Don't rerun failed tests in 'Build and Test with Sanitizers' workflow (#12259 ) ## Problem We could easily miss a sanitizer-detected defect, if it occurred due to some race condition, as we just rerun the test and if it succeeds, the overall test run is considered successful. It was more reasonable before, when we had much more unstable tests in main, but now we can track all test failures. ## Summary of changes Don't rerun failed tests.	2025-06-17 08:08:43 +00:00
github-actions[bot]	bf2a21567d	Proxy release 2025-06-17 06:01 UTC	2025-06-17 06:01:42 +00:00
Tristan Partin	f669e18477	Remove TODO comment related to default_transaction_read_only (#12261 ) This code has been deployed for a while, so let's remove the TODO, and remove the option passed from the control plane. Link: https://github.com/neondatabase/cloud/pull/30274 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-16 19:38:26 +00:00
Suhas Thalanki	632cde7f13	schema and github workflow for validation of compute manifest (#12069 ) Adds a schema to validate the manifest.yaml described in [this RFC](https://github.com/neondatabase/neon/blob/main/docs/rfcs/038-independent-compute-release.md) and a github workflow to test this.	2025-06-16 19:30:41 +00:00
Alexander Lakhin	118e13438d	Add "Build and Test Fully" workflow (#11931 ) ## Problem We don't test debug builds for v14..v16 in the regular "Build and Test" runs to perform the testing faster, but it means we can't detect assertion failures in those versions. (See https://github.com/neondatabase/neon/issues/11891, https://github.com/neondatabase/neon/issues/11997) ## Summary of changes Add a new workflow to test all build types and all versions on all architectures.	2025-06-16 13:29:39 +00:00
Trung Dinh	fc136eec8f	pagectl: add dump layer local (#12245 ) ## Problem In our environment, we don't always have access to the pagectl tool on the pageserver. We have to download the page files to local env to introspect them. Hence, it'll be useful to be able to parse the local files using `pagectl`. ## Summary of changes * Add `dump-layer-local` to `pagectl` that takes a local path as argument and returns the layer content: ``` cargo run -p pagectl layer dump-layer-local ~/Desktop/000000067F000040490002800000FFFFFFFF-030000000000000000000000000000000002__00003E7A53EDE611-00003E7AF27BFD19-v1-00000001 ``` * Bonus: Fix a bug in `pageserver/ctl/src/draw_timeline_dir.rs` in which we don't filter out temporary files.	2025-06-16 10:29:42 +00:00
Erik Grinaker	818e5130f1	page_api: add a few derives (#12253 ) ## Problem The `page_api` domain types are missing a few derives. ## Summary of changes Add `Clone`, `Copy`, and `Debug` derives for all types where appropriate.	2025-06-16 09:45:50 +00:00
Alexander Sarantcev	c243521ae5	Fix reconcile_long_running metric comment (#12234 ) ## Problem Comment for `storage_controller_reconcile_long_running` metric was copy-pasted and not updated in #9207 ## Summary of changes - Fixed comment	2025-06-16 05:51:57 +00:00
Alexander Sarantcev	5303c71589	Move comment above metrics handler (#12236 ) ## Problem Comment is in incorrect place: `/metrics` code is above its description comment. ## Summary of changes - `/metrics` code is now below the comment	2025-06-13 18:18:51 +00:00
Alexander Sarantcev	d146897415	Fix reconciles metrics typo (#12235 ) ## Problem Need to fix naming `safkeeper` -> `safekeeper` ## Summary of changes - `storage_controller_safkeeper_reconciles_` renamed to `storage_controller_safekeeper_reconciles_`	2025-06-13 17:47:09 +00:00
Alexander Sarantcev	d63815fa40	Fix ChaosInjector shard eligibility bug (#12231 ) ## Problem ChaosInjector is intended to skip non-active scheduling policies, but the current logic skips active shards instead. ## Summary of changes - Fixed shard eligibility condition to correctly allow chaos injection for shards with an Active scheduling policy.	2025-06-13 13:34:29 +00:00
Dmitrii Kovalkov	385324ee8a	pageserver: fix post-merge PR comments on basebackup cache (#12216 ) ## Problem This PR addresses all but the direct IO post-merge comments on basebackup cache implementation. - Follow up on https://github.com/neondatabase/neon/pull/11989#pullrequestreview-2867966119 - Part of https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Clean up the tmp directory by recreating it. - Recreate the tmp directory on startup. - Add comments why it's safe to not fsync the inode after renaming.	2025-06-13 08:49:31 +00:00
Alex Chi Z.	8a68d463f6	feat(pagectl): no max key limit if time travel recover locally (#12222 ) ## Problem We would easily hit this limit for a tenant running for enough long time. ## Summary of changes Remove the max key limit for time-travel recovery if the command is running locally. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-13 08:41:10 +00:00
Alex Chi Z.	3046c307da	feat(posthog_client): support feature flag secure API (#12201 ) ## Problem Part of #11813 PostHog has two endpoints to retrieve feature flags: the old project ID one that uses personal API token, and the new one using a special feature flag secure token that can only retrieve feature flag. The new API I added in this patch is not documented in the PostHog API doc but it's used in their Python SDK. ## Summary of changes Add support for "feature flag secure token API". The API has no way of providing a project ID so we verify if the retrieved spec is consistent with the project ID specified by comparing the `team_id` field. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-13 07:22:02 +00:00
Dmitrii Kovalkov	e83f1d8ba5	tests: prepare test_historic_storage_formats for --timelines-onto-safekeepers (#12214 ) ## Problem `test_historic_storage_formats` uses `/tenant_import` to import historic data. Tenant import does not create timelines onto safekeepers, because they might already exist on some safekeeper set. If it does, then we may end up with two different quorums accepting WAL for the same timeline. If the tenant import is used in a real deployment, the administrator is responsible for looking for the proper safekeeper set and migrate timelines into storcon-managed timelines. - Relates to https://github.com/neondatabase/neon/pull/11712 ## Summary of changes - Create timelines onto safekeepers manually after tenant import in `test_historic_storage_formats` - Add a note to tenant import that timelines will be not storcon-managed after the import.	2025-06-13 06:28:18 +00:00
Trung Dinh	8917676e86	Improve logging for gc-compaction (#12219 ) ## Problem * Inside `compact_with_gc_inner`, there is a similar log line: `db24ba95d1/pageserver/src/tenant/timeline/compaction.rs (L3181-L3187)` * Also, I think it would be useful when debugging to have the ability to select a particular sub-compaction job (e.g., `1/100`) to see all the logs for that job. ## Summary of changes * Attach a span to the `compact_with_gc_inner`. CC: @skyzh	2025-06-13 06:07:18 +00:00
Ivan Efremov	43acabd4c2	[proxy]: Improve backoff strategy for redis reconnection (#12218 ) Sometimes during a failed redis connection attempt at the init stage proxy pod can continuously restart. This, in turn, can aggravate the problem if redis is overloaded. Solves the #11114	2025-06-12 19:46:02 +00:00
Vlad Lazar	db24ba95d1	pagserver: always persist shard identity (#12217 ) ## Problem The location config (which includes the stripe size) is stored on pageserver disk. For unsharded tenants we [do not include the shard identity in the serialized description](`ad88ec9257/pageserver/src/tenant/config.rs (L64-L66)`). When the pageserver restarts, it reads that configuration and will use the stripe size from there and rely on storcon input from reattach for generation and mode. The default deserialization is ShardIdentity::unsharded. This has the new default stripe size of 2048. Hence, for unsharded tenants we can be running with a stripe size different from that the one in the storcon observed state. This is not a problem until we shard split without specifying a stripe size (i.e. manual splits via the UI or storcon_cli). When that happens the new shards will use the 2048 stripe size until storcon realises and switches them back. At that point it's too late, since we've ingested data with the wrong stripe sizes. ## Summary of changes Ideally, we would always have the full shard identity on disk. To achieve this over two releases we do: 1. Always persist the shard identity in the location config on the PS. 2. Storage controller includes the stripe size to use in the re attach response. After the first release, we will start persisting correct stripe sizes for any tenant shard that the storage controller explicitly sends a location_conf. After the second release, the re-attach change kicks in and we'll persist the shard identity for all shards.	2025-06-12 17:15:02 +00:00
Folke Behrens	1dce65308d	Update base64 to 0.22 (#12215 ) ## Problem Base64 0.13 is outdated. ## Summary of changes Update base64 to 0.22. Affects mostly proxy and proxy libs. Also upgrade serde_with to remove another dep on base64 0.13 from dep tree.	2025-06-12 16:12:47 +00:00
Alex Chi Z.	ad88ec9257	fix(pageserver): extend layer manager read guard threshold (#12211 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/12194 to make the benchmarks run without warnings. ## Summary of changes Extend read guard hold timeout to 30s. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-12 08:39:54 +00:00
Dmitrii Kovalkov	60dfdf39c7	tests: prepare test_tenant_delete_stale_shards for --timelines-onto-safekeepers (#12198 ) ## Problem The test creates an endpoint and deletes its tenant. The compute cannot stop gracefully because it tries to write a checkpoint shutdown record into the WAL, but the timeline had been already deleted from safekeepers. - Relates to https://github.com/neondatabase/neon/pull/11712 ## Summary of changes Stop the compute before deleting a tenant	2025-06-12 08:10:22 +00:00
Dmitrii Kovalkov	3d5e2bf685	storcon: add tenant_timeline_locate handler (#12203 ) ## Problem Compatibility tests may be run against a compatibility snapshot generated with `--timelines-onto-safekeepers=false`. We need to start the compute without a generation (or with 0 generation) if the timeline is not storcon-managed, otherwise the compute will hang. This handler is needed to check if the timeline is storcon-managed. It's also needed for better test coverage of safekeeper migration code. - Relates to https://github.com/neondatabase/neon/pull/11712 ## Summary of changes - Implement `tenant_timeline_locate` handler in storcon to get safekeeper info from storcon's DB	2025-06-12 08:09:57 +00:00
dependabot[bot]	54fdcfdfa8	build(deps): bump requests from 2.32.3 to 2.32.4 in the pip group across 1 directory (#12180 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-11 21:09:05 +00:00
Vlad Lazar	28e882a80f	pageserver: warn on long layer manager locking intervals (#12194 ) ## Problem We hold the layer map for too long on occasion. ## Summary of changes This should help us identify the places where it's happening from. Related https://github.com/neondatabase/neon/issues/12182	2025-06-11 16:16:30 +00:00
Konstantin Knizhnik	24038033bf	Remove default from DROP FUNCTION (#12202 ) ## Problem DROP FUNCTION doesn't allow to specify default for parameters. ## Summary of changes Remove DEFAULT clause from pgxn/neon/neon--1.6--1.5.sql Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-11 13:16:58 +00:00
Mikhail	1b935b1958	endpoint_storage: add ?from_endpoint= to /lfc/prewarm (#12195 ) Related: https://github.com/neondatabase/cloud/issues/24225 Add optional from_endpoint parameter to allow prewarming from other endpoint	2025-06-10 19:25:32 +00:00
a-masterov	3f16ca2c18	Respect limits for projects for the Random Operations test (#12184 ) ## Problem The project limits were not respected, resulting in errors. ## Summary of changes Now limits are checked before running an action, and if the action is not possible to run, another random action will be run. --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech>	2025-06-10 15:59:51 +00:00
Conrad Ludgate	67b94c5992	[proxy] per endpoint configuration for rate limits (#12148 ) https://github.com/neondatabase/cloud/issues/28333 Adds a new `rate_limit` response type to EndpointAccessControl, uses it for rate limiting, and adds a generic invalidation for the cache.	2025-06-10 14:26:08 +00:00
Folke Behrens	e38193c530	proxy: Move connect_to_compute back to proxy (#12181 ) It's mostly responsible for waking, retrying, and caching. A new, thin wrapper around compute_once will be PGLB's entry point	2025-06-10 11:23:03 +00:00
Konstantin Knizhnik	21949137ed	Return last ring index instead of min_ring_index in prefetch_register_bufferv (#12039 ) ## Problem See https://github.com/neondatabase/neon/issues/12018 Now `prefetch_register_bufferv` calculates min_ring_index of all vector requests. But because of pump prefetch state or connection failure, previous slots can be already proceeded and reused. ## Summary of changes Instead of returning minimal index, this function should return last slot index. Actually result of this function is used only in two places. A first place just fort checking (and this check is redundant because the same check is done in `prefetch_register_bufferv` itself. And in the second place where index of filled slot is actually used, there is just one request. Sp fortunately this bug can cause only assert failure in debug build. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-10 10:09:46 +00:00
Trung Dinh	02f94edb60	Remove global static TENANTS (#12169 ) ## Problem There is this TODO in code: https://github.com/neondatabase/neon/blob/main/pageserver/src/tenant/mgr.rs#L300-L302 This is an old TODO by @jcsp. ## Summary of changes This PR addresses the TODO. Specifically, it removes a global static `TENANTS`. Instead the `TenantManager` now directly manages the tenant map. Enhancing abstraction. Essentially, this PR moves all module-level methods to inside the implementation of `TenantManager`.	2025-06-10 09:26:40 +00:00
github-actions[bot]	24053ff4ca	Proxy release 2025-06-10 08:56 UTC	2025-06-10 08:56:51 +00:00
Conrad Ludgate	58327ef74d	[proxy] fix sql-over-http password setting (#12177 ) ## Problem Looks like our sql-over-http tests get to rely on "trust" authentication, so the path that made sure the authkeys data was set was never being hit. ## Summary of changes Slight refactor to WakeComputeBackends, as well as making sure auth keys are propagated. Fix tests to ensure passwords are tested.	2025-06-10 08:46:29 +00:00
Dmitrii Kovalkov	73be6bb736	fix(compute): use proper safekeeper in VotesCollectedMset (#12175 ) ## Problem `VotesCollectedMset` uses the wrong safekeeper to update truncateLsn. This led to some failed assert later in the code during running safekeeper migration tests. - Relates to https://github.com/neondatabase/neon/issues/11823 ## Summary of changes Use proper safekeeper to update truncateLsn in VotesCollectedMset	2025-06-10 07:16:42 +00:00
Alex Chi Z.	40d7583906	feat(pageserver): use hostname as feature flag resolver property (#12141 ) ## Problem part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes Collect pageserver hostname property so that we can use it in the PostHog UI. Not sure if this is the best way to do that -- open to suggestions. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-10 07:10:41 +00:00
Alex Chi Z.	7a68699abb	feat(pageserver): support azure time-travel recovery (in an okay way) (#12140 ) ## Problem part of https://github.com/neondatabase/neon/issues/7546 Add Azure time travel recovery support. The tricky thing is how Azure handles deletes in its blob version API. For the following sequence: ``` upload file_1 = a upload file_1 = b delete file_1 upload file_1 = c ``` The "delete file_1" won't be stored as a version (as AWS did). Therefore, we can never rollback to a state where file_1 is temporarily invisible. If we roll back to the time before file_1 gets created for the first time, it will be removed correctly. However, this is fine for pageservers, because (1) having extra files in the tenant storage is usually fine (2) for things like timelines/X/index_part-Y.json, it will only be deleted once, so it can always be recovered to a correct state. Therefore, I don't expect any issues when this functionality is used on pageserver recovery. TODO: unit tests for time-travel recovery. ## Summary of changes Add Azure blob storage time-travel recovery support. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-10 05:32:58 +00:00
Konstantin Knizhnik	f42d44342d	Increase statement timeout for test_pageserver_restarts_under_workload test (#12139 ) \## Problem See https://github.com/neondatabase/neon/issues/12119#issuecomment-2942586090 Page server restarts with interval 1 seconds increases time of vacuum especially off prefetch is enabled and so cause test failure because of statement timeout expiration. ## Summary of changes Increase statement timeout to 360 seconds. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Alexander Lakhin <alexander.lakhin@neon.tech>	2025-06-10 05:32:03 +00:00
Konstantin Knizhnik	d759fcb8bd	Increase wait LFC prewarm timeout (#12174 ) ## Problem See https://github.com/neondatabase/neon/issues/12171 ## Summary of changes Increase LFC prewarm wait timeout to 1 minute Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-09 18:01:30 +00:00
Alex Chi Z.	76f95f06d8	feat(pageserver): add global timeline count metrics (#12159 ) ## Problem We are getting tenants with a lot of branches and num of timelines is a good indicator of pageserver loads. I added this metrics to help us better plan pageserver capacities. ## Summary of changes Add `pageserver_timeline_states_count` with two labels: active + offloaded. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-09 09:57:36 +00:00
Mikhail	7efd4554ab	endpoint_storage: allow bypassing s3 write check on startup (#12165 ) Related: https://github.com/neondatabase/cloud/issues/27195	2025-06-06 18:08:02 +00:00
Erik Grinaker	3c7235669a	pageserver: don't delete parent shard files until split is committed (#12146 ) ## Problem If a shard split fails and must roll back, the tenant may hit a cold start as the parent shard's files have already been removed from local disk. External contribution with minor adjustments, see https://neondb.slack.com/archives/C08TE3203RQ/p1748246398269309. ## Summary of changes Keep the parent shard's files on local disk until the split has been committed, such that they are available if the spilt is rolled back. If all else fails, the files will be removed on the next Pageserver restart. This should also be fine in a mixed version: * New storcon, old Pageserver: the Pageserver will delete the files during the split, storcon will log an error when the cleanup detach fails. * Old storcon, new Pageserver: the Pageserver will leave the parent's files around until the next Pageserver restart. The change looks good to me, but shard splits are delicate so I'd like some extra eyes on this.	2025-06-06 15:55:14 +00:00
Conrad Ludgate	6dd84041a1	refactor and simplify the invalidation notification structure (#12154 ) The current cache invalidation messages are far too specific. They should be more generic since it only ends up triggering a `GetEndpointAccessControl` message anyway. Mappings: * `/allowed_ips_updated`, `/block_public_or_vpc_access_updated`, and `/allowed_vpc_endpoints_updated_for_projects` -> `/project_settings_update`. * `/allowed_vpc_endpoints_updated_for_org` -> `/account_settings_update`. * `/password_updated` -> `/role_setting_update`. I've also introduced `/endpoint_settings_update`. All message types support singular or multiple entries, which allows us to simplify things both on our side and on cplane side. I'm opening a PR to cplane to apply the above mappings, but for now using the old phrases to allow both to roll out independently. This change is inspired by my need to add yet another cached entry to `GetEndpointAccessControl` for https://github.com/neondatabase/cloud/issues/28333	2025-06-06 12:49:29 +00:00
Arpad Müller	df7e301a54	safekeeper: special error if a timeline has been deleted (#12155 ) We might delete timelines on safekeepers before we are deleting them on pageservers. This should be an exceptional situation, but can occur. As the first step to improve behaviour here, emit a special error that is less scary/obscure than "was not found in global map". It is for example emitted when the pageserver tries to run `IDENTIFY_SYSTEM` on a timeline that has been deleted on the safekeeper. Found when analyzing the failure of `test_scrubber_physical_gc_timeline_deletion` when enabling `--timelines-onto-safekeepers` on the pytests. Due to safekeeper restarts, there is no hard guarantee that we will keep issuing this error, so we need to think of something better if we start encountering this in staging/prod. But I would say that the introduction of `--timelines-onto-safekeepers` in the pytests and into staging won't change much about this: we are already deleting timelines from there. In `test_scrubber_physical_gc_timeline_deletion`, we'd just be leaking the timeline before on the safekeepers. Part of #11712	2025-06-06 11:54:07 +00:00
Mikhail	470c7d5e0e	endpoint_storage: default listen port, allow inline config (#12152 ) Related: https://github.com/neondatabase/cloud/issues/27195	2025-06-06 11:48:01 +00:00
Conrad Ludgate	4d99b6ff4d	[proxy] separate compute connect from compute authentication (#12145 ) ## Problem PGLB/Neonkeeper needs to separate the concerns of connecting to compute, and authenticating to compute. Additionally, the code within `connect_to_compute` is rather messy, spending effort on recovering the authentication info after wake_compute. ## Summary of changes Split `ConnCfg` into `ConnectInfo` and `AuthInfo`. `wake_compute` only returns `ConnectInfo` and `AuthInfo` is determined separately from the `handshake`/`authenticate` process. Additionally, `ConnectInfo::connect_raw` is in-charge or establishing the TLS connection, and the `postgres_client::Config::connect_raw` is configured to use `NoTls` which will force it to skip the TLS negotiation. This should just work.	2025-06-06 10:29:55 +00:00
Alexander Sarantcev	590301df08	storcon: Introduce deletion tombstones to support flaky node scenario (#12096 ) ## Problem Removed nodes can re-add themselves on restart if not properly tombstoned. We need a mechanism (e.g. soft-delete flag) to prevent this, especially in cases where the node is unreachable. More details there: #12036 ## Summary of changes - Introduced `NodeLifecycle` enum to represent node lifecycle states. - Added a string representation of `NodeLifecycle` to the `nodes` table. - Implemented node removal using a tombstone mechanism. - Introduced `/debug/v1/tombstone*` handlers to manage the tombstone state.	2025-06-06 10:16:55 +00:00
Erik Grinaker	c511786548	pageserver: move `spawn_grpc` to `GrpcPageServiceHandler::spawn` (#12147 ) Mechanical move, no logic changes.	2025-06-06 10:01:58 +00:00
Alex Chi Z.	fe31baf985	feat(build): add aws cli into the docker image (#12161 ) ## Problem Makes it easier to debug AWS permission issues (i.e., storage scrubber) ## Summary of changes Install awscliv2 into the docker image. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-06 09:38:58 +00:00
Alex Chi Z.	b23e75ebfe	test(pageserver): ensure offload cleans up metrics (#12127 ) Add a test to ensure timeline metrics are fully cleaned up after offloading. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-06 06:50:54 +00:00
Arpad Müller	24d7c37e6e	neon_local timeline import: create timelines on safekeepers (#12138 ) neon_local's timeline import subcommand creates timelines manually, but doesn't create them on the safekeepers. If a test then tries to open an endpoint to read from the timeline, it will error in the new world with `--timelines-onto-safekeepers`. Therefore, if that flag is enabled, create the timelines on the safekeepers. Note that this import functionality is different from the fast import feature (https://github.com/neondatabase/neon/issues/10188, #11801). Part of #11670 As well as part of #11712	2025-06-05 18:53:14 +00:00
a-masterov	f64eb0cbaf	Remove the Flaky Test computed-columns from postgis v16 (#12132 ) ## Problem The `computed_columns` test assumes that computed columns are always faster than the request itself. However, this is not always the case on Neon, which can lead to flaky results. ## Summary of changes The `computed_columns` test is excluded from the PostGIS test for PostgreSQL v16, accompanied by related patch refactoring.	2025-06-05 15:02:38 +00:00
Alexey Kondratov	6ae4b89000	feat(compute_ctl): Implement graceful compute monitor exit (#11911 ) ## Problem After introducing a naive downtime calculation for the Postgres process inside compute in https://github.com/neondatabase/neon/pull/11346, I noticed that some amount of computes regularly report short downtime. After checking some particular cases, it looks like all of them report downtime close to the end of the life of the compute, i.e., when the control plane calls a `/terminate` and we are waiting for Postgres to exit. Compute monitor also produces a lot of error logs because Postgres stops accepting connections, but it's expected during the termination process. ## Summary of changes Regularly check the compute status inside the main compute monitor loop and exit gracefully when the compute is in some terminal or soon-to-be-terminal state. --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-06-05 12:17:28 +00:00
Dmitrii Kovalkov	f7ec7668a2	pageserver, tests: prepare test_basebackup_cache for --timelines-onto-safekeepers (#12143 ) ## Problem - `test_basebackup_cache` fails in https://github.com/neondatabase/neon/pull/11712 because after the timelines on safekeepers are managed by storage controller, they do contain proper start_lsn and the compute_ctl tool sends the first basebackup request with this LSN. - `Failed to prepare basebackup` log messages during timeline initialization, because the timeline is not yet in the global timeline map. - Relates to https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Account for `timeline_onto_safekeepers` storcon's option in the test. - Do not trigger basebackup prepare during the timeline initialization.	2025-06-05 12:04:37 +00:00
a-masterov	038e967daf	Configure the dynamic loader for the extension-tests image (#12142 ) ## Problem The same problem, fixed in https://github.com/neondatabase/neon/issues/11857, but for the image `neon-extesions-test` ## Summary of changes The config file was added to use our library	2025-06-05 12:03:51 +00:00
Erik Grinaker	6a43f23eca	pagebench: add batch support (#12133 ) ## Problem The new gRPC page service protocol supports client-side batches. The current libpq protocol only does best-effort server-side batching. To compare these approaches, Pagebench should support submitting contiguous page batches, similar to how Postgres will submit them (e.g. with prefetches or vectored reads). ## Summary of changes Add a `--batch-size` parameter specifying the size of contiguous page batches. One batch counts as 1 RPS and 1 queue depth. For the libpq protocol, a batch is submitted as individual requests and we rely on the server to batch them for us. This will give a realistic comparison of how these would be processed in the wild (e.g. when Postgres sends 100 prefetch requests). This patch also adds some basic validation of responses.	2025-06-05 11:52:52 +00:00
Vlad Lazar	868f194a3b	pageserver: remove handling of vanilla protocol (#12126 ) ## Problem We support two ingest protocols on the pageserver: vanilla and interpreted. Interpreted has been the only protocol in use for a long time. ## Summary of changes * Remove the ingest handling of the vanilla protocol * Remove tenant and pageserver configuration for it * Update all tests that tweaked the ingest protocol ## Compatibility Backward compatibility: * The new pageserver version can read the existing pageserver configuration and it will ignore the unknown field. * When the tenant config is read from the storcon db or from the pageserver disk, the extra field will be ignored. Forward compatiblity: * Both the pageserver config and the tenant config map missing fields to their default value. I'm not aware of any tenant level override that was made for this knob.	2025-06-05 11:43:04 +00:00
Konstantin Knizhnik	9c6c780201	Replica promote (#12090 ) ## Problem This PR is part of larger computes support activity: https://www.notion.so/neondatabase/Larger-computes-114f189e00478080ba01e8651ab7da90 Epic: https://github.com/neondatabase/cloud/issues/19010 In case of planned node restart, we are going to 1. create new read-only replica 2. capture LFC state at primary 3. use this state to prewarm replica 4. stop old primary 5. promote replica to primary Steps 1-3 are currently implemented and support from compute side. This PR provides compute level implementation of replica promotion. Support replica promotion ## Summary of changes Right now replica promotion is done in three steps: 1. Set safekeepers list (now it is empty for replica) 2. Call `pg_promote()` top promote replica 3. Update endpoint setting to that it ids not more treated as replica. May be all this three steps should be done by some function in compute_ctl. But right now this logic is only implement5ed in test. Postgres submodules PRs: https://github.com/neondatabase/postgres/pull/648 https://github.com/neondatabase/postgres/pull/649 https://github.com/neondatabase/postgres/pull/650 https://github.com/neondatabase/postgres/pull/651 --------- Co-authored-by: Matthias van de Meent <matthias@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-05 11:27:14 +00:00
Konstantin Knizhnik	6123fe2d5e	Add query execution time histogram (#10050 ) ## Problem It will be useful to understand what kind of queries our clients are executed. And one of the most important characteristic of query is query execution time - at least it allows to distinguish OLAP and OLTP queries. Also monitoring query execution time can help to detect problem with performance (assuming that workload is more or less stable). ## Summary of changes Add query execution time histogram. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-05 11:23:39 +00:00
Folke Behrens	1577665c20	proxy: Move PGLB-related modules into pglb root module. (#12144 ) Split the modules responsible for passing data and connecting to compute from auth and waking for PGLB. This PR just moves files. The waking is going to get removed from pglb after this.	2025-06-05 11:00:23 +00:00
Alex Chi Z.	d8ebd1d771	feat(pageserver): report tenant properties to posthog (#12113 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11813 In PostHog UI, we need to create the properties before using them as a filter. We report all variants automatically when we start the pageserver. In the future, we can report all real tenants instead of fake tenants (we do that now to save money + we don't need real tenants in the UI). ## Summary of changes * Collect `region`, `availability_zone`, `pageserver_id` properties and use them in the feature evaluation. * Report 10 fake tenants on each pageserver startup. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-05 07:48:36 +00:00
Conrad Ludgate	c8a96cf722	update proxy protocol parsing to not a rw wrapper (#12035 ) ## Problem I believe in all environments we now specify either required/rejected for proxy-protocol V2 as required. We no longer rely on the supported flow. This means we no longer need to keep around read bytes incase they're not in a header. While I designed ChainRW to be fast (the hot path with an empty buffer is very easy to branch predict), it's still unnecessary. ## Summary of changes * Remove the ChainRW wrapper * Refactor how we read the proxy-protocol header using read_exact. Slightly worse perf but it's hardly significant. * Don't try and parse the header if it's rejected.	2025-06-05 07:12:00 +00:00
Konstantin Knizhnik	56d505bce6	Update online_advisor (#12045 ) ## Problem Investigate crash of online_advisor in image check ## Summary of changes --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-06-05 05:48:25 +00:00
Arpad Müller	dae203ef69	pgxn: support generations in safekeepers_cmp (#12129 ) `safekeepers_cmp` was added by #8840 to make changes of the safekeeper set order independent: a `sk1,sk2,sk3` specifier changed to `sk3,sk1,sk2` should not cause a walproposer restart. However, this check didn't support generations, in the sense that it would see the `g#123:` as part of the first safekeeper in the list, and if the first safekeeper changes, it would also restart the walproposer. Therefore, parse the generation properly and make it not be a part of the generation. This PR doesn't add a specific test, but I have confirmed locally that `test_safekeepers_reconfigure_reorder` is fixed with the changes of PR #11712 applied thanks to this PR. Part of https://github.com/neondatabase/neon/issues/11670	2025-06-04 23:02:31 +00:00
Conrad Ludgate	1fb1315aed	compute-ctl: add spec for enable_tls, separate from compute-ctl config (#12109 ) ## Problem Inbetween adding the TLS config for compute-ctl, and adding the TLS config in controlplane, we switched from using a provision flag to a bind flag. This happened to work in all of my testing in preview regions as they have no VM pool, so each bind was also a provision. However, in staging I found that the TLS config is still only processed during provision, even though it's only sent on bind. ## Summary of changes * Add a new feature flag value, `tls_experimental`, which tells postgres/pgbouncer/local_proxy to use the TLS certificates on bind. * compute_ctl on provision will be told where the certificates are, instead of being told on bind.	2025-06-04 20:07:47 +00:00
Suhas Thalanki	838622c594	compute: Add manifest.yml for default Postgres configuration settings (#11820 ) Adds a `manifest.yml` file that contains the default settings for compute. Currently, it comes from cplane code [here](`0cda3d4b01/goapp/controlplane/internal/pkg/compute/computespec/pg_settings.go (L110)`). Related RFC: https://github.com/neondatabase/neon/blob/main/docs/rfcs/038-independent-compute-release.md Related Issue: https://github.com/neondatabase/cloud/issues/11698	2025-06-04 18:03:59 +00:00
Tristan Partin	3fd5a94a85	Use Url::join() when creating the final remote extension URL (#12121 ) Url::to_string() adds a trailing slash on the base URL, so when we did the format!(), we were adding a double forward slash. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-04 15:56:12 +00:00
Erik Grinaker	e7d6f525b3	pageserver: support `get_vectored_concurrent_io` with gRPC (#12131 ) ## Problem The gRPC page service doesn't respect `get_vectored_concurrent_io` and always uses sequential IO. ## Summary of changes Spawn a sidecar task for concurrent IO when enabled. Cancellation will be addressed separately.	2025-06-04 15:14:17 +00:00
a-masterov	e4ca3ac745	Fix codestyle for compute.sh for docker-compose (#12128 ) ## Problem The script `compute.sh` had a non-consistent coding style and didn't follow best practices for modern bash scripts ## Summary of changes The coding style was fixed to follow best practices.	2025-06-04 15:07:48 +00:00
Vlad Lazar	b69d103b90	pageserver: make import job max byte range size configurable (#12117 ) ## Problem We want to repro an OOM situation, but large partial reads are required. ## Summary of Changes Make the max partial read size configurable for import jobs.	2025-06-04 10:44:23 +00:00
a-masterov	208cbd52d4	Add postgis to the test image (#11672 ) ## Problem We don't currently run tests for PostGIS in our test environment. ## Summary of Changes - Added PostGIS test support for PostgreSQL v16 and v17 - Configured different PostGIS versions based on PostgreSQL version: - PostgreSQL v17: PostGIS 3.5.0 - PostgreSQL v14/v15/v16: PostGIS 3.3.3 - Added necessary test scripts and configurations This ensures our PostgreSQL implementation remains compatible with this widely-used extension. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2025-06-04 09:57:31 +00:00
Alex Chi Z.	c567ed0de0	feat(pageserver): feature flag counter metrics (#12112 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes Add a counter on the feature evaluation outcome and we will set up alerts for too many failed evaluations in the future. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-04 06:41:42 +00:00
Mikhail	c698cee19a	ComputeSpec: prewarm_lfc_on_startup -> autoprewarm (#12120 ) https://github.com/neondatabase/cloud/pull/29472 https://github.com/neondatabase/cloud/issues/26346	2025-06-04 05:38:03 +00:00
Tristan Partin	4a3f32bf4a	Clean up compute_tools::http::JsonResponse::invalid_status() (#12110 ) JsonResponse::error() properly logs an error message which can be read in the compute logs. invalid_status() was not going through that helper function, thus not logging anything. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-06-03 16:00:56 +00:00
Vlad Lazar	a963aab14b	pagserver: set default wal receiver proto to interpreted (#12100 ) ## Problem This is already the default in production and in our test suite. ## Summary of changes Set the default proto to interpreted to reduce friction when spinning up new regions or cells.	2025-06-03 14:57:36 +00:00
Erik Grinaker	5bdba70f7d	page_api: only validate Protobuf → domain type conversion (#12115 ) ## Problem Currently, `page_api` domain types validate message invariants both when converting Protobuf → domain and domain → Protobuf. This is annoying for clients, because they can't use stream combinators to convert streamed requests (needed for hot path performance), and also performs the validation twice in the common case. Blocks #12099. ## Summary of changes Only validate the Protobuf → domain type conversion, i.e. on the receiver side, and make domain → Protobuf infallible. This is where it matters -- the Protobuf types are less strict than the domain types, and receivers should expect all sorts of junk from senders (they're not required to validate anyway, and can just construct an invalid message manually). Also adds a missing `impl From<CheckRelExistsRequest> for proto::CheckRelExistsRequest`.	2025-06-03 13:50:41 +00:00
Trung Dinh	25fffd3a55	Validate max_batch_size against max_get_vectored_keys (#12052 ) ## Problem Setting `max_batch_size` to anything higher than `Timeline::MAX_GET_VECTORED_KEYS` will cause runtime error. We should rather fail fast at startup if this is the case. ## Summary of changes * Create `max_get_vectored_keys` as a new configuration (default to 32); * Validate `max_batch_size` against `max_get_vectored_keys` right at config parsing and validation. Closes https://github.com/neondatabase/neon/issues/11994	2025-06-03 13:37:11 +00:00
Erik Grinaker	e00fd45bba	page_api: remove smallvec (#12095 ) ## Problem The gRPC `page_api` domain types used smallvecs to avoid heap allocations in the common case where a single page is requested. However, this is pointless: the Protobuf types use a normal vec, and converting a smallvec into a vec always causes a heap allocation anyway. ## Summary of changes Use a normal `Vec` instead of a `SmallVec` in `page_api` domain types.	2025-06-03 12:20:34 +00:00
Vlad Lazar	3b8be98b67	pageserver: remove backtrace in info level log (#12108 ) ## Problem We print a backtrace in an info level log every 10 seconds while waiting for the import data to land in the bucket. ## Summary of changes The backtrace is not useful. Remove it.	2025-06-03 09:07:07 +00:00
a-masterov	3e72edede5	Use full hostname for ONNX URL (#12064 ) ## Problem We should use the full host name for computes, according to https://github.com/neondatabase/cloud/issues/26005 , but now a truncated host name is used. ## Summary of changes The URL for REMOTE_ONNX is rewritten using the FQDN.	2025-06-03 07:23:17 +00:00
Alex Chi Z.	a650f7f5af	fix(pageserver): only deserialize reldir key once during get_db_size (#12102 ) ## Problem fix https://github.com/neondatabase/neon/issues/12101; this is a quick hack and we need better API in the future. In `get_db_size`, we call `get_reldir_size` for every relation. However, we do the same deserializing the reldir directory thing for every relation. This creates huge CPU overhead. ## Summary of changes Get and deserialize the reldir v1 key once and use it across all get_rel_size requests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-03 05:00:34 +00:00
Erik Grinaker	fc3994eb71	pageserver: initial gRPC page service implementation (#12094 ) ## Problem We should expose the page service over gRPC. Requires #12093. Touches #11728. ## Summary of changes This patch adds an initial page service implementation over gRPC. It ties in with the existing `PageServerHandler` request logic, to avoid the implementations drifting apart for the core read path. This is just a bare-bones functional implementation. Several important aspects have been omitted, and will be addressed in follow-up PRs: * Limited observability: minimal tracing, no logging, limited metrics and timing, etc. * Rate limiting will currently block. * No performance optimization. * No cancellation handling. * No tests. I've only done rudimentary testing of this, but Pagebench passes at least.	2025-06-02 17:15:18 +00:00
Conrad Ludgate	781bf4945d	proxy: optimise future layout allocations (#12104 ) A smaller version of #12066 that is somewhat easier to review. Now that I've been using https://crates.io/crates/top-type-sizes I've found a lot more of the low hanging fruit that can be tweaks to reduce the memory usage. Some context for the optimisations: Rust's stack allocation in futures is quite naive. Stack variables, even if moved, often still end up taking space in the future. Rearranging the order in which variables are defined, and properly scoping them can go a long way. `async fn` and `async move {}` have a consequence that they always duplicate the "upvars" (aka captures). All captures are permanently allocated in the future, even if moved. We can be mindful when writing futures to only capture as little as possible. TlsStream is massive. Needs boxing so it doesn't contribute to the above issue. ## Measurements from `top-type-sizes`: ### Before ``` 10328 {async block@proxy::proxy::task_main::{closure#0}::{closure#0}} align=8 6120 {async fn body of proxy::proxy::handle_client<proxy::protocol2::ChainRW<tokio::net::TcpStream>>()} align=8 ``` ### After ``` 4040 {async block@proxy::proxy::task_main::{closure#0}::{closure#0}} 4704 {async fn body of proxy::proxy::handle_client<proxy::protocol2::ChainRW<tokio::net::TcpStream>>()} align=8 ```	2025-06-02 16:13:30 +00:00
Erik Grinaker	a21c1174ed	pagebench: add gRPC support for `get-page-latest-lsn` (#12077 ) ## Problem We need gRPC support in Pagebench to benchmark the new gRPC Pageserver implementation. Touches #11728. ## Summary of changes Adds a `Client` trait to make the client transport swappable, and a gRPC client via a `--protocol grpc` parameter. This must also specify the connstring with the gRPC port: ``` pagebench get-page-latest-lsn --protocol grpc --page-service-connstring grpc://localhost:51051 ``` The client is implemented using the raw Tonic-generated gRPC client, to minimize client overhead.	2025-06-02 14:50:49 +00:00
Erik Grinaker	8d7ed2a4ee	pageserver: add gRPC observability middleware (#12093 ) ## Problem The page service logic asserts that a tracing span is present with tenant/timeline/shard IDs. An initial gRPC page service implementation thus requires a tracing span. Touches https://github.com/neondatabase/neon/issues/11728. ## Summary of changes Adds an `ObservabilityLayer` middleware that generates a tracing span and decorates it with IDs from the gRPC metadata. This is a minimal implementation to address the tracing span assertion. It will be extended with additional observability in later PRs.	2025-06-02 11:46:50 +00:00
Vlad Lazar	5b62749c42	pageserver: reduce import memory utilization (#12086 ) ## Problem Imports can end up allocating too much. ## Summary of Changes Nerf them a bunch and add some logs.	2025-06-02 10:29:15 +00:00
Vlad Lazar	af5bb67f08	pageserver: more reactive wal receiver cancellation (#12076 ) ## Problem If the wal receiver is cancelled, there's a 50% chance that it will ingest yet more WAL. ## Summary of Changes Always check cancellation first.	2025-06-02 08:59:21 +00:00
Conrad Ludgate	589bfdfd02	proxy: Changes to rate limits and GetEndpointAccessControl caches. (#12048 ) Precursor to https://github.com/neondatabase/cloud/issues/28333. We want per-endpoint configuration for rate limits, which will be distributed via the `GetEndpointAccessControl` API. This lays some of the ground work. 1. Allow the endpoint rate limiter to accept a custom leaky bucket config on check. 2. Remove the unused auth rate limiter, as I don't want to think about how it fits into this. 3. Refactor the caching of `GetEndpointAccessControl`, as it adds friction for adding new cached data to the API. That third one was rather large. I couldn't find any way to split it up. The core idea is that there's now only 2 cache APIs. `get_endpoint_access_controls` and `get_role_access_controls`. I'm pretty sure the behaviour is unchanged, except I did a drive by change to fix #8989 because it felt harmless. The change in question is that when a password validation fails, we eagerly expire the role cache if the role was cached for 5 minutes. This is to allow for edge cases where a user tries to connect with a reset password, but the cache never expires the entry due to some redis related quirk (lag, or misconfiguration, or cplane error)	2025-06-02 08:38:35 +00:00
github-actions[bot]	b147439d6b	Proxy release 2025-06-02 06:12 UTC	2025-06-02 06:12:02 +00:00
Conrad Ludgate	87179e26b3	completely rewrite pq_proto (#12085 ) libs/pqproto is designed for safekeeper/pageserver with maximum throughput. proxy only needs it for handshakes/authentication where throughput is not a concern but memory efficiency is. For this reason, we switch to using read_exact and only allocating as much memory as we need to. All reads return a `&'a [u8]` instead of a `Bytes` because accidental sharing of bytes can cause fragmentation. Returning the reference enforces all callers only hold onto the bytes they absolutely need. For example, before this change, `pqproto` was allocating 8KiB for the initial read `BytesMut`, and proxy was holding the `Bytes` in the `StartupMessageParams` for the entire connection through to passthrough.	2025-06-01 18:41:45 +00:00
Shockingly Good	f05df409bd	impr(compute): Remove the deprecated CLI arg alias for remote-ext-config. (#12087 ) Also moves it from `String` to `Url`.	2025-05-30 17:45:24 +00:00
Alex Chi Z.	f6c0f6c4ec	fix(ci): install build tools with --locked (#12083 ) ## Problem Release pipeline failing due to some tools cannot be installed. ## Summary of changes Install with `--locked`. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-30 17:00:41 +00:00
Shockingly Good	62cd3b8d3d	fix(compute) Remove the hardcoded default value for PGXN HTTP URL. (#12030 ) Removes the hardcoded value for the Postgres Extensions HTTP gateway URL as it is always provided by the calling code.	2025-05-30 15:26:22 +00:00
Alexander Lakhin	8d26978ed9	Allow known pageserver errors in test_location_conf_churn (#12082 ) ## Problem While a pageserver in the unreadable state could not be accessed by postgres thanks to https://github.com/neondatabase/neon/pull/12059, it may still receive WAL records and bump into the "layer file download failed: No file found" error when trying to ingest them. Closes: https://github.com/neondatabase/neon/issues/11348 ## Summary of changes Allow errors from wal_connection_manager, which are considered expected. See https://github.com/neondatabase/neon/issues/11348.	2025-05-30 15:20:46 +00:00
Christian Schwarz	35372a8f12	adjust VirtualFile operation latency histogram buckets (#12075 ) The expected operating range for the production NVMe drives is in the range of 50 to 250us. The bucket boundaries before this PR were not well suited to reason about the utilization / queuing / latency variability of those devices. # Performance There was some concern about perf impact of having so many buckets, considering the impl does a linear search on each observe(). I added a benchmark and measured on relevant machines. In any way, the PR is 40 buckets, so, won't make a meaningful difference on production machines (im4gn.2xlarge), going from 30ns -> 35ns.	2025-05-30 13:22:53 +00:00
Vlad Lazar	6d95a3fe2d	pageserver: various import flow fixups (#12047 ) ## Problem There's a bunch of TODOs in the import code. ## Summary of changes 1. Bound max import byte range to 128MiB. This might still be too high, given the default job concurrency, but it needs to be balanced with going back and forth to S3. 2. Prevent unsigned overflow when determining key range splits for concurrent jobs 3. Use sharded ranges to estimate task size when splitting jobs 4. Bubble up errors that we might hit due to invalid data in the bucket back to the storage controller. 5. Tweak the import bucket S3 client configuration.	2025-05-30 12:30:11 +00:00
Vlad Lazar	99726495c7	test: allow list overly eager storcon finalization (#12055 ) ## Problem I noticed a small percentage of flakes on some import tests. They were all instances of the storage controller being too eager on the finalization. As a refresher: the pageserver notifies the storage controller that it's done from the import task and the storage controller has to call back into it in order to finalize the import. The pageserver checks that the import task is done before serving that request. Hence, we can get this race. In practice, this has no impact since the storage controller will simply retry. ## Summary of changes Allow list such cases	2025-05-30 12:14:36 +00:00
Christian Schwarz	4a4a457312	fix(pageserver): frozen->L0 flush failure causes data loss (#12043 ) This patch is a fixup for - https://github.com/neondatabase/neon/pull/6788 Background ---------- That PR 6788 added artificial advancement of `disk_consistent_lsn` and `remote_consistent_lsn` for shards that weren't written to while other shards _were_ written to. See the PR description for more context. At the time of that PR, Pageservers shards were doing WAL filtering. Nowadays, the WAL filtering happens in Safekeepers. Shards learn about the WAL gaps via `InterpretedWalRecords::next_record_lsn`. The Bug ------- That artificial advancement code also runs if the flush failed. So, we advance the disk_consistent_lsn / remote_consistent_lsn, without having the corresponding L0 to the `index_part.json`. The frozen layer remains in the layer map until detach, so we continue to serve data correctly. We're not advancing flush loop variable `flushed_to_lsn` either, so, subsequent flush requests will retry the flush and repair the situation if they succeed. But if there aren't any successful retries, eventually the tenant will be detached and when it is attached somewhere else, the `index_part.json` and therefore layer map... 1. ... does not contain the frozen layer that failed to flush and 2. ... won't re-ingest that WAL either because walreceiver starts up with the advanced disk_consistent_lsn/remote_consistent_lsn. The result is that the read path will have a gap in the reconstruct data for the keys whose modifications were lost, resulting in a) either walredo failure b) or an incorrect page@lsn image if walredo doesn't error. The Fix ------- The fix is to only do the artificial advancement if `result.is_ok()`. Misc ---- As an aside, I took some time to re-review the flush loop and its callers. I found one more bug related to error handling that I filed here: - https://github.com/neondatabase/neon/issues/12025 ## Problem ## Summary of changes	2025-05-30 11:22:37 +00:00
John Spray	e78d1e2ec6	tests: tighten readability rules in test_location_conf_churn (#12059 ) ## Problem Checking the most recent state of pageservers was insufficient to evaluate whether another pageserver may read in a particular generation, since the latest state might mask some earlier AttachedSingle state. Related: https://github.com/neondatabase/neon/issues/11348 ## Summary of changes - Maintain a history of all attachments - Write out explicit rules for when a pageserver may read	2025-05-30 11:18:01 +00:00
Alex Chi Z.	af429b4a62	feat(pageserver): observability for feature flags (#12034 ) ## Problem Part of #11813. This pull request adds misc observability improvements for the functionality. ## Summary of changes * Info span for the PostHog feature background loop. * New evaluate feature flag API. * Put the request error into the error message. * Log when feature flag gets updated. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-30 08:02:25 +00:00
Gleb Novikov	3b4d4eb535	fast_import.rs: log number of jobs for pg_dump/pg_restore (#12068 ) ## Problem I have a hypothesis that import might be using lower number of jobs than max for the VM, where the job is running. This change will help finding this out from logs ## Summary of changes Added logging of number of jobs, which is passed into both `pg_dump` and `pg_restore`	2025-05-29 18:25:42 +00:00
Arpad Müller	f060537a31	Add safekeeper reconciler metrics (#12062 ) Adds two metrics to the storcon that are related to the safekeeper reconciler: * `storage_controller_safkeeper_reconciles_queued` to indicate currrent queue depth * `storage_controller_safkeeper_reconciles_complete` to indicate the number of complete reconciles Both metrics operate on a per-safekeeper basis (as reconcilers run on a per-safekeeper basis too). These metrics mirror the `storage_controller_pending_reconciles` and `storage_controller_reconcile_complete` metrics, although those are not scoped on a per-pageserver basis but are global for the entire storage controller. Part of #11670	2025-05-29 14:07:33 +00:00
Vlad Lazar	8a6fc6fd8c	pageserver: hook importing timelines up into disk usage eviction (#12038 ) ## Problem Disk usage eviction isn't sensitive to layers of imported timelines. ## Summary of changes Hook importing timelines up into eviction and add a test for it. I don't think we need any special eviction logic for this. These layers will all be visible and their access time will be their creation time. Hence, we'll remove covered layers first and get to the imported layers if there's still disk pressure.	2025-05-29 13:01:10 +00:00
Vlad Lazar	51639cd6af	pageserver: allow for deletion of importing timelines (#12033 ) ## Problem Importing timelines can't currently be deleted. This is problematic because: 1. Cplane cannot delete failed imports and we leave the timeline behind. 2. The flow does not support user driven cancellation of the import ## Summary of changes On the pageserver: I've taken the path of least resistance, extended `TimelineOrOffloaded` with a new variant and added handling in the right places. I'm open to thoughts here, but I think it turned out better than I was envisioning. On the storage controller: Again, fairly simple business: when a DELETE timeline request is received, we remove the import from the DB and stop any finalization tasks/futures. In order to stop finalizations, we track them in-memory. For each finalizing import, we associate a gate and a cancellation token. Note that we delete the entry from the database before cancelling any finalizations. This is such that a concurrent request can't progress the import into finalize state and race with the deletion. This concern about deleting an import with on-going finalization is theoretical in the near future. We are only going to delete importing timelines after the storage controller reports the failure to cplane. Alas, the design works for user driven cancellation too. Closes https://github.com/neondatabase/neon/issues/11897	2025-05-29 11:13:52 +00:00
devin-ai-integration[bot]	529d661532	storcon: skip offline nodes in get_top_tenant_shards (#12057 ) ## Summary The optimiser background loop could get delayed a lot by waiting for timeouts trying to talk to offline nodes. Fixes: #12056 ## Solution - Skip offline nodes in `get_top_tenant_shards` Link to Devin run: https://app.devin.ai/sessions/065afd6756734d33bbd4d012428c4b6e Requested by: John Spray (john@neon.tech) Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: John Spray <john@neon.tech>	2025-05-29 11:07:09 +00:00
Alex Chi Z.	9e4cf52949	pageserver: reduce concurrency for gc-compaction (#12054 ) ## Problem Temporarily reduce the concurrency of gc-compaction to 1 job at a time. We are going to roll out in the largest AWS region next week. Having one job running at a time makes it easier to identify what tenant causes problem if it's not running well and pause gc-compaction for that specific tenant. (We can make this configurable via pageserver config in the future!) ## Summary of changes Reduce `CONCURRENT_GC_COMPACTION_TASKS` from 2 to 1. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-29 09:32:19 +00:00
Arpad Müller	831f2a4ba7	Fix flakiness of test_storcon_create_delete_sk_down (#12040 ) The `test_storcon_create_delete_sk_down` test is still flaky. This test addresses two possible causes for flakiness. both causes are related to deletion racing with `pull_timeline` which hasn't finished yet. * the first cause is timeline deletion racing with `pull_timeline`: * the first deletion attempt doesn't contain the line because the timeline doesn't exist yet * the subsequent deletion attempts don't contain it either, only a note that the timeline is already deleted. * so this patch adds the note that the timeline is already deleted to the regex * the second cause is about tenant deletion racing with `pull_timeline`: * there were no tenant specific tombstones so if a tenant was deleted, we only added tombstones for the specific timelines being deleted, not for the tenant itself. * This patch changes this, so we now have tenant specific tombstones as well as timeline specific ones, and creation of a timeline checks both. * we also don't see any retries of the tenant deletion in the logs. once it's done it's done. so extend the regex to contain the tenant deletion message as well. One could wonder why the regex and why not using the API to check whether the timeline is just "gone". The issue with the API is that it doesn't allow one to distinguish between "deleted" and "has never existed", and latter case might race with `pull_timeline`. I.e. the second case flakiness helped in the discovery of a real bug (no tenant tombstones), so the more precise check was helpful. Before, I could easily reproduce 2-9 occurences of flakiness when running the test with an additional `range(128)` parameter (i.e. 218 times 4 times). With this patch, I ran it three times, not a single failure. Fixes #11838	2025-05-28 18:20:38 +00:00
Vlad Lazar	eadabeddb8	pageserver: use the same job size throughout the import lifetime (#12026 ) ## Problem Import planning takes a job size limit as its input. Previously, the job size came from a pageserver config field. This field may change while imports are in progress. If this happens, plans will no longer be identical and the import would fail permanently. ## Summary of Changes Bake the job size into the import progress reported to the storage controller. For new imports, use the value from the pagesever config, and, for existing imports, use the value present in the shard progress. This value is identical for all shards, but we want it to be versioned since future versions of the planner might split the jobs up differently. Hence, it ends up in `ShardImportProgress`. Closes https://github.com/neondatabase/neon/issues/11983	2025-05-28 15:19:41 +00:00
Alex Chi Z.	67ddf1de28	feat(pageserver): create image layers at L0-L1 boundary (#12023 ) ## Problem Previous attempt https://github.com/neondatabase/neon/pull/10548 caused some issues in staging and we reverted it. This is a re-attempt to address https://github.com/neondatabase/neon/issues/11063. Currently we create image layers at latest record LSN. We would create "future image layers" (i.e., image layers with LSN larger than disk consistent LSN) that need special handling at startup. We also waste a lot of read operations to reconstruct from L0 layers while we could have compacted all of the L0 layers and operate on a flat level of historic layers. ## Summary of changes * Run repartition at L0-L1 boundary. * Roll out with feature flags. * Piggyback a change that downgrades "image layer creating below gc_cutoff" to debug level. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-28 07:00:52 +00:00
Nikita Kalyanov	541fcd8d2f	chore: expose new mark_invisible API in openAPI spec for use in cplane (#12032 ) ## Problem There is a new API that I plan to use. We generate client from the spec so it should be in the spec ## Summary of changes Document the existing API in openAPI format	2025-05-28 03:39:59 +00:00
Suhas Thalanki	e77961c1c6	background worker that collects installed extensions (#11939 ) ## Problem Currently, we collect metrics of what extensions are installed on computes at start up time. We do not have a mechanism that does this at runtime. ## Summary of changes Added a background thread that queries all DBs at regular intervals and collects a list of installed extensions.	2025-05-27 19:40:51 +00:00
Tristan Partin	cdfa06caad	Remove test-images compatibility hack for confirming library load paths (#11927 ) This hack was needed for compatiblity tests, but after the compute release is no longer needed. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-27 17:33:16 +00:00
Alex Chi Z.	f0bb93a9c9	feat(pageserver): support evaluate boolean flags (#12024 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes * Support evaluate boolean flags. * Add docs on how to handle errors. * Add test cases based on real PostHog config. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-27 14:29:15 +00:00
Vlad Lazar	30adf8e2bd	pageserver: add tracing spans for time spent in batch and flushing (#12012 ) ## Problem We have some gaps in our traces. This indicates missing spans. ## Summary of changes This PR adds two new spans: * WAIT_EXECUTOR: time a batched request spends in the batch waiting to be picked up * FLUSH_RESPONSE: time a get page request spends flushing the response to the compute ![image](https://github.com/user-attachments/assets/41b3ddb8-438d-4375-9da3-da341fc0916a)	2025-05-27 13:57:53 +00:00
Erik Grinaker	5d538a9503	page_api: tweak errors (#12019 ) ## Problem The page API gRPC errors need a few tweaks to integrate better with the GetPage machinery. Touches https://github.com/neondatabase/neon/issues/11728. ## Summary of changes * Add `GetPageStatus::InternalError` for internal server errors. * Rename `GetPageStatus::Invalid` to `InvalidRequest` for clarity. * Rename `status` and `GetPageStatus` to `status_code` and `GetPageStatusCode`. * Add an `Into<tonic::Status>` implementation for `ProtocolError`.	2025-05-27 12:06:51 +00:00
Arpad Müller	f3976e5c60	remove safekeeper_proto_version = 3 from tests (#12020 ) Some tests still explicitly specify version 3 of the safekeeper walproposer protocol. Remove the explicit opt in from the tests as v3 is the default now since #11518. We don't touch the places where a test exercises both v2 and v3. Those we leave for #12021. Part of https://github.com/neondatabase/neon/issues/10326	2025-05-27 11:32:15 +00:00
Vlad Lazar	9657fbc194	pageserver: add and stabilize import chaos test (#11982 ) ## Problem Test coverage of timeline imports is lacking. ## Summary of changes This PR adds a chaos import test. It runs an import while injecting various chaos events in the environment. All the commits that follow the test fix various issues that were surfaced by it. Closes https://github.com/neondatabase/neon/issues/10191	2025-05-27 09:52:59 +00:00
a-masterov	dd501554c9	add a script to run the test for online-advisor as a regular user. (#12017 ) ## Problem The regression test for the extension online_advisor fails on the staging instance due to a lack of permission to alter the database. ## Summary of changes A script was added to work around this problem. --------- Co-authored-by: Alexander Lakhin <alexander.lakhin@neon.tech>	2025-05-27 08:54:59 +00:00
github-actions[bot]	54433c0839	Proxy release 2025-05-27 06:01 UTC	2025-05-27 06:01:30 +00:00
Tristan Partin	fe1513ca57	Add neon.safekeeper_conninfo_options GUC (#11901 ) In order to enable TLS connections between computes and safekeepers, we need to provide the control plane with a way to configure the various libpq keyword parameters, sslmode and sslrootcert. neon.safekeepers is a comma separated list of safekeepers formatted as host:port, so isn't available for extension in the same way that neon.pageserver_connstring is. This could be remedied in a future PR. Part-of: https://github.com/neondatabase/cloud/issues/25823 Link: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-27 02:21:24 +00:00
Arpad Müller	3e86008e66	read-only timelines (#12015 ) Support timeline creations on the storage controller to opt out from their creation on the safekeepers, introducing the read-only timelines concept. Read only timelines: * will never receive WAL of their own, so it's fine to not create them on the safekeepers * the property is non-transitive. children of read-only timelines aren't neccessarily read-only themselves. This feature can be used for snapshots, to prevent the safekeepers from being overloaded by empty timelines that won't ever get written to. In the current world, this is not a problem, because timelines are created implicitly by the compute connecting to a safekeeper that doesn't have the timeline yet. In the future however, where the storage controller creates timelines eagerly, we should watch out for that. We represent read-only timelines in the storage controller database so that we ensure that they never touch the safekeepers at all. Especially we don't want them to cause a mess during the importing process of the timelines from the cplane to the storcon database. In a hypothetical future where we have a feature to detach timelines from safekeepers, we'll either need to find a way to distinguish the two, or if not, asking safekeepers to list the (empty) timeline prefix and delete everything from it isn't a big issue either. This patch will unconditionally hit the new safekeeper timeline creation path for read-only timelines, without them needing the `--timelines-onto-safekeepers` flag enabled. This is done because it's lower risk (no safekeepers or computes involved at all) and gives us some initial way to verify at least some parts of that code in prod. https://github.com/neondatabase/cloud/issues/29435 https://github.com/neondatabase/neon/issues/11670	2025-05-26 23:23:58 +00:00
Lassi Pölönen	23fc611461	Add metadata to pgaudit log logline (#11933 ) Previously we were using project-id/endpoint-id as SYSLOGTAG, which has a limit of 32 characters, so the endpoint-id got truncated. The output is now in RFC5424 format, where the message is json encoded with additional metadata `endpoint_id` and `project_id` Also as pgaudit logs multiline messages, we now detect this by parsing the timestamp in the specific format, and consider non-matching lines to belong in the previous log message. Using syslog structured-data would be an alternative, but leaning towards json due to being somewhat more generic.	2025-05-26 14:57:09 +00:00
Alex Chi Z.	dc953de85d	feat(pageserver): integrate PostHog with gc-compaction rollout (#11917 ) ## Problem part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes * Integrate feature store with tenant structure. * gc-compaction picks up the current strategy from the feature store. * We only log them for now for testing purpose. They will not be used until we have more patches to support different strategies defined in PostHog. * We don't support property-based evaulation for now; it will be implemented later. * Evaluating result of the feature flag is not cached -- it's not efficient and cannot be used on hot path right now. * We don't report the evaluation result back to PostHog right now. I plan to enable it in staging once we get the patch merged. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-26 13:09:37 +00:00
Alex Chi Z.	841517ee37	fix(pageserver): do not increase basebackup err counter when reconnect (#12016 ) ## Problem We see unexpected basebackup error alerts in the alert channel. https://github.com/neondatabase/neon/pull/11778 only fixed the alerts for shutdown errors. However, another path is that tenant shutting down while waiting LSN -> WaitLsnError::BadState -> QueryError::Reconnect. Therefore, the reconnect error should also be discarded from the ok/error counter. ## Summary of changes Do not increase ok/err counter for reconnect errors. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-26 11:31:27 +00:00
a-masterov	1369d73dcd	Add h3 to neon-extensions-test (#11946 ) ## Problem We didn't test the h3 extension in our test suite. ## Summary of changes Added tests for h3 and h3-postgis extensions Includes upgrade test for h3 --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-05-26 11:29:39 +00:00
Erik Grinaker	7cd0defaf0	page_api: add Rust domain types (#11999 ) ## Problem For the gRPC Pageserver API, we should convert the Protobuf types to stricter, canonical Rust types. Touches https://github.com/neondatabase/neon/issues/11728. ## Summary of changes Adds Rust domain types that mirror the Protobuf types, with conversion and validation.	2025-05-26 11:01:36 +00:00
Erik Grinaker	a082f9814a	pageserver: add gRPC authentication (#12010 ) ## Problem We need authentication for the gRPC server. Requires #11972. Touches #11728. ## Summary of changes Add two request interceptors that decode the tenant/timeline/shard metadata and authenticate the JWT token against them.	2025-05-26 10:24:45 +00:00
Erik Grinaker	ec991877f4	pageserver: add gRPC server (#11972 ) ## Problem We want to expose the page service over gRPC, for use with the communicator. Requires #11995. Touches #11728. ## Summary of changes This patch wires up a gRPC server in the Pageserver, using Tonic. It does not yet implement the actual page service. * Adds `listen_grpc_addr` and `grpc_auth_type` config options (disabled by default). * Enables gRPC by default with `neon_local`. * Stub implementation of `page_api.PageService`, returning unimplemented errors. * gRPC reflection service for use with e.g. `grpcurl`. Subsequent PRs will implement the actual page service, including authentication and observability. Notably, TLS support is not yet implemented. Certificate reloading requires us to reimplement the entire Tonic gRPC server.	2025-05-26 08:27:48 +00:00
Tristan Partin	abc6c84262	Update sql_exporter to 0.17.3 (#12013 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-23 20:16:13 +00:00
Conrad Ludgate	6768a71c86	proxy(tokio-postgres): refactor typeinfo query to occur earlier (#11993 ) ## Problem For #11992 I realised we need to get the type info before executing the query. This is important to know how to decode rows with custom types, eg the following query: ```sql CREATE TYPE foo AS ENUM ('foo','bar','baz'); SELECT ARRAY['foo'::foo, 'bar'::foo, 'baz'::foo] AS data; ``` Getting that to work was harder that it seems. The original tokio-postgres setup has a split between `Client` and `Connection`, where messages are passed between. Because multiple clients were supported, each client message included a dedicated response channel. Each request would be terminated by the `ReadyForQuery` message. The flow I opted to use for parsing types early would not trigger a `ReadyForQuery`. The flow is as follows: ``` PARSE "" // parse the user provided query DESCRIBE "" // describe the query, returning param/result type oids FLUSH // force postgres to flush the responses early // wait for descriptions // check if we know the types, if we don't then // setup the typeinfo query and execute it against each OID: PARSE typeinfo // prepare our typeinfo query DESCRIBE typeinfo FLUSH // force postgres to flush the responses early // wait for typeinfo statement // for each OID we don't know: BIND typeinfo EXECUTE FLUSH // wait for type info, might reveal more OIDs to inspect // close the typeinfo query, we cache the OID->type map and this is kinder to pgbouncer. CLOSE typeinfo // finally once we know all the OIDs: BIND "" // bind the user provided query - already parsed - to the user provided params EXECUTE // run the user provided query SYNC // commit the transaction ``` ## Summary of changes Please review commit by commit. The main challenge was allowing one query to issue multiple sub-queries. To do this I first made sure that the client could fully own the connection, which required removing any shared client state. I then had to replace the way responses are sent to the client, by using only a single permanent channel. This required some additional effort to track which query is being processed. Lastly I had to modify the query/typeinfo functions to not issue `sync` commands, so it would fit into the desired flow above. To note: the flow above does force an extra roundtrip into each query. I don't know yet if this has a measurable latency overhead.	2025-05-23 19:41:12 +00:00
Peter Bendel	87fc0a0374	periodic pagebench on hetzner runners (#11963 ) ## Problem - Benchmark periodic pagebench had inconsistent benchmarking results even when run with the same commit hash. Hypothesis is this was due to running on dedicated but virtualized EC instance with varying CPU frequency. - the dedicated instance type used for the benchmark is quite "old" and we increasingly get `An error occurred (InsufficientInstanceCapacity) when calling the StartInstances operation (reached max retries: 2): Insufficient capacity.` - periodic pagebench uses a snapshot of pageserver timelines to have the same layer structure in each run and get consistent performance. Re-creating the snapshot was a painful manual process (see https://github.com/neondatabase/cloud/issues/27051 and https://github.com/neondatabase/cloud/issues/27653) ## Summary of changes - Run the periodic pagebench on a custom hetzner GitHub runner with large nvme disk and governor set to defined perf profile - provide a manual dispatch option for the workflow that allows to create a new snapshot - keep the manual dispatch option to specify a commit hash useful for bi-secting regressions - always use the newest created snapshot (S3 bucket uses date suffix in S3 key, example `s3://neon-github-public-dev/performance/pagebench/shared-snapshots-2025-05-17/` - `--ignore` `test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py` in regular benchmarks run for each commit - improve perf copying snapshot by using `cp` subprocess instead of traversing tree in python ## Example runs with code in this PR: - run which creates new snapshot https://github.com/neondatabase/neon/actions/runs/15083408849/job/42402986376#step:19:55 - run which uses latest snapshot - https://github.com/neondatabase/neon/actions/runs/15084907676/job/42406240745#step:11:65	2025-05-23 09:37:19 +00:00
Erik Grinaker	06ce704041	Cargo.toml: upgrade Tonic to 0.13.1 (#11995 ) ## Problem We're about to implement a gRPC interface for Pageserver. Let's upgrade Tonic first, to avoid a more painful migration later. It's currently only used by storage-broker. Touches #11728. ## Summary of changes Upgrade Tonic 0.12.3 → 0.13.1. Also opportunistically upgrade Prost 0.13.3 → 0.13.5. This transitively pulls in Indexmap 2.0.1 → 2.9.0, but it doesn't appear to be used in any particularly critical code paths.	2025-05-23 08:57:35 +00:00
Konstantin Knizhnik	d5023f2b89	Restrict pump prefetch state only to regular backends (#12000 ) ## Problem See https://github.com/neondatabase/neon/issues/11997 This guard prevents race condition with pump prefetch state (initiated by timeout). Assert checks that prefetching is also done under guard. But prewarm knows nothing about it. ## Summary of changes Pump prefetch state only in regular backends. Prewarming is done by background workers now. Also it seems to have not sense to pump prefetch state in any other background workers: parallel executors, vacuum,... because they are short living and can not leave unconsumed responses in socket. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-23 08:48:06 +00:00
Konstantin Knizhnik	8ff25dca8e	Add online_advisor extension (#11898 ) ## Problem Detect problems with Postgres optimiser: lack of indexes and statistics ## Summary of changes https://github.com/knizhnik/online_advisor Add online_advistor extension to docker image --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-23 05:08:32 +00:00
Alexey Kondratov	cf81330fbc	fix(compute_ctl): Wait for rsyslog longer and with backoff (#12002 ) ## Problem https://github.com/neondatabase/neon/pull/11988 waits only for max ~200ms, so we still see failures, which self-resolve after several operation retries. ## Summary of changes Change it to waiting for at least 5 seconds, starting with 2 ms sleep between iterations and x2 sleep on each next iteration. It could be that it's not a problem with a slow `rsyslog` start, but a longer wait won't hurt. If it won't start, we should debug why `inittab` doesn't start it, or maybe there is another problem.	2025-05-22 19:15:05 +00:00
Anastasia Lubennikova	e69ae739ff	fix(compute_ctl): fix rsyslogd restart race. (#11988 ) Add retry loop around waiting for rsyslog start ## Problem ## Summary of changes --------- Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Matthias van de Meent <matthias@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-22 15:20:50 +00:00
Dmitrii Kovalkov	136eaeb74a	pageserver: basebackup cache (hackathon project) (#11989 ) ## Problem Basebackup cache is on the hot path of compute startup and is generated on every request (may be slow). - Issue: https://github.com/neondatabase/cloud/issues/29353 ## Summary of changes - Add `BasebackupCache` which stores basebackups on local disk. - Basebackup prepare requests are triggered by `XLOG_CHECKPOINT_SHUTDOWN` records in the log. - Limit the size of the cache by number of entries. - Add `basebackup_cache_enabled` feature flag to TenantConfig. - Write tests for the cache ## Not implemented yet - Limit the size of the cache by total size in bytes --------- Co-authored-by: Aleksandr Sarantsev <aleksandr@neon.tech>	2025-05-22 12:45:00 +00:00
Erik Grinaker	211b824d62	pageserver: add branch-local consumption metrics (#11852 ) ## Problem For billing, we'd like per-branch consumption metrics. Requires https://github.com/neondatabase/neon/pull/11984. Resolves https://github.com/neondatabase/cloud/issues/28155. ## Summary of changes This patch adds two new consumption metrics: * `written_size_since_parent`: `written_size - ancestor_lsn` * `pitr_history_size_since_parent`: `written_size - max(pitr_cutoff, ancestor_lsn)` Note that `pitr_history_size_since_parent` will not be emitted until the PITR cutoff has been computed, and may or may not increase ~immediately when a user increases their PITR window (depending on how much history we have available and whether the tenant is restarted/migrated).	2025-05-22 12:26:32 +00:00
Peter Bendel	f9fdbc9618	remove auth_endpoint password from log and command line for local proxy mode (#11991 ) ## Problem When testing local proxy the auth-endpoint password shows up in command line and log ```bash RUST_LOG=proxy LOGFMT=text cargo run --release --package proxy --bin proxy --features testing -- \ --auth-backend postgres \ --auth-endpoint 'postgresql://postgres:secret_password@127.0.0.1:5432/postgres' \ --tls-cert server.crt \ --tls-key server.key \ --wss 0.0.0.0:4444 ``` ## Summary of changes - Allow to set env variable PGPASSWORD - fall back to use PGPASSWORD env variable when auth-endpoint does not contain password - remove auth-endpoint password from logs in `--features testing` mode Example ```bash export PGPASSWORD=secret_password RUST_LOG=proxy LOGFMT=text cargo run --package proxy --bin proxy --features testing -- \ --auth-backend postgres \ --auth-endpoint 'postgresql://postgres@127.0.0.1:5432/postgres' \ --tls-cert server.crt \ --tls-key server.key \ --wss 0.0.0.0:4444 ```	2025-05-21 20:26:05 +00:00
Erik Grinaker	95a5f749c8	pageserver: use an `Option` for `GcCutoffs::time` (#11984 ) ## Problem It is not currently possible to disambiguate a timeline with an uninitialized PITR cutoff from one that was created within the PITR window -- both of these have `GcCutoffs::time == Lsn(0)`. For billing metrics, we need to disambiguate these to avoid accidentally billing the entire history when a tenant is initially loaded. Touches https://github.com/neondatabase/cloud/issues/28155. ## Summary of changes Make `GcCutoffs::time` an `Option<Lsn>`, and only set it to `Some` when initialized. A `pitr_interval` of 0 will yield `Some(last_record_lsn)`. This PR takes a conservative approach, and mostly retains the old behavior of consumers by using `unwrap_or_default()` to yield 0 when uninitialized, to avoid accidentally introducing bugs -- except in cases where there is high confidence that the change is beneficial (e.g. for the `pageserver_pitr_history_size` Prometheus metric and to return early during GC).	2025-05-21 15:42:11 +00:00
Konstantin Merenkov	5db20af8a7	Keep the conn info cache on max_client_conn from pgbouncer (#11986 ) ## Problem Hitting max_client_conn from pgbouncer would lead to invalidation of the conn info cache. Customers would hit the limit on wake_compute. ## Summary of changes `should_retry_wake_compute` detects this specific error from pgbouncer as non-retriable, meaning we won't try to wake up the compute again.	2025-05-21 15:27:30 +00:00
Arpad Müller	136cf1979b	Add metric for number of offloaded timelines (#11976 ) We want to keep track of the number of offloaded timelines. It's a per-tenant shard metric because each shard makes offloading decisions on its own.	2025-05-21 11:28:22 +00:00
Vlad Lazar	08bb72e516	pageserver: allow in-mem reads to be planned during writes (#11937 ) ## Problem Get page tracing revealed situations where planning an in-memory layer is taking around 150ms. Upon investigation, the culprit is the inner in-mem layer file lock. A batch being written holds the write lock and a read being planned wants the read lock. See [this trace](https://neonprod.grafana.net/explore?schemaVersion=1&panes=%7B%22j61%22:%7B%22datasource%22:%22JMfY_5TVz%22,%22queries%22:%5B%7B%22refId%22:%22traceId%22,%22queryType%22:%22traceql%22,%22query%22:%22412ec4522fe1750798aca54aec2680ac%22,%22datasource%22:%7B%22type%22:%22tempo%22,%22uid%22:%22JMfY_5TVz%22%7D,%22limit%22:20,%22tableType%22:%22traces%22,%22metricsQueryType%22:%22range%22%7D%5D,%22range%22:%7B%22to%22:%221746702606349%22,%22from%22:%221746681006349%22%7D,%22panelsState%22:%7B%22trace%22:%7B%22spanId%22:%2291e9f1879c9bccc0%22%7D%7D%7D,%226d0%22:%7B%22datasource%22:%22JMfY_5TVz%22,%22queries%22:%5B%7B%22refId%22:%22traceId%22,%22queryType%22:%22traceql%22,%22query%22:%2220a4757706b16af0e1fbab83f9d2e925%22,%22datasource%22:%7B%22type%22:%22tempo%22,%22uid%22:%22JMfY_5TVz%22%7D,%22limit%22:20,%22tableType%22:%22traces%22,%22metricsQueryType%22:%22range%22%7D%5D,%22range%22:%7B%22to%22:%221746702614807%22,%22from%22:%221746681014807%22%7D,%22panelsState%22:%7B%22trace%22:%7B%22spanId%22:%2260e7825512bc2a6b%22%7D%7D%7D%7D) for example. ## Summary of changes Lift the index into its own RwLock such that we can at least plan during write IO. I tried to be smarter in https://github.com/neondatabase/neon/pull/11866: arc swap + structurally shared datastructure and that killed ingest perf for small keys. ## Benchmarking * No statistically significant difference for rust inget benchmarks when compared to main.	2025-05-21 11:08:49 +00:00
Alexander Sarantcev	6f4f3691a5	pageserver: Add tracing endpoint correctness check in config validation (#11970 ) ## Problem When using an incorrect endpoint string - `"localhost:4317"`, it's a runtime error, but it can be a config error - Closes: https://github.com/neondatabase/neon/issues/11394 ## Summary of changes Add config parse time check via `request::Url::parse` validation. --------- Co-authored-by: Aleksandr Sarantsev <ephemeralsad@gmail.com>	2025-05-21 09:03:26 +00:00
dependabot[bot]	a2b756843e	chore(deps): bump setuptools from 70.0.0 to 78.1.1 in the pip group across 1 directory (#11977 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-20 23:00:49 +00:00
Conrad Ludgate	f3c9d0adf4	proxy(logging): significant changes to json logging internals for performance. (#11974 ) #11962 Please review each commit separately. Each commit is rather small in goal. The overall goal of this PR is to keep the behaviour identical, but shave away small inefficiencies here and there.	2025-05-20 17:57:59 +00:00
Konstantin Knizhnik	2e3dc9a8c2	Add rel_size_replica_cache (#11889 ) ## Problem See Discussion: https://neondb.slack.com/archives/C033RQ5SPDH/p1746645666075799 Issue: https://github.com/neondatabase/cloud/issues/28609 Relation size cache is not correctly updated at PS in case of replicas. ## Summary of changes 1. Have two caches for relation size in timeline: `rel_size_primary_cache` and `rel_size_replica_cache`. 2. `rel_size_primary_cache` is actually what we have now. The only difference is that it is not updated in `get_rel_size`, only by WAL ingestion 3. `rel_size_replica_cache` has limited size (LruCache) and it's key is `(Lsn,RelTag)` . It is updated in `get_rel_size`. Only strict LSN matches are accepted as cache hit. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-20 15:38:27 +00:00
Konstantin Merenkov	568779fa8a	proxy/scram: avoid memory copy to improve performance (#11980 ) Touches #11941 ## Problem Performance of our PBKDF2 was worse than reference. ## Summary of changes Avoided memory copy when HMACing in a tight loop.	2025-05-20 15:23:54 +00:00
Alexey Kondratov	e94acbc816	fix(compute_ctl): Dollar escaping and tests (#11969 ) ## Problem In the escaping path we were checking that `${tag}$` or `${outer_tag}$` are present in the string, but that's not enough, as original string surrounded by `$` can also form a 'tag', like `$x$xx$x$`, which is fine on it's own, but cannot be used in the string escaped with `$xx$`. ## Summary of changes Remove `$` from the checks, just check if `{tag}` or `{outer_tag}` are present. Add more test cases and change the catalog test to stress the `drop_subscriptions_before_start: true` path as well. Fixes https://github.com/neondatabase/cloud/issues/29198	2025-05-20 09:03:36 +00:00
github-actions[bot]	40bb9ff62a	Proxy release 2025-05-20 06:01 UTC	2025-05-20 06:01:25 +00:00
Erik Grinaker	f4150614d0	pageserver: don't pass config to `PageHandler` (#11973 ) ## Problem The gRPC page service API will require decoupling the `PageHandler` from the libpq protocol implementation. As preparation for this, avoid passing in the entire server config to `PageHandler`, and instead explicitly pass in the relevant fields. Touches https://github.com/neondatabase/neon/issues/11728. ## Summary of changes * Change `PageHandler` to take a `GetVectoredConcurrentIo` instead of the entire config. * Change `IoConcurrency::spawn_from_conf` to take a `GetVectoredConcurrentIo`.	2025-05-19 15:47:40 +00:00
Erik Grinaker	38dbc5f67f	pageserver/page_api: add binary Protobuf descriptor (#11968 ) ## Problem A binary Protobuf schema descriptor can be used to expose an API reflection service, which in turn allows convenient usage of e.g. `grpcurl` against the gRPC server. Touches #11728. ## Summary of changes * Generate a binary schema descriptor as `pageserver_page_api::proto::FILE_DESCRIPTOR_SET`. * Opportunistically rename the Protobuf package from `page_service` to `page_api`.	2025-05-19 11:17:45 +00:00
Folke Behrens	3685ad606d	endpoint_storage: Fix metrics test by excluding assertion on macos (#11952 )	2025-05-19 10:56:03 +00:00
Ivan Efremov	76a7d37f7e	proxy: Drop cancellation ops if they don't fit into the queue (#11950 ) Add a redis ops batch size argument for proxy and remove timeouts by using try_send()	2025-05-19 10:10:55 +00:00
Erik Grinaker	cdb6479c8a	pageserver: add gRPC page service schema (#11815 ) ## Problem For the [communicator project](https://github.com/neondatabase/company_projects/issues/352), we want to move to gRPC for the page service protocol. Touches #11728. ## Summary of changes This patch adds an experimental gRPC Protobuf schema for the page service. It is equivalent to the current page service, but with several improvements, e.g.: * Connection multiplexing. * Reduced head-of-line blocking. * Client-side batching. * Explicit tenant shard routing. * GetPage request classification (normal vs. prefetch). * Explicit rate limiting ("slow down" response status). The API is exposed as a new `pageserver/page_api` package. This is separate from the `pageserver_api` package to reduce the dependency footprint for the communicator. The longer-term plan is to also split out e.g. the WAL ingestion service to a separate gRPC package, e.g. `pageserver/wal_api`. Subsequent PRs will: add Rust domain types for the Protobuf types, expose a gRPC server, and implement the page service. Preliminary prototype benchmarks of this gRPC API is within 10% of baseline libpq performance. We'll do further benchmarking and optimization as the implementation lands in `main` and is deployed to staging.	2025-05-19 09:03:06 +00:00
Konstantin Knizhnik	81c557d87e	Unlogged build get smgr (#11954 ) ## Problem See https://github.com/neondatabase/neon/issues/11910 and https://neondb.slack.com/archives/C04DGM6SMTM/p1747314649059129 ## Summary of changes Do not change persistence in `start_unlogged_build` Postgres PRs: https://github.com/neondatabase/postgres/pull/642 https://github.com/neondatabase/postgres/pull/641 https://github.com/neondatabase/postgres/pull/640 https://github.com/neondatabase/postgres/pull/639 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-18 05:02:47 +00:00
Trung Dinh	e963129678	pagesteam_handle_batched_message -> pagestream_handle_batched_message (#11916 ) ## Problem Found a typo in code. ## Summary of changes Co-authored-by: Trung Dinh <tdinh@roblox.com> Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-05-17 22:30:29 +00:00
dependabot[bot]	4f0a9fc569	chore(deps): bump flask-cors from 5.0.0 to 6.0.0 in the pip group across 1 directory (#11960 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-17 22:06:32 +00:00
Emmanuel Ferdman	81c6a5a796	Migrate to correct logger interface (#11956 ) ## Problem Currently the `logger` library throws annoying deprecation warnings: ```python DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead ``` ## Summary of changes This small PR resolves the annoying deprecation warnings by migrating to `.warning` as suggested. Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-05-17 21:12:01 +00:00
Konstantin Knizhnik	8e05639dbf	Invalidate LFC after unlogged build (#11951 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1747391617951239 LFC is not always properly updated during unlogged build so it can contain stale content. ## Summary of changes Invalidate LFC content at the end of unlogged build Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-17 19:06:59 +00:00
Alexander Bayandin	deed46015d	CI(test-images): increase timeout from 20m to 60m (#11955 ) ## Problem For some reason (unknown yet) 20m timeout is not enough for `test-images` job on arm runners. Ref: https://github.com/neondatabase/neon/actions/runs/15075321681/job/42387530399?pr=11953 ## Summary of changes - Increase the timeout from 20m to 1h	2025-05-17 06:34:54 +00:00
Heikki Linnakangas	532d9b646e	Add simple facility for an extendable shared memory area (#11929 ) You still need to provide a max size up-front, but memory is only allocated for the portion that is in use. The module is currently unused, but will be used by the new compute communicator project, in the neon Postgres extension. See https://github.com/neondatabase/neon/issues/11729 --------- Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-05-16 21:22:36 +00:00
Heikki Linnakangas	55f91cf10b	Update 'nix' package (#11948 ) There were some incompatible changes. Most churn was from switching from the now-deprecated fcntl:flock() function to fcntl::Flock::lock(). The new function returns a guard object, while with the old function, the lock was associated directly with the file descriptor. It's good to stay up-to-date in general, but the impetus to do this now is that in https://github.com/neondatabase/neon/pull/11929, I want to use some functions that were added only in the latest version of 'nix', and it's nice to not have to build multiple versions. (Although, different versions of 'nix' are still pulled in as indirect dependencies from other packages)	2025-05-16 14:45:08 +00:00
Folke Behrens	baafcc5d41	proxy: Fix misspelled flag value alias, swap names and aliases (#11949 ) ## Problem There's a misspelled flag value alias that's not really used anywhere. ## Summary of changes Fix the alias and make aliases the official flag values and keep old values as aliases. Also rename enum variant. No need for it to carry the version now.	2025-05-16 14:12:39 +00:00
Evan Fleming	aa22572d8c	safekeeper: refactor static remote storage usage to use Arc (#10179 ) Greetings! Please add `w=1` to github url when viewing diff (sepcifically `wal_backup.rs`) ## Problem This PR is aimed at addressing the remaining work of #8200. Namely, removing static usage of remote storage in favour of arc. I did not opt to pass `Arc<RemoteStorage>` directly since it is actually `Optional<RemoteStorage>` as it is not necessarily always configured. I wanted to avoid having to pass `Arc<Optional<RemoteStorage>>` everywhere with individual consuming functions likely needing to handle unwrapping. Instead I've added a `WalBackup` struct that holds `Optional<RemoteStorage>` and handles initialization/unwrapping RemoteStorage internally. wal_backup functions now take self and `Arc<WalBackup>` is passed as a dependency through the various consumers that need it. ## Summary of changes - Add `WalBackup` that holds `Optional<RemoteStorage>` and handles initialization and unwrapping - Modify wal_backup functions to take `WalBackup` as self (Add `w=1` to github url when viewing diff here) - Initialize `WalBackup` in safekeeper root - Store `Arc<WalBackup>` in `GlobalTimelineMap` and pass and store in each Timeline as loaded - use `WalBackup` through Timeline as needed ## Refs - task to remove global variables https://github.com/neondatabase/neon/issues/8200 - drive-by fixes https://github.com/neondatabase/neon/issues/11501 by turning the panic reported there into an error `remote storage not configured` --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-05-16 12:41:10 +00:00
Arpad Müller	2d247375b3	Update rust to 1.87.0 (#11938 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. The 1.87.0 release marks 10 years of Rust. [Announcement blog post](https://blog.rust-lang.org/2025/05/15/Rust-1.87.0/) Prior update was in #11431	2025-05-16 12:21:24 +00:00
Christian Schwarz	a7ce323949	benchmarking: extend `test_page_service_batching.py` to cover concurrent IO + batching under random reads (#10466 ) This PR commits the benchmarks I ran to qualify concurrent IO before we released it. Changes: - Add `l0stack` fixture; a reusable abstraction for creating a stack of L0 deltas each of which has 1 Value::Delta per page. - Such a stack of L0 deltas is a good and understandable demo for concurrent IO because to reconstruct any page, $layer_stack_height` Values need to be read. Before concurrent IO, the reads were sequential. With concurrent IO, they are executed concurrently. - So, switch `test_latency` to use the l0stack. - Teach `pagebench`, which is used by `test_latency`, to limit itself to the blocks of the relation created by the l0stack abstraction. - Additional parametrization of `test_latency` over dimensions `ps_io_concurrency,l0_stack_height,queue_depth` - Use better names for the tests to reflect what they do, leave interpretation of the (now quite high-dimensional) results to the reader - `test_{throughput => postgres_seqscan}` - `test_{latency => random_reads}` - Cut down on permutations to those we use in production. Runtime is about 2min. Refs - concurrent IO epic https://github.com/neondatabase/neon/issues/9378 - batching task: fixes https://github.com/neondatabase/neon/issues/9837 --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech>	2025-05-15 17:48:13 +00:00
Vlad Lazar	31026d5a3c	pageserver: support import schema evolution (#11935 ) ## Problem Imports don't support schema evolution nicely. If we want to change the stuff we keep in storcon, we'd have to carry the old cruft around. ## Summary of changes Version import progress. Note that the import progress version determines the version of the import job split and execution. This means that we can also use it as a mechanism for deploying new import implementations in the future.	2025-05-15 16:13:15 +00:00
Vlad Lazar	2621ce2daf	pageserver: checkpoint import progress in the storage controller (#11862 ) ## Problem Timeline imports do not have progress checkpointing. Any time that the tenant is shut-down, all progress is lost and the import restarts from the beginning when the tenant is re-attached. ## Summary of changes This PR adds progress checkpointing. ### Preliminaries The unit of work is a `ChunkProcessingJob`. Each `ChunkProcessingJob` deals with the import for a set of key ranges. The job split is done by using an estimation of how many pages each job will produce. The planning stage must be pure: given a fixed set of contents in the import bucket, it will always yield the same plan. This property is enforced by checking that the hash of the plan is identical when resuming from a checkpoint. The storage controller tracks the progress of each shard in the import in the database in the form of the latest job that has has completed. ### Flow This is the high level flow for the happy path: 1. On the first run of the import task, the import task queries storcon for the progress and sees that none is recorded. 2. Execute the preparatory stage of the import 3. Import jobs start running concurrently in a `FuturesOrdered`. Every time the checkpointing threshold of jobs has been reached, notify the storage controller. 4. Tenant is detached and re-attached 5. Import task starts up again and gets the latest progress checkpoint from the storage controller in the form of a job index. 6. The plan is computed again and we check that the hash matches with the original plan. 7. Jobs are spawned from where the previous import task left off. Note that we will not report progress after the completion of each job, so some jobs might run twice. Closes https://github.com/neondatabase/neon/issues/11568 Closes https://github.com/neondatabase/neon/issues/11664	2025-05-15 13:18:22 +00:00
Vlad Lazar	a703cd342b	storage_controller: enforce generations in import upcalls (#11900 ) ## Problem Import up-calls did not enforce the usage of the latest generation. The import might have finished in one previous generation, but not in the latest one. Hence, the controller might try to activate a timeline before it is ready. In theory, that would be fine, but it's tricky to reason about. ## Summary of Changes Pageserver provides the current generation in the upcall to the storage controller and the later validates the generation. If the generation is stale, we return an error which stops progress of the import job. Note that the import job will retry the upcall until the stale location is detached. I'll add some proper tests for this as part of the [checkpointing PR](https://github.com/neondatabase/neon/pull/11862). Closes https://github.com/neondatabase/neon/issues/11884	2025-05-15 10:02:11 +00:00
Alexander Bayandin	42e4cf18c9	CI(neon_extra_builds): fix workflow syntax (#11932 ) ## Problem ``` Error when evaluating 'strategy' for job 'build-pgxn'. neondatabase/neon/.github/workflows/build-macos.yml@7907a9e2bf898f3d22b98d9d4d2c6ffc4d480fc3 (Line: 45, Col: 27): Matrix vector 'postgres-version' does not contain any values ``` See https://github.com/neondatabase/neon/actions/runs/15039594216/job/42268015127?pr=11929 ## Summary of changes - Fix typo: `.chnages` -> `.changes` - Ensure JSON is JSON by moving step output to env variable	2025-05-15 09:53:59 +00:00
Alex Chi Z.	9e5a41a342	fix(scrubber): `remote_storage` error causes layers to be deleted as orphans (#11924 ) ## Problem close https://github.com/neondatabase/neon/issues/11159 ; we get occasional wrong deletions of layer files being used and errors in staging. This patch fixed it. Example errors: ``` Timeline metadata errors: ["index_part.json contains a layer .... (shard 0000) that is not present in remote storage (layer_is_l0: false) with error: Failed to download a remote file: s3 head object\n\nCaused by:\n 0: dispatch failure\n 1: timeout\n 2: error trying to connect: HTTP connect timeout occurred after 3.1s\n ``` This error should not be fired because the file could exist, but we cannot know if it exists due to head request failure. ## Summary of changes Only generate cannot find layer errors when the head_object return type is `NotFound`. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-15 07:02:16 +00:00
Konstantin Knizhnik	48b870bc07	Use unlogged build in GIST for storing root page (#11892 ) ## Problem See https://github.com/neondatabase/neon/issues/11891 Newly added assert is first when root page of GIST index is written to the disk as part of sorted build. ## Summary of changes Wrap writing of root page in unlogged build. https://github.com/neondatabase/postgres/pull/632 https://github.com/neondatabase/postgres/pull/633 https://github.com/neondatabase/postgres/pull/634 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-15 04:45:22 +00:00
Christian Schwarz	32a12783fd	pageserver: batching & concurrent IO: update binary-built-in defaults; reduce CI matrix (#11923 ) Use the current production config for batching & concurrent IO. Remove the permutation testing for unit tests from CI. (The pageserver unit test matrix takes ~10min for debug builds). Drive-by-fix use of `if cfg!(test)` inside crate `pageserver_api`. It is ineffective for early-enabling new defaults for pageserver unit tests only. The reason is that the `test` cfg is only set for the crate under test but not its dependencies. So, `cargo test -p pageserver` will build `pageserver_api` with `cfg!(test) == false`. Resort to checking for feature flag `testing` instead, since all our unit tests are run with `--feature testing`. refs - `scattered-lsn` batching has been implemented and rolled out in all envs, cf https://github.com/neondatabase/neon/issues/10765 - preliminary for https://github.com/neondatabase/neon/pull/10466 - epic https://github.com/neondatabase/neon/issues/9377 - epic https://github.com/neondatabase/neon/issues/9378 - drive-by fix https://neondb.slack.com/archives/C0277TKAJCA/p1746821515504219	2025-05-14 16:30:21 +00:00
a-masterov	68120cfa31	Fix Cloud Extensions Regression (#11907 ) ## Problem The regression test on extensions relied on the admin API to set the default endpoint settings, which is not stable and requires admin privileges. Specifically: - The workflow was using `default_endpoint_settings` to configure necessary PostgreSQL settings like `DateStyle`, `TimeZone`, and `neon.allow_unstable_extensions` - This approach was failing because the API endpoint for setting `default_endpoint_settings` was changed (referenced in a comment as issue #27108) - The admin API requires special privileges. ## Summary of changes We get rid of the admin API dependency and use ALTER DATABASE statements instead: Removed the default_endpoint_settings mechanism: - Removed the default_endpoint_settings input parameter from the neon-project-create action - Removed the API call that was attempting to set these settings at the project level - Completely removed the default_endpoint_settings configuration from the cloud-extensions workflow Added database-level settings: - Created a new `alter_db.sh` script that applies the same settings directly to each test database - Modified all extension test scripts to call this script after database creation	2025-05-14 13:19:53 +00:00
github-actions[bot]	4688b815b1	Proxy release 2025-05-14 09:39 UTC	2025-05-14 09:39:15 +00:00
Alex Chi Z.	a8e652d47e	rfc: add bottommost garbage-collection compaction (#8425 ) Add the RFC for bottommost garbage-collection compaction --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-05-14 09:25:57 +00:00
Alex Chi Z.	81fd652151	fix(pageserver): use better estimation for compaction memory usage (#11904 ) ## Problem Hopefully resolves `test_gc_feedback` flakiness. ## Summary of changes `accumulated_values` should not exceed 512MB to avoid OOM. Previously we only use number of items, which is not a good estimation. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-14 08:32:55 +00:00
Elizabeth Murray	d47e88e353	Update the pgrag version in the compute dockerfile. (#11867 ) ## Problem The extensions test are hanging because of pgrag. The new version of pgrag contains a fix for the hang. ## Summary of changes	2025-05-14 07:00:59 +00:00
Vlad Lazar	045ae13e06	pageserver: make imports work with tenant shut downs (#11855 ) ## Problem Lifetime of imported timelines (and implicitly the import background task) has some shortcomings: 1. Timeline activation upon import completion is tricky. Previously, a timeline that finished importing after a tenant detach would not get activated and there's concerns about the safety of activating concurrently with shut-down. 2. Import jobs can prevent tenant shut down since they hold the tenant gate ## Summary of Changes Track the import tasks in memory and abort them explicitly on tenant shutdown. Integrate more closely with the storage controller: 1. When an import task has finished all of its jobs, it notifies the storage controller, but does not mark the import as done in the index_part. When all shards have finished importing, the storage controller will call the `/activate_post_import` idempotent endpoint for all of them. The handler, marks the import complete in index part, resets the tenant if required and checks if the timeline is active yet. 2. Not directly related, but the import job now gets the starting state from the storage controller instead of the import bucket. This paves the way for progress checkpointing. Related: https://github.com/neondatabase/neon/issues/11568	2025-05-13 17:49:49 +00:00
Folke Behrens	234c882a07	proxy: Expose handlers for cpu and heap profiling (#11912 ) ## Problem It's difficult to understand where proxy spends most of cpu and memory. ## Summary of changes Expose cpu and heap profiling handlers for continuous profiling. neondatabase/cloud#22670	2025-05-13 14:58:37 +00:00
Konstantin Knizhnik	290369061f	Check prefetch result in DEBUG_COMPARE_LOCAL mode (#11502 ) ## Problem Prefetched and LFC results are not checked in DEBUG_COMPARE_LOCAL mode ## Summary of changes Add check for this results as well. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-13 14:13:42 +00:00
Anastasia Lubennikova	25ab16ee24	chore(compute): Postgres 17.5, 16.9, 15.13 and 14.18 (#11886 ) Bump all minor versions. the only conflict was src/backend/storage/smgr/smgr.c in v17 where our smgr changes conflicted with `ee578921b6` but it was trivial to resolve.	2025-05-13 13:30:09 +00:00
Vlad Lazar	cfbef4d586	safekeeper: downgrade stream from future WAL log (#11909 ) ## Problem 1. Safekeeper selection on the pageserver side isn't very dynamic. Once you connect to one safekeeper, you'll use that one for as long as the safekeeper keeps the connection alive. In principle, we could be more eager, since the wal receiver connection can be cancelled but we don't do that. We wait until the "session" is done and then we pick a new SK. 2. Picking a new SK is quite conservative. We will switch if: a. We haven't received anything from the SK within the last 10 seconds (wal_connect_timeout) or b. The candidate SK is 1GiB ahead or c. The candidate SK is in the same AZ as the PS or d. There's a candidate that is ahead and we've not had any WAL within the last 10 seconds (lagging_wal_timeout) Hence, we can end up with pageservers that are requesting WAL which their safekeeper hasn't seen yet. ## Summary of changes Downgrade warning log to info.	2025-05-13 13:02:25 +00:00
Alex Chi Z.	34a42b00ca	feat(pageserver): add PostHog lite client (#11821 ) ## Problem part of https://github.com/neondatabase/neon/issues/11813 ## Summary of changes Add a lite PostHog client that only uses the local flag evaluation functionality. Added a test case that parses an example feature flag and gets the evaluation result. TODO: support boolean flag, remote config; implement all operators in PostHog. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-13 09:49:14 +00:00
Alex Chi Z.	a9979620c5	fix(remote_storage): continue on Azure+AWS retryable error (#11903 ) ## Problem We implemented the retry logic in AWS S3 but not in Azure. Therefore, if there is an error during Azure listing, we will return an Err to the caller, and the stream will end without fetching more tenants. Part of https://github.com/neondatabase/neon/issues/11159 Without this fix, listing tenant will stop once we hit an error (could be network errors -- that happens more frequent on Azure). If we happen to stop at a point that we only listed part of the shards, we will hit the "missed shards" error or even remove layers being used. This bug (for Azure listing) was introduced as part of https://github.com/neondatabase/neon/pull/9840 There is also a bug that stops the stream for AWS when there's a timeout -- this is fixed along with this patch. ## Summary of changes Retry the request on error. In the future, we should make such streams return something like `Result<Result<T>>` where the outer result is the error that ends the stream and the inner one is the error that should be retried by the caller. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-13 08:53:35 +00:00
Conrad Ludgate	a113c48c43	proxy: fix redis batching support (#11905 ) ## Problem For `StoreCancelKey`, we were inserting 2 commands, but we were not inserting two replies. This mismatch leads to errors when decoding the response. ## Summary of changes Abstract the command + reply pipeline so that commands and replies are registered at the same time.	2025-05-13 08:33:53 +00:00
Tristan Partin	9971fba584	Properly configure the dynamic loader to load our compiled libraries (#11858 ) The first line in /etc/ld.so.conf is: /etc/ld.so.conf.d/* We want to control library load order so that our compiled binaries are picked up before others from system packages. The previous solution allowed the system libraries to load before ours. Part-of: https://github.com/neondatabase/neon/issues/11857 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-12 17:36:07 +00:00
Conrad Ludgate	a77919f4b2	merge pg-sni-router into proxy (#11882 ) ## Problem We realised that pg-sni-router doesn't need to be separate from proxy. just a separate port. ## Summary of changes Add pg-sni-router config to proxy and expose the service.	2025-05-12 15:48:48 +00:00
github-actions[bot]	0982ca4636	Proxy release 2025-05-12 14:35 UTC	2025-05-12 14:35:27 +00:00
Jakub Kołodziejczak	a618056770	chore(compute): skip audit logs for pg_session_jwt extension (#11883 ) references https://github.com/neondatabase/cloud/issues/28480#issuecomment-2866961124 related https://github.com/neondatabase/cloud/issues/28863 cc @MihaiBojin @conradludgate	2025-05-12 11:24:33 +00:00
Alex Chi Z.	307e1e64c8	fix(scrubber): more logs wrt relic timelines (#11895 ) ## Problem Further investigation on https://github.com/neondatabase/neon/issues/11159 reveals that the list_tenant function can find all the shards of the tenant, but then the shard gets missing during the gc timeline list blob. One reason could be that in some ways the timeline gets recognized as a relic timeline. ## Summary of changes Add logging to help identify the issue. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-12 09:17:35 +00:00
Arpad Müller	a537b2ffd0	pull_timeline: check tombstones by default (#11873 ) Make `pull_timeline` check tombstones by default. Otherwise, we'd be recreating timelines if the order between creation and deletion got mixed up, as seen in #11838. Fixes #11838.	2025-05-12 07:25:54 +00:00
Christian Schwarz	64353b48db	direct+concurrent IO: retroactive RFC (#11788 ) refs - direct IO epic: https://github.com/neondatabase/neon/issues/8130 - concurrent IO epic https://github.com/neondatabase/neon/issues/9378 - obsoletes direct IO proposal RFC: https://github.com/neondatabase/neon/pull/8240 - discussion in https://neondb.slack.com/archives/C07BZ38E6SD/p1746028030574349	2025-05-10 15:06:06 +00:00
Christian Schwarz	79ddc803af	feat(direct IO): runtime alignment validation; support config flag on macOS; default to `DirectRw` (#11868 ) This PR adds a runtime validation mode to check adherence to alignment and size-multiple requirements at the VirtualFile level. This can help prevent alignment bugs from slipping into production because test systems may have more lax requirements than production. (This is not the case today, but it could change in the future). It also allows catching O_DIRECT bugs on systems that don't have O_DIRECT (macOS). Consequently, we can now accept `virtual_file_io_mode={direct,direct-rw}` on macOS now. This has the side benefit of removing some annoying conditional compilation around `IoMode`. A third benefit is that it helped weed out size-multiple requirement violation bugs in how the VirtualFile unit tests exercise read and write APIs. I seized the opportunity to trim these tests down to what actually matters, i.e., exercising of the `OpenFiles` file descriptor cache. Lastly, this PR flips the binary-built-in default to `DirectRw` so that when running Python regress tests and benchmarks without specifying `PAGESERVER_VIRTUAL_FILE_IO_MODE`, one gets the production behavior. Refs - fixes https://github.com/neondatabase/neon/issues/11676	2025-05-10 14:19:52 +00:00
Christian Schwarz	f5070f6aa4	fixup(direct IO): PR #11864 broke test suite parametrization (#11887 ) PR - github.com/neondatabase/neon/pull/11864 committed yesterday rendered the `PAGESERVER_VIRTUAL_FILE_IO_MODE` env-var-based parametrization ineffective. As a consequence, the tests and benchmarks in `test_runner/` were using the binary built-in-default, i.e., `buffered`.	2025-05-09 18:13:35 +00:00
Matthias van de Meent	3b7cc4234c	Fix PS connect attempt timeouts when facing interrupts (#11880 ) With the 50ms timeouts of pumping state in connector.c, we need to correctly handle these timeouts that also wake up pg_usleep. This new approach makes the connection attempts re-start the wait whenever it gets woken up early; and CHECK_FOR_INTERRUPTS() is called to make sure we don't miss query cancellations. ## Problem https://neondb.slack.com/archives/C04DGM6SMTM/p1746794528680269 ## Summary of changes Make sure we start sleeping again if pg_usleep got woken up ahead of time.	2025-05-09 17:02:24 +00:00
Arpad Müller	33abfc2b74	storcon: remove finished safekeeper reconciliations from in-memory hashmap (#11876 ) ## Problem Currently there is a memory leak, in that finished safekeeper reconciliations leave a cancellation token behind which is never cleaned up. ## Summary of changes The change adds cleanup after finishing of a reconciliation. In order to ensure we remove the correct cancellation token, and we haven't raced with another reconciliation, we introduce a `TokenId` counter to tell tokens apart. Part of https://github.com/neondatabase/neon/issues/11670	2025-05-09 13:34:22 +00:00
Alex Chi Z.	93b964f829	fix(pageserver): do not do image compaction if it's below gc cutoff (#11872 ) ## Problem We observe image compaction errors after gc-compaction finishes compacting below the gc_cutoff. This is because `repartition` returns an LSN below the gc horizon as we (likely) determined that `distance <= self.repartition_threshold`. I think it's better to keep the current behavior of when to trigger compaction but we should skip image compaction if the returned LSN is below the gc horizon. ## Summary of changes If the repartition returns an invalid LSN, skip image compaction. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-09 12:07:52 +00:00
Vlad Lazar	d0aaec2abb	storage_controller: create imported timelines on safekeepers (#11801 ) ## Problem SK timeline creations were skipped for imported timelines since we didn't know the correct start LSN of the timeline at that point. ## Summary of changes Created imported timelines on the SK as part of the import finalize step. We use the last record LSN of shard 0 as the start LSN for the safekeeper timeline. Closes https://github.com/neondatabase/neon/issues/11569	2025-05-09 10:55:26 +00:00
Alex Chi Z.	d0dc65da12	fix(pageserver): give up gc-compaction if one key has too long history (#11869 ) ## Problem The limitation we imposed last week https://github.com/neondatabase/neon/pull/11709 is not enough to protect excessive memory usage. ## Summary of changes If a single key accumulated too much history, give up compaction. In the future, we can make the `generate_key_retention` function take a stream of keys instead of first accumulating them in memory, thus easily support such long key history cases. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-09 10:12:49 +00:00
Konstantin Knizhnik	03d635b916	Add more guards for prefetch_pump_state (#11859 ) ## Problem See https://neondb.slack.com/archives/C08PJ07BZ44/p1746566292750689 Looks like there are more cases when `prefetch_pump_state` can be called in unexpected place and cause core dump. ## Summary of changes Add more guards. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-09 09:07:08 +00:00
Conrad Ludgate	5cd7f936f9	fix(neon-rls): optimistically assume role grants are already assigned for replicas (#11811 ) ## Problem Read replicas cannot grant permissions for roles for Neon RLS. Usually the permission is already granted, so we can optimistically check. See INC-509 ## Summary of changes Perform a permission lookup prior to actually executing any grants.	2025-05-09 07:48:30 +00:00
Konstantin Knizhnik	101e115b38	Change prefetch logic in vacuum (#11650 ) ## Problem See https://neondb.slack.com/archives/C03QLRH7PPD/p1745003314183649 Vacuum doesn't use prefetch because this strange logic in `lazy_scan_heap`: ``` /* And only up to the next unskippable block */ if (next_prefetch_block + prefetch_budget > vacrel->next_unskippable_block) prefetch_budget = vacrel->next_unskippable_block - next_prefetch_block; ``` ## Summary of changes Disable prefetch only if vacuum jumps to next skippable block (there is SKIP_PAGES_THRESHOLD) which cancel seqscan and perform jump only if gap is large enough). Postgres PRs: https://github.com/neondatabase/postgres/pull/620 https://github.com/neondatabase/postgres/pull/621 https://github.com/neondatabase/postgres/pull/622 https://github.com/neondatabase/postgres/pull/623 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-09 06:54:40 +00:00
Christian Schwarz	b37bb7d7ed	pageserver: timeline shutdown: fully quiesce ingest path before`freeze_and_flush` (#11851 ) # Problem Before this PR, timeline shutdown would - cancel the walreceiver cancellation token subtree (child token of Timeline::cancel) - call freeze_and_flush - Timeline::cancel.cancel() - ... bunch of waiting for things ... - Timeline::gate.close() As noted by the comment that is deleted by this PR, this left a window where, after freeze_and_flush, walreceiver could still be running and ingest data into a new InMemoryLayer. This presents a potential source of log noise during Timeline shutdown where the InMemoryLayer created after the freeze_and_flush observes that Timeline::cancel is cancelled, failing the ingest with some anyhow::Error wrapping (deeply) a `FlushTaskError::Cancelled` instance (`flush task cancelled` error message). # Solution It turns out that it is quite easy to shut down, not just cancel, walreceiver completely because the only subtask spawned by walreceiver connection manager is the `handle_walreceiver_connection` task, which is properly shut down and waited upon when the manager task observes cancellation and exits its retry loop. The alternative is to replace all the usage of `anyhow` on the ingest path with differentiated error types. A lot of busywork for little gain to fix a potential logging noise nuisance, so, not doing that for now. # Correctness / Risk We do not risk leaking walreceiver child tasks because existing discipline is to hold a gate guard. We will prolong `Timeline::shutdown` to the degree that we're no longer making progress with the rest of shutdown while the walreceiver task hasn't yet observed cancellation. In practice, this should be negligible. `Timeline::shutdown` could fail to complete if there is a hidden dependency of walreceiver shutdown on some subsystem. The code certainly suggests there isn't, and I'm not aware of any such dependency. Anyway, impact will be low because we only shut down Timeline instances that are obsolete, either because there is a newer attachment at a different location, or because the timeline got deleted by the user. We would learn about this through stuck cplane operations or stuck storcon reconciliations. We would be able to mitigate by cancelling such stuck operations/reconciliations and/or by rolling back pageserver. # Refs - identified this while investigating https://github.com/neondatabase/neon/issues/11762 - PR that _does_ fix a bunch _real_ `flush task cancelled` noise on the compaction path: https://github.com/neondatabase/neon/pull/11853	2025-05-08 18:48:24 +00:00
Conrad Ludgate	bef5954fd7	feat(proxy): track SNI usage by protocol, including for http (#11863 ) ## Problem We want to see how many users of the legacy serverless driver are still using the old URL for SQL-over-HTTP traffic. ## Summary of changes Adds a protocol field to the connections_by_sni metric. Ensures it's incremented for sql-over-http.	2025-05-08 16:46:57 +00:00
Christian Schwarz	8477d15f95	feat(direct IO): remove special case in test suite for compat tests (#11864 ) PR - https://github.com/neondatabase/neon/pull/11558 adds special treatment for compat snapshot binaries which don't understand the `direct-rw` mode. A new compat snapshot has been published since, so, we can remove the special case. refs: - fixes https://github.com/neondatabase/neon/issues/11598	2025-05-08 16:11:45 +00:00
Arpad Müller	622b3b2993	Fixes for enabling --timelines-onto-safekeepers in tests (#11854 ) Second PR with fixes extracted from #11712, relating to `--timelines-onto-safekeepers`. Does the following: * Moves safekeeper registration to `neon_local` instead of the test fixtures * Pass safekeeper JWT token if `--timelines-onto-safekeepers` is enabled * Allow some warnings related to offline safekeepers (similarly to how we allow them for offline pageservers) * Enable generations on the compute's config if `--timelines-onto-safekeepers` is enabled * fix parallel `pull_timeline` race condition (the one that #11786 put for later) Fixes #11424 Part of #11670	2025-05-08 15:13:11 +00:00
Santosh Pingale	659366060d	Reuse remote_client from the SnapshotDownloader instead of recreating in download function (#11812 ) ## Problem At the moment, remote_client and target are recreated in download function. We could reuse it from SnapshotDownloader instance. This isn't a problem per se, just a quality of life improvement but it caught my attention when we were trying out snapshot downloading in one of the older version and ran into a curious case of s3 clients behaving in two different manners. One client that used `force_path_style` and other one didn't. Logs from this run: ``` 2025-05-02T12:56:22.384626Z DEBUG /data/snappie/2739e7da34e625e3934ef0b76fa12483/timelines/d44b831adb0a6ba96792dc3a5cc30910/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000014E8F20-00000000014E8F99-00000001 requires download... 2025-05-02T12:56:22.384689Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:apply_configuration: timeout settings for this operation: TimeoutConfig { connect_timeout: Set(3.1s), read_timeout: Disabled, operation_timeout: Disabled, operation_attempt_timeout: Disabled } 2025-05-02T12:56:22.384730Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op: entering 'serialization' phase 2025-05-02T12:56:22.384784Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op: entering 'before transmit' phase 2025-05-02T12:56:22.384813Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op: retry strategy has OKed initial request 2025-05-02T12:56:22.384841Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op: beginning attempt #1 2025-05-02T12:56:22.384870Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: resolving endpoint endpoint_params=EndpointResolverParams(TypeErasedBox[!Clone]:Params { bucket: Some("bucket"), region: Some("eu-north-1"), use_fips: false, use_dual_stack: false, endpoint: Some("https://s3.self-hosted.company.com"), force_path_style: false, accelerate: false, use_global_endpoint: false, use_object_lambda_endpoint: None, key: None, prefix: Some("/pageserver/tenants/2739e7da34e625e3934ef0b76fa12483/timelines/d44b831adb0a6ba96792dc3a5cc30910/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000014E8F20-00000000014E8F99-00000001"), copy_source: None, disable_access_points: None, disable_multi_region_access_points: false, use_arn_region: None, use_s3_express_control_endpoint: None, disable_s3_express_session_auth: None }) endpoint_prefix=None 2025-05-02T12:56:22.384979Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: will use endpoint Endpoint { url: "https://neon.s3.self-hosted.company.com", headers: {}, properties: {"authSchemes": Array([Object({"signingRegion": String("eu-north-1"), "disableDoubleEncoding": Bool(true), "name": String("sigv4"), "signingName": String("s3")})])} } 2025-05-02T12:56:22.385042Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt:lazy_load_identity:provide_credentials{provider=default_chain}: loaded credentials provider=Environment 2025-05-02T12:56:22.385066Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt:lazy_load_identity: identity cache miss occurred; added new identity (took 35.958µs) new_expiration=2025-05-02T13:11:22.385028Z valid_for=899.999961437s partition=IdentityCachePartition(5) 2025-05-02T12:56:22.385090Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: loaded identity 2025-05-02T12:56:22.385162Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: entering 'transmit' phase 2025-05-02T12:56:22.385211Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: new TCP connector created in 361ns 2025-05-02T12:56:22.385288Z DEBUG resolving host="neon.s3.self-hosted.company.com" 2025-05-02T12:56:22.390796Z DEBUG invoke{service=s3 operation=ListObjectVersions sdk_invocation_id=7315885}:try_op:try_attempt: encountered orchestrator error; halting ```	2025-05-08 14:09:15 +00:00
Christian Schwarz	42d93031a1	fixup(#11819 ): broken macOS build (#11861 ) refs - fixes https://github.com/neondatabase/neon/issues/11860	2025-05-08 11:48:29 +00:00
Mark Novikov	d22377c754	Skip event triggers in dump-restore (#11794 ) ## Problem Data import fails if the src db has any event triggers, because those can only be restored by a superuser. Specifically imports from Heroku and Supabase are guaranteed to fail. Closes https://github.com/neondatabase/cloud/issues/27353 ## Summary of changes Depends on `pg_dump` patches per each supported PostgreSQL version: - https://github.com/neondatabase/postgres/pull/630 - https://github.com/neondatabase/postgres/pull/629 - https://github.com/neondatabase/postgres/pull/627 - https://github.com/neondatabase/postgres/pull/628	2025-05-08 11:04:28 +00:00
Erik Grinaker	6c70789cfd	storcon: increase drain+fill secondary warmup timeout from 20 to 30 seconds (#11848 ) ## Problem During deployment drains/fills, we often see the storage controller giving up on warmups after 20 seconds, when the warmup is nearly complete (~90%). This can cause latency spikes for migrated tenants if they block on layer downloads. Touches https://github.com/neondatabase/cloud/issues/26193. ## Summary of changes Increase the drain and fill secondary warmup timeout from 20 to 30 seconds.	2025-05-08 10:14:41 +00:00
Dmitrii Kovalkov	7e55497e13	tests: flush wal before waiting for last record lsn (#11726 ) ## Problem Compute may flush WAL on page boundaries, leaving some records partially flushed for a long time. It leads to `wait_for_last_flush_lsn` stuck waiting for this partial LSN. - Closes: https://github.com/neondatabase/cloud/issues/27876 ## Summary of changes - Flush WAL via CHECKPOINT after requesting current_wal_lsn to make sure that the record we point to is flushed in full - Use proper endpoint in `test_timeline_detach_with_aux_files_with_detach_v1`	2025-05-08 10:00:45 +00:00
Vlad Lazar	40f32ea326	pageserver: refactor import flow and add job concurrency limiting (#11816 ) ## Problem Import code is one big block. Separating planning and execution will help with reporting progress of import to storcon (building block for resuming import). ## Summary of changes Split up the import into planning and execution. A concurrency limit driven by PS config is also added.	2025-05-08 09:19:14 +00:00
Christian Schwarz	1d1502bc16	fix(pageserver): `flush task cancelled` errors during timeline shutdown (#11853 ) # Refs - fixes https://github.com/neondatabase/neon/issues/11762 # Problem PR #10993 introduced internal retries for BufferedWriter flushes. PR #11052 added cancellation sensitivity to that retry loop. That cancellation sensitivity is an error path that didn't exist before. The result is that during timeline shutdown, after we `Timeline::cancel`, compaction can now fail with error `flush task cancelled`. The problem with that: 1. We mis-classify this as an `error!`-worthy event. 2. This causes tests to become flaky because the error is not in global `allowed_errors`. Technically we also trip the `compaction_circuit_breaker` because the resulting `CompactionError` is variant `::Other`. But since this is Timeline shutdown, is doesn't matter practically speaking. # Solution / Changes - Log the anyhow stack trace when classifying a compaction error as `error!`. This was helpful to identify sources of `flush task cancelled` errors. We only log at `error!` level in exceptional circumstances, so, it's ok to have bit verbose logs. - Introduce typed errors along the `BufferedWriter::write_`=> `BlobWriter::write_blob` => `{Delta,Image}LayerWriter::put_` => `Split{Delta,Image}LayerWriter::put_{value,image}` chain. - Proper mapping to `CompactionError`/`CreateImageLayersError` via new `From` impls. I am usually opposed to any magic `From` impls, but, it's how most of the compaction code works today. # Testing The symptoms are most prevalent in `test_runner/regress/test_branch_and_gc.py::test_branch_and_gc`. Before this PR, I was able to reproduce locally 1 or 2 times per 400 runs using `DEFAULT_PG_VERSION=15 BUILD_TYPE=release poetry run pytest --count 400 -n 8`. After this PR, it doesn't reproduce anymore after 2000 runs. # Future Work Technically the ingest path is also exposed to this new source of errors because `InMemoryLayer` is backed by `BufferedWriter`. But we haven't seen it occur in flaky tests yet. Details and a fix in - https://github.com/neondatabase/neon/pull/11851	2025-05-08 06:57:53 +00:00
Christian Schwarz	7eb85c56ac	tokio-epoll-uring: avoid warn! noise due to `ECANCELED` during shutdowns (#11819 ) # Problem Before this PR, `test_pageserver_catchup_while_compute_down` would occasionally fail due to scary-looking WARN log line ``` WARN ephemeral_file_buffered_writer{...}:flush_attempt{attempt=1}: \ error flushing buffered writer buffer to disk, retrying after backoff err=Operation canceled (os error 125) ``` After lengthy investigation, the conclusion is that this is likely due to a kernel bug related due to io_uring async workers (io-wq) and signals. The main indicator is that the error only ever happens in correlation with pageserver shtudown when SIGTERM is received. There is a fix that is merged in 6.14 kernels (`io-wq: backoff when retrying worker creation`). However, even when I revert that patch, the issue is not reproducible on 6.14, so, it remains a speculation. It was ruled out that the ECANCELED is due to the executor thread exiting before the async worker starts processing the operation. # Solution The workaround in this issue is to retry the operation on ECANCELED once. Retries are safe because the low-level io_engine operations are idempotent. (We don't use O_APPEND and I can't think of another flag that would make the APIs covered by this patch not idempotent.) # Testing With this PR, the warn! log no longer happens on [my reproducer setup](https://github.com/neondatabase/neon/issues/11446#issuecomment-2843015111). And the new rate-limited `info!`-level log line informing about the internal retry shows up instead, as expected. # Refs - fixes https://github.com/neondatabase/neon/issues/11446	2025-05-08 06:33:29 +00:00
Dmitrii Kovalkov	24d62c647f	storcon: add missing switch_timeline_membership method to sk client (#11850 ) ## Problem `switch_timeline_membership` is implemented on safekeeper's server side, but the is missing in the client. - Part of https://github.com/neondatabase/neon/issues/11823 ## Summary of changes - Add `switch_timeline_membership` method to `SafekeeperClient`	2025-05-07 17:00:41 +00:00
Shockingly Good	4d2e4b19c3	fix(compute) Correct the PGXN s3 gateway URL. (#11796 ) Corrects the postgres extension s3 gateway address to be not just a domain name but a full base URL. To make the code more readable, the option is renamed to "remote_ext_base_url", while keeping the old name also accessible by providing a clap argument alias. Also provides a very simple and, perhaps, even redundant unit test to confirm the logic behind parsing of the corresponding CLI argument. ## Problem As it is clearly stated in https://github.com/neondatabase/cloud/issues/26005, using of the short version of the domain name might work for now, but in the future, we should get rid of using the `default` namespace and this is where it will, most likely, break down. ## Summary of changes The changes adjust the domain name of the extension s3 gateway to use the proper base url format instead of the just domain name assuming the "default" namespace and add a new CLI argument name for to reflect the change and the expectance.	2025-05-07 16:34:08 +00:00
Alexey Kondratov	0691b73f53	fix(compute): Enforce cloud_admin role in compute_ctl connections (#11827 ) ## Problem Users can override some configuration parameters on the DB level with `ALTER DATABASE ... SET ...`. Some of these overrides, like `role` or `default_transaction_read_only`, affect `compute_ctl`'s ability to configure the DB schema properly. ## Summary of changes Enforce `role=cloud_admin`, `statement_timeout=0`, and move `default_transaction_read_only=off` override from control plane [1] to `compute_ctl`. Also, enforce `search_path=public` just in case, although we do not call any functions in user databases. [1]: `133dd8c4db/goapp/controlplane/internal/pkg/compute/provisioner/provisioner_common.go (L70)` Fixes https://github.com/neondatabase/cloud/issues/28532	2025-05-07 12:14:24 +00:00
Vlad Lazar	3cf5e1386c	pageserver: fix rough edges of pageserver tracing (#11842 ) ## Problem There's a few rough edges around PS tracing. ## Summary of changes * include compute request id in pageserver trace * use the get page specific context for GET_REL_SIZE and GET_BATCH * fix assertion in download layer trace ![image](https://github.com/user-attachments/assets/2ff6779c-7c2d-4102-8013-ada8203aa42f)	2025-05-07 10:13:26 +00:00
Alex Chi Z.	608afc3055	fix(scrubber): log download error (#11833 ) ## Problem We use `head_object` to determine whether an object exists or not. However, it does not always error due to a missing object. ## Summary of changes Log the error so that we can have a better idea what's going on with the scrubber errors in prod. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-07 09:21:17 +00:00
Tristan Partin	0ef6851219	Make the audience claim in compute JWTs a vector (#11845 ) According to RFC 7519, `aud` is generally an array of StringOrURI, but in special cases may be a single StringOrURI value. To accomodate future control plane work where a single token may work for multiple services, make the claim a vector. Link: https://www.rfc-editor.org/rfc/rfc7519#section-4.1.3 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-06 22:19:15 +00:00
Mikhail	5c356c63eb	endpoint_storage compute_ctl integration (#11550 ) Add `/lfc/(prewarm\|offload)` routes to `compute_ctl` which interact with endpoint storage. Add `prewarm_lfc_on_startup` spec option which, if enabled, downloads LFC prewarm data on compute startup. Resolves: https://github.com/neondatabase/cloud/issues/26343	2025-05-06 22:02:12 +00:00
Suhas Thalanki	384e3df2ad	fix: pinned anon extension to v2.1.0 (#11844 ) ## Problem Currently the setup for `anon` v2 in the compute image downloads the latest version of the extension. This can be problematic as on a compute start/restart it can download a version that is newer than what we have tested and potentially break things, hence not giving us the ability to control when the extension is updated. We were also using `v2.2.0`, which is not ready for production yet and has been clarified by the maintainer. Additional context: https://gitlab.com/dalibo/postgresql_anonymizer/-/issues/530 ## Summary of changes Changed the URL from which we download the `anon` extension to point to `v2.1.0` instead of `latest`.	2025-05-06 21:52:15 +00:00
Tristan Partin	f9b3a2e059	Add scoping to compute_ctl JWT claims (#11639 ) Currently we only have an admin scope which allows a user to bypass the compute_id check. When the admin scope is provided, validate the audience of the JWT to be "compute". Closes: https://github.com/neondatabase/cloud/issues/27614 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-05-06 19:51:10 +00:00
Jakub Kołodziejczak	79ee78ea32	feat(compute): enable audit logs for pg_session_jwt extension (#11829 ) related to https://github.com/neondatabase/cloud/issues/28480 related to https://github.com/neondatabase/pg_session_jwt/pull/36 cc @MihaiBojin @conradludgate @lneves12	2025-05-06 15:18:50 +00:00
Erik Grinaker	0e0ad073bf	storcon: fix split aborts removing other tenants (#11837 ) ## Problem When aborting a split, the code accidentally removes all other tenant shards from the in-memory map that have the same shard count as the aborted split, causing "tenant not found" errors. It will recover on a storcon restart, when it loads the persisted state. This issue has been present for at least a year. Resolves https://github.com/neondatabase/cloud/issues/28589. ## Summary of changes Only remove shards belonging to the relevant tenant when aborting a split. Also adds a regression test.	2025-05-06 13:57:34 +00:00
Alex Chi Z.	6827f2f58c	fix(pageserver): only keep `iter_with_options` API, improve docs in gc-compact (#11804 ) ## Problem Address comments in https://github.com/neondatabase/neon/pull/11709 ## Summary of changes - remove `iter` API, users always need to specify buffer size depending on the expected memory usage. - several doc improvements --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-05-06 12:27:16 +00:00
Peter Bendel	c82e363ed9	cleanup orphan projects created by python tests, too (#11836 ) ## Problem - some projects are created during GitHub workflows but not by action project_create but by python test scripts. If the python test fails the project is not deleted ## Summary of changes - make sure we cleanup those python created projects a few days after they are no longer used, too	2025-05-06 12:26:13 +00:00
Alexander Bayandin	50dc2fae77	compute-node.Dockerfile: remove layer with duplicated name (#11807 ) ## Problem Two `rust-extensions-build-pgrx14` layers were added independently in two different PRs, and the layers are exactly the same ## Summary of changes - Remove one of `rust-extensions-build-pgrx14` layers	2025-05-06 10:52:21 +00:00
github-actions[bot]	7272d9f7b3	Proxy release 2025-05-06 09:47 UTC	2025-05-06 09:47:48 +00:00
Folke Behrens	62ac5b94b3	proxy: Include the exp/nbf timestamps in the errors (#11828 ) ## Problem It's difficult to tell when the JWT expired from current logs and error messages. ## Summary of changes Add exp/nbf timestamps to the respective error variants. Also use checked_add when deserializing a SystemTime from JWT. Related to INC-509	2025-05-06 09:28:25 +00:00
Konstantin Knizhnik	f0e7b3e0ef	Use unlogged build for gist_indexsortbuild_flush_ready_pages (#11753 ) ## Problem See https://github.com/neondatabase/neon/issues/11718 GIST index can be constructed in two ways: GIST_SORTED_BUILD and GIST_BUFFERING. We used unlogged build in the second case but not in the first. ## Summary of changes Use unlogged build in `gist_indexsortbuild_flush_ready_pages` Correspondent Postgres PRsL: https://github.com/neondatabase/postgres/pull/624 https://github.com/neondatabase/postgres/pull/625 https://github.com/neondatabase/postgres/pull/626 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2025-05-06 07:24:27 +00:00
Dmitrii Kovalkov	c6ff18affc	cosmetics(pgxn/neon): WP code small clean up (#11824 ) ## Problem Some small cosmetic changes I made while reading the code. Should not affect anything. ## Summary of changes - Remove `n_votes` field because it's not used anymore - Explicitly initialize `safekeepers_generation` with `INVALID_GENERATION` if the generation is not present (the struct is zero-initialized anyway, but the explicit initialization is better IMHO) - Access SafekeeperId via pointer `sk_id` created above	2025-05-06 06:51:51 +00:00
Heikki Linnakangas	16ca74a3f4	Add SAFETY comment on libc::sysconf() call (#11581 ) I got an 'undocumented_unsafe_blocks' clippy warning about it. Not sure why I got the warning now and not before, but in any case a comment is a good idea.	2025-05-06 06:49:23 +00:00
Peter Bendel	cb67f9a651	delete orphan left over projects (#11826 ) ## Problem sometimes our benchmarking GitHub workflow is terminated by side-effects beyond our control (e.g. GitHub runner looses connection to server) and then we have left-over Neon projects created during the workflow [Example where GitHub runner lost connection and project was not deleted](https://github.com/neondatabase/neon/actions/runs/14017400543/job/39244816485) Fixes https://github.com/neondatabase/cloud/issues/28546 ## Summary of changes - Add a cleanup step that cleans up left-over projects - also give each project created during workflows a name that references the testcase and GitHub runid ## Example run (test of new job steps) https://github.com/neondatabase/neon/actions/runs/14837092399/job/41650741922#step:6:63 --------- Co-authored-by: a-masterov <72613290+a-masterov@users.noreply.github.com>	2025-05-05 14:30:13 +00:00
devin-ai-integration[bot]	baf425a2cd	[pageserver/virtual_file] impr: Improve OpenOptions API ergonomics (#11789 ) # Improve OpenOptions API ergonomics Closes #11787 This PR improves the OpenOptions API ergonomics by: 1. Making OpenOptions methods take and return owned Self instead of &mut self 2. Changing VirtualFile::open_with_options_v2 to take an owned OpenOptions 3. Removing unnecessary .clone() and .to_owned() calls These changes make the API more idiomatic Rust by leveraging the builder pattern with owned values, which is cleaner and more ergonomic than the previous approach. Link to Devin run: https://app.devin.ai/sessions/c2a4b24f7aca40a3b3777f4259bf8ee1 Requested by: christian@neon.tech --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: christian@neon.tech <christian@neon.tech>	2025-05-05 13:06:37 +00:00
Alex Chi Z.	0b243242df	fix(test): allow flush error in gc-compaction tests (#11822 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11762 ## Summary of changes While #11762 needs some work to refactor the error propagating thing, we can do a hacky fix for the gc-compaction tests to allow flush error during shutdown. It does not affect correctness. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-05 12:15:22 +00:00
Conrad Ludgate	6131d86ec9	proxy: allow invalid SNI (#11792 ) ## Problem Some PrivateLink customers are unable to use Private DNS. As such they use an invalid domain name to address Neon. We currently are rejecting those connections because we cannot resolve the correct certificate. ## Summary of changes 1. Ensure a certificate is always returned. 2. If there is an SNI field, use endpoint fallback if it doesn't match. I suggest reviewing each commit separately.	2025-05-05 11:18:55 +00:00
Konstantin Knizhnik	4b9087651c	Checked that stored LwLSN >= FirstNormalUnloggedLSN (#11750 ) ## Problem Undo unintended change `60b9fb1baf` ## Summary of changes Add assert that we are not storing fake LSN in LwLSN. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-05-02 19:27:59 +00:00
Konstantin Knizhnik	79699aebc8	Reserve in file descriptor pool sockets used for connections to page servers (#11798 ) ## Problem See https://github.com/neondatabase/neon/issues/11790 The neon extension opens extensions to the pageservers, which consumes file descriptors. Postgres has a mechanism to count how many FDs are in use, but it doesn't know about those FDs. We should call ReserveExternalFD() or AcquireExternalFD() to account for them. ## Summary of changes Call `ReserveExternalFD()` for each shard --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Mikhail Kot <mikhail@neon.tech>	2025-05-02 14:36:10 +00:00
Alexander Bayandin	22290eb7ba	CI: notify relevant team about release deploy failures (#11797 ) ## Problem We notify only Storage team about failed deploys, but Compute and Proxy teams can also benefit from that ## Summary of changes - Adjust `notify-storage-release-deploy-failure` to notify the relevant team about failed deploy	2025-05-02 12:46:21 +00:00
Alex Chi Z.	bbc35e10b8	fix(test): increase timeouts for some tests (#11781 ) ## Problem Those tests are timing out more frequently after https://github.com/neondatabase/neon/pull/11585 ## Summary of changes Increase timeout for `test_pageserver_gc_compaction_smoke` Increase rollback wait timeout for `test_tx_abort_with_many_relations` Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-01 18:36:26 +00:00
Alex Chi Z.	ae2c3ac12f	test: revert relsizev2 config (#11759 ) ## Problem part of https://github.com/neondatabase/neon/issues/9516 One thing I realized in the past few months is that "no-way-back" things like this are scary to roll out without a fine-grained rollout infra. The plan was to flip the flag in the repo and roll it out soon, but I don't think rolling out would happen in the near future. So I'd rather revert the flag to avoid creating a discrepancy between staging and the regress tests. ## Summary of changes Not using rel_size_v2 by default in unit tests; we still have a few tests to explicitly test the new format so we still get some test coverages. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-05-01 17:51:10 +00:00
Vlad Lazar	16d594b7b3	pagectl: list layers for given key in decreasing LSN order (#11799 ) Adds an extra key CLI arg to `pagectl layer list-layer`. When provided, only layers with key ranges containing the key will be listed in decreasing LSN order (indices are preserved for `dump-layer`).	2025-05-01 15:56:43 +00:00
Suhas Thalanki	f999632327	Adding `anon` v2 support to the dockerfile (#11313 ) ## Problem Removed `anon` v1 support as described here: https://github.com/neondatabase/cloud/issues/22663 Adding `anon` v2 support to re-introduce the `pg_anon` extension. Related Issues: https://github.com/neondatabase/cloud/issues/20456 ## Summary of changes Adding `anon` v2 support by building it in the dockerfile	2025-05-01 15:22:01 +00:00
Shockingly Good	5bd850d15a	Fix the leaked tracing context for the "compute_monitor:run". (#11791 ) Removes the leaked tracing context for the "compute_monitor:run" log, which either inherited the "start_compute" span or also the HTTP request context. ## Problem The problem is that the context of the monitor's trace is unnecessarily populated with the span data inherited from previously within the same thread. ## Summary of changes The context is completely reset by moving the span from the thread spawning the monitor into the thread where the monitor will actually start working. Addresses https://github.com/neondatabase/cloud/issues/28145 ## Examples ### Before ``` 2025-04-30T16:39:05.840298Z INFO start_compute:compute_monitor:run: compute is not running, waiting before monitoring activity ``` ### After ``` 2025-04-30T16:39:05.840298Z INFO compute_monitor:run: compute is not running, waiting before monitoring activity ```	2025-05-01 09:09:10 +00:00
Dmitrii Kovalkov	1b789e8d7c	fix(pgxn/neon): Use proper member size in TermsCollectedMset and VotesCollectedMset (#11785 ) ## Problem `TermsCollectedMset` and `VotesCollectedMset` accept a MemberSet argument to find a quorum in. It may be either `wp->mconf.members` or `wp->mconf.new_members`. But the loops inside always use `wp->mconf.members.len`. If the sizes of member sets are different, it may lead to these functions not scanning all the safekeepers from `mset`. We are not planning to change the member set size dynamically now, but it's worth fixing anyway. - Part of https://github.com/neondatabase/neon/issues/11669 ## Summary of changes - Use proper size of member set in `TermsCollectedMset` and `VotesCollectedMset`	2025-04-30 16:50:21 +00:00
Arpad Müller	bec7427d9e	pull_timeline and sk logging fixes (#11786 ) This patch contains some fixes of issues I ran into for #11712: * make `pull_timeline` return success for timeline that already exists. This follows general API design of storage components: API endpoints are retryable and converge to a status code, instead of starting to error. We change the `pull_timeline`'s return type a little bit, because we might not actually have a source sk to pull from. Note that the fix is not enough, there is still a race when two `pull_timeline` instances happen in parallel: we might try to enter both pulled timelines at the same time. That can be fixed later. * make `pull_timeline` support one safekeeper being down. In general, if one safekeeper is down, that's not a problem. the added comment explains a potential situation (found in the `test_lagging_sk` test for example) * don't log very long errors when computes try to connect to safekeepers that don't have the timeline yet, if `allow_timeline_creation` is false. That flag is enabled when a sk connection string with generation numbers is passed to the compute, so we'll hit this code path more often. E.g. when a safekeeper missed a timeline creation, but the compute connects to it first before the `pull_timeline` gets requested by the storcon reconciler: this is a perfectly normal situation. So don't log the whole error backtrace, and don't log it on the error log level, but only on info. part of #11670	2025-04-30 16:24:01 +00:00
Alex Chi Z.	e2db76b9be	feat(pageserver): ondemand download reason observability (#11780 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11615 ## Summary of changes We don't understand the root cause of why we get resident size surge every now and then. This patch adds observability for that, and in the next week, we might have a better understanding of what's going on. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-30 16:04:00 +00:00
Alex Chi Z.	6b4b8e0d8b	fix(pageserver): do not increase basebackup err counter when shutdown (#11778 ) ## Problem We occasionally see basebackup errors alerts but there were no errors logged. Looking at the code, the only codepath that will cause this is shutting down. ## Summary of changes Do not increase any counter (ok/err) when basebackup request gets cancelled due to shutdowns. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-30 15:50:12 +00:00
Konstantin Knizhnik	1d68577fbd	Check target slot state in prefetch_wait_for (#11779 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1745599814030679 Assume the following scenario: prefetch_wait_for is doing `CHECK_FOR_INTERRUPTS` which tries to load prefetch responses. In case of error is calls pageserver_disconnect which aborts all in-flight requests. But such failure is not detected by `prefetch_wait_for` which returns true. As a result `communicator_read_at_lsnv` assumes that slot is received, but as far as asserts are disables at prod, it is not actually checked. Then it tries to interpret response and ... SIGSEGV ## Summary of changes Check target slot state in `prefetch_wait_for`. Resolves https://github.com/neondatabase/cloud/issues/28258 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-30 12:44:59 +00:00
Arseny Sher	60f63c076f	Make safekeeper proto version 3 default (#11518 ) ## Problem We have been running compute <-> sk protocol version 3 for a while on staging with no issues observed, and want to fully migrate to it eventually. ## Summary of changes Let's make v3 the default. ref https://github.com/neondatabase/neon/issues/10326 --------- Co-authored-by: Arpad Müller <arpad@neon.tech>	2025-04-30 12:23:20 +00:00
Mikhail Kot	8da4ec9740	Postgres metrics for stuck getpage requests (#11710 ) https://github.com/neondatabase/neon/issues/10327 Resolves: #11720 New metrics: - `compute_getpage_stuck_requests_total` - `compute_getpage_max_inflight_stuck_time_ms`	2025-04-30 12:01:41 +00:00
Em Sharnoff	b48404952d	Bump vm-builder: v0.42.2 -> v0.46.0 (#11782 ) Bumped to pick up the changes from neondatabase/autoscaling#1366 — specifically including `uname` in the logs. Other changes included: * neondatabase/autoscaling#1301 * neondatabase/autoscaling#1296	2025-04-30 11:32:25 +00:00
devin-ai-integration[bot]	1d06172d59	pageserver: remove resident size from billing metrics (#11699 ) This is a rebase of PR #10739 by @henryliu2014 on the current main branch. ## Problem pageserver: remove resident size from billing metrics Fixes #10388 ## Summary of changes The following changes have been made to remove resident size from billing metrics: * removed the metric "resident_size" and related codes in consumption_metrics/metrics.rs * removed the item of the description of metric "resident_size" in consumption_metrics.md * refactored the metric "resident_size" related test case Requested by: John Spray (john@neon.tech) --------- Co-authored-by: liuheqing <hq.liu@qq.com> Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: John Spray <john@neon.tech>	2025-04-29 18:34:56 +00:00
Elizabeth Murray	a08c1a23eb	Upgrade the pgrag version in the compute Dockerfile. (#11687 ) Update the compute Dockerfile to use a new version of pgrag. The new version of pgrag uses the latest pgrx, and has a fix that terminates background workers on postmaster exit.	2025-04-29 16:50:18 +00:00
John Spray	a2adc7dbd3	storcon: avoid multiple initdbs when shard 0 has stale locations (#11760 ) ## Problem In #11727 I overlooked the case of multiple attached locations for shard 0. I misread the code and thought `create_one` acts on one location, but it actually acts on one _shard_, which is potentially multiple locations. This was not a regression, but it meant that the fix was incomplete. ## Summary of changes - In `create_one`, when updating shard zero, have any "other" locations use the initdb from shard 0	2025-04-29 15:31:52 +00:00
Vlad Lazar	768a580373	pageserver: add not modified since lsn to get page span (#11774 ) It's useful when debugging.	2025-04-29 14:07:23 +00:00
Folke Behrens	09247de8d5	proxy: Enable JSON logging by default (#11772 ) This does not affect local_proxy.	2025-04-29 13:11:24 +00:00
Arpad Müller	0b35929211	Make SafekeeperReconciler parallel via semaphore (#11757 ) Right now we only support running one reconciliation per safekeeper. This is of course usually way below of what a safekeeper can do. Therefore, introduce a semaphore and spawn the tasks asynchronously as they come in. Part of #11670	2025-04-29 12:46:15 +00:00
Ivan Efremov	b3db7f66ac	fix(compute): Change the local_proxy log level (#11770 ) Related to the INC-496	2025-04-29 11:49:16 +00:00
a-masterov	498d852bde	Fix the empty region if run on schedule (#11764 ) ## Problem When the workflow ran on a schedule, the `region_id` input was not set. As a result, an empty region value was used, which caused errors during execution. ## Summary of Changes - Added fallback logic to set a default region (`aws-us-east-2`) when `region_id` is not provided. - Ensures the workflow works correctly both when triggered manually (`workflow_dispatch`) and on schedule (`cron`).	2025-04-29 09:12:14 +00:00
Busra Kugler	7f8b1d79c0	Replace dorny/paths-filter with step-security maintained version (#11663 ) ## Problem Our CI/CD security tool StepSecurity maintains safer forks of popular GitHub Actions with low security scores. We're replacing dorny/paths-filter with the maintained step-security/paths-filter version to reduce risk of supply chain breaches and potential CVEs. ## Summary of changes replace ```uses: dorny/paths-filter@de90cc6fb3 ``` with ```uses: step-security/paths-filter@v3``` This PR will fix: neondatabase/cloud#26141	2025-04-29 09:02:01 +00:00
JC Grünhage	d15f2ff57a	fix(lint-release-pr): adjust lint and action to match (#11766 ) ## Problem The `lint-release-pr` workflow run for https://github.com/neondatabase/neon/pull/11763 failed, because the new action did not match the lint. ## Summary of changes Include time in expected merge message regex.	2025-04-29 08:56:44 +00:00
Konstantin Knizhnik	3593356c10	Prewarm sql api (#11742 ) ## Problem Continue work on prewarm, see https://github.com/neondatabase/neon/pull/11740 https://github.com/neondatabase/neon/pull/11741 ## Summary of changes Add SQL API to prewarm --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-29 06:44:28 +00:00
github-actions[bot]	37d555aa59	Proxy release 2025/04/29 06:01 UTC	2025-04-29 06:01:28 +00:00
Tristan Partin	9e8ab2ab4f	Skip remote extensions WITH_LIB test when sanitizers are enabled (#11758 ) In order for the test to work when sanitizers are enabled, we would need to compile the dummy Postgres extension with the same sanitizer flags that we compile Postgres and the neon extension with. Doing this work would be a little more than trivial, so skipping is the best option, at least for now. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-28 19:13:35 +00:00
Alex Chi Z.	c1ff7db187	fix(pageserver): consider tombstones in replorigin (#11752 ) ## Problem We didn't consider tombstones in replorigin read path in the past. This was fine because tombstones are stored as LSN::Invalid before we universally define what the tombstone is for sparse keyspaces. Now we remove non-inherited keys during detach ancestor and write the universal tombstone "empty image". So we need to consider it across all the read paths. related: https://github.com/neondatabase/neon/pull/11299 ## Summary of changes Empty value gets ignored for replorigin scans. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-28 18:54:26 +00:00
Konstantin Knizhnik	6d6b83e737	Prewarm implementation (#11741 ) ## Problem Continue work on prewarm started in PR https://github.com/neondatabase/neon/pull/11740 ## Summary of changes Implement prewarm using prefetch --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-28 18:17:03 +00:00
John Spray	0482690534	pageserver: make control_plane_api & generations fully mandatory (#10715 ) ## Problem We had retained the ability to run in a generation-less mode to support test_generations_upgrade, which was replaced with a cleaner backward compat test in https://github.com/neondatabase/neon/pull/10701 ## Summary of changes - Remove all the special cases for "if no generation" or "if no control plane api" - Make control_plane_api config mandatory --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-04-28 17:24:55 +00:00
Tristan Partin	a750026c2e	Fix compiler warning in libpagestore.c when WITH_SANITIZERS=yes (#11755 ) Postgres has a nice self-documenting macro called pg_unreachable() when you want to assert that a location in code won't be hit. Warning in question: ``` /home/tristan957/Projects/work/neon//pgxn/neon/libpagestore.c: In function ‘pageserver_connect’: /home/tristan957/Projects/work/neon//pgxn/neon/libpagestore.c:739:1: warning: control reaches end of non-void function [-Wreturn-type] 739 \| } \| ^ ``` Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-28 17:09:48 +00:00
John Spray	998d2c2ce9	storcon: use shard 0's initdb for timeline creation (#11727 ) ## Problem In princple, pageservers with different postgres binaries might generate different initdbs, resulting in inconsistency between shards. To avoid that, we should have shard 0 generate the initdb and other shards re-use it. Fixes: https://github.com/neondatabase/neon/issues/11340 ## Summary of changes - For shards with index greater than zero, set `existing_initdb_timeline_id` in timeline creation to consume the existing initdb rather than creating a new one	2025-04-28 16:43:35 +00:00
JC Grünhage	b1fa68f659	impr(ci): switch release PR creation over to use python based action (#11679 ) ## Problem Our different repositories had both had code to achieve very similar results in terms of release PR creation, but they were structured differently and had different extensions. This was likely to cause maintainability problems in the long run. ## Summary of changes Switch to a python cli based composite action for creating the release PRs that will also be introduced in our other repos later. ## To Do - [ ] Adjust our docs to reflect the changes from this.	2025-04-28 16:37:36 +00:00
devin-ai-integration[bot]	84bc3380cc	Remove SAFEKEEPER_AUTH_TOKEN env var parsing from safekeeper (#11698 ) # Remove SAFEKEEPER_AUTH_TOKEN env var parsing from safekeeper This PR is a follow-up to #11443 that removes the parsing of the `SAFEKEEPER_AUTH_TOKEN` environment variable from the safekeeper codebase while keeping the `auth_token_path` CLI flag functionality. ## Changes: - Removed code that checks for the `SAFEKEEPER_AUTH_TOKEN` environment variable - Updated comments to reflect that only the `auth_token_path` CLI flag is now used As mentioned in PR #11443, the environment variable approach was planned to be deprecated and removed in favor of the file-based approach, which is more secure since environment variables can be quite public in both procfs and unit files. Link to Devin run: https://app.devin.ai/sessions/d6f56cf1b4164ea9880a9a06358a58ac Requested by: arpad@neon.tech --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: arpad@neon.tech <arpad@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-04-28 15:34:47 +00:00
Alex Chi Z.	11f6044338	fix(pageserver): report synthetic size = 1 if all tls offloaded (2) (#11731 ) ## Problem https://github.com/neondatabase/neon/pull/11648 did this for resident size instead of synthetic size. ## Summary of changes Report synthetic_size == 1 if all timelines are offloaded. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-28 13:45:45 +00:00
Konstantin Knizhnik	692c0f3fb8	Prepare to prewarm support (#11740 ) ## Problem See (original prewarm implementation) https://github.com/neondatabase/neon/pull/9197 (functions for storing/restoring LFC state) https://github.com/neondatabase/neon/pull/9587 (store prefetch results in LFC) https://github.com/neondatabase/neon/pull/10442 ## Summary of changes Preparation for prewarm implementation. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-28 13:24:18 +00:00
Alexander Bayandin	2b1d2a55d6	CI: fix typo oicd -> oidc (#11747 ) ## Problem It's OIDC (OpenID Connect), not OICD ## Summary of changes - Rename actions input `aws-oicd-role-arn` -> `aws-oidc-role-arn`	2025-04-28 12:44:28 +00:00
Konstantin Knizhnik	60b9fb1baf	Ignore unlogged LSNs in set last written LSN (#11743 ) ## Problem See https://github.com/neondatabase/neon/issues/11718 and https://neondb.slack.com/archives/C033RQ5SPDH/p1745122797538509 GIST other indexes performing "unlogged build" are using so called fake LSNs - not a real LSN, but something like 0/1. Been stored in lwlsn cache they cause incorrect lookup at PS. ## Summary of changes Do not store fake LSNs in LwLSN hash. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-28 12:16:29 +00:00
Erik Grinaker	606f14034e	pageserver: improve `pageserver_smgr_query_seconds` buckets (#11680 ) ## Problem The `pageserver_smgr_query_seconds` buckets are too coarse, using powers of 10: 1 µs, 10 µs, 100 µs, 1 ms, 10 ms, 100 ms, 1 s, 10 s, 100 s. This is one of our most crucial latency metrics, and needs better resolution. Touches #11594. ## Summary of changes This patch uses buckets with better resolution around 1 ms (the typical latency): * 0.6 ms * 1 ms * 3 ms * 6 ms * 10 ms * 30 ms * 100 ms * 1 s * 3 s These will be the same as the compute's `compute_getpage_wait_seconds`, to make them comparable across the compute and Pageserver: https://github.com/neondatabase/flux-fleet/pull/579. We sacrifice buckets above 3 s, since these can already be considered "too slow". This does not change the previously used `CRITICAL_OP_BUCKETS`, which is also used for other operations on different timescales (e.g. LSN waits). We should consider replacing this with more appropriate buckets for specific operations, since it covers a large span with low resolution.	2025-04-28 11:52:44 +00:00
Conrad Ludgate	32393b4393	pg-sni-router: support compute TLS on different port (#11732 ) ## Problem pg-sni-router isn't aware of compute TLS ## Summary of changes If connections come in on port 4433, we require TLS to compute from pg-sni-router	2025-04-28 11:29:44 +00:00
Alexander Bayandin	1a29f5672a	CI(check-macos-build): trigger workflow automatically for PRs (#11706 ) ## Problem - if-conditions for the `check-macos-build` workflow don't trigger it on PRs with relevant changes (in Rust code or Postgres submodules). - Jobs in the workflow depend on the presence of a cache, which is not guaranteed. ## Summary of changes - Fix if-conditions - Use artifacts on top of cache whenever the workflow depends on it — the cache might not be available	2025-04-28 09:03:10 +00:00
a-masterov	b8d47b5acf	Run the extensions' tests on staging (#11164 ) ## Problem We currently don't run end-to-end tests for PostgreSQL extensions on our cloud infrastructure, which means we might miss problems that only occur in a real cloud environment. ## Summary of changes - Added a workflow to run extension tests against a cloud staging instance - Set up proper project configuration for extension testing - Implemented test execution with appropriate environment settings - Added error handling and reporting for test failures --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2025-04-28 08:13:49 +00:00
Alexander Lakhin	97e01ae6fd	Add workflow to run particular test(s) N times (#11050 ) ## Problem Provide an easy way to run particular test(s) N times on CI. ## Summary of changes * Allow for passing the test selection and the number of test runs to the existing "Build and Test Locally" workflow * Allow for running multiple selected tests by the "Pytest regression tests" step * Introduce a new workflow to run specified test(s) several times * Store results in a separate database to distinguish between testing tests for stability and usual testing	2025-04-28 04:04:37 +00:00
Lokesh	459d51974c	doc: minor updates to consumption-metrics document (#7153 ) ## Problem Proposed minor changes to the `consumption_metrics` document. ## Summary of changes - Fixed minor typos in the document. - Minor formatting in the description of metrics `timeline_logical_size` and `synthetic_storage_size`. Makes this consistent as with description of other metrics in the document. ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Mikhail Kot <mikhail@neon.tech>	2025-04-25 19:46:40 +00:00
StepSecurity Bot	902d361107	CI/CD Hardening: Fixing StepSecurity Flagged Issues (#11724 ) ### Summary I'm fixing one or more of the following CI/CD misconfigurations to improve security. Please feel free to leave a comment if you think the current permissions for the GITHUB_TOKEN should not be restricted so I can take a note of it as accepted behaviour. - Restrict permissions for GITHUB_TOKEN - Add step-security/harden-runner - Pin Actions to a full length commit SHA ### Security Fixes will fix https://github.com/neondatabase/cloud/issues/26141	2025-04-25 14:36:45 +00:00
Dmitrii Kovalkov	ef53a76434	storage_broker: https handler (#11603 ) ## Problem Broker supports only HTTP, no HTTPS - Closes: https://github.com/neondatabase/cloud/issues/27492 ## Summary of changes - Add `listen_https_addr`, `ssl_key_file`, `ssl_cert_file`, `ssl_cert_reload_period` arguments to storage broker - Make `listen_addr` argument optional - Listen https in storage broker - Support https for storage broker request in neon_local - Add `use_https_storage_broker_api` option to NeonEnvBuilder	2025-04-25 14:28:56 +00:00
Vlad Lazar	6f0046b688	storage_controller: ensure mutual exclusion for imports and shard splits (#11632 ) ## Problem Shard splits break timeline imports. ## Summary of Changes Ensure mutual exclusion for imports and shard splits. On the shard split code path: 1. Right before shard splitting, check the database to ensure that no-import is on-going for the tenant. Exclusion is guaranteed because this validation is done while holding the exclusive tenant lock. Timeline creation (and import creation implicitly) requires a shared tenant lock. 2. When selecting a shard to split, use the in-mem state to exclude shards with an on-going import. This is opportunistic since an import might start after the check, but allows shard splits to make progres instead of continously retrying to split the same shard. On the timeline creation code path: 1. Check the in-memory splitting flag on all shards of the tenant. If any of them are splitting, error out asking the client to retry. On the happy path this is not required, due to the tenant lock set-up described above, but it covers the case where we restart with a pending shard-split. Closes https://github.com/neondatabase/neon/issues/11567	2025-04-25 11:46:15 +00:00
Em Sharnoff	2b0248cd76	fix(proxy): s/Console/Control plane/ in cplane error (#11716 ) I got bamboozled by the error message while debugging, seems no objections to updating it. ref https://neondb.slack.com/archives/C060N3SEF9D/p1745570961111509 ref https://neondb.slack.com/archives/C039YKBRZB4/p1745570811957019?thread_ts=1745393368.283599	2025-04-25 11:09:56 +00:00
Fedor Dikarev	7b03216dca	CI(check-macos-build): use gh native cache (#11707 ) ## Problem - using Hetzner buckets for cache requires secrets, we either need `secrets: inherit` to make it works - we don't have self-hosted MacOs runners, so actually GH native cache is more optimal solution there ## Summary of changes - switch to GH native cache for macos builds	2025-04-25 09:18:20 +00:00
a-masterov	992aa91075	Refresh the codestyle of docker compose test script (#11715 ) ## Problem The docker compose test script (`docker_compose_test.sh`) had inconsistent codestyle, mixing legacy syntax with modern approaches and not following best practices at all. This inconsistency could lead to potential issues with variable expansion, path handling, and maintainability. ## Summary of changes This PR modernizes the test script with several codestyle improvements: * Variable scoping and exports: * Added proper export declarations for environment variables * Added explicit COMPOSE_PROFILES export to avoid repetitive flags * Modern Bash syntax: * Replaced [ ] with [[ ]] for safer conditional testing * Used arithmetic operations (( cnt += 3 )) instead of expr * Added proper variable expansion with braces ${variable} * Added proper quoting around variables and paths with "${variable}" * Docker Compose commands: * Replaced hardcoded container names with service names * Used docker compose exec instead of docker exec $CONTAINER_NAME * Removed repetitive flags by using environment variables * Shell script best practices: * Added function keyword before function definition * Used safer path handling with "$(dirname "${0}")" These changes make the script more maintainable, less error-prone, and more consistent with modern shell scripting standards.	2025-04-25 09:13:35 +00:00
Conrad Ludgate	afe9b27983	fix(compute/tls): support for checking certificate chains (#11683 ) ## Problem It seems are production-ready cert-manager setup now includes a full certificate chain. This was not accounted for and the decoder would error. ## Summary of changes Change the way we decode certificates to support cert-chains, ignoring all but the first cert. This also changes a log line to not use multi-line errors. ~~I have tested this code manually against real certificates/keys, I didn't want to embed those in a test just yet, not until the cert expires in 24 hours.~~	2025-04-25 09:09:14 +00:00
Alex Chi Z.	5d91d4e843	fix(pageserver): reduce gc-compaction memory usage (#11709 ) ## Problem close https://github.com/neondatabase/neon/issues/11694 We had the delta layer iterator and image layer iterator set to buffer at most 8MB data. Note that 8MB is the compressed size, so it is possible for those iterators contain more than 8MB data in memory. For the recent OOM case, gc-compaction was running over 556 layers, which means that we will have 556 active iterators. So in theory, it could take up to 556*8=4448MB memory when the compaction is going on. If images get compressed and the compression ratio is high (for that tenant, we see 3x compression ratio across image layers), then that's 13344MB memory. Also we have layer rewrites, which explains the memory taken by gc-compaction itself (versus the iterators). We rewrite 424 out of 556 layers, and each of such rewrites need a pair of delta layer writer. So we are buffering a lot of deltas in the memory. The flamegraph shows that gc-compaction itself takes 6GB memory, delta iterator 7GB, and image iterator 2GB, which can be explained by the above theory. ## Summary of changes - Reduce the buffer sizes. - Estimate memory consumption and if it is too high. - Also give up if the number of layers-to-rewrite is too high. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-25 08:45:31 +00:00
Alexander Bayandin	2465e9141f	test_runner: bump `httpcore` to 1.0.9 and `h11` to 0.16.0 (#11711 ) ## Problem https://github.com/advisories/GHSA-vqfr-h8mv-ghfj ## Summary of changes - Bump `h11` to 0.16.0 (required to bump `httpcore` to 1.0.9)	2025-04-25 08:44:40 +00:00
Tristan Partin	2526f6aea1	Add remote extension test with library component (#11301 ) The current test was just SQL files only, but we also want to test a remote extension which includes a loadable library. With both extensions we should cover a larger portion of compute_ctl's remote extension code paths. Fixes: https://github.com/neondatabase/neon/issues/11146 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-24 22:33:46 +00:00
Vlad Lazar	5ba7315c84	storage_controller: reconcile completed imports at start-up (#11614 ) ## Problem In https://github.com/neondatabase/neon/pull/11345 coordination of imports moved to the storage controller. It involves notifying cplane when the import has been completed by calling an idempotent endpoint. If the storage controller shuts down in the middle of finalizing an import, it would never be retried. ## Summary of changes Reconcile imports at start-up by fetching the complete imports from the database and spawning a background task which notifies cplane. Closes: https://github.com/neondatabase/neon/issues/11570	2025-04-24 18:39:19 +00:00
Vlad Lazar	6f7e3c18e4	storage_controller: make leadership protocol more robust (#11703 ) ## Problem We saw the following scenario in staging: 1. Pod A starts up. Becomes leader and steps down the previous pod cleanly. 2. Pod B starts up (deployment). 3. Step down request from pod B to pod A times out. Pod A did not manage to stop its reconciliations within 10 seconds and exited with return code 1 ([code](`7ba8519b43/storage_controller/src/service.rs (L8686-L8702)`)). 4. Pod B marks itself as the leader and finishes start-up 5. k8s restarts pod A 6. k8s marks pod B as ready 7. pod A sends step down request to pod A - this succeeds => pod A is now the leader 8. k8s kills pod A because it thinks pod B is healthy and pod A is part of the old replica set We end up in a situation where the only pod we have (B) is stepped down and attempts to forward requests to a leader that doesn't exist. k8s can't detect that pod B is in a bad state since the /status endpoint simply returns 200 hundred if the pod is running. ## Summary of changes This PR includes a number of robustness improvements to the leadership protocol: * use a single step down task per controller * add a new endpoint to be used as k8s liveness probe and check leadership status there * handle restarts explicitly (i.e. don't step yourself down) * increase the step down retry count * don't kill the process on long step down since k8s will just restart it	2025-04-24 16:59:56 +00:00
Christian Schwarz	8afb783708	feat: Direct IO for the pageserver write path (#11558 ) # Problem The Pageserver read path exclusively uses direct IO if `virtual_file_io_mode=direct`. The write path is half-finished. Here is what the various writing components use: \|what\|buffering\|flags on <br/>`v_f_io_mode`<br/>=`buffered`\|flags on <br/>`virtual_file_io_mode`<br/>=`direct`\| \|-\|-\|-\|-\| \|`DeltaLayerWriter`\| BlobWriter<BUFFERED=true> \| () \| () \| \|`ImageLayerWriter`\| BlobWriter<BUFFERED=false> \| () \| () \| \|`download_layer_file`\|BufferedWriter\|()\|()\| \|`InMemoryLayer`\|BufferedWriter\|()\|O_DIRECT\| The vehicle towards direct IO support is `BufferedWriter` which - largely takes care of O_DIRECT alignment & size-multiple requirements - double-buffering to mask latency `DeltaLayerWriter`, `ImageLayerWriter` use `blob_io::BlobWriter` , which has neither of these. # Changes ## High-Level At a high-level this PR makes the following primary changes: - switch the two layer writer types to use `BufferedWriter` & make sensitive to `virtual_file_io_mode` (via open_with_options_v2) - make `download_layer_file` sensitive to `virtual_file_io_mode` (also via open_with_options_v2) - add `virtual_file_io_mode=direct-rw` as a feature gate - we're hackish-ly piggybacking on OpenOptions's ask for write access here - this means with just `=direct` InMemoryLayer reads and writes no longer uses O_DIRECT - this is transitory and we'll remove the `direct-rw` variant once the rollout is complete (The `_v2` APIs for opening / creating VirtualFile are those that are sensitive to `virtual_file_io_mode`) The result is: \|what\|uses <br/>`BufferedWriter`\|flags on <br/>`v_f_io_mode`<br/>=`buffered`\|flags on <br/>`v_f_io_mode`<br/>=`direct`\|flags on <br/>`v_f_io_mode`<br/>=`direct-rw`\| \|-\|-\|-\|-\|-\| \|`DeltaLayerWriter`\| ~~Blob~~BufferedWriter \| () \| () \| O_DIRECT \| \|`ImageLayerWriter`\| ~~Blob~~BufferedWriter \| () \| () \| O_DIRECT \| \|`download_layer_file`\|BufferedWriter\|()\|()\|O_DIRECT\| \|`InMemoryLayer`\|BufferedWriter\|()\|~~O_DIRECT~~()\|O_DIRECT\| ## Code-Level The main change is: - Switch `blob_io::BlobWriter` away from its own buffering method to use `BufferedWriter`. Additional prep for upholding `O_DIRECT` requirements: - Layer writer `finish()` methods switched to use IoBufferMut for guaranteed buffer address alignment. The size of the buffers is PAGE_SZ and thereby implicitly assumed to fulfill O_DIRECT requirements. For the hacky feature-gating via `=direct-rw`: - Track `OpenOptions::write(true\|false)` in a field; bunch of mechanical churn. - Consolidate the APIs in which we "open" or "create" VirtualFile for better overview over which parts of the code use the `_v2` APIs. Necessary refactorings & infra work: - Add doc comments explaining how BufferedWriter ensures that writes are compliant with O_DIRECT alignment & size constraints. This isn't new, but should be spelled out. - Add the concept of shutdown modes to `BufferedWriter::shutdown` to make writer shutdown adhere to these constraints. - The `PadThenTruncate` mode might not be necessary in practice because I believe all layer files ever written are sized in multiples `PAGE_SZ` and since `PAGE_SZ` is larger than the current alignment requirements (512/4k depending on platform), it won't be necesary to pad. - Some test (I believe `round_trip_test_compressed`?) required it though - [ ] TODO: decide if we want to accept that complexity; if we do then address TODO in the code to separate alignment requirement from buffer capacity - Add `set_len` (=`ftruncate`) VirtualFile operation to support the above. - Allow `BufferedWriter` to start at a non-zero offset (to make room for the summary block). Cleanups unlocked by this change: - Remove non-positional APIs from VirtualFile (e.g. seek, write_full, read_full) Drive-by fixes: - PR https://github.com/neondatabase/neon/pull/11585 aimed to run unit tests for all `virtual_file_io_mode` combinations but didn't because of a missing `_` in the env var. # Performance This section assesses this PR's impact on deployments with current production setting (`=direct`) and anticipated impact of switching to (`=direct-rw`). For `DeltaLayerWriter`, `=direct` should remain unchanged to slightly improved on throughput because the `BlobWriter`'s buffer had the same size as the `BufferedWriter`'s buffer, but it didn't have the double-buffering that `BufferedWriter` has. The `=direct-rw` enables direct IO; throughput should not be suffering because of double-buffering; benchmarks will show if this is true. The `ImageLayerWriter` was previously not doing any buffering (`BUFFERED=false`). It went straight to issuing the IO operation to the underlying VirtualFile and the buffering was done by the kernel. The switch to `BufferedWriter` under `=direct` adds an additional memcpy into the BufferedWriter's buffer. We will win back that memcpy when enabling direct IO via `=direct-rw`. A nice win from the switch to `BufferedWriter` is that ImageLayerWriter performs >=16x fewer write operations to VirtualFile (the BlobWriter performs one write per len field and one write per image value). This should save low tens of microseconds of CPU overhead from doing all these syscalls/io_uring operations, regardless of `=direct` or `=direct-rw`. Aside from problems with alignment, this write frequency without double-buffering is prohibitive if we actually have to wait for the disk, which is what will happen when we enable direct IO via (`=direct-rw`). Throughput should not be suffering because of BufferedWrite's double-buffering; benchmarks will show if this is true. `InMemoryLayer` at `=direct` will flip back to using buffered IO but remain on BufferedWriter. The buffered IO adds back one memcpy of CPU overhead. Throughput should not suffer and will might improve on not-memory-pressured Pageservers but let's remember that we're doing the whole direct IO thing to eliminate global memory pressure as a source of perf variability. ## bench_ingest I reran `bench_ingest` on `im4gn.2xlarge` and `Hetzner AX102`. Use `git diff` with `--word-diff` or similar to see the change. General guidance on interpretation: - immediate production impact of this PR without production config change can be gauged by comparing the same `io_mode=Direct` - end state of production switched over to `io_mode=DirectRw` can be gauged by comparing old results' `io_mode=Direct` to new results' `io_mode=DirectRw` Given above guidance, on `im4gn.2xlarge` - immediate impact is a significant improvement in all cases - end state after switching has same significant improvements in all cases - ... except `ingest/io_mode=DirectRw volume_mib=128 key_size_bytes=8192 key_layout=Sequential write_delta=Yes` which only achieves `238 MiB/s` instead of `253.43 MiB/s` - this is a 6% degradation - this workload is typical for image layer creation # Refs - epic https://github.com/neondatabase/neon/issues/9868 - stacked atop - preliminary refactor https://github.com/neondatabase/neon/pull/11549 - bench_ingest overhaul https://github.com/neondatabase/neon/pull/11667 - derived from https://github.com/neondatabase/neon/pull/10063 Co-authored-by: Yuchen Liang <yuchen@neon.tech>	2025-04-24 14:57:36 +00:00
Konstantin Knizhnik	1531712555	Undo commit `d1728a6bcd` because it causes problems with creating pg_search extension (#11700 ) ## Problem See https://neondb.slack.com/archives/C03H1K0PGKH/p1745489241982209 pg_search extension now can not be created. ## Summary of changes Undo `d1728a6bcd` Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-24 14:46:10 +00:00
Alexander Bayandin	5e989a3148	CI(build-tools): bump packages in build-tools image (#11697 ) ## Problem `cargo-deny` 0.16.2 spits a bunch of warnings like: ``` warning[index-failure]: unable to check for yanked crates ``` The issue is fixed for the latest version of `cargo-deny` (0.18.2). And while we're here, let's bump all the packages we have in `build-tools` image ## Summary of changes - bump cargo-hakari to 0.9.36 - bump cargo-deny to 0.18.2 - bump cargo-hack to 0.6.36 - bump cargo-nextest to 0.9.94 - bump diesel_cli to 2.2.9 - bump s5cmd to 2.3.0 - bump mold to 2.37.1 - bump python to 3.11.12	2025-04-24 14:13:04 +00:00
Alexey Kondratov	985056be37	feat(compute): Introduce Postgres downtime metrics (#11346 ) ## Problem Currently, we only report the timestamp of the last moment we think Postgres was active. The problem is that if Postgres gets completely unresponsive, we still report some old timestamp, and it's impossible to distinguish situations 'Postgres is effectively down' and 'Postgres is running, but no client activity'. ## Summary of changes Refactor the `compute_ctl`'s compute monitor so that it was easier to track the connection errors and failed activity checks, and report - `now() - last_successful_check` as current downtime on any failure - cumulative Postgres downtime during the whole compute lifetime After adding a test, I also noticed that the compute monitor may not reconnect even though queries fail with `connection closed` or `error communicating with the server: Connection reset by peer (os error 54)`, but for some reason we do not catch it with `client.is_closed()`, so I added an explicit reconnect in case of any failures. Discussion: https://neondb.slack.com/archives/C03TN5G758R/p1742489426966639	2025-04-24 13:51:09 +00:00
Christian Schwarz	9c6ff3aa2b	refactor(BufferedWriter): flush task owns the VirtualFile & abstraction for cleanup on drop (#11549 ) Main change: - `BufferedWriter` owns the `W`; no more `Arc<W>` - We introduce auto-delete-on-drop wrappers for `VirtualFile`. - `TempVirtualFile` for write-only users - `TempVirtualFileCoOwnedByEphemeralFileAndBufferedWriter` for EphemeralFile which requires read access to the immutable prefix of the file (see doc comments for details) - Users of `BufferedWriter` hand it such a wrapped `VirtualFile`. - The wrapped `VirtualFile` moves to the background flush task. - On `BufferedWriter` shutdown, ownership moves back. - Callers remove the wrapper (`disarm_into_inner()`) after doing final touches, e.g., flushing index blocks and summary for delta/image layer writers. If the BufferedWriter isn't shut down properly via `BufferedWriter::shutdown`, or if there is an error during final touches, the wrapper type ensures that the file gets unlinked. We store a GateGuard inside the wrapper to ensure that the Timeline is still alive when unlinking on drop. Rust doesn't have async drop yet, so, the unlinking happens using a synchronous syscall. NB we don't fsync the surrounding directory. This is how it's been before this PR; I believe it is correct because all of these files are temporary paths that get cleaned up on timeline load. Again, timeline load does not need to fsync because the next timeline load will unlink again if the file reappears. The auto-delete-on-drop can happen after a higher-level mechanism retries. Therefore, we switch all users to monotonically increasing, never-reused temp file disambiguators. The aspects pointed out in the last two paragraphs will receive further cleanup in follow-up task - https://github.com/neondatabase/neon/issues/11692 Drive-by changes: - It turns out we can remove the two-pronged code in the layer file download code. No need to make this a separate PR because all of production already uses `tokio-epoll-uring` with the buffered writer for many weeks. Refs - epic https://github.com/neondatabase/neon/issues/9868 - alternative to https://github.com/neondatabase/neon/pull/11544	2025-04-24 13:07:57 +00:00
Folke Behrens	9d472c79ce	Fix what's currently flagged by cargo deny (#11693 ) * Replace yanked papaya version * Remove unused allowed license: OpenSSL * Remove Zlib license from general allow list since it's listed in the exceptions section per crate * Drop clarification for ring since they have separate LICENSE files now * List the tower-otel repo as allowed source while we sort out the OTel deps	2025-04-24 13:02:31 +00:00
Arpad Müller	b43203928f	Switch tenant snapshot subcommand to remote_storage (#11685 ) Switches the tenant snapshot subcommand of the storage scrubber to `remote_storage`. As this is the last piece of the storage scrubber still using the S3 SDK, this finishes the project started in #7547. This allows us to do tenant snapshots on Azure as well. Builds on #11671 Fixes #8830	2025-04-24 12:22:07 +00:00
Arpad Müller	c35d489539	versioning API for remote_storage (#11671 ) Adds a versioning API to remote_storage. We want to use it in the scrubber, both for tenant snapshot as well as for metadata checks. for #8830 and for #11588	2025-04-24 11:41:48 +00:00
Vlad Lazar	3a50d95b6d	storage_controller: coordinate imports across shards in the storage controller (#11345 ) ## Problem Pageservers notify control plane directly when a shard import has completed. Control plane has to download the status of each shard from S3 and figure out if everything is truly done, before proceeding with branch activation. Issues with this approach are: * We can't control shard split behaviour on the storage controller side. It's unsafe to split during import. * Control plane needs to know about shards and implement logic to check all timelines are indeed ready. ## Summary of changes In short, storage controller coordinates imports, and, only when everything is done, notifies control plane. Big rocks: 1. Store timeline imports in the storage controller database. Each import stores the status of its shards in the database. We hook into the timeline creation call as our entry point for this. 2. Pageservers get a new upcall endpoint to notify the storage controller of shard import updates. 3. Storage controller handles these updates by updating persisted state. If an update finalizes the import, then poll pageservers until timeline activation, and, then, notify the control plane that the import is complete. Cplane side change with new endpoint is in https://github.com/neondatabase/cloud/pull/26166 Closes https://github.com/neondatabase/neon/issues/11566	2025-04-24 11:26:06 +00:00
Arpad Müller	d43b8e73ae	Update sentry to 0.37 (#11686 ) Update the sentry crate to 0.37. This deduplicates the `webpki-roots` crate in our crate graph, and brings another dependency onto newer rustls `0.23.18`.	2025-04-24 11:20:41 +00:00
devin-ai-integration[bot]	1808dad269	Add --dev CLI flag to pageserver and safekeeper binaries (#11526 ) # Add --dev CLI flag to pageserver and safekeeper binaries This PR adds the `--dev` CLI flag to both the pageserver and safekeeper binaries without implementing any functionality yet. This is a precursor to PR #11517, which will implement the full functionality to require authentication by default unless the `--dev` flag is specified. ## Changes - Add `dev_mode` config field to pageserver binary - Add `--dev` CLI flag to safekeeper binary This PR is needed for forward compatibility tests to work properly, when we try to merge #11517 Link to Devin run: https://app.devin.ai/sessions/ad8231b4e2be430398072b6fc4e85d46 Requested by: John Spray (john@neon.tech) --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: John Spray <john@neon.tech>	2025-04-24 10:45:40 +00:00
Folke Behrens	7ba8519b43	proxy: Update zerocopy to 0.8 (#11681 ) Also add some macros that might result in more efficient code.	2025-04-24 09:39:08 +00:00
Christian Schwarz	f8100d66d5	ci: extend 'Wait for extension build to finish' timeout (#11689 ) Refs - https://neondb.slack.com/archives/C059ZC138NR/p1745427571307149	2025-04-24 08:15:08 +00:00
Christian Schwarz	51cdb570eb	bench_ingest: general overhaul & add parametrization over `virtual_file_io_mode` (#11667 ) Changes: - clean up existing parametrization & criterion `BenchmarkId` - additional parametrization over `virtual_file_io_mode` - switch to `multi_thread` to be closer to production ([Slack thread](https://neondb.slack.com/archives/C033RQ5SPDH/p1745339543093159)) Refs - epic https://github.com/neondatabase/neon/issues/9868 - extracted from https://github.com/neondatabase/neon/pull/11558	2025-04-24 07:38:18 +00:00
devin-ai-integration[bot]	8e09ecf2ab	Fix KeyError in physical replication benchmark test (#11675 ) # Fix KeyError in physical replication benchmark test This PR fixes the failing physical replication benchmark test that was encountering a KeyError: 'endpoints'. The issue was in accessing `project["project"]["endpoints"][0]["id"]` when it should be `project["endpoints"][0]["id"]`, consistent with how endpoints are accessed elsewhere in the codebase. Fixed the issue in both test functions: - test_ro_replica_lag - test_replication_start_stop Link to Devin run: https://app.devin.ai/sessions/be3fe9a9ee5942e4b12e74a7055f541b Requested by: Peter Bendel Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: peterbendel@neon.tech <peterbendel@neon.tech>	2025-04-23 14:51:08 +00:00
Mikhail Kot	c3534cea39	Rename object_storage->endpoint_storage (#11678 ) 1. Rename service to avoid ambiguity as discussed in Slack 2. Ignore endpoint_id in read paths as requested in https://github.com/neondatabase/cloud/issues/26346#issuecomment-2806758224	2025-04-23 14:03:19 +00:00
Folke Behrens	21d3d60cef	proxy/pglb: Add in-process connection support (#11677 ) Define a `Connection` and a `Stream` type that resemble simple QUIC connections and (multiplexed) streams.	2025-04-23 12:18:30 +00:00
Tristan Partin	b00db536bb	Add CPU architecture to the remote extensions object key (#11590 ) ARM computes are incoming and we need to account for that in remote extensions. Previously, we just blindly assumed that all computes were x86_64. Note that we use the Go architecture naming convention instead of the Rust one directly to do our best and be consistent across the stack. Part-of: https://github.com/neondatabase/cloud/issues/23148 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-22 22:47:22 +00:00
Arpad Müller	149cbd1e0a	Support single and two safekeeper scenarios (#11483 ) In tests and when one safekeeper is down in small regions, we need to contend with one or two safekeepers. Before, we gave an error in `safekeepers_for_new_timeline`. Now we just silently allow the timeline to be created on one or two safekeepers. Part of #9011	2025-04-22 21:27:01 +00:00
Alexander Lakhin	7b949daf13	fix(test): allow reconcile errors in test_storage_controller_heartbeats (#11665 ) ## Problem test_storage_controller_heartbeats is flaky because of unallowed reconciler errors (#11625) ## Summary of changes Allow reconcile errors as in other tests in test_storage_controller.py.	2025-04-22 18:13:16 +00:00
Konstantin Knizhnik	132b6154bb	Unlogged build debug compare local v2 (#11554 ) ## Problem Init fork is used in DEBUG_COMPARE_LOCAL to determine unlogged relation or unlogged build. But it is created only after the relation is initialized and so can be swapped out, producing `Page is evicted with zero LSN` error. ## Summary of changes Create init fork together with main fork for unlogged relations in DEBUG_COMPARE_LOCAL mode. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-22 18:07:45 +00:00
Alex Chi Z.	ad3519ebcb	fix(pageserver): report synthetic size = 1 if all tls offloaded (#11648 ) ## Problem A quick workaround for https://github.com/neondatabase/neon/issues/11631 ## Summary of changes Report synthetic size == 1 if all timelines are offloaded. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-22 14:28:22 +00:00
Dmitrii Kovalkov	6173c0f44c	safekeeper: add enable_tls_wal_service_api (#11520 ) ## Problem Safekeeper doesn't use TLS in wal service - Closes: https://github.com/neondatabase/cloud/issues/27302 ## Summary of changes - Add `enable_tls_wal_service_api` option to safekeeper's cmd arguments - Propagate `tls_server_config` to `wal_service` if the option is enabled - Create `BACKGROUND_RUNTIME` for small background tasks and offload SSL certificate reloader to it. No integration tests for now because support from compute side is required: https://github.com/neondatabase/cloud/issues/25823	2025-04-22 13:19:03 +00:00
a-masterov	fd916abf25	Remove NOTICE messages, which can make the pg_repack regression test fail. (#11659 ) ## Problem The pg_repack test can be flaky due to unpredictable `NOTICE` messages about waiting for some processes. E.g., ``` INFO: repacking table "public.issue3_2" +NOTICE: Waiting for 1 transactions to finish. First PID: 427 ``` ## Summary of changes The `client_min_messages` set to `warning` for the regression tests.	2025-04-22 11:43:45 +00:00
Alexander Bayandin	cd2e1fbc7c	CI(benchmarks): upload perf results for passed tests (#11649 ) ## Problem We run benchmarks in batches (five parallel jobs on different runners). If any test in a batch fails, we won’t upload any results for that batch, even for the tests that passed. ## Summary of changes - Move the results upload to a separate step in the run-python-test-set action, and execute this step even if tests fail.	2025-04-22 09:41:28 +00:00
github-actions[bot]	cae3e2976b	Proxy release 2025-04-22	2025-04-22 06:02:06 +00:00
Tristan Partin	5df4a747e6	Update pgbouncer in compute images to 1.24.1 (#11651 ) Fixes CVE-2025-2291. Link: https://www.postgresql.org/about/news/pgbouncer-1241-released-fixes-cve-2025-2291-3059/ Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-21 17:49:17 +00:00
Vlad Lazar	cbf442292b	pageserver: handle empty get vectored queries (#11652 ) ## Problem If all batched requests are excluded from the query by `Timeine::get_rel_page_at_lsn_batched` (e.g. because they are past the end of the relation), the read path would panic since it doesn't expect empty queries. This is a change in behaviour that was introduced with the scattered query implementation. ## Summary of Changes Handle empty queries explicitly.	2025-04-21 17:45:16 +00:00
Heikki Linnakangas	4d0c1e8b78	refactor: Extract some code in pagebench getpage command to function (#11563 ) This makes it easier to add a different client implementation alongside the current one. I started working on a new gRPC-based protocol to replace the libpq protocol, which will introduce a new function like `client_libpq`, but for the new protocol. It's a little more readable with less indentation anyway.	2025-04-19 08:38:03 +00:00
JC Grünhage	3158442a59	fix(ci): set token for fast-forward failure comments and allow merging with state unstable (#11647 ) ## Problem https://github.com/neondatabase/neon/actions/runs/14538136318/job/40790985693?pr=11645 failed, even though the relevant parts of the CI had passed and auto-merge determined the PR is ready to merge. After that, commenting failed. ## Summary of changes - set GH_TOKEN for commenting after fast-forward failure - allow merging with mergeable_state unstable	2025-04-18 17:49:34 +00:00
JC Grünhage	f006879fb7	fix(ci): make regex to find rc branches less strict (#11646 ) ## Problem https://github.com/neondatabase/neon/actions/runs/14537161022/job/40787763965 failed to find the correct RC PR run, preventing artifact re-use. This broke in https://github.com/neondatabase/neon/pull/11547. There's a hotfix release containing this in https://github.com/neondatabase/neon/pull/11645. ## Summary of changes Make the regex for finding the RC PR run less strict, it was needlessly precise.	2025-04-18 16:39:18 +00:00
Dmitrii Kovalkov	a0d844dfed	pageserver + safekeeper: pass ssl ca certs to broker client (#11635 ) ## Problem Pageservers and safakeepers do not pass CA certificates to broker client, so the client do not trust locally issued certificates. - Part of https://github.com/neondatabase/cloud/issues/27492 ## Summary of changes - Change `ssl_ca_certs` type in PS/SK's config to `Pem` which may be converted to both `reqwest` and `tonic` certificates. - Pass CA certificates to storage broker client in PS and SK	2025-04-18 06:27:23 +00:00
Alex Chi Z.	5073e46df4	feat(pageserver): use rfc3339 time and print ratio in gc-compact stats (#11638 ) ## Problem follow-up on https://github.com/neondatabase/neon/pull/11601 ## Summary of changes - serialize the start/end time using rfc3339 time string - compute the size ratio of the compaction --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-18 05:28:01 +00:00
Alexander Bayandin	182bd95a4e	CI(regress-tests): run tests on `large-metal` (#11634 ) ## Problem Regression tests are more flaky on virtualised (`qemu-x64-*`) runners See https://neondb.slack.com/archives/C069Z2199DL/p1744891865307769 Ref https://github.com/neondatabase/neon/issues/11627 ## Summary of changes - Switch `regress-tests` to metal-only large runners to mitigate flaky behaviour	2025-04-18 01:25:38 +00:00
Anastasia Lubennikova	ce7795a67d	compute: use project_id, endpoint_id as tag (#11556 ) for compute audit logs part of https://github.com/neondatabase/cloud/issues/21955	2025-04-17 23:32:38 +00:00
Suhas Thalanki	134d01c771	remove pg_anon.patch (#11636 ) This PR removes `pg_anon.patch` as the `anon` v1 extension has been removed and the patch is not being used anywhere	2025-04-17 22:08:16 +00:00
Arpad Müller	c1e4befd56	Additional fixes and improvements to storcon safekeeper timelines (#11477 ) This delivers some additional fixes and improvements to storcon managed safekeeper timelines: * use `i32::MAX` for the generation number of timeline deletion * start the generation for new timelines at 1 instead of 0: this ensures that the other components actually are generation enabled * fix database operations we use for metrics * use join in list_pending_ops to prevent the classical ORM issue where one does many db queries * use enums in `test_storcon_create_delete_sk_down`. we are adding a second parameter, and having two bool parameters is weird. * extend `test_storcon_create_delete_sk_down` with a test of whole tenant deletion. this hasn't been tested before. * remove some redundant logging contexts * Don't require mutable access to the service lock for scheduling pending ops in memory. In order to pull this off, create reconcilers eagerly. The advantage is that we don't need mutable access to the service lock that way any more. Part of #9011 --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2025-04-17 20:25:30 +00:00
a-masterov	6c2e5c044c	random operations test (#10986 ) ## Problem We need to test the stability of Neon. ## Summary of changes The test runs random operations on a Neon project. It performs via the Public API calls the following operations: `create a branch`, `delete a branch`, `add a read-only endpoint`, `delete a read-only endpoint`, `restore a branch to a random position in the past`. All the branches and endpoints are loaded with `pgbench`. --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-04-17 19:59:35 +00:00
Alex Chi Z.	748539b222	fix(pageserver): lower L0 compaction threshold (#11617 ) ## Problem We saw OOMs due to L0 compaction happening simultaneously for all shards of the same tenant right after the shard split. ## Summary of changes Lower the threshold so that we compact fewer files. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-17 19:51:28 +00:00
Alex Chi Z.	ad0c5fdae7	fix(test): allow stale generation warnings in storcon (#11624 ) ## Problem https://github.com/neondatabase/neon/pull/11531 did not fully fix the problem because the warning is part of the storcon instead of pageserver. ## Summary of changes Allow stale generation error in storcon. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-17 16:12:24 +00:00
Christian Schwarz	2b041964b3	cover direct IO + concurrent IO in unit, regression & perf tests (#11585 ) This mirrors the production config. Thread that discusses the merits of this: - https://neondb.slack.com/archives/C033RQ5SPDH/p1744742010740569 # Refs - context https://neondb.slack.com/archives/C04BLQ4LW7K/p1744724844844589?thread_ts=1744705831.014169&cid=C04BLQ4LW7K - prep for https://github.com/neondatabase/neon/pull/11558 which adds new io mode `direct-rw` # Impact on CI turnaround time Spot-checking impact on CI timings - Baseline: [some recent main commit](https://github.com/neondatabase/neon/actions/runs/14471549758/job/40587837475) - Comparison: [this commit](https://github.com/neondatabase/neon/actions/runs/14471945087/job/40589613274) in this PR here Impact on CI turnaround time - Regression tests: - x64: very minor, sometimes better; likely in the noise - arm64: substantial 30min => 40min - Benchmarks (x86 only I think): very minor; noise seems higher than regress tests --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Alex Chi Z. <4198311+skyzh@users.noreply.github.com> Co-authored-by: Peter Bendel <peterbendel@neon.tech> Co-authored-by: Alex Chi Z <chi@neon.tech>	2025-04-17 15:53:10 +00:00
John Spray	d4c059a884	tests: use endpoint http wrapper to get auth (#11628 ) ## Problem `test_compute_startup_simple` and `test_compute_ondemand_slru_startup` are failing. This test implicitly asserts that the metrics.json endpoint succeeds and returns all expected metrics, but doesn't make it easy to see what went wrong if it doesn't (e.g. in this failure https://neon-github-public-dev.s3.amazonaws.com/reports/main/14513210240/index.html#suites/13d8e764c394daadbad415a08454c04e/b0f92a86b2ed309f/) In this case, it was failing because of a missing auth token, because it was using `requests` directly instead of using the endpoint http client type. ## Summary of changes - Use endpoint http wrapper to get raise_for_status & auth token	2025-04-17 15:03:23 +00:00
Folke Behrens	2c56c46d48	compute: Set max log level for local proxy sql_over_http mod to WARN (#11629 ) neondatabase/cloud#27738	2025-04-17 14:38:19 +00:00
Tristan Partin	d1728a6bcd	Remove old compatibility hack for remote extensions (#11620 ) Control plane has long since been updated to send the right value. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-17 14:08:42 +00:00
John Spray	0a27973584	pageserver: rename `Tenant` to `TenantShard` (#11589 ) ## Problem `Tenant` isn't really a whole tenant: it's just one shard of a tenant. ## Summary of changes - Automated rename of Tenant to TenantShard - Followup commit to change references in comments	2025-04-17 13:29:16 +00:00
Alexander Bayandin	07c2411f6b	tests: remove mentions of ALLOW_*_COMPATIBILITY_BREAKAGE (#11618 ) ## Problem There are mentions of `ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` and `ALLOW_FORWARD_COMPATIBILITY_BREAKAGE`, but in reality, this mechanism doesn't work, so let's remove it to avoid confusion. The idea behind it was to allow some breaking changes by adding a special label to a PR that would `xfail` the test. However, in practice, this means we would need to carry this label through all subsequent PRs until the release (and artifact regeneration). This approach isn't really viable, as it increases the risk of missing a compatibility break in another PR. ## Summary of changes - Remove mentions and handling of `ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` / `ALLOW_FORWARD_COMPATIBILITY_BREAKAGE`	2025-04-17 10:03:21 +00:00
Alexander Bayandin	5819938c93	CI(pg-clients): fix workflow permissions (#11623 ) ## Problem `pg-clients` can't start: ``` The workflow is not valid. .github/workflows/pg-clients.yml (Line: 44, Col: 3): Error calling workflow 'neondatabase/neon/.github/workflows/build-build-tools-image.yml@aa19f10e7e958fbe0e0641f2e8c5952ce3be44b3'. The nested job 'check-image' is requesting 'packages: read', but is only allowed 'packages: none'. .github/workflows/pg-clients.yml (Line: 44, Col: 3): Error calling workflow 'neondatabase/neon/.github/workflows/build-build-tools-image.yml@aa19f10e7e958fbe0e0641f2e8c5952ce3be44b3'. The nested job 'build-image' is requesting 'packages: write', but is only allowed 'packages: none'. ``` ## Summary of changes - Grant required `packages: write` permissions to the workflow	2025-04-17 08:54:23 +00:00
Konstantin Knizhnik	b7548de814	Disable autovacuum and increase limit for WS approximation (#11583 ) ## Problem Test lfc working set approximation becomes flaky after recent changes in prefetch. May be it is caused by updating HLL in `lfc_write`, may be by some other reasons. ## Summary of changes 1. Disable autovacuum in this test (as possible source of extra page accesses). 2. Increase upper boundary for WS approximation from 12 to 20. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-17 05:07:45 +00:00
Tristan Partin	9794f386f4	Make Postgres 17 the default version (#11619 ) This is mostly a documentation update, but a few updates with regard to neon_local, pageserver, and tests. 17 is our default for users in production, so dropping references to 16 makes sense. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-16 23:23:37 +00:00
Tristan Partin	79083de61c	Remove forward compatibility hacks related to compute_ctl auth (#11621 ) These various hacks were needed for the forward compatibility tests. Enough time has passed since the merge that these are no longer needed. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-16 23:14:24 +00:00
Folke Behrens	ec9079f483	Allow unwrap() in tests when clippy::unwrap_used is denied (#11616 ) ## Problem The proxy denies using `unwrap()`s in regular code, but we want to use it in test code and so have to allow it for each test block. ## Summary of changes Set `allow-unwrap-in-tests = true` in clippy.toml and remove all exceptions.	2025-04-16 20:05:21 +00:00
Ivan Efremov	b9b25e13a0	feat(proxy): Return prefixed errors to testodrome (#11561 ) Testodrome measures uptime based on the failed requests and errors. In case of testodrome request we send back error based on the service. This will help us distinguish error types in testodrome and rely on the uptime SLI.	2025-04-16 19:03:23 +00:00
Alex Chi Z.	cf2e695f49	feat(pageserver): gc-compaction meta statistics (#11601 ) ## Problem We currently only have gc-compaction statistics for each single sub-compaction job. ## Summary of changes Add meta statistics across all sub-compaction jobs scheduled. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-16 18:51:48 +00:00
Conrad Ludgate	fc233794f6	fix(proxy): make sure that sql-over-http is TLS aware (#11612 ) I noticed that while auth-broker -> local-proxy is TLS aware, and TCP proxy -> postgres is TLS aware, HTTP proxy -> postgres is not 😅	2025-04-16 18:37:17 +00:00
Tristan Partin	c002236145	Remove compute_ctl authorization bypass if testing feature was enable (#11596 ) We want to exercise the authorization middleware in our regression tests. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-16 17:54:51 +00:00
Erik Grinaker	4af0b9b387	pageserver: don't recompress images in `ImageLayerInner::filter()` (#11592 ) ## Problem During shard ancestor compaction, we currently recompress all page images as we move them into a new layer file. This is expensive and unnecessary. Resolves #11562. Requires #11607. ## Summary of changes Pass through compressed page images in `ImageLayerInner::filter()`.	2025-04-16 17:10:15 +00:00
Vlad Lazar	0e00faf528	tests: stability fixes for `test_migration_to_cold_secondary` (#11606 ) 1. Compute may generate WAL on shutdown. The test assumes that after shutdown, no further ingest happens. Tweak the compute shutdown to make the assumption true. 2. Assertion of local layer count post cold migration is not right since we may have downloaded layers due to ingest. Remove it. Closes https://github.com/neondatabase/neon/issues/11587	2025-04-16 16:31:23 +00:00
Anastasia Lubennikova	7747a9619f	compute: fix copy-paste typo for neon GUC parameters check (#11610 ) fix for commit [`5063151`](`5063151271`)	2025-04-16 15:55:11 +00:00
Erik Grinaker	46100717ad	pageserver: add `VectoredBlob::raw_with_header` (#11607 ) ## Problem To avoid recompressing page images during layer filtering, we need access to the raw header and data from vectored reads such that we can pass them through to the target layer. Touches #11562. ## Summary of changes Adds `VectoredBlob::raw_with_header()` to return a raw view of the header+data, and updates `read()` to track it. Also adds `blob_io::Header` with header metadata and decode logic, to reuse for tests and assertions. This isn't yet widely used.	2025-04-16 15:38:10 +00:00
Erik Grinaker	00eeff9b8d	pageserver: add `compaction_shard_ancestor` to disable shard ancestor compaction (#11608 ) ## Problem Splits of large tenants (several TB) can cause a huge amount of shard ancestor compaction work, which can overload Pageservers. Touches https://github.com/neondatabase/cloud/issues/22532. ## Summary of changes Add a setting `compaction_shard_ancestor` (default `true`) to disable shard ancestor compaction on a per-tenant basis.	2025-04-16 14:41:02 +00:00
Matthias van de Meent	2a46426157	Update neon GUCs with new default settings (#11595 ) Staging and prod both have these settings configured like this, so let's update this so we can eventually drop the overrides in prod.	2025-04-16 13:42:22 +00:00
Tristan Partin	edc11253b6	Fix neon_local public key parsing when create compute JWKS (#11602 ) Finally figured out the right incantation. I had had this in my original go, but due to some refactoring and apparently missed testing, I committed a mistake. The reason this doesn't currently break anything is that we bypass the authorization middleware when the "testing" cargo feature is enabled. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-16 12:51:48 +00:00
Heikki Linnakangas	b4e26a6284	Set last-written LSN as part of smgr_end_unlogged_build() (#11584 ) This way, the callers don't need to do it, reducing the footprint of changes we've had to made to various index AM's build functions.	2025-04-16 12:34:18 +00:00
Vlad Lazar	96b46365e4	tests: attach final metrics to allure report (#11604 ) ## Problem Metrics are saved in https://github.com/neondatabase/neon/pull/11559, but the file is not matched by the attachment regex. ## Summary of changes Make attachment regex match the metrics file.	2025-04-16 10:26:47 +00:00
Alex Chi Z.	aa19f10e7e	fix(test): allow shutdown warning in preempt tests (#11600 ) ## Problem test_gc_compaction_preempt is still flaky ## Summary of changes - allow shutdown warning logs Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-15 21:50:28 +00:00
Konstantin Knizhnik	35170656fe	Allocate WalProposerConn using TopMemoryAllocator (#11577 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1744659631698609 `WalProposerConn` is allocated using current memory context which life time is not long enough. ## Summary of changes Allocate `WalProposerConn` using `TopMemoryContext`. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-15 19:13:12 +00:00
Tristan Partin	cd9ad75797	Remove compute_ctl authorization bypass on localhost (#11597 ) For whatever reason, this never worked in production computes anyway. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-15 19:12:34 +00:00
Tristan Partin	eadb05f78e	Teach neon_local to pass the Authorization header to compute_ctl (#11490 ) This allows us to remove hacks in the compute_ctl authorization middleware which allowed for bypasses of auth checks. Fixes: https://github.com/neondatabase/neon/issues/11316 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-15 17:27:49 +00:00
Fedor Dikarev	c5115518e9	remove temp file from repo (#11586 ) ## Problem In https://github.com/neondatabase/neon/pull/11409 we added temp file to the repo. ## Summary of changes Remove temp file from the repo.	2025-04-15 15:29:15 +00:00
Alex Chi Z.	931f8c4300	fix(pageserver): check if cancelled before waiting logical size (2/2) (#11575 ) ## Problem close https://github.com/neondatabase/neon/issues/11486, proceeding https://github.com/neondatabase/neon/pull/11531 ## Summary of changes This patch fixes the rest 50% of instability of `test_create_churn_during_restart`. During tenant warmup, we'll request logical size; however, if the startup gets cancelled, we won't be able to spawn the initial logical size calculation task that sets the `cancel_wait_for_background_loop_concurrency_limit_semaphore`. Therefore, we check `cancelled` before proceeding to get `cancel_wait_for_background_loop_concurrency_limit_semaphore`. There will still be a race if the timeline shutdown happens after L5710 and before L5711, but it should be enough to reduce the flakiness of the test. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-15 15:16:16 +00:00
Alexander Bayandin	0f7c2cc382	CI(release): add time to RC PR branch names (#11547 ) ## Problem We can't have more than one open release PR created on the same day (due to non-unique enough branch names). ## Summary of changes - Add time (hours and minutes) to RC PR branch names - Also make sure we use UTC for releases	2025-04-15 15:08:05 +00:00
Erik Grinaker	983d56502b	pageserver: reduce shard ancestor rewrite threshold to 30% (#11582 ) ## Problem When doing power-of-two shard splits (i.e. 4 → 8 → 16), we end up rewriting all layers since half of the pages will be local due to striping. This causes a lot of resource usage when splitting large tenants. ## Summary of changes Drop the threshold of local/total pages to 30%, to reduce the amount of layer rewrites after splits.	2025-04-15 14:26:29 +00:00
Erik Grinaker	bcef542d5b	pageserver: don't rewrite invisible layers during ancestor compaction (#11580 ) ## Problem Shard ancestor compaction can be very expensive following shard splits of large tenants. We currently rewrite garbage layers after shard splits as well, which can be a significant amount of data. Touches https://github.com/neondatabase/cloud/issues/22532. ## Summary of changes Don't rewrite invisible layers after shard splits.	2025-04-15 14:25:58 +00:00
a-masterov	e31455d936	Add the tests for the extensions `pg_jsonschema` and `pg_session_jwt` (#11323 ) ## Problem `pg_jsonschema` and `pg_session_jwt` are not yet covered by tests ## Summary of changes Added the tests for these extensions.	2025-04-15 14:06:01 +00:00
Alex Chi Z.	a4ea7d6194	fix(pageserver): gc-compaction verification false failure (#11564 ) ## Problem https://github.com/neondatabase/neon/pull/11515 introduced a bug that some key history cannot be verified. If a key only exists above the horizon, the verification will fail for its first occurrence because the history does not exist at that point. As gc-compaction skips a key range whenever an error occurs, it might be doing some wasted work in staging/prod now. But I'm not planning a hotfix this week as the bug doesn't affect correctness/performance. ## Summary of changes Allow keys with only above horizon history in the verification. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-15 13:58:32 +00:00
Alexander Bayandin	19bea5fd0c	CI: do not wait for tests to trigger deploy job (#11548 ) ## Problem There is too much delay between merging a PR into `main` and deploying the changes to staging ## Summary of changes - Trigger `deploy` job without waiting for `build-and-test-locally` job	2025-04-15 11:23:41 +00:00
a-masterov	5be94e28c4	Update the documentation of the cloud regress test (#11539 ) ## Problem The information in the README.md contained errors, and some information was missing. ## Summary of changes Found errors are fixed, and new information is added. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-04-15 11:00:25 +00:00
Alexander Bayandin	63a106021a	CI(allure-report-generate): Install allure to /tmp (#11579 ) ## Problem The `/__w/neon/neon` directory is mounted from host to container and persists between runs. Sometimes the next workflow run fails to delete it: ``` Deleting the contents of '/__w/neon/neon' Error: File was unable to be removed Error: EACCES: permission denied, rmdir '/__w/neon/neon/allure-2.32.2/bin' ``` ## Summary of changes - Download and install allure to `/tmp` which exists in container only Ref https://github.com/neondatabase/cloud/issues/27186	2025-04-15 09:29:36 +00:00
Fedor Dikarev	9a6ace9bde	introduce new runners: unit-perf and use them for benchmark jobs (#11409 ) ## Problem Benchmarks results are inconsistent on existing small-metal runners ## Summary of changes Introduce new `unit-perf` runners, and lets run benchmark on them. The new hardware has slower, but consistent, CPU frequency - if run with default governor schedutil. Thus we needed to adjust some testcases' timeouts and add some retry steps where hard-coded timeouts couldn't be increased without changing the system under test. - [wait_for_last_record_lsn](`6592d69a67/test_runner/fixtures/pageserver/utils.py (L193)`) 1000s -> 2000s - [test_branch_creation_many](https://github.com/neondatabase/neon/pull/11409/files#diff-2ebfe76f89004d563c7e53e3ca82462e1d85e92e6d5588e8e8f598bbe119e927) 1000s - [test_ingest_insert_bulk](https://github.com/neondatabase/neon/pull/11409/files#diff-e90e685be4a87053bc264a68740969e6a8872c8897b8b748d0e8c5f683a68d9f) - with back throttling disabled compute becomes unresponsive for more than 60 seconds (PG hard-coded client authentication connection timeout) - [test_sharded_ingest](https://github.com/neondatabase/neon/pull/11409/files#diff-e8d870165bd44acb9a6d8350f8640b301c1385a4108430b8d6d659b697e4a3f1) 600s -> 1200s Right now there are only 2 runners of that class, and if we decide to go with them, we have to check how much that type of runners we need, so jobs not stuck with waiting for that type of runners available. However we now decided to run those runners with governor performance instead of schedutil. This achieves almost same performance as previous runners but still achieves consistent results for same commit Related issue to activate performance governor on these runners https://github.com/neondatabase/runner/pull/138 ## Verification that it helps ### analyze runtimes on new runner for same commit Table of runtimes for the same commit on different runners in [run](https://github.com/neondatabase/neon/actions/runs/14417589789) \| Run \| Benchmarks (1) \| Benchmarks (2) \|Benchmarks (3) \|Benchmarks (4) \| Benchmarks (5) \| \|--------\|--------\|---------\|---------\|---------\|---------\| \| 1 \| 1950.37s \| 6374.55s \| 3646.15s \| 4149.48s \| 2330.22s \| \| 2 \| - \| 6369.27s \| 3666.65s \| 4162.42s \| 2329.23s \| \| Delta % \| - \| 0,07 % \| 0,5 % \| 0,3 % \| 0,04 % \| \| with governor performance \| 1519.57s \| 4131.62s \| - \| - \| - \| \| second run gov. perf. \| 1513.62s \| 4134.67s \| - \| - \| - \| \| Delta % \| 0,3 % \| 0,07 % \| - \| - \| - \| \| speedup gov. performance \| 22 % \| 35 % \| - \| - \| - \| \| current desktop class hetzner runners (main) \| 1487.10s \| 3699.67s \| - \| - \| - \| \| slower than desktop class \| 2 % \| 12 % \| - \| - \| - \| In summary, the runtimes for the same commit on this hardware varies less than 1 %. --------- Co-authored-by: BodoBolero <peterbendel@neon.tech>	2025-04-15 08:21:44 +00:00
Erik Grinaker	8c77ccfc01	pageserver: log total progress during shard ancestor compaction (#11565 ) ## Problem Shard ancestor compaction doesn't currently log any global progress information, only for the current batch. ## Summary of changes Log the number of layers checked for eligibility this iteration, and the total number of layers to check. This will indicate how far along the total shard ancestor compaction has gotten for this iteration.	2025-04-15 07:25:09 +00:00
github-actions[bot]	51ecd1bb37	Proxy release 2025-04-15	2025-04-15 06:01:10 +00:00
Tristan Partin	cbd2fc2395	Clean up logs and error messages in compute_ctl authorize middleware (#11576 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-15 01:21:18 +00:00
Tristan Partin	028a191040	Continue with s/spec/config changes (#11574 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-14 21:18:21 +00:00
Vlad Lazar	8cce27bedb	pageserver: add a randomized read path test (#11519 ) ## Problem Every time we make changes to the read path to fix a bug or add a feature, we end up adding another incomprehensible test. ## Summary of changes Add some generic infrastructure for generating a layer map from a type spec and use that for a read path test. The test is randomized but uses a fixed seed by default. A fuzzing mode is available for confidence building. See [Notion page](https://www.notion.so/neondatabase/Read-Path-Unit-Testing-Fuzzing-1d1f189e0047806c8e5cd37781b0a350?pvs=4) for a diagram of the layer map used. Just for fun I tried removing [this commit](`9990199cb4`) from https://github.com/neondatabase/neon/pull/11494 and it caught the bug in the normal mode (no fuzzing required).	2025-04-14 15:31:32 +00:00
Vlad Lazar	90b706cd96	tests: save pageserver metrics at the end of the test (#11559 ) ## Problem Sometimes it's useful to see the pageserver metrics after a test in order to debug stuff. For example, for https://github.com/neondatabase/neon/issues/11465 I'd like to know what the remote storage latencies are from the client. ## Summary of changes When stopping the env, record the pageserver metrics into a file in the pageserver's workdir.	2025-04-14 15:13:20 +00:00
Alex Chi Z.	057ce115de	fix(test): allow stale generation errors (1/2) (#11531 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11486 ## Summary of changes 50% of the test instability of `test_create_churn_during_restart` are due to error message gets changed. Allow the new error message. Still need to fix other errors due to failure to acquire semaphore in this or the next patch. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-14 14:51:17 +00:00
Vlad Lazar	e85607eed8	tests: remove config tweak allowing old versions to start with a batching config (#11560 ) ## Problem Pageservers now ignore unknown config fields, so this config tweaking is no longer needed. ## Summary of changes Get rid of the hack. Closes https://github.com/neondatabase/neon/issues/11524	2025-04-14 14:42:35 +00:00
Tristan Partin	437071888e	Fix logging in nightly physical replication benchmarks (#11541 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-14 13:57:33 +00:00
Vlad Lazar	148b3701cf	pageserver: add metrics for get page batch breaking reasons (#11545 ) ## Problem https://github.com/neondatabase/neon/pull/11494 changes the batching logic, but we don't have a way to evaluate it. ## Summary of changes This PR introduces a global and per timeline metric which tracks the reason for which a batch was broken.	2025-04-14 13:24:47 +00:00
Christian Schwarz	daebe50e19	refactor: plumb gate and cancellation down to to blob_io::BlobWriter (#11543 ) In #10063 we will switch BlobWriter to use the owned buffers IO buffered writer, which implements double-buffering by virtue of a background task that performs the flushing. That task's lifecylce must be contained within the Timeline lifecycle, so, it must hold the timeline gate open and respect Timeline::cancel. This PR does the noisy plumbing to reduce the #10063 diff. Refs - extracted from https://github.com/neondatabase/neon/pull/10063 - epic https://github.com/neondatabase/neon/issues/9868	2025-04-14 11:51:01 +00:00
Arpad Müller	e0ee6fbeff	Remove deprecated --compute-hook-url storcon param (#11551 ) We have already migrated the storage controller to `--control-plane-url`, added in #11173. The new param was added to support also safekeeper specific endpoints. See the docs changes in #11195 for further details. Part of #11163	2025-04-14 10:36:40 +00:00
Konstantin Knizhnik	307fa2ceb7	Remove unused n_synced variable from HandleSafekeeperResponse (#11553 ) ## Problem clang produce warning about unused variable `n_synced` in HandleSafekeeperResponse ## Summary of changes Remove local variable. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-14 09:45:13 +00:00
Vlad Lazar	a338984dc7	pageserver: support keys at different LSNs in one get page batch (#11494 ) ## Problem Get page batching stops when we encounter requests at different LSNs. We are leaving batching factor on the table. ## Summary of changes The goal is to support keys with different LSNs in a single batch and still serve them with a single vectored get. Important restriction: the same key at different LSNs is not supported in one batch. Returning different key versions is a much more intrusive change. Firstly, the read path is changed to support "scattered" queries. This is a conceptually simple step from https://github.com/neondatabase/neon/pull/11463. Instead of initializing the fringe for one keyspace, we do it for multiple at different LSNs and let the logic already present into the fringe handle selection. Secondly, page service code is updated to support batching at different LSNs. Eeach request parsed from the wire determines its effective request LSN and keeps it in mem for the batcher toinspect. The batcher allows keys at different LSNs in one batch as long one key is not requested at different LSNs. I'd suggest doing the first pass commit by commit to get a feel for the changes. ## Results I used the batching test from [Christian's PR](https://github.com/neondatabase/neon/pull/11391) which increases the change of batch breaks. Looking at the logs I think the new code is at the max batching factor for the workload (we only break batches due to them being oversized or because the executor is idle). ``` Main: Reasons for stopping batching: {'LSN changed': 22843, 'of batch size': 33417} test_throughput[release-pg16-50-pipelining_config0-30-100-128-batchable {'max_batch_size': 32, 'execution': 'concurrent-futures', 'mode': 'pipelined'}].perfmetric.batching_factor: 14.6662 My branch: Reasons for stopping batching: {'of batch size': 37024} test_throughput[release-pg16-50-pipelining_config0-30-100-128-batchable {'max_batch_size': 32, 'execution': 'concurrent-futures', 'mode': 'pipelined'}].perfmetric.batching_factor: 19.8333 ``` Related: https://github.com/neondatabase/neon/issues/10765	2025-04-14 09:05:29 +00:00
Konstantin Knizhnik	8936a7abd8	Increase limit for worker processes for isolation test (#11504 ) ## Problem See https://github.com/neondatabase/neon/issues/10652 Neon extension launches 2 BGW which reduce limit for parallel workers and so affecting parallel_deadlock isolation test. ## Summary of changes Increase `max_worker_processes` from default 8 to 16 for isolation test. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-12 18:09:12 +00:00
Conrad Ludgate	946e971df8	feat(proxy): add batching to cancellation queue processing (#10607 ) Add batching to the redis queue, which allows us to clear it out quicker should it slow down temporarily.	2025-04-12 09:16:22 +00:00
Dmitrii Kovalkov	d109bf8c1d	neon_local: use ed25519 to gen local ssl certs (#11542 ) ## Problem neon_local uses rsa to generate local SSL certs, which is slow Follow-up on: - https://github.com/neondatabase/neon/pull/11025#discussion_r1989453785 - https://github.com/neondatabase/neon/pull/11538 ## Summary of changes - Change key from rsa to ed25519 in neon_local	2025-04-11 17:49:15 +00:00
Alex Chi Z.	4f7b2cdd4f	feat(pageserver): gc-compaction result verification (#11515 ) ## Problem Part of #9114 There was a debug-mode verification mode that verifies at every retain_lsn. However, the code was tangled within the actual history generation itself and it's hard to reason about correctness. This patch adds a separate post-verification of the gc-compaction result that redos logs at every retain_lsn and every record above the GC horizon. This ensures that all key history we produce with gc-compaction is readable, and if there're read errors after gc-compaction, it can only be read-path errors instead of gc-compaction bugs. ## Summary of changes * Add gc_compaction_verification flag, default to true. * Implement a post-verification process. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-11 15:50:29 +00:00
Alex Chi Z.	66f56ddaec	fix(pageserver): allow shutdown errors for gc compaction tests (#11530 ) ## Problem `test_pageserver_compaction_preempt` is flaky. ## Summary of changes Allow the shutdown errors. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-11 15:20:51 +00:00
Erik Grinaker	fd16caa7d0	pageserver: yield for L0 during ancestor compaction (#11536 ) ## Problem Shard ancestor compaction does not yield for L0 compaction, potentially starving it. close https://github.com/neondatabase/neon/issues/11125 ## Summary of changes * Yield for L0 during shard ancestor compaction. * Return `CompactionOutcome::Pending` when limited by `rewrite_max`, for eager rescheduling.	2025-04-11 15:09:28 +00:00
Tristan Partin	ff5a527167	Consolidate compute_ctl configuration structures (#11514 ) Previously, the structure of the spec file was just the compute spec. However, the response from the control plane get spec request included the compute spec and the compute_ctl config. This divergence was hindering other work such as adding regression tests for compute_ctl HTTP authorization. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-11 15:06:29 +00:00
Arpad Müller	c66444ea15	Add timeline_import http endpoint (#11484 ) The added `timleine_import` endpoint allows us to migrate safekeeper timelines from control plane managed to storcon managed. Part of #9011	2025-04-11 14:10:27 +00:00
Arpad Müller	88f01c1ca1	Introduce WalIngestError (#11506 ) Introduces a `WalIngestError` struct together with a `WalIngestErrorKind` enum, to be used for walingest related failures and errors. * the enum captures backtraces, so we don't regress in comparison to `anyhow::Error`s (backtraces might be a bit shorter if we use one of the `anyhow::Error` wrappers) * it explicitly lists most/all of the potential cases that can occur. I've originally been inspired to do this in #11496, but it's a longer-term TODO.	2025-04-11 14:08:46 +00:00
Erik Grinaker	a6937a3281	pageserver: improve shard ancestor compaction logging (#11535 ) ## Problem Shard ancestor compaction always logs "starting shard ancestor compaction", even if there is no work to do. This is very spammy (every 20 seconds for every shard). It also has limited progress logging. ## Summary of changes * Only log "starting shard ancestor compaction" when there's work to do. * Include details about the amount of work. * Log progress messages for each layer, and when waiting for uploads. * Log when compaction is completed, with elapsed duration and whether there is more work for a later iteration.	2025-04-11 12:14:08 +00:00
Erik Grinaker	3c8565a194	test_runner: propagate config via `attach_hook` for test fix (#11529 ) ## Problem The `pagebench` benchmarks set up an initial dataset by creating a template tenant, copying the remote storage to a bunch of new tenants, and attaching them to Pageservers. In #11420, we found that `test_pageserver_characterize_throughput_with_n_tenants` had degraded performance because it set a custom tenant config in Pageservers that was then replaced with the default tenant config by the storage controller. The initial fix was to register the tenants directly in the storage controller, but this created the tenants with generation 1. This broke `test_basebackup_with_high_slru_count`, where the template tenant was at generation 2, leading to all layer files at generation 2 being ignored. Resolves #11485. Touches #11381. ## Summary of changes This patch addresses both test issues by modifying `attach_hook` to also take a custom tenant config. This allows attaching tenants to Pageservers from pre-existing remote storage, specifying both the generation and tenant config when registering them in the storage controller.	2025-04-11 11:31:12 +00:00
Christian Schwarz	979fa0682b	tests: update batching perf test workload to include scattered LSNs (#11391 ) The batching perf test workload is currently read-only sequential scans. However, realistic workloads have concurrent writes (to other pages) going on. This PR simulates concurrent writes to other pages by emitting logical replication messages. These degrade the achieved batching factor, for the reason see - https://github.com/neondatabase/neon/issues/10765 PR - https://github.com/neondatabase/neon/pull/11494 will fix this problem and get batching factor back up. --------- Co-authored-by: Vlad Lazar <vlad@neon.tech>	2025-04-11 09:55:49 +00:00
Christian Schwarz	8884865bca	tests: make `test_pageserver_getpage_throttle` less flaky (#11482 ) # Refs - fixes https://github.com/neondatabase/neon/issues/11395 # Problem Since 2025-03-10, we have observed increased flakiness of `test_pageserver_getpage_throttle`. The test is timing-dependent by nature, and was hitting the ``` assert duration_secs >= 10 * actual_smgr_query_seconds, ( "smgr metrics should not include throttle wait time" ) ``` quite frequently. # Analysis These failures are not reproducible. In this PR's history is a commit that reran the test 100 times without requiring a single retry. In https://github.com/neondatabase/neon/issues/11395 there is a link to a query to the test results database. It shows that the flakiness was not constant, but rather episodic: 2025-03-{10,11,12,13} 2025-03-{19,20,21} 2025-03-31 and 2025-04-01. To me, this suggests variability in available CPU. # Solution The point of the offending assertion is to ensure that most of the request latency is spent on throttling, because testing of the throttling mechanism is the point of the test. The `10` magic number means at most 10% of mean latency may be spent on request processing. Ideally we would control the passage of time (virtual clock source) to make this test deterministic. But I don't see that happening in our regression test setup. So, this PR de-flakes the test as follows: - allot up to 66% of mean latency for request processing - increase duration from 10s to 20s, hoping to get better protection from momentary CPU spikes in noisy neighbor tests or VMs on the runner host As a drive-by, switch to `pytest.approx` and remove one self-test assertion I can't make sense of anymore.	2025-04-11 09:38:05 +00:00
Dmitrii Kovalkov	4c4e33bc2e	storage: add http/https server and cert resover metrics (#11450 ) ## Problem We need to export some metrics about certs/connections to configure alerts and make sure that all HTTP requests are gone before turning https-only mode on. - Closes: https://github.com/neondatabase/cloud/issues/25526 ## Summary of changes - Add started connection and connection error metrics to http/https Server. - Add certificate expiration time and reload metrics to ReloadingCertificateResolver.	2025-04-11 06:11:35 +00:00
Tristan Partin	342607473a	Make Endpoint::respec_deep() infinitely deep (#11527 ) Because it wasn't recursive, there was a limit to the depth of updates. This work is necessary because as we teach neon_local and compute_ctl that the content in --spec-path should match a similar structure we get from the control plane, the spec object itself will no longer be toplevel. It will be under the "spec" key. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-10 19:55:51 +00:00
John Spray	9c37bfc90a	pageserver/tests: make image_layer_rewrite write less data (#11525 ) ## Problem This test is slow to execute, particularly if you're on a slow environment like vscode in a browser. Might have got much slower when we switched to direct IO? ## Summary of changes - Reduce the scale of the test by 10x, since there was nothing special about the original size.	2025-04-10 17:03:22 +00:00
John Spray	52dee408dc	storage controller: improve safety of shard splits coinciding with controller restarts (#11412 ) ## Problem The graceful leadership transfer process involves calling step_down on the old controller, but this was not waiting for shard splits to complete, and the new controller could therefore end up trying to abort a shard split while it was still going on. We mitigated this already in #11256 by avoiding the case where shard split completion would update the database incorrectly, but this was a fragile fix because it assumes that is the only problematic part of the split running concurrently. Precursors: - #11290 - #11256 Closes: #11254 ## Summary of changes - Hold the reconciler gate from shard splits, so that step_down will wait for them. Splits should always be fairly prompt, so it is okay to wait here. - Defense in depth: if step_down times out (hardcoded 10 second limit), then fully terminate the controller process rather than letting it continue running, potentially doing split-brainy things. This makes sense because the new controller will always declare itself leader unilaterally if step_down fails, so leaving an old controller running is not beneficial. - Tests: extend `test_storage_controller_leadership_transfer_during_split` to separately exercise the case of a split holding up step_down, and the case where the overall timeout on step_down is hit and the controller terminates.	2025-04-10 16:55:37 +00:00
Anastasia Lubennikova	5487a20b72	compute: Set log_parameter=off for audit logging. (#11500 ) Log -> Base, pgaudit.log = 'ddl', pgaudit.log_parameter='off' Hipaa -> Extended. pgaudit.log = 'all, -misc', pgaudit.log_parameter='off' add new level Full: pgaudit.log='all', pgaudit.log_parameter='on' Keep old parameter names for compatibility, until cplane side changes are implemented and released. closes https://github.com/neondatabase/cloud/issues/27202	2025-04-10 15:28:28 +00:00
Alex Chi Z.	f06d721a98	test(pageserver): ensure gc-compaction does not fire critical errors (#11513 ) ## Problem Part of https://github.com/neondatabase/neon/issues/10395 ## Summary of changes Add a test case to ensure gc-compaction doesn't fire any critical errors if the key history is invalid due to partial GC. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-10 14:53:37 +00:00
Christian Schwarz	2e35f23085	tests: remove ignored `fair` field (#11521 ) Pageserver has been ignoring field `tenant_config.timeline_get_throttle.fair` for many monhts, since we removed it from the config struct in neondatabase/neon#8539. Refs - epic https://github.com/neondatabase/cloud/issues/27320	2025-04-10 14:24:30 +00:00
Anastasia Lubennikova	5063151271	compute: Add more neon ids to compute (#11366 ) Pass more neon ids to compute_ctl. Expose them to postgres as neon extension GUCs: neon.project_id, neon.branch_id, neon.endpoint_id. This is the compute side PR, not yet supported by cplane.	2025-04-10 13:04:18 +00:00
Erik Grinaker	0122d97f95	test_runner: only use last gen in `test_location_conf_churn` (#11511 ) ## Problem `test_location_conf_churn` performs random location updates on Pageservers. While doing this, it could instruct the compute to connect to a stale generation and execute queries. This is invalid, and will fail if a newer generation has removed layer files used by the stale generation. Resolves #11348. ## Summary of changes Only connect to the latest generation when executing queries.	2025-04-10 10:07:16 +00:00
Arseny Sher	fae7528adb	walproposer: make it aware of membership (#11407 ) ## Problem Walproposer should get elected and commit WAL on safekeepers specified by the membership configuration. ## Summary of changes - Add to wp `members_safekeepers` and `new_members_safekeepers` arrays mapping configuration members to connection slots. Establish this mapping (by node id) when safekeeper sends greeting, giving its id and when mconf becomes known / changes. - Add to TermsCollected, VotesCollected, GetAcknowledgedByQuorumWALPosition membership aware logic. Currently it partially duplicates existing one, but we'll drop the latter eventually. - In python, rename Configuration to MembershipConfiguration for clarity. - Add test_quorum_sanity testing new logic. ref https://github.com/neondatabase/neon/issues/10851	2025-04-10 09:55:37 +00:00
Dmitrii Kovalkov	8a72e6f888	pageserver: add enable_tls_page_service_api (#11508 ) ## Problem Page service doesn't use TLS for incoming requests. - Closes: https://github.com/neondatabase/cloud/issues/27236 ## Summary of changes - Add option `enable_tls_page_service_api` to pageserver config - Propagate `tls_server_config` to `page_service` if the option is enabled No integration tests for now because I didn't find out how to call page service API from python and AFAIK computes don't support TLS yet	2025-04-10 08:45:17 +00:00
Tristan Partin	a04e33ceb6	Remove --spec-json argument from compute_ctl (#11510 ) It isn't used by the production control plane or neon_local. The removal simplifies compute spec logic just a little bit more since we can remove any notion of whether we should allow live reconfigurations. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-09 22:39:54 +00:00
Alex Chi Z.	af0be11503	fix(pageserver): ensure gc-compaction gets preempted by L0 (#11512 ) ## Problem Part of #9114 ## Summary of changes Gc-compaction flag was not correctly set, causing it not getting preempted by L0. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-09 21:41:11 +00:00
Alex Chi Z.	405a17bf0b	fix(pageserver): ensure gc-compaction gets preempted by L0 (#11512 ) ## Problem Part of #9114 ## Summary of changes Gc-compaction flag was not correctly set, causing it not getting preempted by L0. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-09 20:57:50 +00:00
Erik Grinaker	63ee8e2181	test_runner: ignore `.___temp` files in `evict_random_layers` (#11509 ) ## Problem `test_location_conf_churn` often fails with `neither image nor delta layer`, but doesn't say what the file actually is. However, past local failures have indicated that it might be `.___temp` files. Touches https://github.com/neondatabase/neon/issues/11348. ## Summary of changes Ignore `.___temp` files when evicting local layers, and include the file name in the error message.	2025-04-09 19:03:49 +00:00
Alex Chi Z.	2c21a65b0b	feat(pageserver): add gc-compaction time-to-first-item stats (#11475 ) ## Problem In some cases gc-compaction doesn't respond to the L0 compaction yield notifier. I suspect it's stuck on getting the first item, and if so, we probably need to let L0 yield notifier preempt `next_with_trace`. ## Summary of changes - Add `time_to_first_kv_pair` to gc-compaction statistics. - Inverse the ratio so that smaller ratio -> better compaction ratio. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-09 18:07:58 +00:00
Alex Chi Z.	ec66b788e2	fix(pageserver): use different walredo retry setting for gc-compaction (#11497 ) ## Problem Not a complete fix for https://github.com/neondatabase/neon/issues/11492 but should work for a short term. Our current retry strategy for walredo is to retry every request exactly once. This retry doesn't make sense because it retries all requests exactly once and each error is expected to cause process restart and cause future requests to fail. I'll explain it with a scenario of two threads requesting redos: one with an invalid history (that will cause walredo to panic) and another that has a correct redo sequence. First let's look at how we handle retries right now in do_with_walredo_process. At the beginning of the function it will spawn a new process if there's no existing one. Then it will continue to redo. If the process fails, the first process that encounters the error will remove the walredo process object from the OnceCell, so that the next time it gets accessed, a new process will be spawned; if it is the last one that uses the old walredo process, it will kill and wait the process in `drop(proc)`. I'm skeptical whether this works under races but I think this is not the root cause of the problem. In this retry handler, if there are N requests attached to a walredo process and the i-th request fails (panics the walredo), all other N-i requests will fail and they need to retry so that they can access a new walredo process. ``` time ----> proc A None B request 1 ^-----------------^ fail uses A for redo replace with None request 2 ^-------------------- fail uses A for redo request 3 ^----------------^ fail uses A for redo last ref, wait for A to be killed request 4 ^--------------- None, spawn new process B ``` The problem is with our retry strategy. Normally, for a system that we want to retry on, the probability of errors for each of the requests are uncorrelated. However, in walredo, a prior request that panics the walredo process will cause all future walredo on that process to fail (that's correlated). So, back to the situation where we have 2 requests where one will definitely fail and the other will succeed and we get the following sequence, where retry attempts = 1, * new walredo process A starts. * request 1 (invalid) being processed on A and panics A, waiting for retry, remove process A from the process object. * request 2 (valid) being processed on A and receives pipe broken / poisoned process error, waiting for retry, wait for A to be killed -- this very likely takes a while and cannot finish before request 1 gets processed again * new walredo process B starts. * request 1 (invalid) being processed again on B and panics B, the whole request fail. * request 2 (valid) being processed again on B, and get a poisoned error again. ``` time ----> proc A None B None request 1 ^-----------------^--------------^--------------------^ spawn A for redo fail spawn B for redo fail request 2 ^--------------------^-------------------------^------------^ use A for redo fail, wait to kill A B for redo fail again ``` In such cases, no matter how we set n_attempts, as long as the retry count applies to all requests, this sequence is bound to fail both requests because of how they get sequenced; while we could potentially make request 2 successful. There are many solutions to this -- like having a separate walredo manager for compactions, or define which errors are retryable (i.e., broken pipe can be retried, while real walredo error won't be retried), or having a exclusive big lock over the whole redo process (the current one is very fine-grained). In this patch, we go with a simple approach: use different retry attempts for different types of requests. For gc-compaction, the attempt count is set to 0, so that it never retries and consequently stops the compaction process -- no more redo will be issued from gc-compaction. Once the walredo process gets restarted, the normal read requests will proceed normally. ## Summary of changes Add redo_attempt for each reconstruct value request to set different retry policies. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-04-09 18:01:31 +00:00
Peter Bendel	af12647b9d	large tenant oltp benchmark: reindex with downtime (remove concurrently) (#11498 ) ## Problem our large oltp benchmark runs very long - we want to remove the duration of the reindex step. we don't run concurrent workload anyhow but added "concurrently" only to have a "prod-like" approach. But if it just doubles the time we report because it requires two instead of one full table scan we can remove it ## Summary of changes remove keyword concurrently from the reindex step	2025-04-09 17:11:00 +00:00
Tristan Partin	1c237d0c6d	Move compute_ctl claims struct into public API (#11505 ) This is preparatory work for teaching neon_local to pass the Authorization header to compute_ctl. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-09 16:58:44 +00:00
Tristan Partin	afd34291ca	Make neon_local token generation generic over claims (#11507 ) Instead of encoding a certain structure for claims, let's allow the caller to specify what claims be encoded. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-09 16:41:29 +00:00
Vlad Lazar	66f80e77ba	tests/performance: reconcile until idle before benchmark (#11435 ) We'd like to run benchmarks starting from a steady state. To this end, do a reconciliation round before proceeding with the benchmark. This is useful for benchmarks that use tenant dir snapshots since a non-standard tenant configuration is used to generate the snapshot. The storage controller is not aware of the non default tenant configuration and will reconcile while the bench is running.	2025-04-09 16:32:19 +00:00
Conrad Ludgate	72832b3214	chore: fix clippy lints from nightly-2025-03-16 (#11273 ) I like to run nightly clippy every so often to make our future rust upgrades easier. Some notable changes: * Prefer `next_back()` over `last()`. Generic iterators will implement `last()` to run forward through the iterator until the end. * Prefer `io::Error::other()`. * Use implicit returns One case where I haven't dealt with the issues is the now [more-sensitive "large enum variant" lint](https://github.com/rust-lang/rust-clippy/pull/13833). I chose not to take any decisions around it here, and simply marked them as allow for now.	2025-04-09 15:04:42 +00:00
Vlad Lazar	d11f23a341	pageserver: refactor read path for multi LSN batching support (#11463 ) ## Problem We wish to improve pageserver batching such that one batch can contain requests for pages at different LSNs. The current shape of the code doesn't lend itself to the change. ## Summary of changes Refactor the read path such that the fringe gets initialized upfront. This is where the multi LSN change will plug in. A couple other small changes fell out of this. There should be NO behaviour change here. If you smell one, shout! I recommend reviewing commits individually (intentionally made them as small as possible). Related: https://github.com/neondatabase/neon/issues/10765	2025-04-09 13:17:02 +00:00
Dmitrii Kovalkov	e7502a3d63	pageserver: return 412 PreconditionFailed in get_timestamp_of_lsn if timestamp is not found (#11491 ) ## Problem Now `get_timestamp_of_lsn` returns `404 NotFound` if there is no clog pages for given LSN, and it's difficult to distinguish from other 404 errors. A separate status code for this error will allow the control plane to handle this case. - Closes: https://github.com/neondatabase/neon/issues/11439 - Corresponding PR in control plane: https://github.com/neondatabase/cloud/pull/27125 ## Summary of changes - Return `412 PreconditionFailed` instead of `404 NotFound` if no timestamp is fond for given LSN. I looked briefly through the current error handling code in cloud.git and the status code change should not affect anything for the existing code. Change from the corresponding PR also looks fine and should work with the current PS status code. Additionally, here is OK to merge it from control plane team: https://github.com/neondatabase/neon/issues/11439#issuecomment-2789327552 --------- Co-authored-by: John Spray <john@neon.tech>	2025-04-09 13:16:15 +00:00
Heikki Linnakangas	ef8101a9be	refactor: Split "communicator" routines to a separate source file (#11459 ) pagestore_smgr.c had grown pretty large. Split into two parts, such that the smgr routines that PostgreSQL code calls stays in pagestore_smgr.c, and all the prefetching logic and other lower-level routines related to communicating with the pageserver are moved to a new source file, "communicator.c". There are plans to replace communicator parts with a new implementation. See https://github.com/neondatabase/neon/pull/10799. This commit doesn't implement any of the new things yet, but it is good preparation for it. I'm imagining that the new implementation will approximately replace the current "communicator.c" code, exposing roughly the same functions to pagestore_smgr.c. This commit doesn't change any functionality or behavior, or make any other changes to the existing code: It just moves existing code around.	2025-04-09 12:28:59 +00:00
Arpad Müller	d2825e72ad	Add is_stopping check around critical macro in walreceiver (#11496 ) The timeline stopping state is set much earlier than the cancellation token is fired, so by checking for the stopping state, we can prevent races with timeline shutdown where we issue a cancellation error but the cancellation token hasn't been fired yet. Fix #11427.	2025-04-09 12:17:45 +00:00
Erik Grinaker	a6ff8ec3d4	storcon: change default stripe size to 16 MB (#11168 ) ## Problem The current stripe size of 256 MB is a bit large, and can cause load imbalances across shards. A stripe size of 16 MB appears more reasonable to avoid hotspots, although we don't see evidence of this in benchmarks. Resolves https://github.com/neondatabase/cloud/issues/25634. Touches https://github.com/neondatabase/cloud/issues/21870. ## Summary of changes * Change the default stripe size to 16 MB. * Remove `ShardParameters::DEFAULT_STRIPE_SIZE`, and only use `pageserver_api::shard::DEFAULT_STRIPE_SIZE`. * Update a bunch of tests that assumed a certain stripe size.	2025-04-09 08:41:38 +00:00
Dmitrii Kovalkov	cf62017a5b	storcon: add https metrics for pageservers/safekeepers (#11460 ) ## Problem Storcon will not start up if `use_https` is on and there are some pageservers or safekeepers without https port in the database. Metrics "how many nodes with https we have in DB" will help us to make sure that `use_https` may be turned on safely. - Part of https://github.com/neondatabase/cloud/issues/25526 ## Summary of changes - Add `storage_controller_https_pageserver_nodes`, `storage_controller_safekeeper_nodes` and `storage_controller_https_safekeeper_nodes` Prometheus metrics.	2025-04-09 08:33:49 +00:00
Erik Grinaker	c610f3584d	test_runner: tweak `test_create_snapshot` compaction (#11495 ) ## Problem With the recent improvements to L0 compaction responsiveness, `test_create_snapshot` now ends up generating 10,000 layer files (compared to 1,000 in previous snapshots). This increases the snapshot size by 4x, and significantly slows down tests. ## Summary of changes Increase the target layer size from 128 KB to 256 KB, and the L0 compaction threshold from 1 to 5. This reduces the layer count from about 10,000 to 1,000.	2025-04-09 06:52:49 +00:00
Konstantin Knizhnik	c9ca8b7c4a	One more fix for unlogged build support in DEBUG_COMPARE_LOCAL (#11474 ) ## Problem Support of unlogged build in DEBUG_COMPARE_LOCAL. Neon SMGR treats present of local file as indicator of unlogged relations. But it doesn't work in DEBUG_COMPARE_LOCAL mode. ## Summary of changes Use INIT_FORKNUM as indicator of unlogged file and create this file while unlogged index build. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-09 05:14:29 +00:00
Erik Grinaker	7679b63a2c	pageserver: persist stripe size in tenant manifest for tenant_import (#11181 ) ## Problem `tenant_import`, used to import an existing tenant from remote storage into a storage controller for support and debugging, assumed `DEFAULT_STRIPE_SIZE` since this can't be recovered from remote storage. In #11168, we are changing the stripe size, which will break `tenant_import`. Resolves #11175. ## Summary of changes * Add `stripe_size` to the tenant manifest. * Add `TenantScanRemoteStorageShard::stripe_size` and return from `tenant_scan_remote` if present. * Recover the stripe size during`tenant_import`, or fall back to 32768 (the original default stripe size). * Add tenant manifest compatibility snapshot: `2025-04-08-pgv17-tenant-manifest-v1.tar.zst` There are no cross-version concerns here, since unknown fields are ignored during deserialization where relevant.	2025-04-08 20:43:27 +00:00
Erik Grinaker	d177654e5f	gitignore: add `/artifact_cache` (#11493 ) ## Problem This is generated e.g. by `test_historic_storage_formats`, and causes VSCode to list all the contained files as new. ## Summary of changes Add `/artifact_cache` to `.gitignore`.	2025-04-08 16:57:10 +00:00
Alex Chi Z.	a09c933de3	test(pageserver): add conditional append test record (#11476 ) ## Problem For future gc-compaction tests when we support https://github.com/neondatabase/neon/issues/10395 ## Summary of changes Add a new type of neon test WAL record that is conditionally applied (i.e., only when image == the specified value). We can use this to mock the situation where we lose some records in the middle, firing an error, and see how gc-compaction reacts to it. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-08 16:08:44 +00:00
Mikhail Kot	6138d61592	Object storage proxy (#11357 ) Service targeted for storing and retrieving LFC prewarm data. Can be used for proxying S3 access for Postgres extensions like pg_mooncake as well. Requests must include a Bearer JWT token. Token is validated using a pemfile (should be passed in infra/). Note: app is not tolerant to extra trailing slashes, see app.rs `delete_prefix` test for comments. Resolves: https://github.com/neondatabase/cloud/issues/26342 Unrelated changes: gate a `rename_noreplace` feature and disable it in `remote_storage` so as `object_storage` can be built with musl	2025-04-08 14:54:53 +00:00
Roman Zaynetdinov	a7142f3bc6	Configure rsyslog for logs export using the spec (#11338 ) - Work on https://github.com/neondatabase/cloud/issues/24896 - Cplane part https://github.com/neondatabase/cloud/pull/26808 Instead of reconfiguring rsyslog via an API endpoint [we have agreed](https://neondb.slack.com/archives/C04DGM6SMTM/p1743513810964509?thread_ts=1743170369.865859&cid=C04DGM6SMTM) to have a new `logs_export_host` field as part of the compute spec. --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-04-08 14:03:09 +00:00
Dmitrii Kovalkov	7791a49dd4	fix(tests): improve test_scrubber_tenant_snapshot stability (#11471 ) ## Problem `test_scrubber_tenant_snapshot` is flaky with `request was dropped` errors. More details are in the issue. - Closes: https://github.com/neondatabase/neon/issues/11278 ## Summary of changes - Disable shard scheduling during pageservers restart - Add `reconcile_until_idle` in the end of the test	2025-04-08 10:03:38 +00:00
dependabot[bot]	8a6d0dccaa	build(deps): bump tokio from 1.38.0 to 1.38.2 in /test_runner/pg_clients/rust/tokio-postgres in the cargo group across 1 directory (#11478 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-04-08 10:01:15 +00:00
Heikki Linnakangas	7ffcbfde9a	refactor: Move LFC function prototypes to separate header file (#11458 ) Also, move the call to the lfc_init() function. It was weird to have it in libpagestore.c, when libpagestore.c otherwise had nothing to do with the LFC. Move it directly into _PG_init()	2025-04-08 09:03:56 +00:00
github-actions[bot]	1e6bb48076	Proxy release 2025-04-08	2025-04-08 06:01:37 +00:00
Konstantin Knizhnik	b2a0b2e9dd	Skip hole tags in local_cache view (#11454 ) ## Problem If the local file cache is shrunk, so that we punch some holes in the underlying file, the local_cache view displays the holes incorrectly. See https://github.com/neondatabase/neon/issues/10770 ## Summary of changes Skip hole tags in the local_cache view. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-08 03:52:50 +00:00
Alex Chi Z.	0875dacce0	fix(pageserver): more aggressively yield in gc-compaction, degrade errors to warnings (#11469 ) ## Problem Fix various small issues discovered during gc-compaction rollout. ## Summary of changes - Log level changes: if errors are from gc-compaction, fire a warning instead of errors or critical errors. - Yield to L0 compaction more aggressively. Instead of checking every 1k keys, we check on every key. Sometimes a single key reconstruct takes a long time. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-07 21:19:06 +00:00
Erik Grinaker	99d8788756	pageserver: improve tenant manifest lifecycle (#11328 ) ## Problem Currently, the tenant manifest is only uploaded if there are offloaded timelines. The checks are also a bit loose (e.g. only checks number of offloaded timelines). We want to start using the manifest for other things too (e.g. stripe size). Resolves #11271. ## Summary of changes This patch ensures that a tenant manifest always exists. The lifecycle is: * During preload, fetch the existing manifest, if any. * During attach, upload a tenant manifest if it differs from the preloaded one (or does not exist). * Upload a new manifest as needed, if it differs from the last-known manifest (ignoring version number). * On splits, pre-populate the manifest from the parent. * During Pageserver physical GC, remove old manifests but keep the latest 2 generations. This will cause nearly all existing tenants to upload a new tenant manifest on their first attach after this change. Attaches are concurrency-limited in the storage controller, so we expect this will be fine. Also updates `make_broken` to automatically log at `INFO` level when the tenant has been cancelled, to avoid spurious error logs during shutdown.	2025-04-07 19:10:36 +00:00
Erik Grinaker	26c5c7e942	pageserver: set `Stopping` state on attach cancellation (#11462 ) ## Problem If a tenant is cancelled (e.g. due to Pageserver shutdown) during attach, it is set to `Broken`. This results both in error log spam and 500 responses for clients -- shutdown is supposed to return 503 responses which can be retried. This becomes more likely to happen with #11328, where we perform tenant manifest downloads/uploads during attach. ## Summary of changes Set tenant state to `Stopping` when attach fails and the tenant is cancelled, downgrading the log messages to INFO. This introduces two variants of `Stopping` -- with and without a caller barrier -- where the latter is used to signal attach cancellation.	2025-04-07 17:56:56 +00:00
Arpad Müller	8a2b19f467	Allow potential warning in test_storcon_create_delete_sk_down (#11466 ) Since merging #11400 and addition of `test_storcon_create_delete_sk_down`, we've seen an error occur multiple times. https://github.com/neondatabase/neon/pull/11400#issuecomment-2782528369	2025-04-07 16:52:54 +00:00
Arpad Müller	486872dd28	Add support to specify auth token via --auth-token-path (#11443 ) Before we specified the JWT via `SAFEKEEPER_AUTH_TOKEN`, but env vars are quite public, both in procfs as well as the unit files. So add a way to put the auth token into a file directly. context: https://neondb.slack.com/archives/C033RQ5SPDH/p1743692566311099	2025-04-07 16:12:04 +00:00
Alex Chi Z.	d37e90f430	fix(pageserver): allow shard ancestor compaction to be cancelled (#11452 ) ## Problem https://github.com/neondatabase/neon/issues/11330 https://github.com/neondatabase/neon/issues/11358 ## Summary of changes Looking at the staging log, a few tenants right after shard split are stuck on shutdown because they are running shard ancestor compaction. The compaction does not respect the cancellation token. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-07 16:01:21 +00:00
Konstantin Knizhnik	8eb701d706	Save FSM/VM pages on normal shutdown (#11449 ) ## Problem See https://neondb.slack.com/archives/C03QLRH7PPD/p1743746717119179 We wallow FSM/VM pages when they are written to disk to persist them in PS. But it is not happen during shutdown checkpoint, because writing to WAL during checkpoint cause Postgres panic. ## Summary of changes Move `CheckPointBuffers` call to `PreCheckPointGuts` Postgres PRs: https://github.com/neondatabase/postgres/pull/615 https://github.com/neondatabase/postgres/pull/614 https://github.com/neondatabase/postgres/pull/613 https://github.com/neondatabase/postgres/pull/612 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-07 13:56:55 +00:00
Conrad Ludgate	85a515c176	update tokio for RUSTSEC-2025-0023 (#11464 )	2025-04-07 13:33:56 +00:00
Christian Schwarz	aa88279681	fix(storcon/http): node status API returns serialized runtime object (#11461 ) The Serialize impl on the `Node` type is for the `/debug` endpoint only. Committed APIs should use the `NodeDescribeResponse`. Refs - fixes https://github.com/neondatabase/neon/issues/11326 - found while working on admin UI change https://github.com/neondatabase/cloud/pull/26207	2025-04-07 12:23:40 +00:00
Heikki Linnakangas	b2a670c765	refactor: Use same prototype for neon_read_at_lsn on all PG versions (#11457 ) The 'neon_read' function needs to have a different prototype on PG < 16, because it's part of the smgr interface. But neon_read_at_lsn doesn't have that restriction.	2025-04-07 11:04:36 +00:00
a-masterov	ad9655bb01	Fix the errors in pg_regress test running on the staging. (#11432 ) ## Problem The shared libraries preloaded by default interfered with the `pg_regress` tests on staging, causing wrong results ## Summary of changes The projects used for these tests are now free from unnecessary extensions. Some changes were made in patches.	2025-04-06 19:30:21 +00:00
Heikki Linnakangas	1a87975d95	Misc cleanup of #includes and comments in the neon extension (#11456 ) Remove useless and often wrong IDENTIFICATION comments. PostgreSQL sources have them, mostly for historical reasons, but there's no need for us to copy that style. Remove unnecessary #includes in header files, putting the #includes directly in the .c files that need them. The principle is that a header file should #include other header files if they need definitions from them, such that each header file can be compiled on its own, but not other #includes. (There are tools to enforce that, but this was just a manual clean up of violations that I happened to spot.)	2025-04-06 15:34:13 +00:00
dependabot[bot]	417b2781d9	build(deps): bump openssl from 0.10.70 to 0.10.72 in /test_runner/pg_clients/rust/tokio-postgres in the cargo group across 1 directory (#11455 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-04-05 13:00:51 +00:00
Suhas Thalanki	2841f1ffa5	removal of `pg_embedding` (#11440 ) ## Problem The `pg_embedding` extension has been deprecated and can cause issues with recent changes such as with https://github.com/neondatabase/neon/issues/10973 Issue: `PG:2025-04-03 15:39:25.498 GMT ttid=a4de5bee50225424b053dc64bac96d87/d6f3891b8f968458b3f7edea58fb3c6f sqlstate=58P01 [15526] ERROR: could not load library "/usr/local/lib/embedding.so": /usr/local/lib/embedding.so: undefined symbol: SetLastWrittenLSNForRelation` ## Summary of changes Removed `pg_embedding` extension from the compute image.	2025-04-04 18:21:23 +00:00
Christian Schwarz	aad410c8f1	improve ondemand-download latency observability (#11421 ) ## Problem We don't have metrics to exactly quantify the end user impact of on-demand downloads. Perf tracing is underway (#11140) to supply us with high-resolution samples. But it will also be useful to have some aggregate per-timeline and per-instance metrics that definitively contain all observations. ## Summary of changes This PR consists of independent commits that should be reviewed independently. However, for convenience, we're going to merge them together. - refactor(metrics): measure_remote_op can use async traits - impr(pageserver metrics): task_kind dimension for remote_timeline_client latency histo - implements https://github.com/neondatabase/cloud/issues/26800 - refs https://github.com/neondatabase/cloud/issues/26193#issuecomment-2769705793 - use the opportunity to rename the metric and add a _global suffix; checked grafana export, it's only used in two personal dashboards, one of them mine, the other by Heikki - log on-demand download latency for expensive-to-query but precise ground truth - metric for wall clock time spent waiting for on-demand downloads ## Refs - refs https://github.com/neondatabase/cloud/issues/26800 - a bunch of minor investigations / incidents into latency outliers	2025-04-04 18:04:39 +00:00
Christian Schwarz	4f94751b75	pageserver config: ignore+warn about unknown fields (instead of `deny_unknown_fields`) (#11275 ) # Refs - refs https://github.com/neondatabase/neon/issues/8915 - discussion thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1742406381132599 - stacked atop https://github.com/neondatabase/neon/pull/11298 - corresponding internal docs update that illustrates how this PR removes friction: https://github.com/neondatabase/docs/pull/404 # Problem Rejecting `pageserver.toml`s with unknown fields adds friction, especially when using `pageserver.toml` fields as feature flags that need to be decommissioned. See the added paragraphs on `pageserver_api::models::ConfigToml` for details on what kind of friction it causes. Also read the corresponding internal docs update linked above to see a more imperative guide for using `pageserver.toml` flags as feature flags. # Solution ## Ignoring unknown fields Ignoring is the serde default behavior. So, just remove `serde(deny_unknown_fields)` from all structs in `pageserver_api::config::ConfigToml` `pageserver_api::config::TenantConfigToml`. I went through all the child fields and verified they don't use `deny_unknown_fields` either, including those shared with `pageserver_api::models`. ## Warning about unknown fields We still want to warn about unknown fields to - be informed about typos in the config template - be reminded about feature-flag style configs that have been cleaned up in code but not yet in config templates We tried `serde_ignore` (cf draft #11319) but it doesn't work with `serde(flatten)`. The solution we arrived at is to compare the on-disk TOML with the TOML that we produce if we serialize the `ConfigToml` again. Any key specified in the on-disk TOML but not present in the serialized TOML is flagged as an ignored key. The mechanism to do it is a tiny recursive decent visitor on the `toml_edit::DocumentMut`. # Future Work Invalid config _values_ in known fields will continue to fail pageserver startup. See - https://github.com/neondatabase/cloud/issues/24349 for current worst case impact to deployments & ideas to improve.	2025-04-04 17:30:58 +00:00
Christian Schwarz	6ee84d985a	impr(perf tracing): ability to correlate with page_service logs (#11398 ) # Problem Current perf tracing fields do not allow answering the question what a specific Postgres backend was waiting for. # Background For Pageserver logs, we set the backend PID as the libpq `application_name` on the compute side, and funnel that into the a tracing field for the spans that emit to the global tracing subscriber. # Solution Funnel `application_name`, and the other fields that we use in the logging spans, into the root span for perf tracing. # Refs - fixes https://github.com/neondatabase/neon/issues/11393 - stacked atop https://github.com/neondatabase/neon/pull/11433 - epic: https://github.com/neondatabase/neon/issues/9873	2025-04-04 15:13:54 +00:00
JC Grünhage	295be03a33	impr(ci): send clearer notifications to slack when retrying container image pushes (#11447 ) ## Problem We've started sending slack notifications for failed container image pushes that are being retried. There are more messages coming in than expected, so clicking through the link to see what image failed is happening more often than we hoped. ## Summary of changes - Make slack notifications clearer, including whether the job succeeded and what retries have happened. - Log failures/retries in step more clearly, so that you can easily see when something fails.	2025-04-04 14:56:41 +00:00
Alexander Bayandin	8e1b5a9727	Fix Postgres build on macOS (#11442 ) ## Problem Postgres build fails with the following error on macOS: ``` /Users/bayandin/work/neon//vendor/postgres-v14/src/port/snprintf.c:424:27: error: 'strchrnul' is only available on macOS 15.4 or newer [-Werror,-Wunguarded-availability-new] 424 \| const char next_pct = strchrnul(format + 1, '%'); \| ^~~~~~~~~ /Users/bayandin/work/neon//vendor/postgres-v14/src/port/snprintf.c:376:14: note: 'strchrnul' has been marked as being introduced in macOS 15.4 here, but the deployment target is macOS 15.0.0 376 \| extern char strchrnul(const char s, int c); \| ^ /Users/bayandin/work/neon//vendor/postgres-v14/src/port/snprintf.c:424:27: note: enclose 'strchrnul' in a __builtin_available check to silence this warning 424 \| const char next_pct = strchrnul(format + 1, '%'); \| ^~~~~~~~~ 425 \| 426 \| /* Dump literal data we just scanned over / 427 \| dostr(format, next_pct - format, target); 428 \| if (target->failed) 429 \| break; 430 \| 431 \| if (next_pct == '\0') 432 \| break; 433 \| format = next_pct; \| 1 error generated. ``` ## Summary of changes - Update Postgres fork to include changes from `6da2ba1d8a` Corresponding Postgres PRs: - https://github.com/neondatabase/postgres/pull/608 - https://github.com/neondatabase/postgres/pull/609 - https://github.com/neondatabase/postgres/pull/610 - https://github.com/neondatabase/postgres/pull/611	2025-04-04 14:09:15 +00:00
Vlad Lazar	1ef4258f29	pageserver: add tenant level performance tracing sampling ratio (#11433 ) ## Problem https://github.com/neondatabase/neon/pull/11140 introduces performance tracing with OTEL and a pageserver config which configures the sampling ratio of get page requests. Enabling a non-zero sampling ratio on a per region basis is too aggressive and comes with perf impact that isn't very well understood yet. ## Summary of changes Add a `sampling_ratio` tenant level config which overrides the pageserver level config. Note that we do not cache the config and load it on every get page request such that changes propagate timely. Note that I've had to remove the `SHARD_SELECTION` span to get this to work. The tracing library doesn't expose a neat way to drop a span if one realises it's not needed at runtime. Closes https://github.com/neondatabase/neon/issues/11392	2025-04-04 13:41:28 +00:00
Vlad Lazar	65e2aae6e4	pageserver/secondary: deregister IO metrics (#11283 ) ## Problem IO metrics for secondary locations do not get deregistered when the timeline is removed. ## Summary of changes Stash the request context to be used for downloads in `SecondaryTimelineDetail`. These objects match the lifetime of the secondary timeline location pretty well. When the timeline is removed, deregister the metrics too. Closes https://github.com/neondatabase/neon/issues/11156	2025-04-04 10:52:59 +00:00
a-masterov	edc874e1b3	Use the same test image version as the computer one (#11448 ) ## Problem Changes in compute can cause errors in tests if another version of `neon-test-extensions` image is used. ## Summary of changes Use the same version of `neon-test-extensions` image as `compute` one for docker-compose based extension tests.	2025-04-04 10:13:00 +00:00
Dmitrii Kovalkov	181af302b5	storcon + safekeeper + scrubber: propagate root CA certs everywhere (#11418 ) ## Problem There are some places in the code where we create `reqwest::Client` without providing SSL CA certs from `ssl_ca_file`. These will break after we enable TLS everywhere. - Part of https://github.com/neondatabase/cloud/issues/22686 ## Summary of changes - Support `ssl_ca_file` in storage scrubber. - Add `use_https_safekeeper_api` option to safekeeper to use https for peer requests. - Propagate SSL CA certs to storage_controller/client, storcon's ComputeHook, PeerClient and maybe_forward.	2025-04-04 06:30:48 +00:00
Tristan Partin	497116b76d	Download extension if it does not exist on the filesystem (#11315 ) Previously we attempted to download all extensions in CREATE EXTENSION statements. Extensions like pg_stat_statements and neon are not remote extensions, but still we were requesting them when skip_pg_catalog_updates was set to false. Fixes: https://github.com/neondatabase/neon/issues/11127 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-04 01:06:22 +00:00
Arpad Müller	a917952b30	Add test_storcon_create_delete_sk_down and make it work (#11400 ) Adds a test `test_storcon_create_delete_sk_down` which tests the reconciler and pending op persistence if faced with a temporary safekeeper downtime during timeline creation or deletion. This is in contrast to `test_explicit_timeline_creation_storcon`, which tests the happy path. We also do some fixes: * timeline and tenant deletion http requests didn't expect a body, but `()` sent one. * we got the tenant deletion http request's return type wrong: it's supposed to be a hash map * we add some logging to improve observability * We fix `list_pending_ops` which had broken code meant to make it possible to restrict oneself to a single pageserver. But diesel doesn't support that sadly, or at least I couldn't figure out a way to make it work. We don't need that functionality, so remove it. * We add an info span to the heartbeater futures with the node id, so that there is no context-free msgs like "Backoff: waiting 1.1 seconds before processing with the task" in the storcon logs. we could also add the full base url of the node but don't do it as most other log lines contain that information already, and if we do duplication it should at least not be verbose. One can always find out the base url from the node id. Successor of #11261 Part of #9011	2025-04-04 00:17:40 +00:00
Tristan Partin	e581b670f4	Improve nightly physical replication benchmark (#11389 ) Log the created project and endpoint IDs and improve typing in the source code to improve readability. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-03 23:00:58 +00:00
Alexander Bayandin	8ed79ed773	build(deps): bump h2 to 4.2.0 (#11437 ) ## Problem We switched `h2` from 4.1.0 to a git commit to fix stubgen (in https://github.com/neondatabase/neon/pull/10491). `h2` 4.2.0 was released soon after that, so we can switch back to a pinned version. Expected no changes, as 4.2.0 is the right next commit after the commit we currently use: `dacd614fed`%5E ## Summary of changes - Bump `h2` to 4.2.0	2025-04-03 21:42:34 +00:00
Alex Chi Z.	381f42519e	fix(pageserver): skip gc-compaction over sparse keyspaces (#11404 ) ## Problem Part of https://github.com/neondatabase/neon/issues/11318 It's not 100% safe for now to run gc-compaction over the sparse keyspace. It might cause deleted file to re-appear if a specific sequence of operations are done as in the issue, which in reality doesn't happen due to how we split delta/image layers based on the key range. A long-term fix would be either having a separate gc-compaction code path for metadata keys (as how we have a different code path for metadata image layer generation), or let the compaction process aware of the information of "there's an image layer that doesn't contain a key" so that we can skip the keys. ## Summary of changes * gc-compaction auto trigger only triggers compaction over the normal data range. * do not hold gc_block_guard across the full compaction job, only hold it during each subcompaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-03 19:40:44 +00:00
Arpad Müller	375df517a0	storcon: return 503 instead of 500 if there is no new leader yet (#11417 ) The leadership transfer protocol between storage controller instances is as follows, listing the steps for the new pod: The new pod does these things: 1. new pod comes online. looks in database if there is a leader. if there is, it asks that leader to step down. 2. the new pod does some operations to come online. they should be fairly short timed, but it's not zero. 3. the new pod updates the leader entry in the database. The old pod, once it gets the step down request, changes its internal state to stepped down. It treats all incoming requests specially now: instead of processing, it wants to forward them to the new pod. The forwarding however only works if the new pod is online already, so before forwarding it reads from the db for a leader (also to get the address to forward to in the first place). If the new pod is not online yet, i.e. during step 2 above, the old pod might legitimately land in the branch which this patch is editing: the leader in the database is a stepped down instance. Before, we've returned a `ApiError::InternalServerError`, but that would print the full backtrace plus an error log. With this patch, we cut down on the noise, as it's an expected situation to have a short storcon downtime while we are cutting over to the new instance. A `ResourceUnavailable` error is not just more fitting, it also doesn't print a backtrace once encountered, and only prints on the INFO log level (see `api_error_handler` function). Fixes #11320 cc #8954	2025-04-03 18:43:16 +00:00
Vlad Lazar	9db63fea7a	pageserver: optionally export perf traces in OTEL format (#11140 ) Based on https://github.com/neondatabase/neon/pull/11139 ## Problem We want to export performance traces from the pageserver in OTEL format. End goal is to see them in Grafana. ## Summary of changes https://github.com/neondatabase/neon/pull/11139 introduces the infrastructure required to run the otel collector alongside the pageserver. ### Design Requirements: 1. We'd like to avoid implementing our own performance tracing stack if possible and use the `tracing` crate if possible. 2. Ideally, we'd like zero overhead of a sampling rate of zero and be a be able to change the tracing config for a tenant on the fly. 3. We should leave the current span hierarchy intact. This includes adding perf traces without modifying existing tracing. To satisfy (3) (and (2) in part) a separate span hierarchy is used. `RequestContext` gains an optional `perf_span` member that's only set when the request was chosen by sampling. All perf span related methods added to `RequestContext` are no-ops for requests that are not sampled. This on its own is not enough for (3), so performance spans use a separate tracing subscriber. The `tracing` crate doesn't have great support for this, so there's a fair amount of boilerplate to override the subscriber at all points of the perf span lifecycle. ### Perf Impact [Periodic pagebench](https://neonprod.grafana.net/d/ddqtbfykfqfi8d/e904990?orgId=1&from=2025-02-08T14:15:59.362Z&to=2025-03-10T14:15:59.362Z&timezone=utc) shows no statistically significant regression with a sample ratio of 0. There's an annotation on the dashboard on 2025-03-06. ### Overview of changes: 1. Clean up the `RequestContext` API a bit. Namely, get rid of the `RequestContext::extend` API and use the builder instead. 2. Add pageserver level configs for tracing: sampling ratio, otel endpoint, etc. 3. Introduce some perf span tracking utilities and expose them via `RequestContext`. We add a `tracing::Span` wrapper to be used for perf spans and a `tracing::Instrumented` equivalent for it. See doc comments for reason. 4. Set up OTEL tracing infra according to configuration. A separate runtime is used for the collector. 5. Add perf traces to the read path. ## Refs - epic https://github.com/neondatabase/neon/issues/9873 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-04-03 17:56:51 +00:00
Alex Chi Z.	bfc767d60d	fix(test): wait for shard split complete for test_lsn_lease_storcon (#11436 ) ## Problem close https://github.com/neondatabase/neon/issues/11397 ref https://github.com/neondatabase/cloud/issues/23667 ## Summary of changes We need to wait until the shard split is complete, otherwise it will print warning like waiting for shard split exclusive lock for 30s. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-03 17:49:45 +00:00
Alex Chi Z.	109c54a300	fix(pageserver): avoid gc-compaction triggering circuit breaker (#11403 ) ## Problem There are some cases where traditional gc might collect some layer files causing gc-compaction cannot read the full history of the key. This needs to be resolved in the long-term by improving the compaction process. For now, let's simply avoid such errors triggering the circuit breaker. ## Summary of changes * Move the place where we trigger the circuit breaker. We only trigger it during compactions other than L0 compactions. We added the trigger a year ago due to file cleanup concerns in image layer compaction. * For gc-compaction, only return errors to the upper compaction_iteration if it's a shutdown error. Otherwise, just log it and skip the compaction for a key range. Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-04-03 17:18:37 +00:00
Vlad Lazar	74920d8cd8	storcon: notify compute if correct observed state was refreshed (#11342 ) ## Problem Previously, if the observed state was refreshed and matching the intent, we wouldn't send a compute notification. This is unsafe. There's no guarantee that the location landed on the pageserver _and_ a compute notification for it was delivered. See https://github.com/neondatabase/neon/issues/11291#issuecomment-2743205411 for one such example. ## Summary of changes Add a reproducer and notify the compute if the correct observed state required a refresh. Closes https://github.com/neondatabase/neon/issues/11291	2025-04-03 16:35:55 +00:00
Alex Chi Z.	131b32ef48	fix(pageserver): clean up aux files before detaching (#11299 ) ## Problem Related to https://github.com/neondatabase/cloud/issues/26091 and https://github.com/neondatabase/cloud/issues/25840 Close https://github.com/neondatabase/neon/issues/11297 Discussion on Slack: https://neondb.slack.com/archives/C033RQ5SPDH/p1742320666313969 ## Summary of changes * When detaching, scan all aux files within `sparse_non_inherited_keyspace` in the ancestor timeline and create an image layer exactly at the ancestor LSN. All scanned keys will map to an empty value, which is a delete tombstone. - Note that end_lsn for rewritten delta layers = ancestor_lsn + 1, so the image layer will have image_end_lsn=end_lsn. With the current `select_layer` logic, the read path will always first read the image layer. * Add a test case. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-04-03 15:55:22 +00:00
Suhas Thalanki	581bb5d7d5	removed pg_anon setup from compute dockerfile (#10960 ) ## Problem Removing the `anon` v1 extension in postgres as described in https://github.com/neondatabase/cloud/issues/22663. This extension is not built for postgres v17 and is out of date when compared to the upstream variant which is v2 (we have v1.4). ## Summary of changes Removed the `anon` v1 extension from the compute docker image Related to https://github.com/neondatabase/cloud/issues/22663	2025-04-03 15:26:35 +00:00
JC Grünhage	3c78133477	feat(ci): add 'released' tag to container images from release runs (#11425 ) ## Problem We had a problem with https://github.com/neondatabase/neon/pull/11413 having e2e tests failing, because an e2e test (`8d271bed47`) depended on an unreleased pageserver fix (`0ee5bfa2fc`). This came up because neon release CI runs against the most recent releases of the other components, but cloud e2e tests run against latest, which is tagged from main. ## Summary of changes Add an additional `released` tag for released versions. ## Alternative to consider We could (and maybe should) instead switch to `latest` being used for released versions and `main` being used where we use `latest` right now. That'd also mean we don't have to adjust the CI in the cloud repo.	2025-04-03 14:57:44 +00:00
Suhas Thalanki	46e046e779	Exporting `file_cache_used` to calculate LFC utilization (#11384 ) ## Problem Exporting `file_cache_used` which specifies the number of used chunks in the LFC. This helps calculate LFC utilization as: `file_cache_used_pages / (file_cache_used * file_cache_chunk_size_pages)` ## Summary of changes Exporting `file_cache_used`. Related Issue: https://github.com/neondatabase/cloud/issues/26688	2025-04-03 14:54:45 +00:00
Arpad Müller	d8cee52637	Update rust to 1.86.0 (#11431 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Announcement blog post](https://blog.rust-lang.org/2025/04/03/Rust-1.86.0.html). Prior update was in #10914.	2025-04-03 14:53:28 +00:00
Dmitrii Kovalkov	2e11d129d0	tests: suppress mgm api timeout error in sotrcon (#11428 ) ## Problem Since `0f367cb665` the timeout in `with_client_retries` is implemented via `tokio::timeout` instead of `reqwest::ClientBuilder::timeout` (because we reuse the client). It changed the error representation if the timeout is exceeded. Such errors were suppressed in `allowed_errors.py`, but old regexps do not match the new error. Discussion: https://neondb.slack.com/archives/C033RQ5SPDH/p1743533184736319 ## Summary of changes - Add new `Timeout` error to `allowed_errors.py`	2025-04-03 14:18:50 +00:00
Luís Tavares	43a7423f72	feat: bump pg_session_jwt extension to 0.3.0 (#11399 ) ## Problem Bumps https://github.com/neondatabase/pg_session_jwt to the latest release [v0.3.0](https://github.com/neondatabase/pg_session_jwt/releases/tag/v0.3.0) that introduces PostgREST fallback mechanisms. ## Summary of changes Updates the extension download tar and the extension version in the proxy constant. ## Subscribers @mrl5	2025-04-03 13:01:18 +00:00
Arpad Müller	374736a4de	Print remote_addr span for Failed to serve HTTP connection error (#11423 ) I've encountered this error in #11422. Ideally we'd have the URL as well to associate it with a tenant, but at this level we only have the remote addr I guess. Better than nothing.	2025-04-03 11:58:12 +00:00
Alexander Bayandin	5e507776bc	Compute: update plv8 patch (#11426 ) ## Problem https://github.com/neondatabase/cloud/issues/26866 ## Summary of changes - Update plv8 patch Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2025-04-03 11:29:53 +00:00
Alexander Lakhin	4e8e0951be	Increase timeout for test_pageserver_gc_compaction_smoke (#11410 ) ## Problem The test_pageserver_gc_compaction_smoke fails rather often due to a timeout on slow machines. See https://github.com/neondatabase/neon/issues/11355. ## Summary of changes Increase the timeout for the test.	2025-04-03 11:23:30 +00:00
JC Grünhage	64a8d0c2e6	impr(ci): retry container image pushing and send slack messages for failures (#11416 ) ## Problem We've seen quite a few CI failures related to pushes to docker hub failing with weird error messages that indicate maybe docker hub is just not reliable. ## Summary of changes Retry container image pushing up to 10 times, and send a slack message if we had to retry, regardless of the job succeeding or not.	2025-04-03 08:59:42 +00:00
Tristan Partin	7602e6ffc0	Skip compute_ctl authorization checks in testing builds (#11186 ) We will require authorization in production. We need to skip in testing builds for now because regression tests would fail. See https://github.com/neondatabase/neon/issues/11316 for more information. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-03 00:00:28 +00:00
Erik Grinaker	17193d6a33	test_runner: fix pagebench tenant configs (#11420 ) ## Problem Pagebench creates a bunch of tenants by first creating a template tenant and copying its remote storage, then attaching the copies to the Pageserver. These tenants had custom configurations to disable GC and compaction. However, these configs were only picked up by the Pageserver on attach, and not registered with the storage controller. This caused the storage controller to replace the tenant configs with the default tenant config, re-enabling GC and compaction which interferes with benchmark performance. Resolves #11381. ## Summary of changes Register the copied tenants with the storage controller, instead of directly attaching them to the Pageserver.	2025-04-02 20:11:39 +00:00
Erik Grinaker	03ae57236f	docs: add compaction notes (#11415 ) Lifted from https://www.notion.so/neondatabase/Rough-Notes-on-Compaction-1baf189e004780859e65ef63b85cfa81?pvs=4.	2025-04-02 19:55:08 +00:00
Arpad Müller	e3d27b2f68	Start safekeeper node IDs with 0 and forbid 0 from registering (#11419 ) Right now we start safekeeper node ids at 0. However, other code treats 0 as invalid (see #11407). We decided on latter. Therefore, make the register python tests register safekeepers starting at node id 1 instead of 0, and forbid safekeepers with id 0 from registering. Context: https://github.com/neondatabase/neon/pull/11407#discussion_r2024852328	2025-04-02 18:36:50 +00:00
Alex Chi Z.	dd1299f337	feat(storcon): passthrough mark invisible and add tests (#11401 ) ## Problem close https://github.com/neondatabase/neon/issues/11279 ## Summary of changes * Allow passthrough of other methods in tenant timeline shard0 passthrough of storcon. * Passthrough mark invisible API in storcon. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-02 17:11:49 +00:00
John Spray	cb19e4e05d	pageserver: remove legacy TimelineInfo::latest_gc_cutoff field (2/2) (#11136 ) ## Problem This field was retained for backward compat only in #10707. Once https://github.com/neondatabase/cloud/pull/25233 is released, nothing will be reading this field. Related: https://github.com/neondatabase/cloud/issues/24250 ## Summary of changes - Remove TimelineInfo::latest_gc_cutoff_lsn	2025-04-02 15:21:58 +00:00
Erik Grinaker	9df230c837	storcon: improve autosplit defaults (#11332 ) ## Problem In #11122, we changed the autosplit behavior to allow repeated and initial splits. The defaults were set such that they retain the current production settings (8 shards at 64 GB). However, these defaults don't really make sense by themselves. Once we deploy new settings to production, we should change the defaults to something more reasonable. ## Summary of changes Changes the following default settings: * `max_split_shards`: 8 → 16 * `initial_split_threshold`: 64 GB → disabled * `initial_split_shards`: 8 → 2	2025-04-02 15:11:52 +00:00
JC Grünhage	3c2bc5baba	fix(ci): run checks on release PRs (#11375 ) ## Problem Hotfix releases mean that sometimes changes in release PRs haven't been tested and linted yet. Disabling tests and lints is therefore not necessarily safe. In the future we will check whether tests have run on the same git tree already to speed things up, but for now we need to turn tests back on fully. This partially reverts: https://github.com/neondatabase/neon/pull/11272 ## Summary of changes Run checks on `.*-rc-pr` runs.	2025-04-02 14:32:53 +00:00
Alexey Kondratov	6667810800	chore(compute_ctl): Minor code and comment fixes (#11411 ) ## Problem In #11376 I mistakenly reworded one comment and also forgot to commit one of the suggestions. ## Summary of changes Fix it here.	2025-04-02 14:20:52 +00:00
Erik Grinaker	47f5bcf2bc	pageserver: don't periodically flush layers for stale attachments (#11317 ) ## Problem Tenants in attachment state `Stale` can't upload layers, and don't run compaction, but still do periodic L0 layer flushes in the tenant housekeeping loop. If the tenant remains stuck in stale mode, this causes a large buildup of L0 layers, causing logging, metrics increases, and possibly alerts. Resolves #11245. ## Summary of changes Don't perform periodic layer flushes in stale attachment state.	2025-04-02 12:55:15 +00:00
a-masterov	c179d098ef	Increase the timeout for extensions upgrade tests (on schedule). (#11406 ) ## Problem Sometimes the forced extension upgrade test fails (on schedule) due to a timeout. ## Summary of changes The timeout is increased to 60 mins.	2025-04-02 11:18:37 +00:00
Peter Bendel	4bc6dbdd5f	use a prod-like shared_buffers size for some perf unit tests (#11373 ) ## Problem In Neon DBaaS we adjust the shared_buffers to the size of the compute, or better described we adjust the max number of connections to the compute size and we adjust the shared_buffers size to the number of max connections according to about the following sizes `2 CU: 225mb; 4 CU: 450mb; 8 CU: 900mb` [see](`877e33b428/goapp/controlplane/internal/pkg/compute/computespec/pg_settings.go (L405)`) ## Summary of changes We should run perf unit tests with settings that is realistic for a paying customer and select 8 CU as the reference for those tests.	2025-04-02 10:43:05 +00:00
John Spray	7dc8370848	storcon: do graceful migrations from chaos injector (#11028 ) ## Problem Followup to https://github.com/neondatabase/neon/pull/10913 Existing chaos injection just does simple cutovers to secondary locations. Let's also exercise code for doing graceful migrations. This should implicitly test how such migrations cope with overlapping with service restarts. ## Summary of changes	2025-04-02 10:08:05 +00:00
Alex Chi Z.	c4fc602115	feat(pageserver): support synthetic size calculation for invisible branches (#11335 ) ## Problem ref https://github.com/neondatabase/neon/issues/11279 Imagine we have a branch with 3 snapshots A, B, and C: ``` base---+---+---+---main \-A \-B \-C base=100G, base-A=1G, A-B=1G, B-C=1G, C-main=1G ``` at this point, the synthetic size should be 100+1+1+1+1=104G. after the deletion, the structure looks like: ``` base---+---+---+ \-A \-B \-C ``` If we simply assume main never exists, the size will be calculated as size(A) + size(B) + size(C)=300GB, which obviously is not what the user would expect. The correct way to do this is to assume part of main still exists, that is to say, set C-main=1G: ``` base---+---+---+main \-A \-B \-C ``` And we will get the correct synthetic size of 100G+1+1+1=103G. ## Summary of changes * Do not generate gc cutoff point for invisible branches. * Use the same LSN as the last branchpoint for branch end. * Remove test_api_handler for mark_invisible. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-01 18:50:58 +00:00
Konstantin Knizhnik	02936b82c5	Fix effective_lsn calculation for prefetch (#11219 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1741594233757489 Consider the following scenario: 1. Backend A wants to prefetch some block B 2. Backend A checks that block B is not present in shared buffer 3. Backend A registers new prefetch request and calls prefetch_do_request 4. prefetch_do_request calls neon_get_request_lsns 5. neon_get_request_lsns obtains LwLSN for block B 6. Backend B downloads B, updates and wallogs it (let say to Lsn1) 7. Block B is once again thrown from shared buffers, its LwLSN is set to Lsn1 8. Backend A obtains current flush LSN, let's say that it is Lsn1 9. Backend A stores Lsn1 as effective_lsn in prefetch slot. 10. Backend A reads page B with LwLSN=Lsn1 11. Backend A finds in prefetch ring response for prefetch request for block B with effective_lsn=Lsn1, so that it satisfies neon_prefetch_response_usable condition 12. Backend A uses deteriorated version of the page! ## Summary of changes Use `not_modified_since` as `effective_lsn`. It should not cause some degrade of performance because we store LwLSN when it was not found in LwLSN hash, so if page is not changed till prefetch response is arrived, then LwLSN should not be changed. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-04-01 15:48:02 +00:00
Arpad Müller	1fad1abb24	storcon: timeline deletion improvements and fixes (#11334 ) This PR contains a bunch of smaller followups and fixes of the original PR #11058. most of these implement suggestions from Arseny: * remove `Queryable, Selectable` from `TimelinePersistence`: they are not needed. * no `Arc` around `CancellationToken`: it itself is an arc wrapper * only schedule deletes instead of scheduling excludes and deletes * persist and delete deletion ops * delete rows in timelines table upon tenant and timeline deletion * set `deleted_at` for timelines we are deleting before we start any reconciles: this flag will help us later to recognize half-executed deletions, or when we crashed before we could remove the timeline row but after we removed the last pending op (handling these situations are left for later). Part of #9011	2025-04-01 15:16:33 +00:00
Arpad Müller	016068b966	API spec: add safekeepers field returned by storcon (#11385 ) Add an optional `safekeepers` field to `TimelineInfo` which is returned by the storcon upon timeline creation if the `--timelines-onto-safekeepers` flag is enabled. It contains the list of safekeepers chosen. Other contexts where we return `TimelineInfo` do not contain the `safekeepers` field, sadly I couldn't make this more type safe like done in Rust via `TimelineCreateResponseStorcon`, as there is no way of flattening or inheritance (and I don't that duplicating the entire type for some minor type safety improvements is worth it). The storcon side has been done in #11058. Part of https://github.com/neondatabase/cloud/issues/16176 cc https://github.com/neondatabase/cloud/issues/16796	2025-04-01 12:39:10 +00:00
Erik Grinaker	80596feeaa	pageserver: invert `CompactFlags::NoYield` as `YieldForL0` (#11382 ) ## Problem `CompactFlags::NoYield` was a bit inconvenient, since every caller except for the background compaction loop should generally set it (e.g. HTTP API calls, tests, etc). It was also inconsistent with `CompactionOutcome::YieldForL0`. ## Summary of changes Invert `CompactFlags::NoYield` as `CompactFlags::YieldForL0`. There should be no behavioral changes.	2025-04-01 11:43:58 +00:00
Erik Grinaker	225cabd84d	pageserver: update upload queue TODOs (#11377 ) Update some upload queue TODOs, particularly to track https://github.com/neondatabase/neon/issues/10283, which I won't get around to.	2025-04-01 11:38:12 +00:00
Alexey Kondratov	557127550c	feat(compute): Add compute_ctl_up metric (#11376 ) ## Problem For computes running inside NeonVM, the actual compute image tag is buried inside the NeonVM spec, and we cannot get it as part of standard k8s container metrics (it's always an image and a tag of the NeonVM runner container). The workaround we currently use is to extract the running computes info from the control plane database with SQL. It has several drawbacks: i) it's complicated, separate DB per region; ii) it's slow; iii) it's still an indirect source of info, i.e. k8s state could be different from what the control plane expects. ## Summary of changes Add a new `compute_ctl_up` gauge metric with `build_tag` and `status` labels. It will help us to both overview what are the tags/versions of all running computes; and to break them down by current status (`empty`, `running`, `failed`, etc.) Later, we could introduce low cardinality (no endpoint or compute ids) streaming aggregates for such metrics, so they will be blazingly fast and usable for monitoring the fleet-wide state.	2025-04-01 08:51:17 +00:00
github-actions[bot]	1470af0b42	Proxy release 2025-04-01	2025-04-01 06:01:27 +00:00
Konstantin Knizhnik	cfe3e6d4e1	Remove loop from pageserver_try_receive (#11387 ) ## Problem Commit `3da70abfa5` cause noticeable performance regression (40% in update-with-prefetch in test_bulk_update): https://neondb.slack.com/archives/C04BLQ4LW7K/p1742633167580879 ## Summary of changes Remove loop from pageserver_try_receive to make it fetch not more than one response. There is still loop in `pump_prefetch_state` which can fetch as many responses as available. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-31 19:49:32 +00:00
Alex Chi Z.	47d47000df	fix(pageserver): passthrough lsn lease in storcon API (#11386 ) ## Problem part of https://github.com/neondatabase/cloud/issues/23667 ## Summary of changes lsn_lease API can only be used on pageservers. This patch enables storcon passthrough. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-31 19:16:42 +00:00
Matthias van de Meent	e5b95bc9dc	Neon LFC/prefetch: Improve page read handling (#11380 ) Previously we had different meanings for the bitmask of vector IOps. That has now been unified to "bit set = final result, no more scribbling". Furthermore, the LFC read path scribbled on pages that were already read; that's probably not a good thing so that's been fixed too. In passing, the read path of LFC has been updated to read only the requested pages into the provided buffers, thus reducing the IO size of vectorized IOs. ## Problem ## Summary of changes	2025-03-31 17:04:00 +00:00
Alex Chi Z.	0ee5bfa2fc	fix(pageserver): allow sibling archived branch for detaching (#11383 ) ## Problem close https://github.com/neondatabase/neon/issues/11379 ## Summary of changes Remove checks around archived branches for detach v2. I also updated the comments `ancestor_retain_lsn`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-31 16:32:55 +00:00
Fedor Dikarev	00bcafe82e	chore(ci): upgrade stats action with docker images from ghcr.io (#11378 ) ## Problem Current version of GitHub Workflow Stats action pull docker images from DockerHub, that could be an issue with the new pull limits on DockerHub side. ## Summary of changes Switch to version `v0.2.2`, with docker images hosted on `ghcr.io`	2025-03-31 14:21:07 +00:00
Conrad Ludgate	ed117af73e	chore(proxy/tokio-postgres): remove phf from sqlstate and switch to tracing (#11249 ) In sqlstate, we have a manual `phf` construction, which is not explicitly guaranteed to be stable - you're intended to use a build.rs or the macro to make sure it's constructed correctly each time. This was inherited from tokio-postgres upstream, which has the same issue (https://github.com/rust-phf/rust-phf/pull/321#issuecomment-2724521193). We don't need this encoding of sqlstate, so I've switched it to simply parse 5 bytes (https://www.postgresql.org/docs/current/errcodes-appendix.html). While here, I switched out log for tracing.	2025-03-31 12:35:51 +00:00
Konstantin Knizhnik	21a891a06d	Fix IS_LOCAL_REL macro (first class has oid=FirstNormalObjectId) (#11369 ) ## Problem Macro IS_LOCAL_REL used for DEBUG_COMPARE_LOCAL mode use greater-than rather than greater-or-equal comparison while first table really is assigned FirstNormalObjectId. ## Summary of changes Replace strict greater with greater-or-equal comparison. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-31 11:16:35 +00:00
Alexander Bayandin	30a7dd630c	ruff: enable TC — flake8-type-checking (#11368 ) ## Problem `TYPE_CHECKING` is used inconsistently across Python tests. ## Summary of changes - Update `ruff`: 0.7.0 -> 0.11.2 - Enable TC (flake8-type-checking): https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc - (auto)fix all new issues	2025-03-30 18:58:33 +00:00
Erik Grinaker	db5384e1b0	pageserver: remove L0 flush upload wait (#11196 ) ## Problem Previously, L0 flushes would wait for uploads, as a simple form of backpressure. However, this prevented flush pipelining and upload parallelism. It has since been disabled by default and replaced by L0 compaction backpressure. Touches https://github.com/neondatabase/cloud/issues/24664. ## Summary of changes This patch removes L0 flush upload waits, along with the `l0_flush_wait_upload`. This can't be merged until the setting has been removed across the fleet.	2025-03-30 13:14:04 +00:00
JC Grünhage	5cb6a4bc8b	fix(ci): use the right sha in release PRs (#11365 ) ## Problem `github.sha` contains a merge commit of `head` and `base` if we're in a PR. In release PRs, this makes no sense, because we fast-forward the `base` branch to contain the changes from `head`. Even though we correctly use `${{ github.event.pull_request.head.sha \|\| github.sha }}` to reference the git commit when building artifacts, we don't use that when checking out code, because we want to test the merge of head and base usually. In the case of release PRs, we definitely always want to test on the head sha though, because we're going to forward that, and it already has the base sha as a parent, so the merge would end up with the same tree anyway. As a side effect, not checking out `${{ github.event.pull_request.head.sha \|\| github.sha }}` also caused https://github.com/neondatabase/neon/actions/runs/13986389780/job/39173256184#step:6:49 to say `release-tag=release-compute-8187`, while https://github.com/neondatabase/neon/actions/runs/14084613121/job/39445314780#step:6:48 is talking about `build-tag=release-compute-8186` ## Summary of changes Run a few things on `github.event.pull_request.head.sha`, if we're in a release PR.	2025-03-28 11:56:24 +00:00
Folke Behrens	1dbf40ee2c	proxy: Update redis crate (#11372 )	2025-03-28 11:43:52 +00:00
Fedor Dikarev	939354abea	chore(ci): pin python base images to sha (#11367 ) Similar to how we pin base `debian` images, also pin `python` base images, so we better cache them and have reproducible builds.	2025-03-27 17:42:28 +00:00
Fedor Dikarev	1d5d168626	impr(ci): use hetzner buckets for cache (#11364 ) ## Problem Occasionally getting data from GH cache could be slow, with less than 10MB/s and taking 5+ minutes to download cache: ``` Received 20971520 of 2987085791 (0.7%), 9.9 MBs/sec Received 50331648 of 2987085791 (1.7%), 15.9 MBs/sec ... Received 1065353216 of 2987085791 (35.7%), 4.8 MBs/sec Received 1065353216 of 2987085791 (35.7%), 4.7 MBs/sec ... ``` https://github.com/neondatabase/neon/actions/runs/13956437454/job/39068664599#step:7:17 Resulting in getting cache even longer that build time. ## Summary of changes Switch to the caches, that are closer to the runners, and they provided stable throughput about 70-80MB/s	2025-03-27 11:11:45 +00:00
Folke Behrens	b40dd54732	compute-node: Add some debugging tools to image (#11352 ) ## Problem Some useful debugging tools are missing from the compute image and sometimes it's impossible to install them because memory is tightly packed. ## Summary of changes Add the following tools: iproute2, lsof, screen, tcpdump. The other changes come from sorting the packages alphabetically. ```bash $ docker image inspect ghcr.io/neondatabase/vm-compute-node-v16:7555 \| jaq '.[0].Size' 1389759645 $ docker image inspect ghcr.io/neondatabase/vm-compute-node-v16:14083125313 \| jaq '.[0].Size' 1396051101 $ echo $((1396051101 - 1389759645)) 6291456 ```	2025-03-27 11:09:27 +00:00
Folke Behrens	4bb7087d4d	proxy: Fix some clippy warnings coming in next versions (#11359 )	2025-03-26 10:50:16 +00:00
Arpad Müller	5f3551e405	Add "still waiting for task" for slow shutdowns (#11351 ) To help with narrowing down https://github.com/neondatabase/cloud/issues/26362, we make the case more noisy where we are wait for the shutdown of a specific task (in the case of that issue, the `gc_loop`).	2025-03-24 17:29:44 +00:00
Anastasia Lubennikova	3e5884ff01	Revert "feat(compute_ctl): allow to change audit_log_level for existi… (#11343 ) …ng (#11308)" This reverts commit `e5aef3747c`. The logic of this commit was incorrect: enabling audit requires a restart of the compute, because audit extensions use shared_preload_libraries. So it cannot be done in the configuration phase, require endpoint restart instead.	2025-03-21 18:09:34 +00:00
Vlad Lazar	9fc7c22cc9	storcon: add use_local_compute_notifications flag (#11333 ) ## Problem While working on bulk import, I want to use the `control-plane-url` flag for a different request. Currently, the local compute hook is used whenever no control plane is specified in the config. My test requires local compute notifications and a configured `control-plane-url` which isn't supported. ## Summary of changes Add a `use-local-compute-notifications` flag. When this is set, we use the local flow regardless of other config values. It's enabled by default in neon_local and disabled by default in all other envs. I had to turn the flag off in tests that wish to bypass the local flow, but that's expected. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-03-21 15:31:06 +00:00
Folke Behrens	23ad228310	pgxn: Increase the pageserver response timeout a bit (#11339 ) Increase the PS response timeout slightly but noticeably, so it does not coincide with the default TCP_RTO_MAX.	2025-03-21 14:21:53 +00:00
Dmitrii Kovalkov	aeb53fea94	storage: support multiple SSL CA certificates (#11341 ) ## Problem - We need to support multiple SSL CA certificates for graceful root CA certificate rotation. - Closes: https://github.com/neondatabase/cloud/issues/25971 ## Summary of changes - Parses `ssl_ca_file` as a pem bundle, which may contain multiple certificates. Single pem cert is a valid pem bundle, so the change is backward compatible.	2025-03-21 13:43:38 +00:00
Dmitrii Kovalkov	0f367cb665	storcon: reuse reqwest http client (#11327 ) ## Problem - Part of https://github.com/neondatabase/neon/issues/11113 - Building a new `reqwest::Client` for every request is expensive because it parses CA certs under the hood. It's noticeable in storcon's flamegraph. ## Summary of changes - Reuse one `reqwest::Client` for all API calls to avoid parsing CA certificates every time.	2025-03-21 11:48:22 +00:00
John Spray	76088c16d2	storcon: reproduce shard split issue (#11290 ) ## Problem Issue https://github.com/neondatabase/neon/issues/11254 describes a case where restart during a shard split can result in a bad end state in the database. ## Summary of changes - Add a reproducer for the issue - Tighten an existing safety check around updated row counts in complete_shard_split	2025-03-21 08:48:56 +00:00
John Spray	0d99609870	docs: storage controller retro-RFC (#11218 ) ## Problem Various aspects of the controller's job are already described in RFCs, but the overall service didn't have an RFC that records design tradeoffs and the top level structure. ## Summary of changes - Add a retrospective RFC that should be useful for anyone understanding storage controller functionality	2025-03-21 08:32:11 +00:00
Alex Chi Z.	bae9b9acdc	feat(pageserver): persist timeline invisible flag (#11331 ) ## Problem part of https://github.com/neondatabase/neon/issues/11279 ## Summary of changes The invisible flag is used to exclude a timeline from synthetic size calculation. For the first step, let's persist this flag. Most of the code are following the `is_archived` modification flow. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-20 18:39:08 +00:00
Nikita Kalyanov	53f54ba37a	chore: expose detach_v2 (#11325 ) we need this exposed in the spec to use it in cplane. extracted from https://github.com/neondatabase/cloud/pull/26167 ## Problem ## Summary of changes	2025-03-20 18:04:17 +00:00
Dmitrii Kovalkov	28fc051dcc	storage: live ssl certificate reload (#11309 ) ## Problem SSL certs are loaded only during start up. It doesn't allow the rotation of short-lived certificates without server restart. - Closes: https://github.com/neondatabase/cloud/issues/25525 ## Summary of changes - Implement `ReloadingCertificateResolver` which reloads certificates from disk periodically.	2025-03-20 16:26:27 +00:00
Folke Behrens	d0102a473a	pgxn: Include local port in no-response log messages (#11321 ) ## Problem Now that stuck connections are quickly terminated it's not easy to quickly find the right port from the pid to correlate the connection with the one seen on pageserver side. ## Summary of changes Call getsockname() and include the local port number in the no-response-from-pageserver log messages.	2025-03-20 16:06:00 +00:00
Alex Chi Z.	78502798ae	feat(compute_ctl): pass compute type to pageserver with pg_options (#11287 ) ## Problem second try of https://github.com/neondatabase/neon/pull/11185, part of https://github.com/neondatabase/cloud/issues/24706 ## Summary of changes Tristan reminded me of the `options` field of the pg wire protocol, which can be used to pass configurations. This patch adds the parsing on the pageserver side, and supplies `neon.endpoint_type` as part of the `options`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-20 15:48:40 +00:00
Erik Grinaker	65d690b21d	storcon: add repeated auto-splits and initial splits (#11122 ) ## Problem Currently, we only split tenants into 8 shards once, at the 64 GB split threshold. For very large tenants, we need to keep splitting to avoid huge shards. And we also want to eagerly split at a lower threshold to improve throughput during initial ingestion. See https://github.com/neondatabase/cloud/issues/22532#issuecomment-2706215907 for details. Touches https://github.com/neondatabase/cloud/issues/22532. Requires #11157. ## Summary of changes This adds parameters and logic to enable repeated splits when a tenant's largest timeline divided by shard count exceeds `split_threshold`, as well as eager initial splits at a lower threshold to speed up initial ingestion. The default parameters are all set such that they retain the current behavior in production (only split into 8 shards once, at 64 GB). * `split_threshold` now specifies a maximum shard size. When a shard exceeds it, all tenant shards are split by powers of 2 such that all tenant shards fall below `split_threshold`. Disabled by default, like today. * Add `max_split_shards` to specify a max shard count for autosplits. Defaults to 8 to retain current behavior. * Add `initial_split_threshold` and `initial_split_shards` to specify a threshold and target count for eager splits of unsharded tenants. Defaults to 64 GB and 8 shards to retain current production behavior. Because this PR sets `initial_split_threshold` to 64 GB by default, it has the effect of enabling autosplits by default. This was not the case previously, since `split_threshold` defaults to None, but it is already enabled across production and staging. This is temporary until we complete the production rollout. For more details, see code comments. This must wait until #11157 has been deployed to Pageservers. Once this has been deployed to production, we plan to change the parameters to: * `split-threshold`: 256 GB * `initial-split-threshold`: 16 GB * `initial-split-shards`: 4 * `max-split-shards`: 16 The final split points will thus be: * Start: 1 shard * 16 GB: 4 shards * 1 TB: 8 shards * 2 TB: 16 shards We will then change the default settings to be disabled by default. --------- Co-authored-by: John Spray <john@neon.tech>	2025-03-20 15:43:57 +00:00
Arpad Müller	5dd60933d3	Mark min_readable_lsn as required in the pageserver API spec (#11324 ) Fully applies the changes of https://github.com/neondatabase/cloud/pull/25233 to neon.git. The field is always present in the Rust struct definition, so it can be marked as required. cc #10707	2025-03-20 15:25:09 +00:00
Gleb Novikov	2065074559	fast_import: put job status to s3 (#11284 ) ## Problem `fast_import` binary is being run inside neonvms, and they do not support proper `kubectl describe logs` now, there are a bunch of other caveats as well: https://github.com/neondatabase/autoscaling/issues/1320 Anyway, we needed a signal if job finished successfully or not, and if not — at least some error message for the cplane operation. And after [a short discussion](https://neondb.slack.com/archives/C07PG8J1L0P/p1741954251813609), that s3 object is the most convenient at the moment. ## Summary of changes If `s3_prefix` was provided to `fast_import` call, any job run puts a status object file into `{s3_prefix}/status/fast_import` with contents `{"done": true}` or `{"done": false, "error": "..."}`. Added a test as well	2025-03-20 15:23:35 +00:00
Konstantin Knizhnik	3da70abfa5	Fix pageserver_try_receive (#11096 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1741176713523469 The problem is that this function is using `PQgetCopyData(shard->conn, &resp_buff.data, 1 /* async = true */)` to try to fetch next message. But this function returns 0 if the whole message is not present in the buffer. And input buffer may contain only part of message so result is not fetched. ## Summary of changes Use `PQisBusy` + `WaitEventSetWait` to check if data is available and `PQgetCopyData(shard->conn, &resp_buff.data, 0)` to read whole message in this case. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-20 15:21:00 +00:00
Anastasia Lubennikova	e5aef3747c	feat(compute_ctl): allow to change audit_log_level for existing (#11308 ) projects. Preserve the information about the current audit log level in compute state, so that we don't relaunch rsyslog on every spec change https://github.com/neondatabase/cloud/issues/25349 --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-03-20 11:23:20 +00:00
Arpad Müller	91dad2514f	storcon: also support tenant deletion for safekeepers (#11289 ) If a tenant gets deleted, delete also all of its timelines. We assume that by the time a tenant is being deleted, no new timelines are being created, so we don't need to worry about races with creation in this situation. Unlike #11233, which was very simple because it listed the timelines and invoked timeline deletion, this PR obtains a list of safekeepers to invoke the tenant deletion on, and then invokes tenant deletion on each safekeeper that has one or multiple timelines. Alternative to #11233 Builds on #11288 Part of #9011	2025-03-20 10:52:21 +00:00
Dmitrii Kovalkov	9bf59989db	storcon: add https API (#11239 ) ## Problem Pageservers use unencrypted HTTP requests for storage controller API. - Closes: https://github.com/neondatabase/cloud/issues/25524 ## Summary of changes - Replace hyper0::server::Server with http_utils::server::Server in storage controller. - Add HTTPS handler for storage controller API. - Support `ssl_ca_file` in pageserver.	2025-03-20 08:22:02 +00:00
Tristan Partin	c6f5a58d3b	Remove potential for SQL injection (#11260 ) Timeline IDs do not contain characters that may cause a SQL injection, but best to always play it safe. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-19 19:19:38 +00:00
Suhas Thalanki	5589efb6de	moving LastWrittenLSNCache to Neon Extension (#11031 ) ## Problem We currently have this code duplicated across different PG versions. Moving this to an extension would reduce duplication and simplify maintenance. ## Summary of changes Moving the LastWrittenLSN code from PG versions to the Neon extension and linking it with hooks. Related Postgres PR: https://github.com/neondatabase/postgres/pull/590 Closes: https://github.com/neondatabase/neon/issues/10973 --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-03-19 17:29:40 +00:00
Folke Behrens	019a29748d	proxy: Move release PR creation to Tuesday (#11306 ) Move the creation of the proxy release PR to Tuesday mornings.	2025-03-19 16:45:29 +00:00
StepSecurity Bot	88ea855cff	fix(ci): Fixing StepSecurity Flagged Issues (#11311 ) This pull request is created by [StepSecurity](https://app.stepsecurity.io/securerepo) at the request of @areyou1or0. ## Summary This pull request is created by [StepSecurity](https://app.stepsecurity.io/securerepo) at the request of @areyou1or0. Please merge the Pull Request to incorporate the requested changes. Please tag @areyou1or0 on your message if you have any questions related to the PR. ## Summary This pull request is created by [StepSecurity](https://app.stepsecurity.io/securerepo) at the request of @areyou1or0. Please merge the Pull Request to incorporate the requested changes. Please tag @areyou1or0 on your message if you have any questions related to the PR. ## Security Fixes ### Least Privileged GitHub Actions Token Permissions The GITHUB_TOKEN is an automatically generated secret to make authenticated calls to the GitHub API. GitHub recommends setting minimum token permissions for the GITHUB_TOKEN. - [GitHub Security Guide](https://docs.github.com/en/actions/security-guides/automatic-token-authentication#using-the-github_token-in-a-workflow) - [The Open Source Security Foundation (OpenSSF) Security Guide](https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions) ### Pinned Dependencies GitHub Action tags and Docker tags are mutable. This poses a security risk. GitHub's Security Hardening guide recommends pinning actions to full length commit. - [GitHub Security Guide](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#using-third-party-actions) - [The Open Source Security Foundation (OpenSSF) Security Guide](https://github.com/ossf/scorecard/blob/main/docs/checks.md#pinned-dependencies) ### Harden Runner [Harden-Runner](https://github.com/step-security/harden-runner) is an open-source security agent for the GitHub-hosted runner to prevent software supply chain attacks. It prevents exfiltration of credentials, detects tampering of source code during build, and enables running jobs without `sudo` access. See how popular open-source projects use Harden-Runner [here](https://docs.stepsecurity.io/whos-using-harden-runner). <details> <summary>Harden runner usage</summary> You can find link to view insights and policy recommendation in the build log <img src="https://github.com/step-security/harden-runner/blob/main/images/buildlog1.png?raw=true" width="60%" height="60%"> Please refer to [documentation](https://docs.stepsecurity.io/harden-runner) to find more details. </details> will fix https://github.com/neondatabase/cloud/issues/26141	2025-03-19 16:44:22 +00:00
Vlad Lazar	9ce3704ab5	pageseserver: rename cplane api to storage controller api (#11310 ) ## Problem The pageserver upcall api was designed to work with control plane or the storage controller. We have completed the transition period and now the upcall api only targets the storage controller. ## Summary of changes Rename types accordingly and tweak some comments.	2025-03-19 16:29:52 +00:00
Alexey Kondratov	518269ea6a	feat(compute): Add perf test for compute startup time breakdown (#11198 ) ## Problem We had a recent Postgres startup latency (`start_postgres_ms`) degradation, but it was only caught with SLO alerts. There was actually an existing test for the same purpose -- `start_postgres_ms`, but it's doing only two starts, so it's a bit noisy. ## Summary of changes Add new compute startup latency test that does 100 iterations and reports p50, p90 and p99 latencies. Part of https://github.com/neondatabase/cloud/issues/24882	2025-03-19 16:11:33 +00:00
a-masterov	cf9d817a21	Add tests for some extensions currently not covered by the regression tests (#11191 ) ## Problem Some extensions do not contain tests, which can be easily run on top of docker-compose or staging. ## Summary of changes Added the pg_regress based tests for `pg_tiktoken`, `pgx_ulid`, `pg_rag` Now they will be run on top of docker-compose, but I intend to adopt them to be run on top staging in the next PRs	2025-03-19 15:38:33 +00:00
John Spray	55cb07f680	pageserver: improve debuggability of timeline creation failures during chaos testing (#11300 ) ## Problem We're seeing timeline creation failures that look suspiciously like some race with the cleanup-deletion of initdb temporary directories. I couldn't spot the bug, but we can make it a bit easier to debug. Related: https://github.com/neondatabase/neon/issues/11296 ## Summary of changes - Avoid surfacing distracting ENOENT failure to delete as a log error -- this is fine, and can happen if timeline is cancelled while doing initdb, or if initdb itself has an error where it doesn't write the dir (this error is surfaced separately) - Log after purging initdb temp directories	2025-03-19 13:38:03 +00:00
Christian Schwarz	0f20dae3c3	impr: merge `pageserver_api::models::TenantConfig` and `pageserver::tenant::config::TenantConfOpt` (#11298 ) The only difference between - `pageserver_api::models::TenantConfig` and - `pageserver::tenant::config::TenantConfOpt` at this point is that `TenantConfOpt` serializes with `skip_serializing_if = Option::is_none`. That is an efficiency improvement for all the places that currently serde `models::TenantConfig` because new serializations will no longer write `$fieldname: null` for each field that is `None` at runtime. This should be particularly beneficial for Storcon, which stores JSON-serialized `models::TenantConfig` in its DB. # Behavior Changes This PR changes the serialization behavior: we omit `None` fields instead of serializing `$fieldname: null`). So it's a data format change (see section on compatibility below). And it changes API responses from Storcon and Pageserver. ## API Response Compatibility Storcon returns the location description. Afaik it is passed through into - storcon_cli output - storcon UI in console admin UI These outputs will no longer contain `$fieldname: null` values, which de-bloats the output (good). But in storcon UI, it also serves as an editor "default", which will be eliminated after a storcon with this PR is released. ## Data Format Compatibility Backwards compat: new software reading old serialized data will deserialize to the same runtime value because all the field types are exactly the same and `skip_serializing_if` does not affect deserialization. Forward compat: old software reading data serialized by new software will map absence fields in the serialized form to runtime value `Option::None`. This is serde default behavior, see this playground to convince yourself: https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=f7f4e1a169959a3085b6158c022a05eb The `serde(with="humantime_serde")` however behaves strangely: if used on an `Option<Duration>`, it still requires the field to be present, unlike the serde default behavior shown in the previous paragraph. The workaround is to set `serde(default)`. Previously it was set on each individual field, but, we do have the container attribute, so, set it there. This requires deriving a `Default` impl, which, because all fields are `Option`, is non-magic. See my notes here: https://gist.github.com/problame/eddbc225a5d12617e9f2c6413e0cf799 # Future Work We should have separate types (& crates) for - runtime types configuration (e.g. PageServerConf::tenant_config, AttachedLocationConf) - `config-v1` file pageserver local disk file format - `mgmt API` - `pageserver.toml` Right now they all use the same, which is convenient but makes it hard to reason about compatibility breakage. # Refs - corresponding docs.neon.build PR https://github.com/neondatabase/docs/pull/470	2025-03-19 12:47:17 +00:00
JC Grünhage	aedeb37220	fix(ci): put the BUILD_TAG of the upcoming release into RC PR artifacts (#11304 ) ## Problem #11061 changed how artifacts for releases are built, by reusing/retagging the artifacts from release PRs. This resulted in the BUILD_TAG that's baked into the images to not be as expected. Context: https://neondb.slack.com/archives/C08JBTT3R1Q/p1742333300129069 ## Summary of changes Set BUILD_TAG to the release tag of the upcoming release when running inside release PRs.	2025-03-19 09:34:28 +00:00
Anastasia Lubennikova	6af974548e	feat(compute_ctl): Add basic audit logging for computes. (#11170 ) if `audit_log_level` is set to Log, preload pgaudit extension and log DDL with masked parameters into standard postgresql log	2025-03-19 00:13:36 +00:00
Christian Schwarz	9fb77d6cdd	buffered writer: add cancellation sensitivity (#11052 ) In - https://github.com/neondatabase/neon/pull/10993#issuecomment-2690428336 I added infinite retries for buffered writer flush IOs, primarily to gracefully handle ENOSPC but more generally so that the buffered writer is not left in a state where reads from the surrounding InMemoryLayer cause panics. However, I didn't add cancellation sensitivity, which is concerning because then there is no way to detach a timeline/tenant that is encountering the write IO errors. That’s a legitimate scenario in the case of some edge case bug. See the #10993 description for details. This PR - first makes flush loop infallible, enabled by infinite retries - then adds sensitivity to `Timeline::cancel` to the flush loop, thereby making it fallible in one specific way again - finally fixes the InMemoryLayer/EphemeralFile/BufferedWriter amalgamate to remain read-available after flush loop is cancelled. The support for read-availability after cancellation is necessary so that reads from the InMemoryLayer that are already queued up behind the RwLock that wraps the BufferedWriter won't panic because of the `mutable=None` that we leave behind in case the flush loop gets cancelled. # Alternatives One might think that we can only ship the change for read-availability if flush encounters an error, without the infinite retrying and/or cancellation sensitivity complexity. The problem with that is that read-availability sounds good but is really quite useless, because we cannot ingest new WAL without a writable InMemoryLayer. Thus, very soon after we transition to read-only mode, reads from compute are going to wait anyway, but on `wait_lsn` instead of the RwLock, because ingest isn't progressing. Thus, having the infinite flush retries still makes more sense because they're just "slowness" to the user, whereas wait_lsn is hard errors.	2025-03-18 18:48:43 +00:00
JC Grünhage	99639c26b4	fix(ci): update build-tools image references (#11293 ) ## Problem https://github.com/neondatabase/neon/pull/11210 migrated pushing images to ghcr. Unfortunately, it was incomplete in using images from ghcr, which resulted in a few places referencing the ghcr build-tools image, while trying to use docker hub credentials. ## Summary of changes Use build-tools image from ghcr consistently.	2025-03-18 15:21:22 +00:00
Ivan Efremov	86fe26c676	fix(proxy): Fix testodrome HTTP header handling in proxy (#11292 ) Relates to #22486	2025-03-18 15:14:08 +00:00
JC Grünhage	eb6efda98b	impr(ci): move some kinds of tests to PR runs only (#11272 ) ## Problem The pipelines after release merges are slower than they need to be at the moment. This is because some kinds of tests/checks run on all kinds of pipelines, even though they only matter in some of those. ## Summary of changes Run `check-codestyle-{rust,python,jsonnet}`, `build-and-test-locally` and `trigger-e2e-tests` only on regular PRs, not release PR or pushes to main or release branches.	2025-03-18 13:49:34 +00:00
Conrad Ludgate	fd41ab9bb6	chore: remove x509-parser (#11247 ) Both crates seem well maintained. x509-cert is part of the high quality RustCrypto project that we already make heavy use of, and I think it makes sense to reduce the dependencies where possible.	2025-03-18 13:05:08 +00:00
JC Grünhage	2dfff6a2a3	impr(ci): use ghcr.io as the default container registry (#11210 ) ## Problem Docker Hub has new rate limits coming up, and to avoid problems coming with those we're switching to GHCR. ## Summary of changes - Push images to GHCR initially and distribute them from there - Use images from GHCR in docker-compose	2025-03-18 11:30:49 +00:00
Arpad Müller	2cf6ae76fc	storcon: move safekeeper related stuff out of service.rs (#11288 ) There is no functional change here. We move safekeeper related code from `service.rs` to `service/safekeeper_service.rs`, so that safekeeper related stuff is contained in a single file. This also helps with preventing `service.rs` from growing even further. Part of #9011.	2025-03-18 09:00:53 +00:00
Dmitrii Kovalkov	57d51e949d	tests: suppress excessive pageserver errors in test_timeline_ancestor_detach_errors (#11277 ) ## Problem The test is flaky because of the same reasons as described in https://github.com/neondatabase/neon/issues/11177. The test has already suppressed these `WARN` and `ERROR` log messages, but the regexp didn't match all possible errors. ## Summary of changes - Change regexp to suppress all possible allowed error log messages.	2025-03-18 07:10:11 +00:00
Arpad Müller	0d3d639ef3	storcon: remove timeouts for safekeeper heartbeating (#11232 ) PRs #10891 and #10902 have time-bounded the safekeeper heartbeating of the storage controller. Those timeouts were not meant to be permanent, but temporary until we figured out the reasons for the safekeeper heartbeating causing problems. Now they are better understood and resolved. A comment is [here](https://github.com/neondatabase/cloud/issues/24396#issuecomment-2679342929), but most importantly, we've had: * #10954 to send heartbeats concurrently (before the issue was we sent them sequentially, so the total time time was number of nodes times time for timeout to be hit, now the total time is the maximum of all things we are heartbeating) * work to actually make heartbeats work and not error, i.e. JWT rollout for storcon, not sending heartbeats to decomissioned safekeepers, removal of decomissioned safekeepers from the databases Part of https://github.com/neondatabase/cloud/issues/25473	2025-03-18 03:37:45 +00:00
Alex Chi Z.	05ca27c981	fix(pagectl/benches): scope context with debug tools (#11285 ) ## Problem `7c462b3417` requires all contexts have scopes. pagectl/benches don't have such scopes. close https://github.com/neondatabase/neon/issues/11280 ## Summary of changes Adding scopes for the tools. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-17 21:27:27 +00:00
Alex Chi Z.	bb64beffbb	fix(pageserver): log compaction errors with timeline ids (#11231 ) ## Problem Makes it easier to debug. ## Summary of changes Log compaction errors with timeline ids. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-17 19:42:02 +00:00
Konstantin Knizhnik	24f41bee5c	Update LFC in case of unlogged build (#11262 ) ## Problem Unlogged build is used for GIST/SPGIST/GIN/HNSW indexes. In this mode we first change relation class to `RELPERSISTENCE_UNLOGGED` and save them on local disk. But we do not save unlogged relations in LFC. It may cause fetching incorrect value from LFC if relfilenode is reused. ## Summary of changes Save modified pages in LFC on second stage of unlogged build (when modified pages are walloged). There is no need to save pages in LFC at first phase because the will be in any case overwritten with assigned LSN at second phase. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-17 19:06:42 +00:00
Suhas Thalanki	a05c99f487	fix: removed anon pg extension (#10936 ) ## Problem Removing the `anon` v1 extension in postgres as described in https://github.com/neondatabase/cloud/issues/22663. This extension is not built for postgres v17 and is out of date when compared to the upstream variant which is v2 (we have v1.4). ## Summary of changes Removed the `anon` v1 extension from being built or preloaded Related to https://github.com/neondatabase/cloud/issues/22663	2025-03-17 18:23:32 +00:00
JC Grünhage	486ffeef6d	fix(ci): don't have neon-test-extensions release tag push depend on compute-node-image build (#11281 ) ## Problem Failures like https://github.com/neondatabase/neon/actions/runs/13901493608/job/38896940612?pr=11272 are caused by the dependency on `compute-node-image`, which was wrong on release jobs anyway. ## Summary of changes Remove dependency on `compute-node-image` from the job `add-release-tag-to-neon-test-extension-image`.	2025-03-17 16:31:49 +00:00
Arpad Müller	56149a046a	Add test_explicit_timeline_creation_storcon and make it work (#11261 ) Adds a basic test that makes the storcon issue explicit creation of a timeline on safeekepers (main storcon PR in #11058). It was adapted from `test_explicit_timeline_creation` from #11002. Also, do a bunch of fixes needed to get the test work (the API definitions weren't correct), and log more stuff when we can't create a new timeline due to no safekeepers being active. Part of #9011 --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2025-03-17 16:28:21 +00:00
Roman Zaynetdinov	db30e1669c	Add /configure_telemetry API endpoint (#11117 ) Work on https://github.com/neondatabase/cloud/issues/23721 and https://github.com/neondatabase/cloud/issues/23714 Depends on https://github.com/neondatabase/neon/pull/11111 - Add `/configure_telemetry` API endpoint - Support second rsyslog configuration for Postgres logs export - Enable logs export when compute feature is enabled and configure Postgres to send logs to syslog I have used `/configure_telemetry` name because in the future I see it also being used for configuring a `pg_tracing` extension to export traces. Let me know if you'd rather have these APIs separate. In this case we can rename it to `/configure_rsyslog`.	2025-03-17 13:53:23 +00:00
JC Grünhage	fdf04d4d81	fix(ci): use correct branch ref for checking whether this is a release merge queue (#11270 ) ## Problem https://github.com/neondatabase/neon/actions/runs/13894288475/job/38871819190 shows the "Add fast-fordward label to PR to trigger fast-forward merge" job being skipped. This is due to not using the right variable for checking which branch the merge queue is merging into. ## Summary of changes Use the `branch` output of the `meta` task for checking the target branch of a merge group.	2025-03-17 09:26:45 +00:00
Alexander Bayandin	136cae76c2	fix(ci): correct regex to detect release-compute RC PRs (#11269 ) ## Problem The regex in `_meta.yml` workflow doesn't detect RC PRs for compute releases: https://neondb.slack.com/archives/C059ZC138NR/p1742164884669389 ## Summary of changes - Fix regex --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech>	2025-03-17 07:25:12 +00:00
Konstantin Knizhnik	15e63afe7d	Support DEBUG_COMPARE_LOCAL mode for unloggedindex build (#11257 ) ## Problem In unlogged index build (used fir GIST/SPGIST/GIN indexes) files is created on disk and then removed at the end. It contradicts to the logic of DEBUG_COMPARE_LOCAL mode. ## Summary of changes Do not create and unlink files in unlogged build in DEBUG_COMPARE_LOCAL mode. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-17 06:07:24 +00:00
Alexey Kondratov	966abd3bd6	fix(compute_ctl): Dollar escaping helper fixes (#11263 ) ## Problem In the previous PR #11045, one edge-case wasn't covered, when an ident contains only one `$`, we were picking `$$` as a 'wrapper'. Yet, when this `$` is at the beginning or at the end of the ident, then we end up with `$$$` in a row which breaks the escaping. ## Summary of changes Start from `x` tag instead of a blank string. Slack: https://neondb.slack.com/archives/C08HV951W2W/p1742076675079769?thread_ts=1742004205.461159&cid=C08HV951W2W	2025-03-16 18:39:54 +00:00
Alexey Kondratov	8566cad23b	chore(docs): Refresh RFC guide to suggest using YYYY-MM-DD prefix (#11252 ) ## Problem Serial/numeric IDs lead to collisions, which is not critical but looks awkward. Previous discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1741891345869979 ## Summary of changes Suggest using the `YYYY-MM-DD` prefix, which i) has less chance of collision; ii) provides out-of-the-box lexicographic sorting; iii) even if it collides, it's not a big deal -- just two RFCs have been started on the same day. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-03-16 17:17:58 +00:00
Peter Bendel	228bb75354	Extend large tenant OLTP workload ... (#11166 ) ... to better match the workload characteristics of real Neon customers ## Problem We analyzed workloads of large Neon users and want to extend the oltp workload to include characteristics seen in those workloads. ## Summary of changes - for re-use branch delete inserted rows from last run - adjust expected run-time (time-outs) in GitHub workflow - add queries that exposes the prefetch getpages path - add I/U/D transactions for another table (so far the workload was insert/append-only) - add an explicit vacuum analyze step and measure its time - add reindex concurrently step and measure its time (and take care that this step succeeds even if prior reindex runs have failed or were canceled) - create a second connection string for the pooled connection that removes the `-pooler` suffix from the hostname because we want to run long-running statements (database maintenance) and bypass the pooler which doesn't support unlimited statement timeout ## Test run https://github.com/neondatabase/neon/actions/runs/13851772887/job/38760172415	2025-03-16 14:04:48 +00:00
Cihan Demirci	a5b00b87ba	CI(pre-merge-checks): use step-security/changed-files (#11265 ) Use Step Security maintained version of `tj-actions/changed-files`. https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised#use-the-stepsecurity-maintained-changed-files-action	2025-03-16 13:53:27 +00:00
John Spray	a674ed8caf	storcon: safety check when completing shard split (#11256 ) ## Problem There is a rare race between controller graceful deployment and shard splitting where we may incorrectly both abort _and_ complete the split (on different pods), and thereby leave no shards at all in the database. Related: #11254 ## Summary of changes - In complete_shard_split, refuse to delete anything if child shards are not found	2025-03-14 20:08:24 +00:00
Erik Grinaker	53d50c7ea5	pageserver: deflake compaction tests (#11246 ) These need to set `NoYield`, otherwise they may be preempted by pending L0 compaction.	2025-03-14 17:45:18 +00:00
Dmitrii Kovalkov	3168bd0e3a	tests: suppress "Cancelled request finished with an error" in test_timeline_archive (#11241 ) ## Problem Previous PR https://github.com/neondatabase/neon/pull/11190 didn't suppress `Cancelled request finished with an error` messages, which are also expected, so the test https://github.com/neondatabase/neon/issues/11177 is still flaky. ## Summary of changes - Suppress `Cancelled request finished with an error` in `test_timeline_archive`	2025-03-14 17:42:09 +00:00
Alexander Bayandin	4a97cd0b7e	test_runner: fix tests with jsonnet for Python 3.13 (#11240 ) ## Problem Python's `jsonnet` 0.20.0 doesn't support Python 3.13, so we have a couple of tests xfailed because of that. ## Summary of changes - Bump `jsonnet` to `0.21.0rc2` which supports Python 3.13 - Unxfail `test_sql_exporter_metrics_e2e` and `test_sql_exporter_metrics_smoke` on Python 3.13	2025-03-14 17:02:55 +00:00
Anastasia Lubennikova	b7c6738524	feat(compute_ctl): add pgaudt log gc to compute_ctl (#11169 ) - add pgaudt_gc thread to compute_ctl to cleanup old pgaudit logs if they exist. pgaudit can rotate files, but it doesn't delete the old files - Add AUDIT_LOG_DIR_SIZE metric to compute_ctl to track the size of the audit log directory in bytes. - Fix permissions for rsyslog state files directory	2025-03-14 14:08:16 +00:00
Conrad Ludgate	7fe5a689b4	feat(proxy): export ingress metrics (#11244 ) ## Problem We exposed the direction tag in #10925 but didn't actually include the ingress tag in the export to allow for an adaption period. ## Summary of changes We now export the ingress direction	2025-03-14 13:54:57 +00:00
Dmitrii Kovalkov	b0922967e0	Bump humantime version and remove advisories.ignore (#11242 ) ## Problem - Closes: https://github.com/neondatabase/neon/issues/11179#issuecomment-2724222041 ## Summary of changes - Bump humantime version to `2.2` - Remove `RUSTSEC-2025-0014` from `advisories.ignore`	2025-03-14 11:51:11 +00:00
Dmitrii Kovalkov	f68be2b5e2	safekeeper: https for management API (#11171 ) ## Problem Storage controller uses unencrypted HTTP requests for safekeeper management API. - Closes: https://github.com/neondatabase/cloud/issues/24836 ## Summary of changes - Replace `hyper0::server::Server` with `http_utils::server::Server` in safekeeper. - Add HTTPS handler for safekeeper management API.	2025-03-14 11:41:22 +00:00
Christian Schwarz	04370b48b3	fix(storcon): optimization validation makes decisions based on wrong SecondaryProgress (#11229 ) # Refs - fixes https://github.com/neondatabase/neon/issues/11228 # Problem High-Level When storcon validates whether a `ScheduleOptimizationAction` should be applied, it retrieves the `tenant_secondary_status` to determine whether a secondary is ready for the optimization. When collecting results, it associates secondary statuses with the wrong optimization actions in the batch of optimizations that we're validating. The result is that we make the decision for shard/location X based on the SecondaryStatus of a random secondary location Y in the current batch of optimizations. A possible symptom is an early cutover, as seen in this engineering investigation here: - https://github.com/neondatabase/cloud/issues/25734 # Problem Code-Level This code here in `optimize_all_validate` `97e2e27f68/storage_controller/src/service.rs (L7012-L7029)` zips the `want_secondary_status` with the Vec returned from `tenant_for_shards_api` . However, the Vec returned from `want_secondary_status` is not ordered (it uses FuturesUnordered internally). # Solution Sort the Vec in input order before returning it. `optimize_all_validate` was the only caller affected by this problem While at it, also future-proof similar-looking function `tenant_for_shards`. None of its callers care about the order, but this type of function signature is easy to use incorrectly. # Future Work Avoid the additional iteration, map, and allocation. Change API to leverage AsyncFn (async closure). And/or invert `tenant_for_shards_api` into a Future ext trait / iterator adaptor thing.	2025-03-14 11:21:16 +00:00
Arpad Müller	5359cf717c	storcon: add API definitions for exclude_timeline and term_bump (#11197 ) Adds API definitions for the safekeeper API endpoints `exclude_timeline` and `term_bump`. Also does a bugfix to return the correct type from `delete_timeline`. Part of #8614	2025-03-14 00:00:37 +00:00
Erik Grinaker	d6d78a050f	pageserver: disable `l0_flush_wait_upload` by default (#11215 ) ## Problem This is already disabled in production, as it is replaced by L0 flush delays. It will be removed in a later PR, once the config option is no longer specified in production. ## Summary of changes Disable `l0_flush_wait_upload` by default.	2025-03-13 21:08:28 +00:00
Erik Grinaker	4ff000c042	pageserver: deflake `test_metadata_image_creation` (#11230 ) ## Problem `test_metadata_image_creation ` became flaky with #11212, since image compaction may yield to L0 compaction. ## Summary of changes Set `NoYield` when compacting in tenant tests.	2025-03-13 20:46:21 +00:00
Conrad Ludgate	9a3020d2ce	chore(proxy): pre-initialise metricvecs (#11226 ) ## Problem We noticed that error metrics didn't show for some services with light load. This is not great and can cause problems for dashboards/alerts ## Summary of changes Pre-initialise some metricvecs.	2025-03-13 20:23:53 +00:00
Alex Chi Z.	23b713900e	feat(storcon): passthrough ancestor detach behavior (#11199 ) ## Problem https://github.com/neondatabase/neon/issues/10310 https://github.com/neondatabase/neon/pull/11158 ## Summary of changes We need to passthrough the new detach behavior through the storcon API. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-13 20:21:23 +00:00
Arpad Müller	b1a1be6a4c	switch pytests and neon_local to control_plane_hooks_api (#11195 ) We want to switch away from and deprecate the `--compute-hook-url` param for the storcon in favour of `--control-plane-url` because it allows us to construct urls with `notify-safekeepers`. This PR switches the pytests and neon_local from a `control_plane_compute_hook_api` to a new param named `control_plane_hooks_api` which is supposed to point to the parent of the `notify-attach` URL. We still support reading the old url from disk to not be too disruptive with existing deployments, but we just ignore it. Also add docs for the `notify-safekeepers` upcall API. Follow-up of #11173 Part of https://github.com/neondatabase/neon/issues/11163	2025-03-13 19:50:52 +00:00
Erik Grinaker	8afae9d03c	pageserver: enable `l0_flush_delay_threshold` by default (#11214 ) ## Problem `l0_flush_delay_threshold` has already been set to 30 in production for a couple of weeks. Let's harmonize the default. ## Summary of changes Update `DEFAULT_L0_FLUSH_DELAY_FACTOR` to 3 such that the default `l0_flush_delay_threshold` is `3 * compaction_threshold`. This differs from the production setting, which is hardcoded to 30 (with `compaction_threshold` at 10), and is more appropriate for any tenants that have custom `compaction_threshold` overrides.	2025-03-13 19:15:22 +00:00
JC Grünhage	066b0a1be9	fix(ci): correctly push neon-test-extensions in releases and to ghcr (#11225 ) ## Problem `ef0d4a48a` adjusted how we build container images and how we push them, and the neon-test-extensions image was overlooked. Additionally, is was also missed in `1f0dea9a1`, which pushed our container images to GHCR. ## Summary of changes Push neon-test-extensions to GHCR and also push release tags for it.	2025-03-13 18:18:55 +00:00
Konstantin Knizhnik	398d2794eb	Handle DEBUG_COMPARE_LOCAL mode in neon_zeroextend (#11220 ) ## Problem DEBUG_COMPARE_LOCAL is not supported in neon_zeroextend added in PG16 ## Summary of changes Add support of DEBUG_COMPARE_LOCAL in neon_zeroextend Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-13 16:30:32 +00:00
Erik Grinaker	3c3b9dc919	pageserver: enable `image_creation_preempt_threshold` by default (#11216 ) ## Problem This is already set in production, we should harmonize the default. ## Summary of changes Default `image_creation_preempt_threshold` to 3.	2025-03-13 16:28:21 +00:00
Christian Schwarz	ed31dd2a3c	pageserver: better observability for slow wait_lsn (#11176 ) # Problem We leave too few observability breadcrumbs in the case where wait_lsn is exceptionally slow. # Changes - refactor: extract the monitoring logic out of `log_slow` into `monitor_slow_future` - add global + per-timeline counter for time spent waiting for wait_lsn - It is updated while we're still waiting, similar to what we do for page_service response flush. - add per-timeline counterpair for started & finished wait_lsn count - add slow-logging to leave breadcrumbs in logs, not just metrics For the slow-logging, we need to consider not flooding the logs during a broker or network outage/blip. The solution is a "log-streak-level" concurrency limit per timeline. At any given time, there is at most one slow wait_lsn that is logging the "still running" and "completed" sequence of logs. Other concurrent slow wait_lsn's don't log at all. This leaves at least one breadcrumb in each timeline's logs if some wait_lsn was exceptionally slow during a given period. The full degree of slowness can then be determined by looking at the per-timeline metric. # Performance Reran the `bench_log_slow` benchmark, no difference, so, existing call sites are fine. We do use a Semaphore, but only try_acquire it _after_ things have already been determined to be slow. So, no baseline overhead anticipated. # Refs - https://github.com/neondatabase/cloud/issues/23486#issuecomment-2711587222	2025-03-13 15:03:53 +00:00
Conrad Ludgate	3dec117572	feat(compute_ctl): use TLS if configured (#10972 ) Closes: https://github.com/neondatabase/cloud/issues/22998 If control-plane reports that TLS should be used, load the certificates (and watch for updates), make sure postgres use them, and detects updates. Procedure: 1. Load certificates 2. Reconfigure postgres/pgbouncer 3. Loop on a timer until certificates have loaded 4. Go to 1 Notes: 1. We only run this procedure if requested on startup by control plane. 2. We needed to compile pgbouncer with openssl enabled 3. Postgres doesn't allow tls keys to be globally accessible - must be read only to the postgres user. I couldn't convince the autoscaling team to let me put this logic into the VM settings, so instead compute_ctl will copy the keys to be read-only by postgres. 4. To mitigate a race condition, we also verify that the key matches the cert.	2025-03-13 15:03:22 +00:00
Alex Chi Z.	b2286f5bcb	fix(pageserver): don't panic if gc-compaction find no keys (#11200 ) ## Problem There was a panic on staging that compaction didn't find any keys. This is possible if all layers selected for compaction does not contain any keys within the current shard. ## Summary of changes Make panic an error. In the future, we can try creating an empty image layer so that GC can clean up those layers. Otherwise, for now, we can only rely on shard ancestor compaction to remove these data. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-13 14:38:45 +00:00
Erik Grinaker	c036fec065	pageserver: enable `compaction_l0_first` by default (#11212 ) ## Problem `compaction_l0_first` has already been enabled in production for a couple of weeks. ## Summary of changes Enable `compaction_l0_first` by default. Also set `CompactFlags::NoYield` in `timeline_checkpoint_handler`, to ensure explicitly requested compaction runs to completion. This endpoint is mainly used in tests, and caused some flakiness where tests expected compaction to complete.	2025-03-13 14:28:42 +00:00
github-actions[bot]	f92f92b91b	Proxy release 2025-03-13	2025-03-13 13:43:01 +00:00
JC Grünhage	89c7e4e917	fix(ci): use paranthesis for error handling in jq when fetching release PRs (#11217 ) ## Problem #11061 introduced code fetching previous releases. #11151 introduced jq error handling, which has also been applied in #11061, but parenthesis have been missed. ## Summary of changes Add parenthesis around error handling code.	2025-03-13 13:40:43 +00:00
Erik Grinaker	5a245a837d	storcon: retain stripe size when autosplitting sharded tenants (#11194 ) ## Problem Autosplits always request `DEFAULT_STRIPE_SIZE` for splits. However, splits do not allow changing the stripe size of already-sharded tenants, and will error out if it differs. In #11168, we are changing the stripe size, which could hit this when attempting to autosplit already sharded tenants. Touches #11168. ## Summary of changes Pass `new_stripe_size: None` when autosplitting already sharded tenants. Otherwise, pass `DEFAULT_STRIPE_SIZE` instead of the shard identity's stripe size, since we want to use the current default rather than an old, persisted default.	2025-03-13 13:28:10 +00:00
devin-ai-integration[bot]	efb1df4362	fix: Change metric_unit from 'microseconds' to 'μs' in test_compute_ctl_api.py (#11209 ) # Fix metric_unit length in test_compute_ctl_api.py ## Description This PR changes the metric_unit from "microseconds" to "μs" in test_compute_ctl_api.py to fix the issue where perf test results were not being stored in the database due to the string exceeding the 10 character limit of the metric_unit column in the perf_test_results table. ## Problem As reported in Slack, the perf test results were not being uploaded to the database because the "microseconds" string (12 characters) exceeds the 10 character limit of the metric_unit column in the perf_test_results table. ## Solution Replace "microseconds" with "μs" in all metric_unit parameters in the test_compute_ctl_api.py file. ## Testing The changes have been committed and pushed. The PR is ready for review. Link to Devin run: https://app.devin.ai/sessions/e29edd672bd34114b059915820e8a853 Requested by: Peter Bendel Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: peterbendel@neon.tech <peterbendel@neon.tech>	2025-03-13 10:17:01 +00:00
github-actions[bot]	dbb205ae92	Proxy release 2025-03-13	2025-03-13 09:50:35 +00:00
JC Grünhage	803e6f908a	fix(ci): fix syntax of lint-release-pr (#11208 ) ## Problem A small adjustment in #11061 broke the lint-release-pr.sh script, and the new version was neither tested nor linted. This has been done now, the script is once again tested and passing `shellcheck`. ## Summary of changes Add missing `el` of `elif` condition chain.	2025-03-13 09:42:38 +00:00
JC Grünhage	afc9524bc7	fix(ci): run lint-release-pr on head-ref (#11206 ) ## Problem #11061 changed release pr creation, and I missed that the workflow will checkout a would-be-merge of the rc branch and the release branch instead of the head ref, unless explicitly instructed otherwise. ## Summary of changes Check out head ref for linting the release PRs.	2025-03-13 08:17:33 +00:00
JC Grünhage	507353404c	fix(ci): pass emtpy body when creating release PRs (#11203 ) ## Problem #11061 changed release pr creation, and I missed that creating PRs using `gh` in non-interactive environments requires `--body` instead of defaulting to an empty body. ## Summary of changes Explicitly set an empty body when creating release PRs.	2025-03-12 23:54:43 +00:00
JC Grünhage	48be4df3f3	fix(ci): fetch all refs in release PR creation (#11201 ) ## Problem #11061 changed release PR creation, and I missed that we need to explicitly fetch the whole history so that the relevant git refs and objects are available. ## Summary of changes - Fetch all git refs including history by setting fetch-depth to 0 - Reference release branch as a remote branch, because we haven't checked it out locally	2025-03-12 22:32:38 +00:00
Alex Chi Z.	c3b3b507f7	feat(pageserver): support detaching behavior v2 (#11158 ) ## Problem close https://github.com/neondatabase/neon/issues/10310 ## Summary of changes This patch adds a new behavior for the detach_ancestor API: detach with multi-level ancestor and no reparenting. Though we can potentially support multi-level + do reparenting / single-level + no-reparenting in the future, as it's not required for the recovery/snapshot epic, I'd prefer keeping things simple now that we only handle the old one and the new one instead of supporting the full feature matrix. I only added a test case of successful detaching instead of testing failures. I'd like to make this into staging and add more tests in the future. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-12 22:27:23 +00:00
JC Grünhage	ef0d4a48a8	Reuse artifacts from release PRs (#11061 ) ## Problem When we release our components, we perform builds in the release PR, then test the components, then merge the PR, and then build everything again, run tests again, and only then start deployments. To speed things up, we want to perform builds and run tests in the PR, and start deployments using the existing artifacts from the release PR. To make that possible, we need to have both CI pipelines running on the same commit hash, which requires fast forwarding release. That only works, if we have a commit in the PR that has the current release branch state as an ancestor. ## Summary of changes - Changes to release PR creation: - Remove templates and automatic bodies for release PRs. The previous template wasn't used anymore, and the automatic body we created in the pipeline didn't contain any useful content anymore after the changees here. - Make it possible to select the source branch. For releases that aren't cut from `main`, like https://github.com/neondatabase/neon/pull/11051, we need a way to trigger the new flow from a different branch. - Determine `release-branch` automatically from the component name instead of passing that as well. - Changes to the merge queue job: - Rename `get-changed-files` to `meta` in preparation of additional data being fetched as part of that job - Fail the merge queue if we're trying to merge into a branch other than main - this is to prevent non-fast-forward merges. - Label PRs to branches other than main as `fast-forward`, to trigger the fast-forward job - Add a fast-forward job that can be triggered with the `fast-forward` label that performs a fast-forward merge. This only happens if the PR has `mergeable_state == clean`, so CI having passed. - Build and Test on releases now skips building images, skips testing images and skips triggering e2e tests. We add new tags to the images from the release PR to tag them as release images, and we push them to the prod registries.	2025-03-12 21:00:59 +00:00
Alex Chi Z.	8a5a739af0	test(pageserver): add small tenant compaction (#11049 ) ## Problem close https://github.com/neondatabase/neon/issues/10881 ## Summary of changes Mock a tenant with very small amount of data. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-12 20:34:19 +00:00
Tristan Partin	5eed0e4b94	Add docs to performance/test_logical_replication.py on how to run the suite (#10175 ) These docs are in tandem with what was recently published on the internal docs site. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-12 17:31:09 +00:00
Tristan Partin	bb3c0ff251	Make collecting the installed extensions metric async (#11071 ) If the goal is to make compute_ctl completely asynchronous, then this is one step to getting there. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-12 16:09:02 +00:00
Conrad Ludgate	7aec1364dd	chore(proxy): remove enum and composite type queries (#11178 ) In our json encoding, we only need to know about array types. Information about composites or enums are not actually used. Enums are quite popular, needing to type query them when not needed can add some latency cost for no gain.	2025-03-12 15:47:17 +00:00
Tristan Partin	40672b739e	Move maybe_add_request_id_header middleware into middleware module (#11187 ) This matches the authorization middleware. --------- Signed-off-by: Tristan Partin <tristan@neon.tech> Co-authored-by: Mikhail Kot <mikhail@neon.tech>	2025-03-12 15:34:46 +00:00
Vlad Lazar	02a83913ec	storcon: do not update observed state on node activation (#11155 ) ## Problem When a node becomes active, we query its locations and update the observed state in-place. This can race with the observed state updates done when processing reconcile results. ## Summary of changes The argument for this reconciliation step is that is reduces the need for background reconciliations. I don't think is actually true anymore. There's two cases. 1. Restart of node after drain. Usually the node does not go through the offline state here, so observed locations were not marked as none. In any case, there should be a handful of shards max on the node since we've just drained it. 2. Node comes back online after failure or network partition. When the node is marked offline, we reschedule everything away from it. When it later becomes active, the previous observed location is extraneous and requires a reconciliation anyway. Closes https://github.com/neondatabase/neon/issues/11148	2025-03-12 15:31:28 +00:00
Erik Grinaker	c7717c85c7	storcon,pageserver: use persisted stripe size when loading unsharded tenants (#11193 ) ## Problem When the storage controller and Pageserver loads tenants from persisted storage, it uses `ShardIdentity::unsharded()` for unsharded tenants. However, this replaces the persisted stripe size of unsharded tenants with the default stripe size. This doesn't really matter for practical purposes, since the stripe size is meaningless for unsharded tenants anyway, but can cause consistency check failures if the persisted stripe size differs from the default. This was seen in #11168, where we change the default stripe size. Touches #11168. ## Summary of changes Carry over the persisted stripe size from `TenantShardPersistence` for unsharded tenants, and from `LocationConf` on Pageservers. Also add bounds checks for type casts when loading persisted shard metadata.	2025-03-12 15:16:54 +00:00
Erik Grinaker	1436b8469c	pageserver: appease unused lint on macOS (#11192 ) ## Problem `info_span!` is only used in a `linux` branch, causing the unused lint to fire on macOS. ## Summary of changes Fully qualify the `info_span!` use.	2025-03-12 14:34:29 +00:00
JC Grünhage	fc515e7be2	chore(deps): bump env_logger to 0.11.7 (#11188 ) ## Problem `humantime` is unmaintained, we want to migrate to `jiff`, see https://github.com/neondatabase/neon/issues/11179. `env_logger` in older versions depend on `humantime`, and newer versions depend on `jiff`, so we need to update it. ## Summary of changes Update `env_logger` to the most recent release, which does not depend on `humantime` anymore.	2025-03-12 14:26:52 +00:00
John Spray	7015dbbdf0	storcon_cli: remove pre-warm helper (#11183 ) ## Problem This command was used when onboarding tenants to the storage controller. We no longer do that, so the command can go. ## Summary of changes - Remove `storcon_cli tenant-warmup` command	2025-03-12 14:02:11 +00:00
Dmitrii Kovalkov	73e37ae388	Suppress "request was dropped" errors in test_timeline_archive (#11190 ) ## Problem Test `test_timeline_archive` is flaky because it makes requests that are intended to fail. It sometimes leads to warning in pageserver's logs. More details are in the issue. - Closes: https://github.com/neondatabase/neon/issues/11177 ## Summary of changes - Suppress such errors.	2025-03-12 13:23:31 +00:00
Vlad Lazar	1c0ff3c04d	utils: explicit OTEL export config and OTEL enablement via common entry point (#11139 ) We want to export performance traces from the pageserver in OTEL format. End goal is to see them in Grafana. To this end, there are two changes here: 1. Update the `tracing-utils` crate to allow for explicitly specifying the export configuration. Pageserver configuration is loaded from a file on start-up. This allows us to use the same flow for export configs there. 2. Update the `utils::logging::init` common entry point to set up OTEL tracing infrastructure if requested. Note that an entirely different tracing subscriber is used. This is to avoid interference with the existing tracing set-up. For now, no service uses this functionality. PR to plug this into the pageserver is [here](https://github.com/neondatabase/neon/pull/11140). Related https://github.com/neondatabase/neon/issues/9873	2025-03-12 11:07:49 +00:00
John Spray	7bf6397334	pageserver: remove legacy `TimelineInfo::latest_gc_cutoff` field (1/2) (#11149 ) ## Problem This field was retained for backward compat only in https://github.com/neondatabase/neon/pull/10707. Once https://github.com/neondatabase/cloud/pull/25233 is released, nothing external will be reading this field. Internally, this was a mandatory field so storage controller is still trying to decode it, so we must do this removal in two steps: this PR makes the field optional, and after one release we can fully remove it. Related: https://github.com/neondatabase/cloud/issues/24250 ## Summary of changes - Rename field to `_unused` - Remove field from swagger - Make field optional	2025-03-12 10:23:41 +00:00
Konstantin Knizhnik	f60ffe3021	Rebase compare local debug mode (#11174 ) ## Problem DEBUG_COMPARE_LOCAL mode is broken See https://neondb.slack.com/archives/C03QLRH7PPD/p1732862608323269?thread_ts=1732711054.862919&cid=C03QLRH7PPD ## Summary of changes Fix compile errors and unlogged build issues. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-12 05:52:18 +00:00
Arpad Müller	da2431f11f	storcon: add --control-plane-url config option (#11173 ) Adds the `--control-plane-url` config option to the storcon, which we want to migrate to instead of using `notify-attach`. Part of #11163	2025-03-12 02:30:56 +00:00
JC Grünhage	e8396034ac	fix(ci): fail meta using jq halt_error if data is unexpectedly missing (#11151 ) ## Problem When the githb API is having problems, we might not get data back, and are happily setting vars as empty. This causes problems down the line. See https://github.com/neondatabase/neon/actions/runs/13718859397/job/38381946590?pr=11132#step:5:1 for example. ## Summary of changes Fail the `meta` job if we don't get expected data back from github.	2025-03-11 22:59:30 +00:00
Tristan Partin	decd265c99	Revert notify to 6.0.0 (#11162 ) The upgrade to 8.0.0 caused severe performance regressions in the start_postgres_ms metric, which measures the time it takes from execing Postgres to the time Postgres marks itself as ready in the postmaster.pid file. We use the notify crate to watch for changes in the pgdata directory and the postmaster.pid file. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-11 22:18:09 +00:00
Christian Schwarz	158db414bf	buffered writer: handle write errors by retrying all write IO errors indefinitely (#10993 ) # Problem If the Pageserver ingest path (InMemoryLayer=>EphemeralFile=>BufferedWriter) encounters ENOSPC or any other write IO error when flushing the mutable buffer of the BufferedWriter, the buffered writer is left in a state where subsequent _reads_ from the InMemoryLayer it will cause a `must not use after we returned an error` panic. The reason is that 1. the flush background task bails on flush failure, 2. causing the `FlushHandle::flush` function to fail at channel.recv() and 3. causing the `FlushHandle::flush` function to bail with the flush error, 4. leaving its caller `BufferedWriter::flush` with `BufferedWriter::mutable = None`, 5. once the InMemoryLayer's RwLock::write guard is dropped, subsequent reads can enter, 6. those reads find `mutable = None` and cause the panic. # Context It has always been the contract that writes against the BufferedWriter API must not be retried because the writer/stream-style/append-only interface makes no atomicity guarantees ("On error, did nothing or a piece of the buffer get appended?"). The idea was that the error would bubble up to upper layers that can throw away the buffered writer and create a new one. (See our [internal error handling policy document on how to handle e.g. `ENOSPC`](`c870a50bc0/src/storage/handling_io_and_logical_errors.md (L36-L43)`)). That _might_ be true for delta/image layer writers, I haven't checked. But it's certainly not true for the ingest path: there are no provisions to throw away an InMemoryLayer that encountered a write error an reingest the WAL already written to it. Adding such higher-level retries would involve either resetting last_record_lsn to a lower value and restarting walreceiver. The code isn't flexible enough to do that, and such complexity likely isn't worth it given that write errors are rare. # Solution The solution in this PR is to retry _any_ failing write operation _indefinitely_ inside the buffered writer flush task, except of course those that are fatal as per `maybe_fatal_err`. Retrying indefinitely ensures that `BufferedWriter::mutable` is never left `None` in the case of IO errors, thereby solving the problem described above. It's a clear improvement over the status quo. However, while we're retrying, we build up backpressure because the `flush` is only double-buffered, not infinitely buffered. Backpressure here is generally good to avoid resource exhaustion, but blocks reads and hence stalls GetPage requests because InMemoryLayer reads and writes are mutually exclusive. That's orthogonal to the problem that is solved here, though. ## Caveats Note that there are some remaining conditions in the flush background task where it can bail with an error. I have annotated one of them with a TODO comment. Hence the `FlushHandle::flush` is still fallible and hence the overall scenario of leaving `mutable = None` on the bail path is still possible. We can clean that up in a later commit. Note also that retrying indefinitely is great for temporary errors like ENOSPC but likely undesirable in case the `std::io::Error` we get is really due to higher-level logic bugs. For example, we could fail to flush because the timeline or tenant directory got deleted and VirtualFile's reopen fails with ENOENT. Note finally that cancellation is not respected while we're retrying. This means we will block timeline/tenant/pageserver shutdown. The reason is that the existing cancellation story for the buffered writer background task was to recv from flush op channel until the sending side (FlushHandle) is explicitly shut down or dropped. Failing to handle cancellation carries the operational risk that even if a single timeline gets stuck because of a logic bug such as the one laid out above, we must still restart the whole pageserver process. # Alternatives Considered As pointed out in the `Context` section, throwing away a InMemoryLayer that encountered an error and reingesting the WAL is a lot of complexity that IMO isn't justified for such an edge case. Also, it's wasteful. I think it's a local optimum. A more general and simpler solution for ENOSPC is to `abort()` the process and run eviction on startup before bringing up the rest of pageserver. I argued for it in the past, the pro arguments are still valid and complete: https://neondb.slack.com/archives/C033RQ5SPDH/p1716896265296329 The trouble at the time was implementing eviction on startup. However, maybe things are simpler now that we are fully storcon-managed and all tenants have secondaries. For example, if pageserver `abort()`s on ENOSPC and then simply don't respond to storcon heartbeats while we're running eviction on startup, storcon will fail tenants over to the secondary anyway, giving us all the time we need to clean up. The downside is that if there's a systemic space management bug, above proposal will just propagate the problem to other nodes. But I imagine that because of the delays involved with filling up disks, the system might reach a half-stable state, providing operators more time to react. # Demo Intermediary commit `a03f335121480afc0171b0f34606bdf929e962c5` is demoed in this (internal) screen recording: https://drive.google.com/file/d/1nBC6lFV2himQ8vRXDXrY30yfWmI2JL5J/view?usp=drive_link # Perf Testing Ran `bench_ingest` on tmpfs, no measurable difference. Spans are uniquely owned by the flush task, and the span stack isn't too deep, so, enter and exit should be cheap. Plus, each flush takes ~150us with direct IO enabled, so, not _that_ high frequency event anyways. # Refs - fixes https://github.com/neondatabase/neon/issues/10856	2025-03-11 20:40:23 +00:00
Christian Schwarz	083a30b1e2	storage broker: disable deploy by default (#11172 ) context - https://github.com/neondatabase/cloud/issues/23486#issuecomment-2711587222 - companion infra.git PR: https://github.com/neondatabase/infra/pull/3249	2025-03-11 19:45:06 +00:00
Alex Chi Z.	7d221214bb	feat(pageserver): support no-yield for gc-compaction (#11184 ) ## Problem This should also resolve the test flakiness of `test_gc_feedback`. close https://github.com/neondatabase/neon/issues/11144 ## Summary of changes If `NoYield` is set, do not yield in gc-compaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-11 19:13:52 +00:00
Dmitrii Kovalkov	8983677f29	Ignore cargo deny advisory RUSTSEC-2025-0014 for humantime (#11180 ) ## Problem `humantime` is not maintained and `cargo deny check` fails - Will be addressed in https://github.com/neondatabase/neon/issues/11179 ## Summary of changes Ignore RUSTSEC-2025-0014 advisory for now	2025-03-11 19:09:32 +00:00
Ivan Efremov	011f7c21a3	fix(proxy): Add testodrome query id HTTP header (#11167 ) Handle "X-Neon-Query-ID" header to glue data with testodrome queries. Relates to the #22486	2025-03-11 17:17:30 +00:00
Alex Chi Z.	7588983168	fix(scrubber): log even if no refs are found (#11160 ) ## Problem Investigate https://github.com/neondatabase/neon/issues/11159 ## Summary of changes This doesn't fix the issue, but at least we can narrow down the cause next time it happens by logging ancestor referenced layer cnt even if it's 0. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-11 14:33:35 +00:00
Arseny Sher	359c64c779	walproposer: pre generations refactoring (#11060 ) ## Problem https://github.com/neondatabase/neon/issues/10851 ## Summary of changes Do some refactoring before making walproposer generations aware. - Rename SS_VOTING to SS_WAIT_VOTING, SS_IDLE to SS_WAIT_ELECTED - Continue to get rid of epochs: rename GetEpoch to GetLastLogTerm, donorEpoch to donorLastLogTerm - Instead of counting n_votes, n_connected, introduce explicit WalProposerState (collecting terms / voting / elected). Refactor out TermsCollected and VotesCollected; they will determine state transition differently depending whether generations are enabled or not. There is no new logic in this PR and thus no new tests.	2025-03-11 14:01:00 +00:00
Erik Grinaker	f466c01995	pageserver: add `max_logical_size_per_shard` for `get_top_tenants` (#11157 ) ## Problem In #11122, we want to split shards once the logical size of the largest timeline exceeds a split threshold. However, `get_top_tenants` currently only returns `max_logical_size`, which tracks the max _total_ logical size of a timeline across all shards. This is problematic, because the storage controller needs to fetch a list of N tenants that are eligible for splits, but the API doesn't currently have a way to express this. For example, with a split threshold of 1 GB, a tenant with `max_logical_size` of 4 GB is eligible to split if it has 1 or 2 shards, but not if it already has 4 shards. We need to express this in per-shard terms, otherwise the `get_top_tenants` endpoint may end up only returning tenants that can't be split, blocking splits entirely. Touches https://github.com/neondatabase/neon/pull/11122. Touches https://github.com/neondatabase/cloud/issues/22532. ## Summary of changes Add `TenantShardItem::max_logical_size_per_shard` containing `max_logical_size / shard_count`, and `TenantSorting::MaxLogicalSizePerShard` to order and filter by it.	2025-03-11 11:43:55 +00:00
Conrad Ludgate	d1b60fa0b6	fix(proxy): delete prepared statements when discarding (#11165 ) Fixes https://github.com/neondatabase/serverless/issues/144 When tables have enums, we need to perform type queries for that data. We cache these query statements for performance reasons. In Neon RLS, we run "discard all" for security reasons, which discards all the statements. When we need to type check again, the statements are no longer valid. This fixes it to discard the statements as well. I've also added some new logs and error types to monitor this. Currently we don't see the prepared statement errors in our logs.	2025-03-11 10:48:50 +00:00
Christian Schwarz	7c462b3417	impr: propagate VirtualFile metrics via RequestContext (#7202 ) # Refs - fixes https://github.com/neondatabase/neon/issues/6107 # Problem `VirtualFile` currently parses the path it is opened with to identify the `tenant,shard,timeline` labels to be used for the `STORAGE_IO_SIZE` metric. Further, for each read or write call to VirtualFile, it uses `with_label_values` to retrieve the correct metrics object, which under the hood is a global hashmap guarded by a parking_lot mutex. We perform tens of thousands of reads and writes per second on every pageserver instance; thus, doing the mutex lock + hashmap lookup is wasteful. # Changes Apply the technique we use for all other timeline-scoped metrics to avoid the repeat `with_label_values`: add it to `TimelineMetrics`. Wrap `TimelineMetrics` into an `Arc`. Propagate the `Arc<TimelineMetrics>` down do `VirtualFile`, and use `Timeline::metrics::storage_io_size`. To avoid contention on the `Arc<TimelineMetrics>`'s refcount atomics between different connection handlers for the same timeline, we wrap it into another Arc. To avoid frequent allocations, we store that Arc<Arc<TimelineMetrics>> inside the per-connection timeline cache. Preliminary refactorings to enable this change: - https://github.com/neondatabase/neon/pull/11001 - https://github.com/neondatabase/neon/pull/11030 # Performance I ran the benchmarks in `test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py` on an `i3en.3xlarge` because that's what we currently run them on. None of the benchmarks shows a meaningful difference in latency or throughput or CPU utilization. I would have expected some improvement in the many-tenants-one-client-each workload because they all hit that hashmap constantly, and clone the same `UintCounter` / `Arc` inside of it. But apparently the overhead is miniscule compared to the remaining work we do per getpage. Yet, since the changes are already made, the added complexity is manageable, and the perf overhead of `with_label_values` demonstrable in micro-benchmarks, let's have this change anyway. Also, propagating TimelineMetrics through RequestContext might come in handy down the line. The micro-benchmark that demonstrates perf impact of `with_label_values`, along with other pitfalls and mitigation techniques around the `metrics`/`prometheus` crate: - https://github.com/neondatabase/neon/pull/11019 # Alternative Designs An earlier iteration of this PR stored an `Arc<Arc<Timeline>>` inside `RequestContext`. The problem is that this risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by `Timeline`. Ideally, we wouldn't be using this mess of Arc's at all and propagate Rust references instead. But tokio requires tasks to be `'static`, and so, we wouldn't be able to propagate references across task boundaries, which is incompatible with any sort of fan-out code we already have (e.g. concurrent IO) or future code (parallel compaction). So, opt for Arc for now.	2025-03-11 07:23:06 +00:00
Christian Schwarz	420f7b07b4	add benchmark demonstrating `metrics`/`prometheus` crate multicore scalability pitfalls & workarounds (#11019 ) We use the `metrics` / `prometheus` crate in the Pageserver code base. This PR demonstrates - typical performance pitfalls with that crate - our current set of techniques to avoid most of these pitfalls. refs - https://github.com/neondatabase/neon/issues/10948 - https://github.com/neondatabase/neon/pull/7202 - I applied the `label_values__cache_label_values_lookup` technique there. - It didn't yield measurable results in high-level benchmarks though.	2025-03-11 07:22:56 +00:00
Arpad Müller	4d3c477689	storcon: timetime table, creation and deletion (#11058 ) This PR extends the storcon with basic safekeeper management of timelines, mainly timeline creation and deletion. We want to make the storcon manage safekeepers in the future. Timeline creation is controlled by the `--timelines-onto-safekeepers` flag. 1. it adds the `timelines` and `safekeeper_timeline_pending_ops` tables to the storcon db 2. extend code for the timeline creation and deletion 4. it adds per-safekeeper reconciler tasks TODO: * maybe not immediately schedule reconciliations for deletions but have a prior manual step * tenant deletions * add exclude API definitions (probably separate PR) * how to choose safekeeper to do exclude on vs deletion? this can be a bit hairy because the safekeeper might go offline in the meantime. * error/failure case handling * tests (cc test_explicit_timeline_creation from #11002) * single safekeeper mode: we often only have one SK (in tests for example) * `notify-safekeepers` hook: https://github.com/neondatabase/neon/issues/11163 TODOs implemented: * cancellations of enqueued reconciliations on a per-timeline basis, helpful if there is an ongoing deletion * implement pending ops overwrite behavior * load pending operations from db RFC section for important reading: [link](https://github.com/neondatabase/neon/blob/main/docs/rfcs/035-safekeeper-dynamic-membership-change.md#storage_controller-implementation) Implements the bulk of #9011 Successor of #10440. --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2025-03-11 02:31:22 +00:00
Alex Chi Z.	3451bdd3d2	fix(test): force L0 compaction before gc-compaction (#11143 ) ## Problem Fix test flakyness of `test_gc_feedback` Closes: https://github.com/neondatabase/neon/issues/11153 ## Summary of changes Looking at the log, gc-compaction is interrupted by L0 compaction. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-10 20:03:49 +00:00
Konstantin Knizhnik	fb1957936c	Fix caclulation of LFC used_pages (#11095 ) ## Problem Async prefetch in LFC PR cause incorrect calculation of LFC `used_pages`when page is overwritten ## Summary of changes Decrement `used_pages` is page is overwritten. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>	2025-03-10 18:28:55 +00:00
Matthias van de Meent	bc052fd0fc	Add configuration options to disable prevlink checks (#11138 ) This allows for improved decoding of otherwise broken WAL. ## Problem Currently, if (or when) a WAL record has a wrong prevptr, that breaks decoding. With this, we don't have to break on that if we decide it's OK to proceed after that. ## Summary of changes Use a Neon GUC to allow the system to enable the NEON-specific skip_lsn_checks option in XLogReader.	2025-03-10 17:02:30 +00:00
Vlad Lazar	8c553297cb	safekeeper: use max end lsn as start of next batch (#11152 ) ## Problem Partial reads are still problematic. They are stored in the buffer of the wal decoder and result in gaps being reported too eagerly on the pageserver side. ## Summary of changes Previously, we always used the start LSN of the chunk of WAL that was just read. This patch switches to using the end LSN of the last record that was decoded in the previous iteration.	2025-03-10 16:33:28 +00:00
Dmitrii Kovalkov	63b22d3fb1	pageserver: https for management API (#11025 ) ## Problem Storage controller uses unencrypted HTTP requests for pageserver management API. Closes: https://github.com/neondatabase/cloud/issues/24283 ## Summary of changes - Implement `http_utils::server::Server` with TLS support. - Replace `hyper0::server::Server` with `http_utils::server::Server` in pageserver. - Add HTTPS handler for pageserver management API. - Generate local SSL certificates in neon local.	2025-03-10 15:07:59 +00:00
JC Grünhage	f17931870f	fix(ci): use <!subteam^ID> syntax for pinging groups on slack (#11135 ) ## Problem Pinging groups on slack didn't work, because I didn't use the correct syntax. ## Summary of changes Use the correct syntax for pinging groups.	2025-03-10 13:27:23 +00:00
Arpad Müller	33c3c34c95	Appease cargo deny errors (#11142 ) * pprof can also use `prost` as a backend, switch to it as `protobuf` has no update available but a security issue. * `paste` is a build time dependency, so add the unmaintained warning as an exception.	2025-03-10 13:24:14 +00:00
Ivan Efremov	5d38fd6c43	fix(proxy): Use testodrome query id for latency measurement (#11150 ) Add a new neon option "neon_query_id" to glue data with testodrome queries. Log latency in microseconds always. Relates to the #22486	2025-03-10 12:55:16 +00:00
Dmitrii Kovalkov	66881b4394	storcon: update scheduler stats when changing node's preferred az (#11147 ) ## Problem `home_shard_count` is not updated on the preferred AZ change. Closes: https://github.com/neondatabase/neon/issues/10493 ## Summary of changes - Update scheduler stats (node ref counts) on preferred AZ change.	2025-03-10 11:33:00 +00:00
Konstantin Knizhnik	c87d307e8c	Print state of connection buffer when no response is receioved from PS for a long time (#11145 ) ## Problem See https://neondb.slack.com/archives/C08DE6Q9C3B Sometimes compute is not able to receive responses from PS for a long time (minutes). I do not think that the problem is at compute side, but in order to exclude this possibility I wan to see more information about connection state at compute side, particularly amount of data cached in connection buffer. ## Summary of changes Right now we are dumping state of socket buffer. This PR adds printing state of connection buffer. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-09 18:36:36 +00:00
Tristan Partin	1b8c4286c4	Fetch remote extension in ALTER EXTENSION UPDATE statements (#11102 ) Previously, remote extensions were not fetched unless they were used in some other manner. For instance, loading a BM25 index in pg_search fetches the pg_search extension. However, if on a fresh compute with pg_search 0.15.5 installed, the user ran `ALTER EXTENSION pg_search UPDATE TO '0.15.6'` without first using the pg_search extension, we would not fetch the extension and fail to find an update path. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-09 17:29:44 +00:00
Tristan Partin	3fe5650039	Fix dropping role with table privileges granted by non-neon_superuser (#10964 ) We were previously only revoking privileges granted by neon_superuser. However, we need to do it for all grantors. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-07 19:00:11 +00:00
Alex Chi Z.	cd438406fb	feat(pageserver): add force patch index_part API (#11119 ) ## Problem As part of the disaster recovery tool. Partly for https://github.com/neondatabase/neon/issues/9114. ## Summary of changes * Add a new pageserver API to force patch the fields in index_part and modify the timeline internal structures. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-07 17:42:52 +00:00
Dmitrii Kovalkov	e876794ce5	storcon: use https safekeeper api (#11065 ) ## Problem Storage controller uses http for requests to safekeeper management API. Closes: https://github.com/neondatabase/cloud/issues/24835 ## Summary of changes - Add `use_https_safekeeper_api` option to storcon to use https api - Use https for requests to safekeeper management API if this option is enabled - Add `ssl_ca_file` option to storcon for ability to specify custom root CA certificate	2025-03-07 17:22:47 +00:00
John Spray	87e6117dfd	storage controller: API-driven graceful migrations (#10913 ) ## Problem The current migration API does a live migration, but if the destination doesn't already have a secondary, that live migration is unlikely to be able to warm up a tenant properly within its timeout (full warmup of a big tenant can take tens of minutes). Background optimisation code knows how to do this gracefully by creating a secondary first, but we don't currently give a human a way to trigger that. Closes: https://github.com/neondatabase/neon/issues/10540 ## Summary of changes - Add `prefererred_node` parameter to TenantShard, which is respected by optimize_attachment - Modify migration API to have optional prewarm=true mode, in which we set preferred_node and call optimize_attachment, rather than directly modifying intentstate - Require override_scheduler=true flag if migrating somewhere that is a less-than-optimal scheduling location (e.g. wrong AZ) - Add `origin_node_id` to migration API so that callers can ensure they're moving from where they think they're moving from - Add tests for the above The storcon_cli wrapper for this has a 'watch' mode that waits for eventual cutover. This doesn't show the warmth of the secondary evolve because we don't currently have an API for that in the controller, as the passthrough API only targets attached locations, not secondaries. It would be straightforward to add later as a dedicated endpoint for getting secondary status, then extend the storcon_cli to consume that and print a nice progress indicator.	2025-03-07 17:02:38 +00:00
Vlad Lazar	084fc4a757	pageserver: enable previous heatmaps by default (#11132 ) We add the off by default configs in https://github.com/neondatabase/neon/pull/11088 because the unarchival heatmap was causing oversized secondary locations. That was fixed in https://github.com/neondatabase/neon/pull/11098, so let's turn them on by default.	2025-03-07 16:05:31 +00:00
Vlad Lazar	937876cbe2	safekeeper: don't skip empty records for shard zero (#11137 ) ## Problem Shard zero needs to track the start LSN of the latest record in adition to the LSN of the next record to ingest. The former is included in basebackup persisted by the compute in WAL. Previously, empty records were skipped for all shards. This caused the prev LSN tracking on the PS to fall behind and led to logical replication issues. ## Summary of changes Shard zero now receives emtpy interpreted records for LSN tracking purposes. A test is included too.	2025-03-07 15:52:01 +00:00
Alexander Lakhin	a4ce20db5c	Support workflow_dispatch event in _meta.yml (#11133 ) ## Problem Allow for using _meta.yml with workflow_dispatch event. ## Summary of changes Handle this event in the run-kind step; fix and update the description of the run-kind output.	2025-03-07 15:00:06 +00:00
Erik Grinaker	eedd179f0c	storcon: initial autosplit tweaks (#11134 ) ## Problem This patch makes some initial tweaks as preparation for https://github.com/neondatabase/cloud/issues/22532, where we will be introducing additional autosplit logic. The plan is outlined in https://github.com/neondatabase/cloud/issues/22532#issuecomment-2706215907. ## Summary of changes Minor code cleanups and behavioral changes: * Decide that we'll split based on `max_logical_size` (later possibly `total_logical_size`). * Fix a bug that would split the smallest candidate rather than the largest. * Pick the largest candidate by `max_logical_size` rather than `resident_size`, for consistency (debatable). * Split out `get_top_tenant_shards()` to fetch split candidates. * Fetch candidates concurrently from all nodes. * Make `TenantShard.get_scheduling_policy()` return a copy instead of a reference.	2025-03-07 14:38:01 +00:00
Arpad Müller	f1b18874c3	storcon: require safekeeper jwt's in strict mode (#11116 ) We have introduced the ability to specify safekeeper JWTs for the storage controller. It now does heartbeats. We now want to also require the presence of those JWTs. Let's merge this PR shortly after the release cutoff. Part of / follow-up of https://github.com/neondatabase/cloud/issues/24727	2025-03-07 13:29:48 +00:00
Fedor Dikarev	db77896e92	remove CODEOWNER assignement for the test_runner/ (#11130 ) ## Problem That adds an extra unnecessary load to the PerfCorr team ## Summary of changes Remove `CODEOWNERS` assignment for the `test_runner/` folder	2025-03-07 12:38:27 +00:00
Alexey Kondratov	f5aa8c3eac	feat(compute_ctl): Add a basic HTTP API benchmark (#11123 ) ## Problem We just had a regression reported at https://neondb.slack.com/archives/C08EXUJF554/p1741102467515599, which clearly came with one of the releases. It's not a huge problem yet, but it's annoying that we cannot quickly attribute it to a specific commit. ## Summary of changes Add a very simple `compute_ctl` HTTP API benchmark that does 10k requests to `/status` and `metrics.json` and reports p50 and p99. --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech>	2025-03-07 12:35:42 +00:00
Arpad Müller	cea67fc062	update ring to 0.17.13 (#11131 ) Update ring from 0.17.6 to 0.17.13. Addresses the advisory: https://rustsec.org/advisories/RUSTSEC-2025-0009	2025-03-07 12:17:04 +00:00
Alex Chi Z.	e825974a2d	feat(pageserver): yield gc-compaction to L0 compaction (#11120 ) ## Problem Part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes gc-compaction could take a long time in some cases, for example, if the job split heuristics is wrong and we selected a too large region for compaction that can't be finished within a reasonable amount of time. We will give up such tasks and yield to L0 compaction. Each gc-compaction sub-compaction job is atomic and cannot be split further so we have to give up (instead of storing a state and continue later as in image creation). --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-06 20:30:11 +00:00
Fedor Dikarev	50d883d516	Add performance-correctness to the CODEOWNERS (#11124 ) ## Problem After splitting teams it became a bit more complicated for the PerfCorr team to work on tests changes. ## Summary of changes 1. Add PerfCorr team co-owners for `.github/` folder 2. Add PerCorr team as owner for `test_runner/` folder	2025-03-06 19:59:17 +00:00
Alexey Kondratov	a485022300	fix(compute_ctl): Properly escape identifiers inside PL/pgSQL blocks (#11045 ) ## Problem In `f37eeb56`, I properly escaped the identifier, but I haven't noticed that the resulting string is used in the `format('...')`, so it needs additional escaping. Yet, after looking at it closer and with Heikki's and Tristan's help, it appeared to be that it's a full can of worms and we have problems all over the code in places where we use PL/pgSQL blocks. ## Summary of changes Add a new `pg_quote_dollar()` helper to deal with it, as dollar-quoting of strings seems to be the only robust way to escape strings in dynamic PL/pgSQL blocks. We mimic the Postgres' `pg_get_functiondef` logic here [1]. While on it, I added more tests and caught a couple of more bugs with string escaping: 1. `get_existing_dbs_async()` was wrapping `owner` in additional double-quotes if it contained special characters 2. `construct_superuser_query()` was flawed in even more ways than the rest of the code. It wasn't realistic to fix it quickly, but after thinking about it more, I realized that we could drop most of it altogether. IIUC, it was added as some sort of migration, probably back when we haven't had migrations yet. So all the complicated code was needed to properly update existing roles and DBs. In the current Neon, this code only runs before we create the very first DB and role. When we create roles and DBs, all `neon_superuser` grants are added in the different places. So the worst thing that could happen is that there is an ancient branch somewhere, so when users poke it, they will realize that not all Neon features work as expected. Yet, the fix is simple and self-serve -- just create a new role via UI or API, and it will get a proper `neon_superuser` grant. [1]: `8b49392b27/src/backend/utils/adt/ruleutils.c (L3153)` Closes neondatabase/cloud#25048	2025-03-06 19:54:29 +00:00
Anastasia Lubennikova	3dee29eb00	Spawn rsyslog from neonvm (#11111 ) then configure it from compute_ctl. to make it more robust in case of restarts and rsyslogd crashes.	2025-03-06 19:14:19 +00:00
Peter Bendel	3bb318a295	run periodic page bench more frequently to simplify bi-secting regressions (#11121 ) ## Problem When periodic pagebench runs only once a day a lot of commits can be in between a good run and a regression. ## Summary of changes Run the workflow every 3 hours	2025-03-06 17:47:54 +00:00
Alex Chi Z.	11334a2cdb	feat(pageserver): more statistics for gc-compaction (#11103 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes * Add timers for each phase of the gc-compaction. * Add a final ratio computation to directly show the garbage collection ratio in the logs. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-06 16:44:00 +00:00
Alexey Kondratov	4b77807de9	fix(compute/sql_exporter): Ignore invalid DBs when collecting size (#11097 ) ## Problem Original Slack discussion: https://neondb.slack.com/archives/C04DGM6SMTM/p1739915430147169 TL;DR in Postgres, it's totally normal to have 'invalid' DBs (state after the interrupted `DROP DATABASE`). Yet, some of our metrics collected with `sql_exporter` try to get the size of such invalid DBs. Typical log lines: ``` time=2025-03-05T16:30:32.368Z level=ERROR source=promhttp.go:52 msg="Error gathering metrics" error="[from Gatherer #1] [collector=neon_collector,query=pg_stats_userdb] pq: [NEON_SMGR] [reqid 0] could not read db size of db 173228 from page server at lsn 0/44A0E8C0" time=2025-03-05T16:30:32.369Z level=ERROR source=promhttp.go:52 msg="Error gathering metrics" error="[from Gatherer #1] [collector=neon_collector,query=db_total_size] pq: [NEON_SMGR] [reqid 0] could not read db size of db 173228 from page server at lsn 0/44A0E8C0" ``` ## Summary of changes Ignore invalid DBs in these two metrics -- `pg_stats_userdb` and `db_total_size`	2025-03-06 15:32:17 +00:00
Vlad Lazar	5ceb8c994d	pageserver: mark unarchival heatmap layers as cold (#11098 ) ## Problem On unarchival, we update the previous heatmap with all visible layers. When the primary generates a new heatmap it includes all those layers, so the secondary will download them. Since they're not actually resident on the primary (we didn't call the warm up API), they'll never be evicted, so they remain in the heatmap. We want these layers in the heatmap, since we might wish to warm-up an unarchived timeline after a shard migration. However, we don't want them to be downloaded on the secondary until we've warmed up the primary. ## Summary of Changes Include these layers in the heatmap and mark them as cold. All heatmap operations act on non-cold layers apart from the attached location warming up API, which will download the cold layers. Once the cold layers are downloaded on the primary, they'll be included in the next heatmap as hot and the secondary starts fetching them too.	2025-03-06 11:25:02 +00:00
Vlad Lazar	43cea0df91	pageserver: allow for unit test stress test (#11112 ) ## Problem I like using `cargo stress` to hammer on a test, but it doesn't work out of the box because it does parallel runs by default and tests always use the same repo dir. ## Summary of changes Add an uuid to the test repo dir when generating it.	2025-03-06 11:23:25 +00:00
Erik Grinaker	ab7efe9e47	pageserver: add amortized read amp metrics (#11093 ) ## Problem In a batch, `pageserver_layers_per_read_global` counts all layer visits towards every read in the batch, since this directly affects the observed latency of the read. However, this doesn't give a good picture of the amortized read amplification due to batching. ## Summary of changes Add two more global read amp metrics: * `pageserver_layers_per_read_batch_global`: number of layers visited per batch. * `pageserver_layers_per_read_amortized_global`: number of layers divided by reads in a batch.	2025-03-06 10:23:48 +00:00
Folke Behrens	16b8a3f598	Update Jinja2 to 3.1.6 (#11109 ) https://github.com/neondatabase/neon/security/dependabot/89	2025-03-06 09:55:41 +00:00
Conrad Ludgate	85072b715f	Merge pull request #11106 from neondatabase/rc/release-proxy/2025-03-06 Proxy release 2025-03-06	2025-03-06 09:53:00 +00:00
Folke Behrens	f343537e4d	proxy: Small adjustments to json logging (#11107 ) * Remove callsite identifier registration on span creation. Forgot to remove from last PR. Was part of alternative idea. * Move "spans" object to right after "fields", so event and span fields are listed together.	2025-03-06 09:18:28 +00:00
github-actions[bot]	6c86fe7143	Proxy release 2025-03-06	2025-03-06 06:02:15 +00:00
Alex Chi Z.	78b322f616	rfc: add 041-rel-sparse-keyspace (#10412 ) Based on the PoC patch I've done in #10316, I'd like to put an RFC in advance to ensure everyone is on the same page, and start incrementally port the code to the main branch. https://github.com/neondatabase/neon/issues/9516 [Rendered](https://github.com/neondatabase/neon/blob/skyzh/rfc-041-rel-sparse-keyspace/docs/rfcs/041-rel-sparse-keyspace.md) --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-03-05 21:43:16 +00:00
Alex Chi Z.	2de3629b88	test(pageserver): use reldirv2 by default in regress tests (#11081 ) ## Problem For pg_regress test, we do both v1 and v2; for all the rest, we default to v2. part of https://github.com/neondatabase/neon/issues/9516 ## Summary of changes Use reldir v2 across test cases by default. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-05 21:02:44 +00:00
Em Sharnoff	1fe23fe8d2	compute/lfc: Add chunk size to neon_lfc_stats (#11100 ) This PR adds a new key to neon.neon_lfc_stats — 'file_cache_chunk_size_pages'. It just returns the value of BLOCKS_PER_CHUNK from the LFC implementation. The new value should (eventually) allow changing the chunk size without breaking any places that rely on LFC stats values measured in number of chunks. See neondatabase/cloud#25170 for more.	2025-03-05 20:35:08 +00:00
Peter Bendel	604eb5e8d4	fix grafana dashboard link for pooler endoints (#11099 ) ## Problem Our benchmarking workflows contain links to grafana dashboards to troubleshoot problems. This works fine for non-pooled endpoints. For pooled endpoints we need to remove the `-pooler` suffix from the endpoint's hostname to get a valid endpoint ID. Example link that doesn't work in this run https://github.com/neondatabase/neon/actions/runs/13678933253/job/38246028316#step:8:311 ## Summary of changes Check if connection string is a -pooler connection string and if so remove this suffix from the endpoint ID. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-03-05 20:01:17 +00:00
Tristan Partin	d599d2df80	Update postgres_exporter to 0.17.1 (#11094 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-05 18:32:45 +00:00
Alexey Kondratov	8263107f6c	feat(compute): Add filename label to remote ext requests metric (#11091 ) ## Problem We realized that we may use this metric for more 'live' info about extension installations vs. what we have with installed extensions metric, which is only updated at start, atm. ## Summary of changes Add `filename` label to `compute_ctl_remote_ext_requests_total`. Note that it contains the raw archive name with `.tar.zst` at the end, so the consumer may need to strip this suffix. Closes https://github.com/neondatabase/cloud/issues/24694	2025-03-05 18:17:57 +00:00
Anastasia Lubennikova	d94fc75cfc	Setup compute_ctl pgaudit and rsyslog (#10615 ) Setup pgaudit and pgauditlogtofile extensions in compute_ctl when the ComputeAuditLogLevel is set to 'hipaa'. See cloud PR https://github.com/neondatabase/cloud/pull/24568 Add rsyslog setup for compute_ctl. Spin up a rsyslog server in the compute VM, and configure it to send logs to the endpoint specified in AUDIT_LOGGING_ENDPOINT env.	2025-03-05 18:01:00 +00:00
Alex Chi Z.	9cdc8c0e6c	feat(pageserver): revisit error types for gc-compaction (#11082 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 We used anyhow::Error everywhere and it's time to fix. ## Summary of changes * Make sure that cancel errors are correctly propagated as CompactionError::ShuttingDown. * Skip all the trigger computation work if gc_cutoff is not generated yet. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-05 15:57:38 +00:00
Arpad Müller	2d45522fa6	storcon db: load safekeepers from DB again (#11087 ) Earlier PR #11041 soft-disabled the loading code for safekeepers from the storcon db. This PR makes us load the safekeepers from the database again, now that we have [JWTs available on staging](https://github.com/neondatabase/neon/pull/11087) and soon on prod. This reverts commit `23fb8053c5`. Part of https://github.com/neondatabase/cloud/issues/24727	2025-03-05 15:45:43 +00:00
JC Grünhage	94e6897ead	fix(ci): make deploy job depend on pushing images to dev registries (#11089 ) ## Problem If an image fails to push to dev registries, we shouldn't trigger the deploy job, because that depends on images existing in dev registries. To ensure this is the case, the deploy job needs to depend on pushing to dev registries. ## Summary of changes Make `deploy` depend on `push-neon-image-dev` and `push-compute-image-dev`.	2025-03-05 14:28:43 +00:00
Erik Grinaker	332aae1484	test_runner/regress: speed up `test_check_visibility_map` (#11086 ) ## Problem `test_check_visibility_map` is the slowest test in CI, and can cause timeouts under particularly slow configurations (`debug` and `without-lfc`). ## Summary of changes * Reduce the `pgbench` scale factor from 10 to 8. * Omit a redundant vacuum during `pgbench` init. * Remove a final `vacuum freeze` + `pg_check_visible` pass, which has questionable value (we've already done a vacuum freeze previously, and we don't flush the compute cache before checking anyway).	2025-03-05 13:50:35 +00:00
Vlad Lazar	8c12ccf729	pageserver: gate previous heatmap behind config flag (#11088 ) ## Problem On unarchival, we update the previous heatmap with all visible layers. When the primary generates a new heatmap it includes all those layers, so the secondary will download them. Since they're not actually resident on the primary (we didn't call the warm up API), they'll never be evicted, so they remain in the heatmap. This leads to oversized secondary locations like we saw in pre-prod. ## Summary of changes Gate the loading of the previous heatmaps and the heatmap generation on unarchival behind configuration flags. They are disabled by default, but enabled in tests.	2025-03-05 12:20:18 +00:00
Vlad Lazar	abae7637d6	pageserver: do big reads to fetch slru segment (#11029 ) ## Problem Each page of the slru segment is fetched individually when it's loaded on demand. ## Summary of Changes Use `Timeline::get_vectored` to fetch 16 at a time.	2025-03-05 11:55:55 +00:00
Anastasia Lubennikova	38a883118a	Skip dropping tablesync replication slots on the publisher from branch (#11073 ) fixes https://github.com/neondatabase/cloud/issues/24292 Do not drop tablesync replication slots on the publisher, when we're in the process of dropping subscriptions inherited by a neon branch. Because these slots are still needed by the parent branch subscriptions. For regular slots we handle this by setting the slot_name to NONE before calling DROP SUBSCRIPTION, but tablesync slots are not exposed to SQL. rely on GUC disable_logical_replication_subscribers=true to know that we're in the Neon-specific process of dropping subscriptions.	2025-03-05 11:29:46 +00:00
Erik Grinaker	40aa4d7151	utils: log Sentry initialization (#11077 ) ## Problem We don't have any logging for Sentry initialization. This makes it hard to verify that it has been configured correctly. ## Summary of changes Log some basic info when Sentry has been initialized, but omit the public key (which allows submitting events). Also log when `SENTRY_DSN` isn't specified at all, and when it fails to initialize (which is supposed to panic, but we may as well).	2025-03-05 11:23:07 +00:00
Folke Behrens	8e51bfc597	proxy: JSON logging field refactor (#11078 ) ## Problem Grafana Loki's JSON handling is somewhat limited and the log message should be structured in a way that it's easy to sift through logs and filter. ## Summary of changes * Drop span_id. It's too short lived to be of value and only bloats the logs. * Use the span's name as the object key, but append a unique numeric value to prevent name collisions. * Extract interesting span fields into a separate object at the root. New format: ```json { "timestamp": "2025-03-04T18:54:44.134435Z", "level": "INFO", "message": "connected to compute node at 127.0.0.1 (127.0.0.1:5432) latency=client: 22.002292ms, cplane: 0ns, compute: 5.338875ms, retry: 0ns", "fields": { "cold_start_info": "unknown" }, "process_id": 56675, "thread_id": 9122892, "task_id": "24", "target": "proxy::compute", "src": "proxy/src/compute.rs:288", "trace_id": "5eb89b840ec63fee5fc56cebd633e197", "spans": { "connect_request#1": { "ep": "endpoint", "role": "proxy", "session_id": "b8a41818-12bd-4c3f-8ef0-9a942cc99514", "protocol": "tcp", "conn_info": "127.0.0.1" }, "connect_to_compute#6": {}, "connect_once#8": { "compute_id": "compute", "pid": "853" } }, "extract": { "session_id": "b8a41818-12bd-4c3f-8ef0-9a942cc99514" } } ```	2025-03-05 10:27:46 +00:00
Peter Bendel	906d7468cc	exclude separate perf tests from bench step (#11084 ) ## Problem Our benchmarking workflow has a job step `bench`which runs all tests in test_runner/performance/* except those that we want to run separately. We recently added two test cases to that testcase directory that we want to run separately but forgot to ignore them during the bench step. This is now causing [failures](https://github.com/neondatabase/neon/actions/runs/13667689340/job/38212087331#step:7:392). ## Summary of changes Ignore the separately run tests in the bench step.	2025-03-05 10:14:51 +00:00
Konstantin Knizhnik	438f7bb726	Check response status in prefetch_lookup (#11080 ) ## Problem New async prefetch introduces `prefetch+lookup[` function which is called before LFC lookup to check if prefetch request is already completed. This function is not containing now check that response is actually `T_NeonGetPageResponse` (and not error). ## Summary of changes Add checks for response tag. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-05 10:03:09 +00:00
Peter Bendel	f62ddb11ed	Distinguish manually submitted runs for periodic pagebench in grafana dashboard (#11079 ) ## Problem Periodic pagebench workflow runs periodically from latest main commit and also allows to dispatch it manually for a given commit hash to bi-sect regressions. However in the dashboards we can not distinguish manual runs from periodic runs which makes it harder to follow the trend. ## Summary of changes Send an additional flag commit type to the benchmark runner instance to distinguish the run type. Note: this needs a follow-up PR on the receiving side.	2025-03-04 18:11:43 +00:00
Tristan Partin	7b7e4a9fd3	Authorize compute_ctl requests from the control plane (#10530 ) The compute should only act if requests come from the control plane. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-03-04 18:08:00 +00:00
Erik Grinaker	4bbdb758ec	compute_tools: appease unused lint on macOS (#11074 ) ## Problem On macOS, the `unused` lint complains about two variables not used in `!linux` builds. These were introduced in #11007. ## Summary of changes Appease the linter by explicitly using the variables in `!linux` branches.	2025-03-04 16:39:32 +00:00
Alex Chi Z.	20af9cef17	fix(test): use the same value for reldir v1+v2 (#11070 ) ## Problem part of https://github.com/neondatabase/neon/issues/11067 My observation is that with the current value of settings, x86-v1 usually takes 30s, arm-v1 1m30s, x86-v2 1m, arm-v2 3m. But sometimes the system could run too slow and cause test to timeout on arm with reldir v2. While I investigate what's going on and further improve the performance, I'd like to set both of them to use the same test input, so that it doesn't timeout and we don't abuse this test case as a performance test. ## Summary of changes Use the same settings for both test cases. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-04 14:55:50 +00:00
Erik Grinaker	a2902e774a	http-utils: generate heap profiles with jemalloc_pprof (#11075 ) ## Problem The code to generate symbolized pprof heap profiles and flamegraph SVGs has been upstreamed to the `jemalloc_pprof` crate: * https://github.com/polarsignals/rust-jemalloc-pprof/pull/22 * https://github.com/polarsignals/rust-jemalloc-pprof/pull/23 ## Summary of changes Use `jemalloc_pprof` to generate symbolized pprof heap profiles and flamegraph SVGs. This reintroduces a bunch of internal jemalloc stack frames that we'd previously strip, e.g. each stack now always ends with `prof_backtrace_impl` (where jemalloc takes a stack trace for heap profiling), but that seems ok.	2025-03-04 12:13:41 +00:00
Vlad Lazar	435bf452e6	tests: remove obsolete err log whitelisting (#11069 ) The pageserver read path now supports overlapped in-memory and image layers via https://github.com/neondatabase/neon/pull/11000. These allowed errors are now obsolete.	2025-03-04 08:18:19 +00:00
Erik Grinaker	65addfc524	storcon: add per-tenant rate limiting for API requests (#10924 ) ## Problem Incoming requests often take the service lock, and sometimes even do database transactions. That creates a risk that a rogue client can starve the controller of the ability to do its primary job of reconciling tenants to an available state. ## Summary of changes * Use the `governor` crate to rate limit tenant requests at 10 requests per second. This is ~10-100x lower than the worst "attack" we've seen from a client bug. Admin APIs are not rate limited. * Add a `storage_controller_http_request_rate_limited` histogram for rate limited requests. * Log a warning every 10 seconds for rate limited tenants. The rate limiter is parametrized on TenantId, because the kinds of client bug we're protecting against generally happen within tenant scope, and the rates should be somewhat stable: we expect the global rate of requests to increase as we do more work, but we do not expect the rate of requests to one tenant to increase. --------- Co-authored-by: John Spray <john@neon.tech>	2025-03-03 22:04:59 +00:00
Alex Chi Z.	6d0976dad5	feat(pageserver): persist reldir v2 migration status (#10980 ) ## Problem part of https://github.com/neondatabase/neon/issues/9516 ## Summary of changes Similar to the aux v2 migration, we persist the relv2 migration status into index_part, so that even the config item is set to false, we will still read from the v2 storage to avoid loss of data. Note that only the two variants `None` and `Some(RelSizeMigration::Migrating)` are used for now. We don't have full migration implemented so it will never be set to `RelSizeMigration::Migrated`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-03 21:05:43 +00:00
Alex Chi Z.	dbf9a80261	fix(pageserver): avoid flooding gc-compaction logs (#11024 ) ## Problem The "did not trigger" gets logged at 10k/minute in staging. ## Summary of changes Change it to debug level. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-03 20:23:20 +00:00
Vlad Lazar	6ca49b4d0c	safekeeper: fix a gap tracking edge case (#11054 ) The interpreted reader tracks a record aligned current position in the WAL stream. Partial reads move the stream internally, but not from the pov of the interpreted WAL reader. Hence, where new shards subscribe with a start position that matches the reader's current position, but we've also done some partial reads. This confuses the gap tracking. To make it more robust, update the current batch start to the min between the new start position and its current value. Since no record has been decoded yet (position matches), we can't have lost it	2025-03-03 19:16:03 +00:00
Vlad Lazar	5197e43396	pageserver: add recurse flag to layer download spec (#11068 ) I missed updating the open api spec in the original PR. We need this so that the cplane auto-generated client sees the flag.	2025-03-03 19:04:01 +00:00
Alexander Lakhin	9a4e2eab61	Fix artifact name for build with sanitizers (#11066 ) ## Problem When a build is made with sanitizers, this is not reflected in the artifact name, which can lead to overriding normal builds with sanitized ones. ## Summary of changes Take this property of a build into account when constructing the artifact name.	2025-03-03 18:00:53 +00:00
Vlad Lazar	8298bc903c	pageserver: handle in-memory layer overlaps with persistent layers (#11000 ) ## Problem Image layers may be nested inside in-memory layers as diagnosed [here](https://github.com/neondatabase/neon/issues/10720#issuecomment-2649419252). The read path doesn't support this and may skip over the image layer, resulting in a failure to reconstruct the page. ## Summary of changes We already support nesting of image layers inside delta layers. The logic lives in `LayerMap::select_layer`. The main goal of this PR is to propagate the candidate in-memory layer down to that point and update the selection logic. Important changes are: 1. Support partial reads for the in-memory layer. Previously, we could only specify the start LSN of the read. We need to control the end LSN too. 2. `LayerMap::ranged_search` considers in-memory layers too. Previously, the search for in-memory layers was done explicitly in `Timeline::get_reconstruct_data_timeline`. Note that `LayerMap::ranged_search` now returns a weak readable layer which the `LayerManager` can upgrade. This dance is such that we can unit test the layer selection logic. 3. Update `LayerMap::select_layer` to consider the candidate in-memory layer too Loosely related drive bys: 1. Remove the "keys not found" tracking in the ranged search. This wasn't used anywhere and it just complicates things. 2. Remove the difficulty map stuff from the layer map. Again, not used anywhere. Closes https://github.com/neondatabase/neon/issues/9185 Closes https://github.com/neondatabase/neon/issues/10720	2025-03-03 17:52:59 +00:00
John Spray	b953daa21f	safekeeper: allow remote deletion to proceed after dropped requests (#11042 ) ## Problem If a caller times out on safekeeper timeline deletion on a large timeline, and waits a while before retrying, the deletion will not progress while the retry is waiting. The net effect is very very slow deletion as it only proceeds in 30 second bursts across 5 minute idle periods. Related: https://github.com/neondatabase/neon/issues/10265 ## Summary of changes - Run remote deletion in a background task - Carry a watch::Receiver on the Timeline for other callers to join the wait - Restart deletion if the API is called again and the previous attempt failed	2025-03-03 16:03:51 +00:00
Peter Bendel	a07599949f	First version of a new benchmark to test larger OLTP workload (#11053 ) ## Problem We want to support larger tenants (regarding logical database size, number of transactions per second etc.) and should increase our test coverage of OLTP transactions at larger scale. ## Summary of changes Start a new benchmark that over time will add more OLTP tests at larger scale. This PR covers the first version and will be extended in further PRs. Also fix some infrastructure: - default for new connections and large tenants is to use connection pooler pgbouncer, however our fixture always added `statement_timeout=120` which is not compatible with pooler [see](https://neon.tech/docs/connect/connection-errors#unsupported-startup-parameter) - action to create branch timed out after 10 seconds and 10 retries but for large tenants it can take longer so use increasing back-off for retries ## Test run https://github.com/neondatabase/neon/actions/runs/13593446706	2025-03-03 15:25:48 +00:00
Vlad Lazar	38277497fd	pageserver: log shutdown at info level for basebackup (#11046 ) ## Problem Timeline shutdown during basebackup logs at error level because the the canecellation error is smushed into BasebackupError::Server. ## Summary of changes Introduce BasebackupError::Shutdown and use it. `log_query_error` will now see `QueryError::Shutdown` and log at info level.	2025-03-03 13:46:50 +00:00
Arseny Sher	ef2b50994c	walproposer: basic infra to enable generations (#11002 ) ## Problem Preparation for https://github.com/neondatabase/neon/issues/10851 ## Summary of changes Add walproposer `safekeepers_generations` field which can be set by prefixing `neon.safekeepers` GUC with `g#n:`. Non zero value (n) forces walproposer to use generations. In particular, this also disables implicit timeline creation as timeline will be created by storcon. Add test checking this. Also add missing infra: `--safekeepers-generation` flag to neon_local endpoint start + fix `--start-timeout` flag: it existed but value wasn't used.	2025-03-03 13:20:20 +00:00
Konstantin Knizhnik	8669bfe493	Do not store zero pages in inmem SMGR for walredo (#11043 ) ## Problem See https://neondb.slack.com/archives/C033RQ5SPDH/p1740157873114339 smgrextend for FSM fork is called during page reconstruction by walredo process causing overflow of inmem SMGR (64 pages). ## Summary of changes Do not store zero pages in inmem SMGR because `inmem_read` returns zero page if it is not able to locate specified block. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-03-03 12:50:07 +00:00
Misha Sakhnov	625c526bdd	ci: create multiarch vm images (#11017 ) ## Problem We build compute-nodes as multi-arch images, but not the vm-compute-nodes. The PR adds multiarch vm images the same way as in autoscaling repo. ## Summary of changes Add architecture to the matrix for vm compute build steps Add merge job --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-03-03 11:47:09 +00:00
a-masterov	df0767176a	Change the tags names according to the curent state (#11059 ) ## Problem We have not synced `force-test-extensions-upgrade.yml` with the last changes. The variable `TEST_EXTENSIONS_UPGRADE` was ignored in the script and actually set to `NEW_COMPUTE_TAG` while it should be set to `OLD_COMPUTE_TAG` as we are about to run compatibility tests. ## Summary of changes The tag names were synced, the logic was fixed.	2025-03-03 09:40:49 +00:00
Heikki Linnakangas	38ddfab643	compute_ctl: Perform more startup actions in parallel (#11008 ) To speed up compute startup. Resizing swap in particular takes about 100 ms on my laptop. By performing it in parallel with downloading the basebackup, that latency is effectively hidden. I would imagine that downloading remote extensions can also take a non-trivial amount of time, although I didn't try to measure that. In any case that's now also performed in parallel with downloading the basebackup.	2025-03-03 00:29:37 +00:00
Heikki Linnakangas	066324d6ec	compute_ctl: Rearrange startup code (#11007 ) Move most of the code to compute.rs, so that all the major startup steps are visible in one place. You can now get a pretty good picture of what happens in the latency-critical path at compute startup by reading ComputeNode::start_compute(). This also clarifies the error handling in start_compute. Previously, the start_postgres function sometimes returned an Err, and sometimes Ok but with the compute status already set to Failed. Now the start_compute function always returns Err on failure, and it's the caller's responsibility to change the compute status to Failed. Separately from that, it returns a handle to the Postgres process via a `&mut` reference if it had already started Postgres (i.e. on success, or if the failure happens after launching the Postgres process). --------- Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2025-02-28 22:48:05 +00:00
Suhas Thalanki	ee0c8ca8fd	Add -fsigned-char for cross platform signed chars (#10852 ) ## Problem In multi-character keys, the GIN index creates a CRC Hash of the first 3 bytes of the key. The hash can have the first bit to be set or unset, needing to have a consistent representation of `char` across architectures for consistent results. GIN stores these keys by their hashes which determines the order in which the keys are obtained from the GIN index. By default, chars are signed in x86 and unsigned in arm, leading to inconsistent behavior across different platform architectures. Adding the `-fsigned-char` flag to the GCC compiler forces chars to be treated as signed across platforms, ensuring the ordering in which the keys are obtained consistent. ## Summary of changes Added `-fsigned-char` to the `CFLAGS` to force GCC to use signed chars across platforms. Added a test to check this across platforms. Fixes: https://github.com/neondatabase/cloud/issues/23199	2025-02-28 21:07:21 +00:00
Ivan Efremov	56033189c1	feat(proxy): Log latency after connect to compute (#11048 ) ## Problem To measure latency accurate we should associate the testodrome role within a latency data ## Summary of changes Add latency logging to associate different roles within a latency. Relates to the #22486	2025-02-28 17:58:42 +00:00
Erik Grinaker	d857f63e3b	pageserver: fix race that can wedge background tasks (#11047 ) ## Problem `wait_for_active_tenant()`, used when starting background tasks, has a race condition that can cause it to wait forever (until cancelled). It first checks the current tenant state, and then subscribes for state updates, but if the state changes between these then it won't be notified about it. We've seen this wedge compaction tasks, which can cause unbounded layer file buildup and read amplification. ## Summary of changes Use `watch::Receiver::wait_for()` to check both the current and new tenant states.	2025-02-28 17:00:22 +00:00
Alex Chi Z.	f79ee0bb88	fix(storcon): loop in chaos injection (#11004 ) ## Problem Somehow the previous patch loses the loop in the chaos injector function so everything will only run once. https://github.com/neondatabase/neon/pull/10934 ## Summary of changes Add back the loop. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-28 15:49:15 +00:00
Vlad Lazar	23fb8053c5	storcon: soft disable SK heartbeats (#11041 ) ## Problem JWT tokens aren't in place, so all SK heartbeats fail. This is equivalent to a wait before applying the PS heartbeats and makes things more flaky. ## Summary of Changes Add a flag that skips loading SKs from the db on start-up and at runtime.	2025-02-28 15:49:09 +00:00
Conrad Ludgate	d9ced89ec0	feat(proxy): require TLS to compute if prompted by cplane (#10717 ) https://github.com/neondatabase/cloud/issues/23008 For TLS between proxy and compute, we are using an internally provisioned CA to sign the compute certificates. This change ensures that proxy will load them from a supplied env var pointing to the correct file - this file and env var will be configured later, using a kubernetes secret. Control plane responds with a `server_name` field if and only if the compute uses TLS. This server name is the name we use to validate the certificate. Control plane still sends us the IP to connect to as well (to support overlay IP). To support this change, I'd had to split `host` and `host_addr` into separate fields. Using `host_addr` and bypassing `lookup_addr` if possible (which is what happens in production). `host` then is only used for the TLS connection. There's no blocker to merging this. The code paths will not be triggered until the new control plane is deployed and the `enableTLS` compute flag is enabled on a project.	2025-02-28 14:20:25 +00:00
Erik Grinaker	c7ff3c4c9b	safekeeper: downgrade interpreted reader errors (#11034 ) ## Problem This `critical!` could fire on IO errors, which is just noisy. Resolves #11027. ## Summary of changes Downgrade to error, except for decode errors. These could be either data corruption or a bug, but seem worth investigating either way.	2025-02-28 14:06:56 +00:00
Christian Schwarz	7c53fd0d56	refactor(page_service / timeline::handle): the GateGuard need not be a special case (#11030 ) # Changes While working on - https://github.com/neondatabase/neon/pull/7202 I found myself needing to cache another expensive Arc::clone inside inside the timeline::handle::Cache by wrapping it in another Arc. Before this PR, it seemed like the only expensive thing we were caching was the connection handler tasks' clone of `Arc<Timeline>`. But in fact the GateGuard was another such thing, but it was special-cased in the implementation. So, this refactoring PR de-special-cases the GateGuard. # Performance With this PR we are doing strictly _less_ operations per `Cache::get`. The reason is that we wrap the entire `Types::Timeline` into one Arc. Before this PR, it was a separate Arc around the Arc<Timeline> and one around the Arc<GateGuard>. With this PR, we avoid an allocation per cached item, namely, the separate Arc around the GateGuard. This PR does not change the amount of shared mutable state. So, all in all, it should be a net positive, albeit probably not noticable with our small non-NUMA instances and generally high CPU usage per request. # Reviewing To understand the refactoring logistics, look at the changes to the unit test types first. Then read the improved module doc comment. Then the remaining changes. In the future, we could rename things to be even more generic. For example, `Types::TenantMgr` could really be a `Types::Resolver`. And `Types::Timeline` should, to avoid constant confusion in the doc comment, be called `Types::Cached` or `Types::Resolved`. Because the `handle` module, after this PR, really doesn't care that we're using it for storing Arc's and GateGuards. Then again, specicifity is sometimes more useful than being generic. And writing the module doc comment in a totally generic way would probably also be more confusing than helpful.	2025-02-28 12:31:52 +00:00
a-masterov	7607686f25	Make test extensions upgrade work with absent images (#11036 ) ## Problem CI does not pass for the compute release due to the absence of some images ## Summary of changes Now we use the images from the old non-compute releases for non-compute images	2025-02-28 11:16:22 +00:00
Vlad Lazar	0d6d58bd3e	pageserver: make heatmap layer download API more cplane friendly (#10957 ) ## Problem We intend for cplane to use the heatmap layer download API to warm up timelines after unarchival. It's tricky for them to recurse in the ancestors, and the current implementation doesn't work well when unarchiving a chain of branches and warming them up. ## Summary of changes * Add a `recurse` flag to the API. When the flag is set, the operation recurses into the parent timeline after the current one is done. * Be resilient to warming up a chain of unarchived branches. Let's say we unarchived `B` and `C` from the `A -> B -> C` branch hierarchy. `B` got unarchived first. We generated the unarchival heatmaps and stash them in `A` and `B`. When `C` unarchived, it dropped it's unarchival heatmap since `A` and `B` already had one. If `C` needed layers from `A` and `B`, it was out of luck. Now, when choosing whether to keep an unarchival heatmap we look at its end LSN. If it's more inclusive than what we currently have, keep it.	2025-02-28 10:36:53 +00:00
John Spray	55633ebe3a	storcon: enable API passthrough to nonzero shards (#11026 ) ## Problem Storage controller will proxy GETs to pageserver-like tenant/timeline paths through to the pageserver. Usually GET passthroughs make sense to go to shard 0, e.g. if you want to list timelines. But sometimes you really want to know about a particular shard, e.g. reading its cache state or similar. ## Summary of changes - Accept shard IDs as well as tenant IDs in the passthrough route - Refactor node lookup to take a shard ID and make the tenant ID case a layer on top of that. This is one more lock take-drop during these requests, but it's not particularly expensive and these requests shouldn't be terribly frequent This is not immediately used by anything, but will be there any time we want to e.g. do a pass-through query to check the warmth of a tenant cache on a particular shard or somesuch.	2025-02-28 08:42:08 +00:00
Heikki Linnakangas	a4b2009800	compute_ctl: Refactor, moving spec_apply functions to spec_apply.rs (#11006 ) Seems nice to have the function and all its subroutines in the same source file.	2025-02-27 20:13:06 +00:00
JC Grünhage	66d5fe7f5b	Merge pull request #11023 from neondatabase/rc/release-proxy/2025-02-27 Proxy release 2025-02-27	2025-02-27 19:10:58 +01:00
Alex Chi Z.	ab1f22b7d1	fix(pageserver): correctly access layer map in gc-compaction (#11021 ) ## Problem layer_map access was unwrapped. It might return an error during shutdown. ## Summary of changes Propagate the layer_map access error back to the compaction loop. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-27 16:26:55 +00:00
github-actions[bot]	a1b9528757	Proxy release 2025-02-27	2025-02-27 16:18:42 +00:00
JC Grünhage	7ed236e17e	fix(ci): push prod container images again (#11020 ) ## Problem https://github.com/neondatabase/neon/pull/10841 made building compute and neon images optional on releases that don't need them. The `push-<component>-image-prod` jobs had transitive dependencies that were skipped due to that, causing the images not to be pushed to production registries. ## Summary of changes Add `!failure() && !cancelled() &&` to the beginning of the conditions for these jobs to ensure they run even if some of their transitive dependencies are skipped.	2025-02-27 16:16:14 +00:00
Konstantin Knizhnik	e58f264a05	Increase inmem SMGR size for walredo process to 100 pagees (#10937 ) ## Problem We see `Inmem storage overflow` in page server logs: https://neondb.slack.com/archives/C033RQ5SPDH/p1740157873114339 walked process is using inseam SMGR with storage size limited by 64 pages with warning watermark 32 (based ion the assumption that XLR_MAX_BLOCK_ID is 32, so WAL record can not access more than 32 pages). Actually it is not true. We can update up to 3 forks for each block (including update of FSM and VM forks). ## Summary of changes This PR increases inseam SMGR size for walled process to 100 pages and print stack trace in case of overflow. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-27 14:31:05 +00:00
Matthias van de Meent	a283edaccf	PS/Prefetch: Use a timeout for reading data from TCP (#10834 ) This reduces pressure on OS TCP buffers, reducing flush times in other systems like PageServer. ## Problem ## Summary of changes	2025-02-27 14:00:18 +00:00
a-masterov	ad37199745	Separate the upgrade tests in timelines (#10974 ) ## Problem We created extensions in a single database. The tests could interfere, i.e., discover some service tables left by other extensions and produce unexpected results. ## Summary of changes The tests are now run in a separate timeline, so only one extension owns the database, which prevents interference.	2025-02-27 13:45:18 +00:00
Erik Grinaker	93b59e65a2	pageserver: remove stale comment (#11016 ) No longer true now that we eagerly notify the compaction loop.	2025-02-27 12:56:28 +00:00
Ivan Efremov	1423bb8aa2	Merge pull request #11011 from neondatabase/rc/release-proxy/2025-02-27 Proxy release 2025-02-27	2025-02-27 13:57:49 +02:00
Christian Schwarz	e35f7758d8	impr(controller_upcall_client): clean up copy-pasta code & add context to retries (#10991 ) Before this PR, re-attach and validate would log the same warning ``` calling control plane generation validation API failed ``` on retry errors. This can be confusing. This PR makes the message generically valid for any upcall and adds additional tracing spans to capture context. Along the way, clean up some copy-pasta variable naming. refs - https://github.com/neondatabase/neon/issues/10381#issuecomment-2684755827 --------- Co-authored-by: Alexander Lakhin <alexander.lakhin@neon.tech>	2025-02-27 10:59:43 +00:00
Peter Bendel	3a3d62dc4f	Bodobolero/test cum stats persistence (#10995 ) ## Problem So far cumulative statistics have not been persisted when Neon scales to zero (suspends endpoint). With PR https://github.com/neondatabase/neon/pull/6560 the cumulative statistics should now survive endpoint restarts and correctly trigger the auto- vacuum and auto analyze maintenance So far we did not have a testcase that validates that improvement in our dev cloud environment with a real project. ## Summary of changes Introduce testcase `test_cumulative_statistics_persistence`in the benchmarking workflow running daily to verify: - Verifies that the cumulative statistics are correctly persisted across restarts. - Cumulative statistics are important to persist across restarts because they are used - when auto-vacuum an auto-analyze trigger conditions are met. - The test performs the following steps: - Seed a new project using pgbench - insert tuples that by itself are not enough to trigger auto-vacuum - suspend the endpoint - resume the endpoint - insert additional tuples that by itself are not enough to trigger auto-vacuum but in combination with the previous tuples are - verify that autovacuum is triggered by the combination of tuples inserted before and after endpoint suspension ## Test run https://github.com/neondatabase/neon/actions/runs/13546879714/job/37860609089#step:6:282	2025-02-27 10:45:13 +00:00
Arpad Müller	a22be5af72	Migrate the last crates to edition 2024 (#10998 ) Migrates the remaining crates to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, however some code changes had to be done to accommodate the edition's breaking changes. Like the previous migration PRs, this is comprised of three commits: * the first does the edition update and makes `cargo check`/`cargo clippy` pass. we had to update bindgen to make its output [satisfy the requirements of edition 2024](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-extern.html) * the second commit does a `cargo fmt` for the new style edition. * the third commit reorders imports as a one-off change. As before, it is entirely optional. Part of #10918	2025-02-27 09:40:40 +00:00
Christian Schwarz	f09843ef17	refactor(pageserver): propagate RequestContext to layer downloads (#11001 ) For some reason the layer download API never fully got `RequestContext`-infected. This PR fixes that as a precursor to - https://github.com/neondatabase/neon/issues/6107	2025-02-27 09:26:25 +00:00
JC Grünhage	c92a36740b	fix(ci): support PR-on-top-of-PR usecase again (#11013 ) ## Problem https://github.com/neondatabase/neon/pull/10841 broke CI on PRs that aren't based on main or a release branch but want to merge into another PR. ## Summary of changes Replace `run-kind=pr-main` with `run-kind=pr`, so that all PRs that aren't release PRs are treated equally.	2025-02-27 09:05:15 +00:00
Arseny Sher	8b86cd1154	safekeeper: follow membership configuration rules (#10781 ) ## Problem safekeepers must ignore walproposer messages with non matching membership conf. ## Summary of changes Make safekeepers reject vote request, proposer elected and append request messages with non matching generation. Switch to the configuration in the greeting message if it is higher. In passing, fix one comment and WAL truncation. Last part of https://github.com/neondatabase/neon/issues/9965	2025-02-27 06:13:30 +00:00
github-actions[bot]	332f064a42	Proxy release 2025-02-27	2025-02-27 00:17:57 +00:00
Heikki Linnakangas	c50b38ab72	compute_ctl: Fix comment on start_postgres (#11005 ) The comment was woefully outdated and outright wrong. It applied a long time ago (before commit `e5cc2f92c4` to be precise), but nowadays the function just launches postgres and waits until it starts accepting connections. The other things the comment talked about are done in other functions.	2025-02-26 23:38:45 +00:00
Fedor Dikarev	4f4a3910d0	fix error (Line: 74, Col: 26): Unexpected value 'false' (#10999 ) ## Problem Check neon with extra platform builds is failing on main with: ``` The template is not valid. .github/workflows/neon_extra_builds.yml (Line: 74, Col: 26): Unexpected value 'false' ``` https://github.com/neondatabase/neon/actions/runs/13549634905 ## Summary of changes Use `fromJson()` to have `false` as boolean value. thanks to @skyzh for pointing on the issue	2025-02-26 19:54:46 +00:00
Alex Chi Z.	11aab9f0de	fix(pageserver): further stablize gc-compaction tests (#10975 ) ## Problem Yet another source of flakyness for https://github.com/neondatabase/neon/issues/10517 ## Summary of changes The test scenario we want to create is that we have an image layer in index_part and then overwrite it, so we have to ensure it gets persisted in index_part by doing a force checkpoint. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-26 19:50:10 +00:00
Heikki Linnakangas	5cfdb1244f	compute_ctl: Add OTEL tracing to incoming HTTP requests and startup (#10971 ) We lost this with the switch to axum for the HTTP server. Add it back. In addition to just resurrecting the functionality we had before, pass the tracing context of the /configure HTTP request to the start_postgres operation that runs in the main thread. This way, the 'start_postgres' and all its sub-spans like getting the basebackup become children of the HTTP request span. This allows end-to-end tracing of a compute start, all the way from the proxy to the SQL queries executed by compute_ctl as part of compute startup.	2025-02-26 19:27:16 +00:00
Arseny Sher	643a48210f	safekeeper: exclude API (#10757 ) ## Problem https://github.com/neondatabase/neon/pull/10241 added configuration switch endpoint, but it didn't delete timeline if node was excluded. ## Summary of changes Add separate /exclude API endpoint which similarly accepts membership configuration where sk is supposed by be excluded. Implementation deletes the timeline locally. Some more small related tweaks: - make mconf switch API PUT instead of POST as it is idempotent; - return 409 if switch was refused instead of 200 with requested & current; - remove unused was_active flag from delete response; - remove meaningless _force suffix from delete functions names; - reuse timeline.rs delete_dir function in timelines_global_map instead of its own copy. part of https://github.com/neondatabase/neon/issues/9965	2025-02-26 19:26:33 +00:00
Arseny Sher	c1a040447d	walproposer: send valid timeline_start_lsn in v2 (#10994 ) ## Problem https://github.com/neondatabase/neon/pull/10647 dropped timeline_start_lsn from protocol messages as it can be taken from term history. In v2 0 was sent in the placeholder. However, until safekeepers are deployed with that PR they still use the value, setting timeline_start_lsn to 0, which confuses WAL reading; problem appears only when compute includes 10647 but safekeepers don't. ref https://neondb.slack.com/archives/C04DGM6SMTM/p1740577649644269?thread_ts=1740572363.541619&cid=C04DGM6SMTM ## Summary of changes Send real value instead of 0 in v2.	2025-02-26 17:38:44 +00:00
Alex Chi Z.	30f3be9840	fix(test): reduce number of relations in test_tx_abort_with_many_relations (#10997 ) ## Problem I see a lot of timeout errors, which indicates that this test is too slow. It seems that create relations are fast, but the subsequent truncating step is slow. ## Summary of changes Reduce number of relations for now, and investigate later. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-26 17:19:14 +00:00
JC Grünhage	8dfa8f0b94	feat(ci): don't build storage on compute-releases and vice versa (#10841 ) ## Problem Release CI is slow, because we're doing unnecessary work, for example building compute images on storage releases and vice versa. ## Summary of changes - Extract tag generation into reusable workflow and extend it with fetching of previous component releases - Don't build neon images on compute releases and don't build compute images on proxy and storage releases - Reuse images from previous releases for tests on branches where we don't build those images ## Open questions - We differentiate between `TAG` and `COMPUTE_TAG` in a few places, but we don't differentiate between storage and proxy releases. Since they use the same image, this will continue to work, but I'm not sure this is what we want.	2025-02-26 17:17:26 +00:00
Alex Chi Z.	a138a6de9b	fix(pageserver): correctly handle collect_keyspace errors (#10976 ) ## Problem ref https://github.com/neondatabase/neon/issues/10927 ## Summary of changes * Implement `is_critical` and `is_cancel` over `CompactionError`. * Revisit all places that uses `CollectKeyspaceError` to ensure they are handled correctly. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-26 17:09:50 +00:00
Arpad Müller	14347630a4	ancestor detach: delete hardlinked layers on error (#10977 ) Delete layers that we have hardlinked so far when there is an error in `remote_copy`. This prevents a retry of the ancestor detach from stumbling over already present layer files: the hardlink would fail with an error. If there is a crash, we already clean up during the timeline attach: we loop over all layer files and purge all layers that are not referenced by the `index_part.json`. Make sure to hold the timeline gate to prevent races with detach&attach&read from the layer file. These cleanups aren't completely enough however, as there is code after `prepare` as well. To handle errors there, we add a special case for `AlreadyExists` errors during the hardlink, where we check if the layer is an orphan, and if yes, we delete it from local disk. That is ideally not the case we hit, as it is less clear in that scenario where the layer came from, but it provides good defense in depth. Related #10729 Fixes #10970	2025-02-26 16:11:15 +00:00
Erik Grinaker	86b9703f06	pageserver: set `SO_KEEPALIVE` on the page service socket (#10992 ) ## Problem If the client connection goes dead without an explicit close (e.g. due to network infrastructure dropping the connection) then we currently won't detect it for a long time, which may e.g. block GetPage flushes and keep the task running. Touches https://github.com/neondatabase/cloud/issues/23515. ## Summary of changes Enable `SO_KEEPALIVE` on the page service socket, to enable periodic TCP keepalive probes. These are configured via Linux sysctls, which will be deployed separately. By default, the first probe is sent after 2 hours, so this doesn't have a practical effect until we change the sysctls.	2025-02-26 14:36:05 +00:00
Arseny Sher	01581f3af5	safekeeper: drop json_ctrl (#10722 ) ## Problem json_ctrl.rs is an obsolete attempt to have tests with fine control of feeding messages into safekeeper superseded by desim framework. ## Summary of changes Drop it.	2025-02-26 13:32:37 +00:00
Arpad Müller	f94286f0c9	Upgrade compute_tools and compute_api to edition 2024 (#10983 ) Updates `compute_tools` and `compute_api` crates to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, however some code changes had to be done to accommodate the edition's breaking changes. The PR has three commits: * the first commit updates the named crates to edition 2024 and appeases `cargo clippy` by changing code. * the second commit performs a `cargo fmt` that does some minor changes (not many) * the third commit performs a cargo fmt with nightly options to reorder imports as a one-time thing. it's completely optional, but I offer it here for the compute team to review it. I'd like to hear opinions about the third commit, if it's wanted and felt worth the diff or not. I think most attention should be put onto the first commit. Part of #10918	2025-02-26 13:12:26 +00:00
Fedor Dikarev	c2a768086d	add credentials for pulling containers for the jobs (#10987 ) Ref: https://github.com/neondatabase/cloud/issues/24939 ## Problem I found that we are missing authorization for some container jobs, that will make them use anonymous pulls. It's not an issue for now, with high enough limits, but that could be an issue when new limits introduced in DockerHub (10 pulls / hour) ## Summary of changes - add credentials for the jobs that run in containers	2025-02-26 12:50:06 +00:00
Vlad Lazar	622a9def6f	tests: use generated record lsn instead of hardcoded one (#10990 ) ... and start the initial reader with the correct lsn Closes https://github.com/neondatabase/neon/issues/10978	2025-02-26 12:47:13 +00:00
Arpad Müller	26bda17551	storcon: use the SchedulingPolicy enum in SafekeeperPersistence (#10897 ) We don't want to serialize to/from string all the time, so use `SchedulingPolicy` in `SafekeeperPersistence` via the use of a wrapper. Stacked atop #10891	2025-02-26 12:12:50 +00:00
Folke Behrens	0d36f52a6c	proxy: Record and export user-agent header (#10955 ) neondatabase/cloud#24464	2025-02-26 11:39:34 +00:00
Heikki Linnakangas	40ad42d556	Silence "sudo: unable to resolve host" messages at compute startup (#10985 )	2025-02-26 10:10:05 +00:00
Heikki Linnakangas	e452f2a5a3	Remove some redundant log lines at postgres startup (#10958 )	2025-02-26 10:06:42 +00:00
Heikki Linnakangas	43b109af69	compute_ctl: Add more detailed tracing spans to startup subroutines (#10979 ) In local dev environment, these steps take around 100 ms, and they are in the critical path of a compute startup on a compute pool hit. I don't know if it's like that in production, but as first step, add tracing spans to the functions so that they can be measured more easily.	2025-02-26 09:51:07 +00:00
Arthur Petukhovsky	3684162d9f	Bump vm-builder v0.37.1 -> v0.42.2 (#10981 ) Bump version to pick up changes introduced in https://github.com/neondatabase/autoscaling/pull/1286 It's better to have a compute release for this change first, because: - vm-runner changes kernel loglevel from 7 to 6 - vm-builder has a change to bring it back to 7 after startup Previous update: https://github.com/neondatabase/neon/pull/10015	2025-02-26 09:19:19 +00:00
Arpad Müller	920040e402	Update storage components to edition 2024 (#10919 ) Updates storage components to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, however some code changes had to be done to accommodate the edition's breaking changes. The PR has two commits: * the first commit updates storage crates to edition 2024 and appeases `cargo clippy` by changing code. i have accidentially ran the formatter on some files that had other edits. * the second commit performs a `cargo fmt` I would recommend a closer review of the first commit and a less close review of the second one (as it just runs `cargo fmt`). part of https://github.com/neondatabase/neon/issues/10918	2025-02-25 23:51:37 +00:00
Konstantin Knizhnik	dc975d554a	Incremenet getpage histogram in prefetch_lookup (#10965 ) ## Problem PR https://github.com/neondatabase/neon/pull/10442 added prefetch_lookup function. It changed handling of getpage requests in compute. Before: 1. Lookup in LFC (return if found) 2. Register prefetch buffer 3. Wait prefetch result (increment getpage_hist) Now: 1. Lookup prefetch ring (return if prefetch request is already completed) 2. Lookup in LFC (return if found) 3. Register prefetch buffer 4. Wait prefetch result (increment getpage_hist) So if prefetch result is already available, then get page histogram is not incremented. It case failure of some test_throughtput benchmarks: https://neondb.slack.com/archives/C033RQ5SPDH/p1740425527249499 ## Summary of changes Increment getpage histogram in `prefetch_lookup` Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-25 19:51:38 +00:00
Suhas Thalanki	d05606252d	fix: only showing LSN for static computes in `neon endpoint list` (#10931 ) ## Problem `neon endpoint list` shows a different LSN than what the state of the replica is. This is mainly down to what we define as LSN in this output. If we define it as the LSN that a compute was started with, it only makes sense to show it for static computes. ## Summary of changes Removed the output of `last_record_lsn` for primary/hot standby computes. Closes: https://github.com/neondatabase/neon/issues/5825 --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-02-25 19:26:14 +00:00
Alex Chi Z.	c69ebb4486	fix(ci): extend timeout to 75min (#10963 ) 60min is not enough for debug builds Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-25 17:37:23 +00:00
a-masterov	1fb2faab5b	Rename the patch files for the semver test (#10966 ) ## Problem The patch for `semver` extensions relies on `PG_VERSION` environment variable. The files were named without the letter `v` so script cannot find them. ## Summary of changes The patch files were renamed.	2025-02-25 16:00:43 +00:00
Alex Chi Z.	015092d259	feat(pageserver): add automatic trigger for gc-compaction (#10798 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes Add the auto trigger for gc-compaction. It computes two values: L1 size and L2 size. When L1 size >= initial trigger threshold, we will trigger an initial gc-compaction. When l1_size / l2_size >= gc_compaction_ratio_percent, we will trigger the "tiered" gc-compaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-25 14:50:39 +00:00
Alex Chi Z.	b7fcf2c7a7	test(pageserver): add reldir v2 into tests (#10750 ) ## Problem We have `test_perf_many_relations` but it only runs on remote clusters, and we cannot directly modify tenant config. Therefore, I patched one of the current tests to benchmark relv2 performance. close https://github.com/neondatabase/neon/issues/9986 ## Summary of changes * Add `v1/v2` selector to `test_tx_abort_with_many_relations`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-25 14:50:22 +00:00
Erik Grinaker	8deeddd4f0	pageserver: ignore `CollectKeySpaceError::Cancelled` during compaction (#10968 ) This pops up a few times during deployment. Not sure why it fires without `self.cancel` being cancelled, but could be e.g. ancestor timelines or sth.	2025-02-25 14:49:41 +00:00
a-masterov	f78ac44748	Use the Dockerfile COPY instead of docker cp (#10943 ) ## Problem We use `docker cp` to copy the files required for the extension tests now. It causes problems if we run older images with the newer source tree. ## Summary of changes Copying the files was moved to the compute Dockerfile.	2025-02-25 12:44:06 +00:00
Folke Behrens	f4fefd9f2f	pre-commit: Switch to cargo fmt to handle per-crate editions (#10969 ) cargo knows what edition each crate uses.	2025-02-25 12:29:27 +00:00
Konstantin Knizhnik	8f82c661d4	Move neon_pgstat_file_size_limit to the extension (#10959 ) ## Problem PG14 uses separate backend for stats collector having no access to shaerd memory. As far as AUX mechanism requires access to shared memory, persisting pgstat.stat file is not supported at pg14. And so there is no definition of `neon_pgstat_file_size_limit` variable. It makes it impossible to provide same config for all Postgres version. ## Summary of changes Move neon_pgstat_file_size_limit to Neon extension. Postgres submodules PR: https://github.com/neondatabase/postgres/pull/587 https://github.com/neondatabase/postgres/pull/588 https://github.com/neondatabase/postgres/pull/589 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Tristan Partin <tristan@neon.tech>	2025-02-25 12:23:04 +00:00
Arseny Sher	758f597280	compute <-> sk protocol v3 (#10647 ) ## Problem As part of https://github.com/neondatabase/neon/issues/8614 we need to pass membership configurations between compute and safekeepers. ## Summary of changes Add version 3 of the protocol carrying membership configurations. Greeting message in both sides gets full conf, and other messages generation number only. Use protocol bump to include other accumulated changes: - stop packing whole structs on the wire as is; - make the tag u8 instead of u64; - send all ints in network order; - drop proposer_uuid, we can pass it in START_WAL_PUSH and it wasn't much useful anyway. Per message changes, apart from mconf: - ProposerGreeting: tenant / timeline id is sent now as hex cstring. Remove proto version, it is passed outside in START_WAL_PUSH. Remove postgres timeline, it is unused. Reorder fields a bit. - AcceptorGreeting: reorder fields - VoteResponse: timeline_start_lsn is removed. It can be taken from first member of term history, and later we won't need it at all when all timelines will be explicitly created. Vote itself is u8 instead of u64. - ProposerElected: timeline_start_lsn is removed for the same reasons. - AppendRequest: epoch_start_lsn removed, it is known from term history in ProposerElected. Both compute and sk are able to talk v2 and v3 to make rollbacks (in case we need them) easier; neon.safekeeper_proto_version GUC sets the client version. v2 code can be dropped later. So far empty conf is passed everywhere, future PRs will handle them. To test, add param to some tests choosing proto version; we want to test both 2 and 3 until we fully migrate. ref https://github.com/neondatabase/neon/issues/10326 --------- Co-authored-by: Arthur Petukhovsky <petuhovskiy@yandex.ru>	2025-02-25 11:56:05 +00:00
Vlad Lazar	0d9a45a475	safekeeper: invalidate start of interpreted batch on reader resets (#10951 ) ## Problem The interpreted WAL reader tracks the start of the current logical batch. This needs to be invalidated when the reader is reset. This bug caused a couple of WAL gap alerts in staging. ## Summary of changes * Refactor to make it possible to write a reproducer * Add repro unit test * Fix by resetting the start with the reader Related https://github.com/neondatabase/cloud/issues/23935	2025-02-25 10:21:35 +00:00
Vlad Lazar	5d17640944	storcon: send heartbeats concurrently (#10954 ) ## Problem While looking at logs I noticed that heartbeats are sent sequentially. The loop polling the UnorderedSet is at the wrong level of identation. Instead of doing it after we have the full set, we did after each entry. ## Summary of Changes Poll the UnorderedSet properly.	2025-02-25 09:33:08 +00:00
Erik Grinaker	6621be6b7b	pageserver: tweak slow GetPage logging (#10956 ) ## Problem We recently added slow GetPage request logging. However, this unintentionally included the flush time when logging (which we already have separate logging for). It also logs at WARN level, which is a bit aggressive since we see this fire quite frequently. Follows https://github.com/neondatabase/neon/pull/10906. ## Summary of changes * Only log the request execution time, not the flush time. * Extract a `pagestream_dispatch_batched_message()` helper. * Rename `warn_slow()` to `log_slow()` and downgrade to INFO.	2025-02-24 22:01:14 +00:00
Heikki Linnakangas	565a9e62a1	compute: Disconnect if no response to a pageserver request is received (#10882 ) We've seen some cases in production where a compute doesn't get a response to a pageserver request for several minutes, or even more. We haven't found the root cause for that yet, but whatever the reason is, it seems overly optimistic to think that if the pageserver hasn't responded for 2 minutes, we'd get a response if we just wait patiently a little longer. More likely, the pageserver is dead or there's some kind of a network glitch so that the TCP connection is dead, or at least stuck for a long time. Either way, it's better to disconnect and reconnect. I set the default timeout to 2 minutes, which should be enough for any GetPage request under normal circumstances, even if the pageserver has to download several layer files from remote storage. Make the disconnect timeout configurable. Also make the "log interval", after which we print a message to the log configurable, so that if you change the disconnect timeout, you can set the log timeout correspondingly. The default log interval is still 10 s. The new GUCs are called "neon.pageserver_response_log_timeout" and "neon.pageserver_response_disconnect_timeout". Includes a basic test for the log and disconnect timeouts. Implements issue #10857	2025-02-24 20:16:37 +00:00
Peter Bendel	8fd0f89b94	rename libduckdb.so in pg_duckdb context to avoid conflict with pg_mooncake (#10915 ) ## Problem Introducing pg_duckdb caused a conflict with pg_mooncake. Both use libduckdb.so in different versions. ## Summary of changes - Rename the libduckdb.so to libduckdb_pg_duckdb.so in the context of pg_duckdb so that it doesn't conflict with libduckdb.so referenced by pg_mooncake. - use a version map to rename the duckdb symbols to a version specific name - DUCKDB_1.1.3 for pg_mooncake - DUCKDB_1.2.0 for pg_duckdb For the concept of version maps see - https://www.man7.org/conf/lca2006/shared_libraries/slide19a.html - https://peeterjoot.com/2019/09/20/an-example-of-linux-glibc-symbol-versioning/ - https://akkadia.org/drepper/dsohowto.pdf	2025-02-24 17:50:49 +00:00
JC Grünhage	1f0dea9a1a	feat(ci): push container images to ghcr.io as well (#10945 ) ## Problem There's new rate-limits coming on docker hub. To reduce our reliance on docker hub and the problems the limits are going to cause for us, we want to prepare for this by also pushing our container images to ghcr.io ## Summary of changes Push our images to ghcr.io as well and not just docker hub.	2025-02-24 17:45:23 +00:00
Heikki Linnakangas	40acb0c06d	Fix usage of WaitEventSetWait() with timeout (#10947 ) WaitEventSetWait() returns the number of "events" that happened, and only that many events in the WaitEvent array are updated. When the timeout is reached, it returns 0 and does not modify the WaitEvent array at all. We were reading 'event.events' without checking the return value, which would be uninitialized when the timeout was hit. No test included, as this is harmless at the moment. But this caused the test I'm including in PR #10882 to fail. That PR changes the logic to loop back to retry the PQgetCopyData() call if WL_SOCKET_READABLE was set. Currently, making an extra call to PQconsumeInput() is harmless, but with that change in logic, it turns into a busy-wait.	2025-02-24 17:15:07 +00:00
Arpad Müller	df362de0dd	Reject basebackup requests for archived timelines (#10828 ) For archived timelines, we would like to prevent all non-pageserver issued getpage requests, as users are not supposed to send these. Instead, they should unarchive a timeline before issuing any external read traffic. As this is non-trivial to do, at least prevent launches of new computes, by errorring on basebackup requests for archived timelines. In #10688, we started issuing a warning instead of an error, because an error would mean a stuck project. Now after we can confirm the the warning is not present in the logs for about a week, we can issue errors. Follow-up of #10688 Related: #9548	2025-02-24 16:38:13 +00:00
Alex Chi Z.	5fad4a4cee	feat(storcon): chaos injection of force exit (#10934 ) ## Problem close https://github.com/neondatabase/cloud/issues/24485 ## Summary of changes This patch adds a new chaos injection mode for the storcon. The chaos injector reads the crontab and exits immediately at the configured time. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-24 15:30:21 +00:00
Arpad Müller	fdde58120c	Upgrade proxy crates to edition 2024 (#10942 ) This upgrades the `proxy/` crate as well as the forked libraries in `libs/proxy/` to edition 2024. Also reformats the imports of those forked libraries via: ``` cargo +nightly fmt -p proxy -p postgres-protocol2 -p postgres-types2 -p tokio-postgres2 -- -l --config imports_granularity=Module,group_imports=StdExternalCrate,reorder_imports=true ``` It can be read commit-by-commit: the first commit has no formatting changes, only changes to accomodate the new edition. Part of #10918	2025-02-24 15:26:28 +00:00
Vlad Lazar	459446fcb8	pagesever: include visible layers in heatmaps after unarchival (#10880 ) ## Problem https://github.com/neondatabase/neon/pull/10788 introduced an API for warming up attached locations by downloading all layers in the heatmap. We intend to use it for warming up timelines after unarchival too, but it doesn't work. Any heatmap generated after the unarchival will not include our timeline, so we've lost all those layers. ## Summary of changes Generate a cheeky heatmap on unarchival. It includes all the visible layers. Use that as the `PreviousHeatmap` which inputs into actual heatmap generation. Closes: https://github.com/neondatabase/neon/issues/10541	2025-02-24 15:21:17 +00:00
Alexander Bayandin	17724a19e6	CI(allure-reports): update dependencies and cleanup code (#10794 ) ## Problem There are a bunch of minor improvements that are too small and insignificant as is, so collecting them in one PR. ## Summary of changes - Add runner arch to artifact name to make it easier to distinguish files on S3 ([ref](https://neondb.slack.com/archives/C059ZC138NR/p1739365938371149)) - Use `github.event.pull_request.number` instead of parsing `$GITHUB_EVENT_PATH` file - Update Allure CLI and `allure-pytest`	2025-02-24 15:07:14 +00:00
John Spray	2a5d7e5a78	tests: improve compat test coverage of controller-pageserver interaction (#10848 ) ## Problem We failed to detect https://github.com/neondatabase/neon/pull/10845 before merging, because the tests we run with a matrix of component versions didn't include the ones that did live migrations. ## Summary of changes - Do a live migration during the storage controller smoke test, since this is a pretty core piece of functionality - Apply a compat version matrix to the graceful cluster restart test, since this is the functionality that we most urgently need to work across versions to make deploys work. I expect the first CI run of this to fail, because https://github.com/neondatabase/neon/pull/10845 isn't merged yet.	2025-02-24 12:22:22 +00:00
Conrad Ludgate	fb77f28326	feat(proxy): add direction and private link id to billing export (#10925 ) ref: https://github.com/neondatabase/cloud/issues/23385 Adds a direction flag as well as private-link ID to the traffic reporting pipeline. We do not yet actually count ingress, but we include the flag anyway. I have additionally moved vpce_id string parsing earlier, since we expect it to be utf8 (ascii).	2025-02-24 11:49:11 +00:00
Heikki Linnakangas	a6f315c9c9	Remove unnecessary dependencies to synchronous 'postgres' crate (#10938 ) The synchronous 'postgres' crate is just a wrapper around the async 'tokio_postgres' crate. Some places were unnecessarily using the re-exported NoTls and Error from the synchronous 'postgres' crate, even though they were otherwise using the 'tokio_postgres' crate. Tidy up by using the tokio_postgres types directly.	2025-02-24 09:40:25 +00:00
Alexey Kondratov	df264380b9	fix(compute_ctl): Skip invalid DBs in PerDatabasePhase (#10910 ) ## Problem After refactoring the configuration code to phases, it became a bit fuzzy who filters out DBs that are not present in Postgres, are invalid, or have `datallowconn = false`. The first 2 are important for the DB dropping case, as we could be in operation retry, so DB could be already absent in Postgres or invalid (interrupted `DROP DATABASE`). Recent case: https://neondb.slack.com/archives/C03H1K0PGKH/p1740053359712419 ## Summary of changes Add a common code that filters out inaccessible DBs inside `ApplySpecPhase::RunInEachDatabase`.	2025-02-21 21:50:50 +00:00
Arpad Müller	4bbe75de8c	Update vm_monitor to edition 2024 (#10916 ) Updates `vm_monitor` to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, it's only changes due to the rustfmt edition. part of https://github.com/neondatabase/neon/issues/10918	2025-02-21 20:29:05 +00:00
Tristan Partin	c0c3ed94a9	Fix flaky test_compute_installed_extensions_metric test (#10933 ) There was a race condition with compute_ctl and the metric being collected related to whether the neon extension had been updated or not. compute_ctl will run `ALTER EXTENSION neon UPDATE` on compute start in the postgres database. Fixes: https://github.com/neondatabase/neon/issues/10932 Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-21 18:29:48 +00:00
Konstantin Knizhnik	b1d8771d5f	Store prefetch results in LFC cache once as soon as they are received (#10442 ) ## Problem Prefetch is performed locally, so different backers can request the same pages form PS. Such duplicated request increase load of page server and network traffic. Making prefetch global seems to be very difficult and undesirable, because different queries can access chunks on different speed. Storing prefetch chunks in LFC will not completely eliminate duplicates, but can minimise such requests. The problem with storing prefetch result in LFC is that in this case page is not protected by share buffer lock. So we will have to perform extra synchronisation at LFC side. See: https://neondb.slack.com/archives/C0875PUD0LC/p1736772890602029?thread_ts=1736762541.116949&cid=C0875PUD0LC @MMeent implementation of prewarm: See https://github.com/neondatabase/neon/pull/10312/ ## Summary of changes Use conditional variables to sycnhronize access to LFC entry. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-21 16:56:16 +00:00
Vlad Lazar	3e82addd64	storcon: use `Duration` for duration's in the storage controller tenant config (#10928 ) ## Problem The storage controller treats durations in the tenant config as strings. These are loaded from the db. The pageserver maps these durations to a seconds only format and we always get a mismatch compared to what's in the db. ## Summary of changes Treat durations as durations inside the storage controller and not as strings. Nothing changes in the cross service API's themselves or the way things are stored in the db. I also added some logging which I would have made the investigation a 10min job: 1. Reason for why the reconciliation was spawned 2. Location config diff between the observed and wanted states	2025-02-21 15:45:00 +00:00
John Spray	5e3c234edc	storcon: do more concurrent optimisations (#10929 ) ## Problem Now that we rely on the optimisation logic to handle fixing things up after some tenants are in the wrong AZ (e.g. after node failure), it's no longer appropriate to treat optimisations as an ultra-low-priority task. We used to reflect that low priority with a very low limit on concurrent execution, such that we would only migrate 2 things every 20 seconds. ## Summary of changes - Increase MAX_OPTIMIZATIONS_EXEC_PER_PASS from 2 to 16 - Increase MAX_OPTIMIZATIONS_PLAN_PER_PASS from 8 to 64. Since we recently gave user-initiated actions their own semaphore, this should not risk starving out API requests.	2025-02-21 14:58:49 +00:00
Arpad Müller	ff3819efc7	storcon: infrastructure for safekeeper specific JWT tokens (#10905 ) Safekeepers only respond to requests with the per-token scope, or the `safekeeperdata` JWT scope. Therefore, add infrastructure in the storage controller for safekeeper JWTs. Also, rename the ambiguous `jwt_token` to `pageserver_jwt_token`. Part of #9011 Related: https://github.com/neondatabase/cloud/issues/24727	2025-02-21 11:02:02 +00:00
Arpad Müller	f927ae6e15	Return a json response in scheduling_policy handler (#10904 ) Return an empty json response in the `scheduling_policy` handler. This prevents errors of the form: ``` Error: receive body: error decoding response body: EOF while parsing a value at line 1 column 0 ``` when setting the scheduling policy via the `storcon_cli`. part of #9011.	2025-02-21 11:01:57 +00:00
Heikki Linnakangas	61d385caea	Split plv8 build into two parts (#10920 ) Plv8 consists of two parts: 1. the V8 engine, which is built from vendored sources, and 2. the PostgreSQL extension. Split those into two separate steps in the Dockerfile. The first step doesn't need any PostgreSQL sources or any other files from the neon repository, just the build tools and the upstream plv8 sources. Use the build-deps image as the base for that step, so that the layer can be cached and doesn't need to be rebuilt every time. This is worthwhile because the V8 build takes a very long time.	2025-02-21 09:03:54 +00:00
Alex Chi Z.	c214c32d3f	fix(pageserver): avoid creating empty job for gc-compaction (#10917 ) ## Problem This should be one last fix for https://github.com/neondatabase/neon/issues/10517. ## Summary of changes If a keyspace is empty, we might produce a gc-compaction job which covers no layer files. We should avoid generating such jobs so that the gc-compaction image layer can cover the full key range. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-21 01:56:30 +00:00
Erik Grinaker	9b42d1ce1a	pageserver: periodically log slow ongoing getpage requests (#10906 ) ## Problem We don't have good observability for "stuck" getpage requests. Resolves https://github.com/neondatabase/cloud/issues/23808. ## Summary of changes Log a periodic warning (every 30 seconds) if GetPage request execution is slow to complete, to aid in debugging stuck GetPage requests. This does not cover response flushing (we have separate logging for that), nor reading the request from the socket and batching it (expected to be insignificant and not straightforward to handle with the current protocol). This costs 95 nanoseconds on the happy path when awaiting a `tokio::task::yield_now()`: ``` warn_slow/enabled=false time: [45.716 ns 46.116 ns 46.687 ns] warn_slow/enabled=true time: [141.53 ns 141.83 ns 142.18 ns] ```	2025-02-20 21:38:42 +00:00
Konstantin Knizhnik	0b9b391ea0	Fix caclulation of prefetch ring position to fit in-flight request in resized ring buffer (#10899 ) ## Problem Refer https://github.com/neondatabase/neon/issues/10885 Wait position in ring buffer to restrict number of in-flight requests is not correctly calculated. ## Summary of changes Update condition and remove redundant assertion Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-20 20:21:44 +00:00
Heikki Linnakangas	3f376e44ba	Temporarily disable pg_duckdb (#10909 ) It clashed with pg_mooncake This is the same as the hotfix #10908 , but for the main branch, to keep the release and main branches in sync. In particular, we don't want to accidentally revert this temporary fix, if we cut a new release from main.	2025-02-20 19:48:06 +00:00
Arpad Müller	5b81a774fc	Update rust to 1.85.0 (#10914 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Announcement blog post](https://blog.rust-lang.org/2025/02/20/Rust-1.85.0.html). Prior update was in #10618.	2025-02-20 19:16:22 +00:00
Konstantin Knizhnik	bd335fa751	Fix prototype of CheckPointReplicationState (#10907 ) ## Problem Occasionally removed (void) from definition of `CheckPointReplicationState` function ## Summary of changes Restore function prototype. https://github.com/neondatabase/postgres/pull/585 https://github.com/neondatabase/postgres/pull/586 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-20 18:29:14 +00:00
Vlad Lazar	34996416d6	pageserver: guard against WAL gaps in the interpreted protocol (#10858 ) ## Problem The interpreted SK <-> PS protocol does not guard against gaps (neither does the Vanilla one, but that's beside the point). ## Summary of changes Extend the protocol to include the start LSN of the PG WAL section from which the records were interpreted. Validation is enabled via a config flag on the pageserver and works as follows: Case 1: `raw_wal_start_lsn` is smaller than the requested LSN There can't be gaps here, but we check that the shard received records which it hasn't seen before. Case 2: `raw_wal_start_lsn` is equal to the requested LSN This is the happy case. No gap and nothing to check Case 3: `raw_wal_start_lsn` is greater than the requested LSN This is a gap. To make Case 3 work I had to bend the protocol a bit. We read record chunks of WAL which aren't record aligned and feed them to the decoder. The picture below shows a shard which subscribes at a position somewhere within Record 2. We already have a wal reader which is below that position so we wait to catch up. We read some wal in Read 1 (all of Record 1 and some of Record 2). The new shard doesn't need Record 1 (it has already processed it according to the starting position), but we read past it's starting position. When we do Read 2, we decode Record 2 and ship it off to the shard, but the starting position of Read 2 is greater than the starting position the shard requested. This looks like a gap. ![image](https://github.com/user-attachments/assets/8aed292e-5d62-46a3-9b01-fbf9dc25efe0) To make it work, we extend the protocol to send an empty `InterpretedWalRecords` to shards if the WAL the records originated from ends the requested start position. On the pageserver, that just updates the tracking LSNs in memory (no-op really). This gives us a workaround for the fake gap. As a drive by, make `InterpretedWalRecords::next_record_lsn` mandatory in the application level definition. It's always included. Related: https://github.com/neondatabase/cloud/issues/23935	2025-02-20 17:49:05 +00:00
Tristan Partin	d571553d8a	Remove hacks in compute_ctl related to compute ID (#10751 )	2025-02-20 17:31:52 +00:00
Tristan Partin	f7474d3f41	Remove forward compatibility hacks related to compute HTTP servers (#10797 ) These hacks were added to appease the forward compatibility tests and can be removed. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-20 17:31:42 +00:00
Dmitrii Kovalkov	e808e9432a	storcon: use https for pageservers (#10759 ) ## Problem Storage controller uses unsecure http for pageserver API. Closes: https://github.com/neondatabase/cloud/issues/23734 Closes: https://github.com/neondatabase/cloud/issues/24091 ## Summary of changes - Add an optional `listen_https_port` field to storage controller's Node state and its API (RegisterNode/ListNodes/etc). - Allow updating `listen_https_port` on node registration to gradually add https port for all nodes. - Add `use_https_pageserver_api` CLI option to storage controller to enable https. - Pageserver doesn't support https for now and always reports `https_port=None`. This will be addressed in follow-up PR.	2025-02-20 17:16:04 +00:00
Anastasia Lubennikova	7c7180a79d	Fix deadlock in drop_subscriptions_before_start (#10806 ) ALTER SUBSCRIPTION requires AccessExclusive lock which conflicts with iteration over pg_subscription when multiple databases are present and operations are applied concurrently. Fix by explicitly locking pg_subscription in the beginning of the transaction in each database. ## Problem https://github.com/neondatabase/cloud/issues/24292	2025-02-20 17:14:16 +00:00
Erik Grinaker	07bee60037	pageserver: make compaction walredo errors critical (#10884 ) Mark walredo errors as critical too. Also pull the pattern matching out into the outer `match`. Follows #10872.	2025-02-20 12:08:54 +00:00
Erik Grinaker	f7edcf12e3	pageserver: downgrade ephemeral layer roll wait message (#10883 ) We already log a message for this during the L0 flush, so the additional message is mostly noise.	2025-02-20 12:08:30 +00:00
a-masterov	1d9346f8b7	Add pg_repack test (#10638 ) ## Problem We don't test `pg_repack` ## Summary of changes The test for `pg_repack` is added	2025-02-20 10:05:01 +00:00
Folke Behrens	c962f2b447	Merge pull request #10903 from neondatabase/rc/release-proxy/2025-02-20 Proxy release 2025-02-20	2025-02-20 10:37:47 +01:00
Konstantin Knizhnik	a6d8640d6f	Persist pg_stat information in pageserver (#6560 ) ## Problem Statistic is saved in local file and so lost on compute restart. Persist in in page server using the same AUX file mechanism used for replication slots See more about motivation in https://neondb.slack.com/archives/C04DGM6SMTM/p1703077676522789 ## Summary of changes Persist postal file using AUX mechanism Postgres PRs: https://github.com/neondatabase/postgres/pull/547 https://github.com/neondatabase/postgres/pull/446 https://github.com/neondatabase/postgres/pull/445 Related to #6684 and #6228 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-20 06:38:55 +00:00
github-actions[bot]	446b3f9d28	Proxy release 2025-02-20	2025-02-20 06:02:01 +00:00
Arpad Müller	bb7e244a42	storcon: fix heartbeats timing out causing a panic (#10902 ) Fix an issue caused by PR https://github.com/neondatabase/neon/pull/10891: we introduced the concept of timeouts for heartbeats, where we would hang up on the other side of the oneshot channel if a timeout happened (future gets cancelled, receiver is dropped). This hang up would make the heartbeat task panic when it did obtain the response, as we unwrap the result of the result sending operation. The panic would lead to the heartbeat task panicing itself, which is then according to logs the last sign of life we of that process invocation. I'm not sure what brings down the process, in theory tokio [should continue](https://docs.rs/tokio/latest/tokio/runtime/enum.UnhandledPanic.html#variant.Ignore), but idk. Alternative to #10901.	2025-02-19 23:04:05 +00:00
Arpad Müller	787b98f8f2	storcon: log all safekeepers marked as offline (#10898 ) Doing this to help debugging offline safekeepers. Part of https://github.com/neondatabase/neon/issues/9011	2025-02-19 20:45:22 +00:00
Vlad Lazar	f148d71d9b	test: disable background heatmap uploads and downloads in cold migration test (#10895 ) ## Problem Background heatmap uploads and downloads were blocking the ones done manually by the test. ## Summary of changes Disable Background heatmap uploads and downloads for the cold migration test. The test does them explicitly.	2025-02-19 19:30:17 +00:00
JC Grünhage	aad817d806	refactor(ci): use reusable push-to-container-registry workflow for pinning the build-tools image (#10890 ) ## Problem Pinning build tools still replicated the ACR/ECR/Docker Hub login and pushing, even though we have a reusable workflow for this. Was mentioned as a TODO in https://github.com/neondatabase/neon/pull/10613. ## Summary of changes Reuse `_push-to-container-registry.yml` for pinning the build-tools images.	2025-02-19 17:26:09 +00:00
Erik Grinaker	0b3db74c44	libs: remove unnecessary regex in `pprof::symbolize` (#10893 ) `pprof::symbolize()` used a regex to strip the Rust monomorphization suffix from generic methods. However, the `backtrace` crate can do this itself if formatted with the `:#` flag. Also tighten up the code a bit.	2025-02-19 17:11:12 +00:00
Arpad Müller	9ba2a87e69	storcon: sk heartbeat fixes (#10891 ) This PR does the following things: * The initial heartbeat round blocks the storage controller from becoming online again. If all safekeepers are unresponsive, this can cause storage controller startup to be very slow. The original intent of #10583 was that heartbeats don't affect normal functionality of the storage controller. So add a short timeout to prevent it from impeding storcon functionality. * Fix the URL of the utilization endpoint. * Don't send heartbeats to safekeepers which are decomissioned. Part of https://github.com/neondatabase/neon/issues/9011 context: https://neondb.slack.com/archives/C033RQ5SPDH/p1739966807592589	2025-02-19 16:57:11 +00:00
Alex Chi Z.	1f9511dbd9	feat(pageserver): yield image creation to L0 compactions across timelines (#10877 ) ## Problem A simpler version of https://github.com/neondatabase/neon/pull/10812 ## Summary of changes Image layer creation will be preempted by L0 accumulated on other timelines. We stop image layer generation if there's a pending L0 compaction request. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-19 15:10:12 +00:00
Erik Grinaker	aab5482fd5	storcon: add CPU/heap profiling endpoints (#10894 ) Adds CPU/heap profiling for storcon. Also fixes allowlists to match on the path only, since profiling endpoints take query parameters. Requires #10892 for heap profiling.	2025-02-19 14:43:29 +00:00
Erik Grinaker	3720cf1c5a	storcon: use jemalloc (#10892 ) ## Problem We'd like to enable CPU/heap profiling for storcon. This requires jemalloc. ## Summary of changes Use jemalloc as the global allocator, and enable heap sampling for profiling.	2025-02-19 14:20:51 +00:00
Erik Grinaker	0453eaf65c	pageserver: reduce default `compaction_upper_limit` to 20 (#10889 ) ## Problem We've seen the previous default of 50 cause OOMs. Compacting many L0 layers at once now has limited benefit, since the cost is mostly linear anyway. This is already being reduced to 20 in production settings. ## Summary of changes Reduce `DEFAULT_COMPACTION_UPPER_LIMIT` to 20. Once released, let's remove the config overrides.	2025-02-19 14:12:05 +00:00
Heikki Linnakangas	2d96134a4e	Remove unused dependencies (#10887 ) Per cargo machete.	2025-02-19 14:09:01 +00:00
JC Grünhage	e52e93797f	refactor(ci): use variables for AWS account IDs (#10886 ) ## Problem Our AWS account IDs are copy-pasted all over the place. A wrong paste might only be caught late if we hardcode them, but will get flagged instantly by actionlint if we access them from github actions variables. Resolves https://github.com/neondatabase/neon/issues/10787, follow-up for https://github.com/neondatabase/neon/pull/10613. ## Summary of changes Access AWS account IDs using Github Actions variables.	2025-02-19 12:34:41 +00:00
Erik Grinaker	aa115a774c	storcon: eagerly attempt autosplits (#10849 ) ## Problem Autosplits are crucial for bulk ingest performance. However, autosplits were only attempted when there was no other pending work. This could cause e.g. mass AZ affinity violations following Pageserver restarts to starve out autosplits for hours. Resolves #10762. ## Summary of changes Always attempt autosplits in the background reconciliation loop, regardless of other pending work.	2025-02-19 09:01:02 +00:00
Peter Bendel	2f0d6571a9	add a variant to ingest benchmark with shard-splitting disabled (#10876 ) ## Problem we measure ingest performance for a few variants (stripe-sizes, pre-sharded, shard-splitted). However some phenomena (e.g. related to L0 compaction) in PS can be better observed and optimized with un-sharded tenants. ## Summary of changes - Allow to create projects with a policy that disables sharding (`{"scheduling": "Essential"}`) - add a variant to ingest_benchmark that uses that policy for the new project ## Test run https://github.com/neondatabase/neon/actions/runs/13396325970	2025-02-19 08:43:53 +00:00
a-masterov	7199919f04	Fix the problems discovered in the upgrade test (#10826 ) ## Problem The nightly test discovered problems in the extensions upgrade test. 1. `PLv8` has different versions on PGv17 and PGv16 and a different test set, which was not implemented correctly [sample](https://github.com/neondatabase/neon/actions/runs/13382330475/job/37372930271) 2. The same for `semver` [sample](https://github.com/neondatabase/neon/actions/runs/13382330475/job/37372930017) 3. `pgtap` interfered with the other tests, e.g. tables, created by other extensions caused the tests to fail. ## Summary of changes The discovered problems were fixed. 1. The tests list for `PLv8` is now generated using the original Makefile 2. The patches for `semver` are now split for PGv16 and PGv17. 3. `pgtap` is being tested in a separate database now. --------- Co-authored-by: Mikhail Kot <mikhail@neon.tech>	2025-02-19 06:40:09 +00:00
Alex Chi Z.	a4e3989c8d	fix(pageserver): make repartition error critical (#10872 ) ## Problem Read errors during repartition should be a critical error. ## Summary of changes <del>We only have one call site</del> We have two call sites of `repartition` where one of them is during the initial image upload optimization and another is during image layer creation, so I added a `critical!` here instead of inside `collect_keyspace`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 20:19:23 +00:00
Peter Bendel	9d074db18d	Use link to cross-service-endpoint dashboard in allure reports and benchmarking workflow logs (#10874 ) ## Problem We have links to deprecated dashboards in our logs Example https://github.com/neondatabase/neon/actions/runs/13382454571/job/37401983608#step:8:348 ## Summary of changes Use link to cross service endpoint instead. Example: https://github.com/neondatabase/neon/actions/runs/13395407925/job/37413056148#step:7:345	2025-02-18 19:54:21 +00:00
Alex Chi Z.	538ea03f73	feat(pageserver): allow read path debug in getpagelsn API (#10748 ) ## Problem The usual workflow for me to debug read path errors in staging is: download the tenant to my laptop, import, and then run some read tests. With this patch, we can do this directly over staging pageservers. ## Summary of changes * Add a new `touchpagelsn` API that does a page read but does not return page info back. * Allow read from latest record LSN from get/touchpagelsn * Add read_debug config in the context. * The read path will read the context config to decide whether to enable read path tracing or not. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 18:54:53 +00:00
Erik Grinaker	cb8060545d	pageserver: don't log noop image compaction (#10873 ) ## Problem We log image compaction stats even when no image compaction happened. This is logged every 10 seconds for every timeline. ## Summary of changes Only log when we actually performed any image compaction.	2025-02-18 17:49:01 +00:00
JC Grünhage	9151d3a318	feat(ci): notify storage oncall if deploy job fails on release branch (#10865 ) ## Problem If the deploy job on the release branch doesn't succeed, the preprod deployment will not have happened. It was requested that this triggers a notification in https://github.com/neondatabase/neon/issues/10662. ## Summary of changes If we're on the release branch and the deploy job doesn't end up in "success", notify storage oncall on slack.	2025-02-18 17:20:03 +00:00
Anastasia Lubennikova	381115b68e	Add pgaudit and pgauditlogtofile extensions (#10763 ) to compute image. This commit doesn't enable anything yet. It is a preparatory work for enabling audit logging in computes.	2025-02-18 16:32:32 +00:00
Vlad Lazar	1a69a8cba7	storage: add APIs for warming up location after cold migrations (#10788 ) ## Problem We lack an API for warming up attached locations based on the heatmap contents. This is problematic in two places: 1. If we manually migrate and cut over while the secondary is still cold 2. When we re-attach a previously offloaded tenant ## Summary of changes https://github.com/neondatabase/neon/pull/10597 made heatmap generation additive across migrations, so we won't clobber it a after a cold migration. This allows us to implement: 1. An endpoint for downloading all missing heatmap layers on the pageserver: `/v1/tenant/:tenant_shard_id/timeline/:timeline_id/download_heatmap_layers`. Only one such operation per timeline is allowed at any given time. The granularity is tenant shard. 2. An endpoint to the storage controller to trigger the downloads on the pageserver: `/v1/tenant/:tenant_shard_id/timeline/:timeline_id/download_heatmap_layers`. This works both at tenant and tenant shard level. If an unsharded tenant id is provided, the operation is started on all shards, otherwise only the specified shard. 3. A storcon cli command. Again, tenant and tenant-shard level granularities are supported. Cplane will call into storcon and trigger the downloads for all shards. When we want to rescue a migration, we will use storcon cli targeting the specific tenant shard. Related: https://github.com/neondatabase/neon/issues/10541	2025-02-18 16:09:06 +00:00
Alex Chi Z.	ed98f6d57e	feat(pageserver): log lease request (#10832 ) ## Problem To investigate https://github.com/neondatabase/cloud/issues/23650 ## Summary of changes We log lease requests to see why there are clients accessing things below gc_cutoff. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 16:06:39 +00:00
Alex Chi Z.	f9a063e2e9	test(pageserver): fix test_pageserver_gc_compaction_idempotent (#10833 ) ## Problem ref https://github.com/neondatabase/neon/issues/10517 ## Summary of changes For some reasons the job split algorithm decides to have different image coverage range for two compactions before/after restart. So we remove the subcompaction key range and let it generate an image covering the full range, which should make the test more stable. Also slightly tuned the logging span. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 16:06:20 +00:00
Heikki Linnakangas	f36ec5c84b	chore(compute): Postgres 17.4, 16.8, 15.12 and 14.17 (#10868 ) Update all minor versions. No conflicts. Postgres repository PRs: - https://github.com/neondatabase/postgres/pull/584 - https://github.com/neondatabase/postgres/pull/583 - https://github.com/neondatabase/postgres/pull/582 - https://github.com/neondatabase/postgres/pull/581	2025-02-18 15:56:43 +00:00
Alexander Bayandin	274cb13293	test_runner: fix mismatch versions tests on linux (#10869 ) ## Problem Tests with mixed-version binaries always use the latest binaries on CI ([an example](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10848/13378137061/index.html#suites/8fc5d1648d2225380766afde7c428d81/1ccefc4cfd4ef176/)): The versions of new `storage_broker` and old `pageserver` are the same: `b45254a5605f6fdafdf475cdd3e920fe00898543`. This affects only Linux, on macOS the version mixed correctly. ## Summary of changes - Use hardlinks instead of symlinks to create a directory with mixed-version binaries	2025-02-18 15:52:00 +00:00
Alex Chi Z.	290f007b8e	Revert "feat(pageserver): repartition on L0-L1 boundary (#10548 )" (#10870 ) This reverts commit `443c8d0b4b`. ## Problem We observe a massive amount of compaction errors. ## Summary of changes If the tenant did not write any L1 layers (i.e., they accumulate L0 layers where number of them is below L0 threshold), image creation will always fail. Therefore, it's not correct to simply use the disk_consistent_lsn or L0/L1 boundary for the image creation.	2025-02-18 15:43:33 +00:00
Alexander Lakhin	29e4ca351e	Pass asan/ubsan options to pg_dump/pg_restore started by fast_import (#10866 )	2025-02-18 15:41:20 +00:00
Arpad Müller	caece02da7	move pull_timeline to safekeeper_api and add SafekeeperGeneration (#10863 ) Preparations for a successor of #10440: * move `pull_timeline` to `safekeeper_api` and add it to `SafekeeperClient`. we want to do `pull_timeline` on any creations that we couldn't do initially. * Add a `SafekeeperGeneration` type instead of relying on a type alias. we want to maintain a safekeeper specific generation number now in the storcon database. A separate type is important to make it impossible to mix it up with the tenant's pageserver specific generation number. We absolutely want to avoid that for correctness reasons. If someone mixes up a safekeeper and pageserver id (both use the `NodeId` type), that's bad but there is no wrong generations flying around. part of #9011	2025-02-18 14:02:22 +00:00
Arseny Sher	d36baae758	Add gc_blocking and restore latest_gc_cutoff in openapi spec (#10867 ) ## Problem gc_blocking is missing in the tenant info, but cplane wants to use it. Also, https://github.com/neondatabase/neon/pull/10707/ removed latest_gc_cutoff from the spec, renaming it to applied_gc_cutoff. Temporarily get it back until cplane migrates. ## Summary of changes Add them. ref https://neondb.slack.com/archives/C03438W3FLZ/p1739877734963979	2025-02-18 13:57:12 +00:00
Alexander Lakhin	f81259967d	Add test to make sure sanitizers really work when expected (#10838 )	2025-02-18 13:23:18 +00:00
Conrad Ludgate	719ec378cd	fix(local_proxy): discard all in tx (#10864 ) ## Problem `discard all` cannot run in a transaction (even if implicit) ## Summary of changes Split up the query into two, we don't need transaction support.	2025-02-18 08:54:20 +00:00
Alexander Bayandin	27241f039c	test_runner: fix `neon_local` usage for version mismatch tests (#10859 ) ## Problem Tests with mixed versions of binaries always pick up new versions if services are started using `neon_local`. ## Summary of changes - Set `neon_local_binpath` along with `neon_binpath` and `pg_distrib_dir` for tests with mixed versions	2025-02-17 20:29:14 +00:00
Heikki Linnakangas	811506aaa2	fast_import: Use rust s3 client for uploading (#10777 ) This replaces the use of the awscli utility. awscli binary is massive, it added about 200 MB to the docker image size, while the s3 client was already a dependency so using that is essentially free, as far as binary size is concerned. I implemented a simple upload function that tries to keep 10 uploads going in parallel. I believe that's the default behavior of the "aws s3 sync" command too.	2025-02-17 20:07:31 +00:00
Heikki Linnakangas	2884917bd4	compute: Allow postgres user to power off the VM also on <= v16 (#10860 ) I did this for debian bookworm variant in PR #10710, but forgot to update the "bullseye" dockerfile that is used to build older PostgreSQL versions.	2025-02-17 19:42:57 +00:00
Tristan Partin	b34598516f	Warn when PR may require regenerating cloud PG settings (#10229 ) These generated Postgres settings JSON files can get out of sync causing the control plane to reject updated to an endpoint or project's Postgres settings. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-17 19:02:16 +00:00
Erik Grinaker	84bbe87d60	pageserver: tweak `pageserver_layers_per_read` histogram resolution (#10847 ) ## Problem The current `pageserver_layers_per_read` histogram buckets don't represent the current reality very well. For the percentiles we care about (e.g. p50 and p99), we often see fairly high read amp, especially during ingestion, and anything below 4 can be considered very good. ## Summary of changes Change the per-timeline read amp histogram buckets to `[4.0, 8.0, 16.0, 32.0, 64.0, 128.0, 256.0]`.	2025-02-17 17:24:17 +00:00
Vlad Lazar	b10890b81c	tests: compare digests in test_peer_recovery (#10853 ) ## Problem Test fails when comparing the first WAL segment because the system id in the segment header is different. The system id is not consistently set correctly since segments are usually inited on the safekeeper sync step with sysid 0. ## Summary of Chnages Compare timeline digests instead. This skips the header. Closes https://github.com/neondatabase/neon/issues/10596	2025-02-17 16:32:24 +00:00
Conrad Ludgate	3204efc860	chore(proxy): use specially named prepared statements for type-checking (#10843 ) I was looking into https://github.com/neondatabase/serverless/issues/144, I recall previous cases where proxy would trigger these prepared statements which would conflict with other statements prepared by our client downstream. Because of that, and also to aid in debugging, I've made sure all prepared statements that proxy needs to make have specific names that likely won't conflict and makes it clear in a error log if it's our statements that are causing issues	2025-02-17 16:19:57 +00:00
Tristan Partin	da79cc5eee	Add neon.extension_server_{connect,request}_timeout (#10801 ) Instead of hardcoding the request timeout, let's make it configurable as a PGC_SUSET GUC. Additionally, add a connect timeout GUC. Although the extension server runs on the compute, it is always best to keep operations from hanging. Better to present a timeout error to the user than a stuck backend. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-17 15:40:43 +00:00
John Spray	39d42d846a	pageserver_api: fix decoding old-version TimelineInfo (#10845 ) ## Problem In #10707 some new fields were introduced in TimelineInfo. I forgot that we do not only use TimelineInfo for encoding, but also decoding when the storage controller calls into a pageserver, so this broke some calls from controller to pageserver while in a mixed-version state. ## Summary of changes - Make new fields have default behavior so that they are optional	2025-02-17 15:04:47 +00:00
Arpad Müller	0330b61729	Azure SDK: use neon branch again (#10844 ) Originally I wanted to switch back to the `neon` branch before merging #10825, but I forgot to do it. Do it in a separate PR now. No actual change of the source code, only changes the branch name (so that maybe in a few weeks we can delete the temporary branch `arpad/neon-rebase`).	2025-02-17 14:59:01 +00:00
Erik Grinaker	8a2d95b4b5	pageserver: appease unused lint on macOS (#10846 ) ## Problem `SmgrOpFlushInProgress::measure()` takes a `socket_fd` argument which is only used on Linux. This causes linter warnings on macOS. Touches #10823. ## Summary of changes Add a noop use of `socket_fd` on non-Linux branch.	2025-02-17 14:41:22 +00:00
Konstantin Knizhnik	8c6d133d31	Fix out-of-boundaries access in addSHLL function (#10840 ) ## Problem See https://github.com/neondatabase/neon/issues/10839 rho(x,b) functions returns values in range [1,b+1] and addSHLL tries to store it in array of size b+1. ## Summary of changes Subtract 1 fro value returned by rho --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-17 12:54:17 +00:00
Arpad Müller	81f08d304a	Rebase Azure SDK and apply newest patch (#10825 ) The [upstream PR](https://github.com/Azure/azure-sdk-for-rust/pull/1997) has been merged with some changes to use threads with async, so apply them to the neon specific fork to be nice to the executor (before, we had the state as of filing of that PR). Also, rebase onto the latest version of upstream's `legacy` branch. current SDK commits: [link](https://github.com/neondatabase/azure-sdk-for-rust/commits/neon-2025-02-14) now: [link](https://github.com/neondatabase/azure-sdk-for-rust/commits/arpad/neon-refresh) Prior update was in #10790	2025-02-17 10:44:44 +00:00
Peter Bendel	d566d604cf	feat(compute) add pg_duckdb extension v0.3.1 (#10829 ) We want to host pg_duckdb (starting with v0.3.1) on Neon. This PR replaces https://github.com/neondatabase/neon/pull/10350 which was for older pg_duckdb v0.2.0 Use cases - faster OLAP queries - access to datelake files (e.g. parquet) on S3 buckets from Neon PostgreSQL Because neon does not provide superuser role to neon customers we need to grant some additional permissions to neon_superuser: Note: some grants that we require are already granted to `PUBLIC` in new release of pg_duckdb [here](`3789e4c509/sql/pg_duckdb--0.2.0--0.3.0.sql (L1054)`) ```sql GRANT ALL ON FUNCTION duckdb.install_extension(TEXT) TO neon_superuser; GRANT ALL ON TABLE duckdb.extensions TO neon_superuser; GRANT ALL ON SEQUENCE duckdb.extensions_table_seq TO neon_superuser; ```	2025-02-17 10:43:16 +00:00
Alexander Lakhin	f739773edd	Fix format of milliseconds in pytest output (#10836 ) ## Problem The timestamp prefix of pytest log lines contains milliseconds without leading zeros, so values of milliseconds less than 100 printed incorrectly. For example: ``` 2025-02-15 12:02:51.997 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.4 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.9 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.23 INFO [_internal.py:97] 127.0.0.1 - - ... ``` ## Summary of changes Fix log_format for pytest so that milliseconds are printed with leading zeros.	2025-02-16 04:59:52 +00:00
Heikki Linnakangas	2dae0612dd	fast_import: Fix shared_buffers setting (#10837 ) In commit `9537829ccd` I made shared_buffers be derived from the system's available RAM. However, I failed to remove the old hard-coded shared_buffers=10GB settings, shared_buffers was set twice. Oopsie.	2025-02-16 00:01:19 +00:00
Alexander Bayandin	2ec8dff6f7	CI(build-and-test-locally): set `session-timeout` for pytest (#10831 ) ## Problem Sometimes, a regression test run gets stuck (taking more than 60 minutes) and is killed by GitHub's `timeout-minutes` without leaving any traces in the test results database. I find no correlation between this and either the build type, the architecture, or the Postgres version. See: https://neonprod.grafana.net/goto/nM7ih7cHR?orgId=1 ## Summary of changes - Bump `pytest-timeout` to the version that supports `--session-timeout` - Set `--session-timeout` to (timeout-minutes - 10 minutes) * 60 seconds in Attempt to stop tests gracefully to generate test reports until they are forcibly stopped by the stricter `timeout-minutes` limit.	2025-02-15 10:34:11 +00:00
Alex Chi Z.	ae091c6913	feat(pageserver): store reldir in sparse keyspace (#10593 ) ## Problem Part of https://github.com/neondatabase/neon/issues/9516 ## Summary of changes This patch adds the support for storing reldir in the sparse keyspace. All logic are guarded with the `rel_size_v2_enabled` flag, so if it's set to false, the code path is exactly the same as what's currently in prod. Note that we did not persist the `rel_size_v2_enabled` flag and the logic around it will be implemented in the next patch. (i.e., what if we enabled it, restart the pageserver, and then it gets set to false? we should still read from v2 using the rel_size_v2_migration_status in the index_part). The persistence logic I'll implement in the next patch will disallow switching from v2->v1 via config item. I also refactored the metrics so that it can work with the new reldir store. However, this metric is not correctly computed for reldirs (see the comments) before. With the refactor, the value will be computed only when we have an initial value for the reldir size. The refactor keeps the incorrectness of the computation when there are more than 1 database. For the tests, we currently run all the tests with v2, and I'll set it to false and add some v2-specific tests before merging, probably also v1->v2 migration tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-14 20:31:54 +00:00
Christian Schwarz	a32e8871ac	compute/pageserver: correlation of logs through backend PID (via `application_name`) (#10810 ) This PR makes compute set the `application_name` field to the PG backend process PID which is also included in each compute log line. This allows correlation of Pageserver connection logs with compute logs in a way that was guesswork before this PR. In future, we can switch for a more unique identifier for a page_service session. Refs - discussion in https://neondb.slack.com/archives/C08DE6Q9C3B/p1739465208296169?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - fixes https://github.com/neondatabase/neon/issues/10808	2025-02-14 20:11:42 +00:00
Christian Schwarz	9177312ba6	basebackup: use `Timeline::get` for `get_rel` instead of `get_rel_page_at_lsn` (#10476 ) I noticed the opportunity to simplify here while working on https://github.com/neondatabase/neon/pull/9353 . The only difference is the zero-fill behavior: if one reads past rel size, `get_rel_page_at_lsn` returns a zeroed page whereas `Timeline::get` returns an error. However, the `endblk` is at most rel size large, because `nblocks` is eq `get_rel_size`, see a few lines above this change. We're using the same LSN (`self.lsn`) for everything, so there is no chance of non-determinism. Refs: - Slack discussion debating correctness: https://neondb.slack.com/archives/C033RQ5SPDH/p1737457010607119	2025-02-14 17:57:18 +00:00
Christian Schwarz	b992a1a62a	page_service: include socket send & recv queue length in slow flush log mesage (#10823 ) # Summary In - https://github.com/neondatabase/neon/pull/10813 we added slow flush logging but it didn't log the TCP send & recv queue length. This PR adds that data to the log message. I believe the implementation to be safe & correct right now, but it's brittle and thus this PR should be reverted or improved upon once the investigation is over. Refs: - stacked atop https://github.com/neondatabase/neon/pull/10813 - context: https://neondb.slack.com/archives/C08DE6Q9C3B/p1739464533762049?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - improves https://github.com/neondatabase/neon/issues/10668 - part of https://github.com/neondatabase/cloud/issues/23515 # How It Works The trouble is two-fold: 1. getting to the raw socket file descriptor through the many Rust types that wrap it and 2. integrating with the `measure()` function Rust wraps it in types to model file descriptor lifetimes and ownership, and usually one can get access using `as_raw_fd()`. However, we `split()` the stream and the resulting [`tokio::io::WriteHalf`](https://docs.rs/tokio/latest/tokio/io/struct.WriteHalf.html) . Check the PR commit history for my attempts to do it. My solution is to get the socket fd before we wrap it in our protocol types, and to store that fd in the new `PostgresBackend::socket_fd` field. I believe it's safe because the lifetime of `PostgresBackend::socket_fd` value == the lifetime of the `TcpStream` that wrap and store in `PostgresBackend::framed`. Specifically, the only place that close()s the socket is the `impl Drop for TcpStream`. I think the protocol stack calls `TcpStream::shutdown()`, but, that doesn't `close()` the file descriptor underneath. Regarding integration with the `measure()` function, the trouble is that `flush_fut` is currently a generic `Future` type. So, we just pass in the `socket_fd` as a separate argument. A clean implementation would convert the `pgb_writer.flush()` to a named future that provides an accessor for the socket fd while not being polled. I tried (see PR history), but failed to break through the `WriteHalf`. # Testing Tested locally by running ``` ./target/debug/pagebench get-page-latest-lsn --num-clients=1000 --queue-depth=1000 ``` in one terminal, waiting a bit, then ``` pkill -STOP pagebench ``` then wait for slow logs to show up in `pageserver.log`. Pick one of the slow log message's port pairs, e.g., `127.0.0.1:39500`, and then checking sockstat output ``` ss -ntp \| grep '127.0.0.1:39500' ``` to ensure that send & recv queue size match those in the log message.	2025-02-14 16:20:07 +00:00
Gleb Novikov	3d7a32f619	fast import: allow restore to provided connection string (#10407 ) Within https://github.com/neondatabase/cloud/issues/22089 we decided that would be nice to start with import that runs dump-restore into a running compute (more on this [here](https://www.notion.so/neondatabase/2024-Jan-13-Migration-Assistant-Next-Steps-Proposal-Revised-17af189e004780228bdbcad13eeda93f?pvs=4#17af189e004780de816ccd9c13afd953)) We could do it by writing another tool or by extending existing `fast_import.rs`, we chose the latter. In this PR, I have added optional `restore_connection_string` as a cli arg and as a part of the json spec. If specified, the script will not run postgres and will just perform restore into provided connection string. TODO: - [x] fast_import.rs: - [x] cli arg in the fast_import.rs - [x] encoded connstring in json spec - [x] simplify `fn main` a little, take out too verbose stuff to some functions - [ ] ~~allow streaming from dump stdout to restore stdin~~ will do in a separate PR - [ ] ~~address https://github.com/neondatabase/neon/pull/10251#pullrequestreview-2551877845~~ will do in a separate PR - [x] tests: - [x] restore with cli arg in the fast_import.rs - [x] restore with encoded connstring in json spec in s3 - [ ] ~~test with custom dbname~~ will do in a separate PR - [ ] ~~test with s3 + pageserver + fast import binary~~ https://github.com/neondatabase/neon/pull/10487 - [ ] ~~https://github.com/neondatabase/neon/pull/10271#discussion_r1923715493~~ will do in a separate PR neondatabase/cloud#22775 --------- Co-authored-by: Eduard Dykman <bird.duskpoet@gmail.com>	2025-02-14 16:10:06 +00:00
Christian Schwarz	fac5db3c8d	page_service: emit periodic log message while response flush is slow (#10813 ) The logic might seem a bit intricate / over-optimized, but I recently spent time benchmarking this code path in the context of a nightly pagebench regression (https://github.com/neondatabase/cloud/issues/21759) and I want to avoid regressing it any further. Ideally would also log the socket send & recv queue length like we do on the compute side in - https://github.com/neondatabase/neon/pull/10673 But that is proving difficult due to the Rust abstractions that wrap the socket fd. Work in progress on that is happening in - https://github.com/neondatabase/neon/pull/10823 Regarding production impact, I am worried at a theoretical level that the additional logging may cause a downward spiral in the case where a pageserver is slow to flush because there is not enough CPU. The logging would consume more CPU and thereby slow down flushes even more. However, I don't think this matters practically speaking. # Refs - context: https://neondb.slack.com/archives/C08DE6Q9C3B/p1739464533762049?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - fixes https://github.com/neondatabase/neon/issues/10668 - part of https://github.com/neondatabase/cloud/issues/23515 # Testing Tested locally by running ``` ./target/debug/pagebench get-page-latest-lsn --num-clients=1000 --queue-depth=1000 ``` in one terminal, waiting a bit, then ``` pkill -STOP pagebench ``` then wait for slow logs to show up in `pageserver.log`. To see that the completion log message is logged, run ``` pkill -CONT pagebench ```	2025-02-14 14:37:03 +00:00
John Spray	a82a6631fd	storage controller: prioritize reconciles for user-facing operations (#10822 ) ## Problem Some situations may produce a large number of pending reconciles. If we experience an issue where reconciles are processed more slowly than expected, that can prevent us responding promptly to user requests like tenant/timeline CRUD. This is a cleaner implementation of the hotfix in https://github.com/neondatabase/neon/pull/10815 ## Summary of changes - Introduce a second semaphore for high priority tasks, with configurable units (default 256). The intent is that in practical situations these user-facing requests should never have to wait. - Use the high priority semaphore for: tenant/timeline CRUD, and shard splitting operations. Use normal priority for everything else.	2025-02-14 13:25:43 +00:00
Folke Behrens	da7496e1ee	proxy: Post-refactor + future clippy lint cleanup (#10824 ) * Clean up deps and code after logging and binary refactor * Also include future clippy lint cleanup	2025-02-14 12:34:09 +00:00
a-masterov	646e011c4d	Tests the test-upgrade scripts themselves (#10664 ) ## Problem We run the compatibility tests only if we are upgrading the extension. An accidental code change may break the test itself, so we have to check this code as well. ## Summary of changes The test is scheduled once a day to save time and resources. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-02-14 11:41:57 +00:00
Arpad Müller	878c1c7110	offload_timeline: check if the timeline is archived on HasChildren error (#10776 ) PR #10305 makes sure that there is no actual race, i.e. we will never attempt to offload a timeline that has just been unarchived, or similar. However, if a timeline has been unarchived and has children that are unarchived too, we will get an error log line. Such races can occur as in compaction we check if the timeline can be offloaded way before we attempt to offload it: the result might change in the meantime. This patch checks if the delete guard can't be obtained because the timeline has unarchived children, and if yes, it does another check for whether the timeline has become unarchived or not. If it is unarchived, it just prints an info log msg and integrates itself into the error suppression logic of the compaction calling into it. If you squint at it really closely, there is still a possible race in which we print an error log, but this one is unlikely because the timeline and its children need to be archived right after the check for whether the timeline has any unarchived children, and right before the check whether the timeline is archived. Archival involves a network operation while nothing between these two checks does that, so it's very unlikely to happen in real life. https://github.com/neondatabase/cloud/issues/23979#issuecomment-2651265729	2025-02-14 10:21:50 +00:00
John Spray	996f0a3753	storcon: fix eliding parameters from proxied URL labels (#10817 ) ## Problem We had code for stripping IDs out of proxied paths to reduce cardinality of metrics, but it was only stripping out tenant IDs, and leaving in timeline IDs and query parameters (e.g. LSN in lsn->timestamp lookups). ## Summary of changes - Use a more general regex approach. There is still some risk that a future pageserver API might include a parameter in `/the/path/`, but we control that API and it is not often extended. We will also alert on metrics cardinality in staging so that if we made that mistake we would notice.	2025-02-14 09:57:19 +00:00
Konstantin Knizhnik	8bdb1828c8	Perform seqscan to fill LFC chunks with data so that on-disk file size included size of table (#10775 ) ## Problem See https://github.com/neondatabase/neon/issues/10755 Random access pattern of pgbench leaves sparse chunks, which makes the on-disk size of file.cache unpredictable. ## Summary of changes Perform seqscan to fill LFC chunks with data so that on-disk file size included size of table. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-14 08:19:56 +00:00
Alexander Bayandin	3e8bf2159d	CI(build-and-test): run `benchmarks` after `deploy` job (#10791 ) ## Problem `benchmarks` is a long-running and non-blocking job. If, on Staging, a deploy-blocking job fails, restarting it requires cancelling any running `benchmarks` jobs, which is a waste of CI resources and requires a couple of extra clicks for a human to do. Ref: https://neondb.slack.com/archives/C059ZC138NR/p1739292995400899 ## Summary of changes - Run `benchmarks` after `deploy` job - Handle `benchmarks` run in PRs with `run-benchmarks` label but without `deploy` job.	2025-02-13 22:03:47 +00:00
Arpad Müller	5008324460	Fix utilization URL and ensure heartbeats work (#10811 ) There was a typo in the name of the utilization endpoint URL, fix it. Also, ensure that the heartbeat mechanism actually works. Related: #10583, #10429 Part of #9011	2025-02-13 20:55:53 +00:00
Christian Schwarz	487f3202fe	pageserver read path: abort on fatal IO errors from disk / filesystem (#10786 ) Before this PR, an IO error returned from the kernel, e.g., due to a bad disk, would get bubbled up, all the way to a user-visible query failing. This is against the IO error handling policy where we have established and is hence being rectified in this PR. [[(internal Policy document link)]](`bef44149f7/src/storage/handling_io_and_logical_errors.md (L33-L35)`) The practice on the write path seems to be that we call `maybe_fatal_err()` or `fatal_err()` fairly high up the stack. That is, regardless of whether std::fs, tokio::fs, or VirtualFile is used to perform the IO. For the read path, I choose a centralized approach in this PR by checking for errors as close to the kernel interface as possible. I believe this is better for long-term consistency. To mitigate the problem of missing context if we abort so far down in the stack, the `on_fatal_io_error` now captures and logs a backtrace. I grepped the pageserver code base for `fs::read` to convince myself that all non-VirtualFile reads already handle IO errors according to policy. Refs - fixes https://github.com/neondatabase/neon/issues/10454	2025-02-13 20:53:39 +00:00
Alex Chi Z.	6a741fd1c2	fix(pageserver): ensure all basebackup client errors are caught (#10793 ) ## Problem We didn't catch all client errors causing alerts. ## Summary of changes Client errors should be wrapped with ClientError so that it doesn't fire alerts. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-13 19:38:02 +00:00
a-masterov	7ac7755dad	Add tests for pgtap (#10589 ) ## Problem We do not test `pgtap` which is shipped with Neon ## Summary of changes Test and binaries for `pgtap` are added.	2025-02-13 19:04:08 +00:00
Arseny Sher	98e18e9a54	Add s3 storage to test_s3_wal_replay (#10809 ) ## Problem The test is flaky: WAL in remote storage appears to be corrupted. One of hypotheses so far is that corruption is the result of local fs implementation being non atomic, and safekeepers may concurrently PUT the same segment. That's dubious though because by looking at local_fs impl I'd expect then early EOF on segment read rather then observed zeros in test failures, but other directions seem even less probable. ## Summary of changes Let's add s3 backend as well and see if it is also flaky. Also add some more logging around segments uploads. ref https://github.com/neondatabase/neon/issues/10761	2025-02-13 18:05:15 +00:00
Tristan Partin	0cf9157adc	Handle new compute_ctl_config parameter in compute spec requests (#10746 ) There is now a compute_ctl_config field in the response that currently only contains a JSON Web Key set. compute_ctl currently doesn't do anything with the keys, but will in the future. The reasoning for the new field is due to the nature of empty computes. When an empty compute is created, it does not have a tenant. A compute spec is the primary means of communicating the details of an attached tenant. In the empty compute state, there is no spec. Instead we wait for the control plane to pass us one via /configure. If we were to include the jwks field in the compute spec, we would have a partial compute spec, which doesn't logically make sense. Instead, we can have two means of passing settings to the compute: - spec: tenant specific config details - compute_ctl_config: compute specific settings For instance, the JSON Web Key set passed to the compute is independent of any tenant. It is a setting of the compute whether it is attached or not. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-13 18:04:36 +00:00
Tristan Partin	b6f972ed83	Increase the extension server request timeout to 1 minute (#10800 ) pg_search is 46ish MB. All other remote extensions are around hundeds of KB. 3 seconds is not long enough to download the tarball if the S3 gateway cache doesn't already contain a copy. According to our setup, the cache is limited to 10 GB in size and anything that has not been accessed for an hour is purged. This is really bad for scaling to 0, even more so if you're the only project actively using the extension in a production Kubernetes cluster. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-13 17:33:27 +00:00
John Spray	a4d0a34591	tests: flush in test_isolation (#10658 ) ## Problem This test occasionally fails while the test teardown tries to do a graceful shutdown, because the test has quickly written lots of data into the pageserver. Closes: #10654 ## Summary of changes - Call `post_checks` at the end of `test_isolation`, as we already do for test_pg_regress -- this improves our detection of issues, and as a nice side effect flushes the pageserver. - Ignore pg_notify files when validating state at end of test, these are not expected to be the same	2025-02-13 16:23:51 +00:00
John Spray	ae463f366b	tests: broaden allow-list for #10720 workaround (#10807 ) ## Problem In #10752 I used an overly-strict regex that only ignored error on a particular key. ## Summary of changes - Drop key from regex so it matches all such errors	2025-02-13 16:15:04 +00:00
Alexey Kondratov	8c2f85b209	chore(compute): Postgres 17.3, 16.7, 15.11 and 14.16 (#10771 ) ## Summary of changes Bump all minor versions. The only non-trivial conflict was between - `0350b876b0` - and `bd09a752f4` It seems that just adding this extra argument is enough. I also got conflict with `c1c9df3159` but for some reason only in PG 15. Yet, that was a trivial one around ```c if (XLogCtl) LWLockRelease(ControlFileLock); /* durable_rename already emitted log message */ return false; ``` in `xlog.c` ## Postgres PRs - https://github.com/neondatabase/postgres/pull/580 - https://github.com/neondatabase/postgres/pull/579 - https://github.com/neondatabase/postgres/pull/577 - https://github.com/neondatabase/postgres/pull/578	2025-02-13 13:28:05 +00:00
JC Grünhage	e37ba8642d	Integrate cargo-chef into Dockerfile (#10782 ) ## Problem The build of the neon container image is not caching any part of the rust build, making it fairly slow. ## Summary of changes Cache dependency building using cargo-chef.	2025-02-13 13:08:46 +00:00
Vlad Lazar	8fea43a5ba	pageserver: make heatmap generation additive (#10597 ) ## Problem Previously, when cutting over to cold secondary locations, we would clobber the previous, good, heatmap with a cold one. This is because heatmap generation used to include only resident layers. Once this merges, we can add an endpoint which triggers full heatmap hydration on attached locations to heal cold migrations. ## Summary of changes With this patch, heatmap generation becomes additive. If we have a heatmap from when this location was secondary, the new uploaded heatmap will be the result of a reconciliation between the old one and the on disk resident layers. More concretely, when we have the previous heatmap: 1. Filter the previous heatmap and keep layers that are (a) present in the current layer map, (b) visible, (c) not resident. Call this set of layers `visible_non_resident`. 2. From the layer map, select all layers that are resident and visible. Call this set of layers `resident`. 3. The new heatmap is the result of merging the two disjoint sets. Related https://github.com/neondatabase/neon/issues/10541	2025-02-13 12:48:47 +00:00
Arpad Müller	536bdb3209	storcon: track safekeepers in memory, send heartbeats to them (#10583 ) In #9011, we want to schedule timelines to safekeepers. In order to do such scheduling, we need information about how utilized a safekeeper is and if it's available or not. Therefore, send constant heartbeats to the safekeepers and try to figure out if they are online or not. Includes some code from #10440.	2025-02-13 11:06:30 +00:00
John Spray	b8095f84a0	pageserver: make true GC cutoff visible in admin API, rebrand `latest_gc_cutoff` as `applied_gc_cutoff` (#10707 ) ## Problem We expose `latest_gc_cutoff` in our API, and callers understandably were using that to validate LSNs for branch creation. However, this is _not_ the true GC cutoff from a user's point of view: it's just the point at which we last actually did GC. The actual cutoff used when validating branch creations and page_service reads is the min() of latest_gc_cutoff and the planned GC lsn in GcInfo. Closes: https://github.com/neondatabase/neon/issues/10639 ## Summary of changes - Expose the more useful min() of GC cutoffs as `gc_cutoff_lsn` in the API, so that the most obviously named field is really the one people should use. - Retain the ability to read the LSN at which GC was actually done, in an `applied_gc_cutoff_lsn` field. - Internally rename `latest_gc_cutoff_lsn` to `applied_gc_cutoff_lsn` ("latest" was a confusing name, as the value in GcInfo is more up to date in terms of what a user experiences) - Temporarily preserve the old `latest_gc_cutoff_lsn` field for compat with control plane until we update it to use the new field. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-02-13 10:33:47 +00:00
Ivan Efremov	356cca23a5	fix(proxy): Change HSet to HDel for cancellation key metric (#10789 )	2025-02-13 10:22:13 +00:00
JC Grünhage	7b966a2b71	CI(trigger-e2e-tests): fix checking for successful image pushes (#10803 ) ## Problem https://github.com/neondatabase/neon/pull/10613 changed how images are pushed, and therefore also how we have to wait for images to be pushed in `trigger-e2e-tests`. The `trigger-e2e-tests` workflow is triggered in three different ways: - When a pull request is pushed to that is already ready to review, here we call the workflow from `build_and_test` - When a pull request is marked ready for review, then the workflow is triggered directly - When a push to `main` or `release(-.*)?` triggers `build_and_test` and that indirectly calls `trigger-e2e-tests`. The second of these paths had a bug, which was not tested in the PR, because this path being different wasn't clear to me. ## Summary of changes Fix the jq statement that caused the bug.	2025-02-13 10:13:26 +00:00
Conrad Ludgate	23352dc2e9	Merge pull request #10802 from neondatabase/rc/release-proxy/2025-02-13 Proxy release 2025-02-13	2025-02-13 08:41:01 +00:00
github-actions[bot]	c65fc5a955	Proxy release 2025-02-13	2025-02-13 06:02:01 +00:00
JC Grünhage	e38694742c	fix(ci): don't try pushing to prod container registries from main (#10795 ) ## Problem https://github.com/neondatabase/neon/pull/10613 changed how images are pushed, and there was a small mismatch between the github workflow and the script generating what to push where. This resulted in the workflow trying to push images to prod registries from the main branch, even though we don't do that and therefore didn't generate a mapping for those registries in the script that decides what to push where. This misconception happened because promote-images-dev pushed to dev registries, and promote-images-prod pushed to prod registries, but promote-images-prod also updated the latest tag in the dev registries if and only if we are on the main branch. This last bit is why the push-<component>-image-prod jobs were trying to run on the main branch. ## Summary of changes Don't try pushing to prod registries from the main branch.	2025-02-12 20:26:05 +00:00
Arpad Müller	922f3ee17d	Compress git history of Azure SDK (#10790 ) Switch the Azure SDK git fork to one with a compressed git history. This helps with download speed of the git repository. closes #10732	2025-02-12 19:48:11 +00:00
Arpad Müller	61d2474632	Also check by the planned gc cutoff for lease creation (#10764 ) We don't want to allow new leases below the planned gc cutoff either. Other APIs like branch creation or getpage requests already enforce this.	2025-02-12 19:29:17 +00:00
JC Grünhage	b77dd66bc4	refactor(ci): overhaul container image pushing (#10613 ) ## Problem Retagging container images and pushing container images taken from one registry to another is very tangled up with artifact building and not separated by component. This makes not building compute for storage releases and vice versa pretty tricky. To enable that, I want to clean up retagging and pushing of container images and then continue on making the pipelines for releases leaner by not building unnecessary things. ## Summary of changes - Add a reusable workflow that can push to ACR, ECR and Docker Hub, while being very flexible in terms of source and target images. This allows for retagging and pushing images between container registries. - Stop pushing images to registries aside of docker hub in the jobs that build the images - Split image pushing into 4 different jobs (not mentioning special cases): - neon-dev - neon-prod - compute-dev - compute-prod ## TODO - Consider also using this for `pin-build-tools-image`, as it's basically another instance of the same thing. ## Known limitations - The ECR part of this workflow supports authenticating to multiple AWS accounts and therefore multiple ECR endpoints, but the ACR part only supports one Azure Account. If someone with more knowledge on Azure can tell me whether an equivalent to https://github.com/aws-actions/amazon-ecr-login?tab=readme-ov-file#login-to-ecr-on-multiple-aws-accounts is easily possible, that'd be great. - The `image_map` input is a bit complex. It expects something along the lines of ``` { "docker.io/neondatabase/compute-node-v14:13196061314": [ "docker.io/neondatabase/compute-node-v14:13196061314", "369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:13196061314", "neoneastus2.azurecr.io/neondatabase/compute-node-v14:13196061314" ], "docker.io/neondatabase/compute-node-v15:13196061314": [ "docker.io/neondatabase/compute-node-v15:13196061314", "369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:13196061314", "neoneastus2.azurecr.io/neondatabase/compute-node-v15:13196061314" ] } ``` to map from source to target image. We have a small python step to generate this map for the 4 main image pushing jobs. The concrete example is taken from https://github.com/neondatabase/neon/actions/runs/13196061314/job/36838584098?pr=10613#step:3:6 and shortened to two images.	2025-02-12 17:54:51 +00:00
Alexey Kondratov	49775d28e4	fix(compute): Respect skip_pg_catalog_updates in reconfigure() (#10696 ) ## Problem We respect `skip_pg_catalog_updates` at the initial start, but ignore at the follow-up `/configure`. Yet, it's used for storage->cplane->compute notify requests after migrations, shard split, etc. So every time we get them, applying the new config takes much longer than it should because we go through Postgres catalog checks. Cplane sets this flag, when it does serves notify attach call `9068c7d743` Related to `inc-403`, for example ## Summary of changes Look at `skip_pg_catalog_updates` in `compute.reconfigure()`	2025-02-12 17:54:21 +00:00
Alexander Bayandin	f45f9209b9	CI(trigger-e2e-tests): check permissions before running jobs (#10785 ) ## Problem PRs created by external contributors, in some cases might list failed jobs - `Trigger E2E Tests / cancel-previous-e2e-tests` - `Trigger E2E Tests / tag` They don't block the merge, and tests in fact pass (their counterparts in internal PR), but because jobs are triggered from an external PR (and not from the corresponding internal one) they still present as red marks. For example https://github.com/neondatabase/neon/pull/10778 ## Summary of changes - Check permissions before triggering e2e tests	2025-02-12 17:00:23 +00:00
Cheng Chen	20fe4b8ec3	chore(compute): pg_mooncake v0.1.2 (#10778 ) ## Problem Upgrade pg_mooncake to v0.1.2 ## Summary of changes https://github.com/Mooncake-Labs/pg_mooncake/blob/main/CHANGELOG.md#012-2025-02-11	2025-02-12 16:29:19 +00:00
Erik Grinaker	f62047ae97	pageserver: add separate semaphore for L0 compaction (#10780 ) ## Problem L0 compaction frequently gets starved out by other background tasks and image/GC compaction. L0 compaction must be responsive to keep read amplification under control. Touches #10694. Resolves #10689. ## Summary of changes Use a separate semaphore for the L0-only compaction pass. * Add a `CONCURRENT_L0_COMPACTION_TASKS` semaphore and `BackgroundLoopKind::L0Compaction`. * Add a setting `compaction_l0_semaphore` (default off via `compaction_l0_first`). * Use the L0 semaphore when doing an `OnlyL0Compaction` pass. * Use the background semaphore when doing a regular compaction pass (which includes an initial L0 pass). * While waiting for the background semaphore, yield for L0 compaction if triggered. * Add `CompactFlags::NoYield` to disable L0 yielding, and set it for the HTTP API route. * Remove the old `use_compaction_semaphore` setting and compaction-scoped semaphore. * Remove the warning when waiting for a semaphore; it's noisy and we have metrics.	2025-02-12 16:12:21 +00:00
Fedor Dikarev	ec354884ea	Feat/pin docker images to sha (#10730 ) ## Problem With current approach for the base images in `Dockerfiles`, it's hard to track when image is updated, and as they are base, than update will invalidate all the layers, as base image changed. That also becomes more complicated, as we have a number of runners, and they may have different images with the tag `bookworm-slim`, so that will lead to invalidate caches, when image build on one runner will be used on another runners. To fix that problem, we could pin our base images to the specific sha, and that not only align images across runners, and also will allow us to have reproducible build and don't depend on any spontaneous changes in upstream. Fix: https://github.com/neondatabase/cloud/issues/24084 ## Summary of changes Beside of the main goal, that PR also included some small changes around Dockerfiles: 1. Main change: use `SHA` for `bookworm-slim` and `bullseye-slim` debian images 2. For the layers requiring `curl` we could add `curl` and `unzip` to the `build-deps` image, and use it as a base image for all the steps, removing extra dependency on `alpine/curl` 3. added `retry-on-host-error=on` for the `wgetrc` as it happened to me: fail to resolve hostname	2025-02-12 14:03:10 +00:00
John Spray	9989d8bfae	tests: make Workload more determinstic (#10741 ) ## Problem Previously, Workload was reconfiguring the compute before each run of writes, which was meant to be a no-op when nothing changed, but was actually writing extra data due to an issue being fixed in https://github.com/neondatabase/neon/pull/10696. The row counts in tests were too low in some cases, these tests were only working because of those extra writes that shouldn't have been happening, and moreover were relying on checkpoints happening. ## Summary of changes - Only reconfigure compute if the attached pageserver actually changed. If pageserver is set to None, that means controller is managing everything, so never reconfigure compute. - Update tests that wrote too few rows. --------- Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2025-02-12 12:35:29 +00:00
Heikki Linnakangas	9537829ccd	fast_import: Make CPU & memory size configurable (#10709 ) The old values assumed that you have at least about 18 GB of RAM available (shared_buffers=10GB and maintenance_work_mem=8GB). That's a lot when testing locally. Make it configurable, and make the default assumption much smaller: 256 MB. This is nice for local testing, but it's also in preparation for starting to use VMs to run these jobs. When launched in a VM, the control plane can set these env variables according to the max size of the VM. Also change the formula for how RAM is distributed: use 10% of RAM for shared_buffers, and 70% for maintenance_work_mem. That leaves a good amount for misc. other stuff and the OS. A very large shared_buffers setting won't typically help with bulk loading. It won't help with the network and I/O of processing all the tables, unless maybe if the whole database fits in shared buffers, but even then it's not much faster than using local disk. Bulk loading is all sequential I/O. It also won't help much with index creation, which is also sequential I/O. A large maintenance_work_mem can be quite useful, however, so that's where we put most of the RAM.	2025-02-12 11:43:23 +00:00
Mikhail Kot	2c4c6e6330	fix(neon): Add tests clarifying postgres sigabrt on pageserver unavailability (#10666 ) Resolves: https://github.com/neondatabase/neon/issues/5734 When we query pageserver and it's unavailable after some retries, postgres sigabrt's. This is intended behavior so I've added tests checking it	2025-02-12 10:52:26 +00:00
Erik Grinaker	71c30e52fa	pageserver: properly yield for L0 compaction (#10769 ) ## Problem When image compaction yields for L0 compaction, it may not immediately schedule L0 compaction, because it just goes on to compact the next pending timeline. Touches #10694. Requires #10744. ## Summary of changes Extend `CompactionOutcome` with `YieldForL0` and `Skipped` variants, and immediately schedule an L0 compaction pass in the `YieldForL0` case.	2025-02-11 23:43:58 +00:00
Erik Grinaker	6c83ac3fd2	pageserver: do all L0 compaction before image compaction (#10744 ) ## Problem Image compaction can starve out L0 compaction if a tenant has several timelines with L0 debt. Touches #10694. Requires #10740. ## Summary of changes * Add an initial L0 compaction pass, in order of L0 count. * Add a tenant option `compaction_l0_first` to control the L0 pass (disabled by default). * Add `CompactFlags::OnlyL0Compaction` to run an L0-only compaction pass. * Clean up the compaction iteration logic. A later PR will use separate semaphores for the L0 and image compaction passes to avoid cross-tenant L0 starvation. That PR will also make image compaction yield if _any_ of the tenant's timelines have pending L0 compaction to further avoid starvation.	2025-02-11 22:08:46 +00:00
Heikki Linnakangas	635b67508b	Split utils::http to separate crate (#10753 ) Avoids compiling the crate and its dependencies into binaries that don't need them. Shrinks the compute_ctl binary from about 31MB to 28MB in the release-line-debug-size-lto profile.	2025-02-11 22:06:53 +00:00
dependabot[bot]	9491154eae	build(deps): bump cryptography from 43.0.1 to 44.0.1 in the pip group (#10773 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-02-11 21:23:17 +00:00
Heikki Linnakangas	b5e09fdaf3	Re-order Dockerfile steps for putting together final compute image (#10736 ) Run "apt install" first, and only then COPY the files from the intermediary build layers to the final image. This way, if you modify any of the sources that trigger e.g. rebuilding compute_ctl, the "apt install" step can still be cached.	2025-02-11 20:10:06 +00:00
John Spray	cd51ed2f86	tests: parametrize test_graceful_cluster_restart on AZ count (#10427 ) ## Problem In https://github.com/neondatabase/neon/pull/10411 fill logic changes such that it benefits us to test it with & without AZs set up. I didn't extend the test inline in that PR because there were overlapping test changes in flight to add `num_az` parameter. ## Summary of changes - Parameterise test on AZ count (1 or 2) - When AZ count is 2, use a different balance check that just asserts the _tenants_ are balanced (since AZ affinity is chosen on a per-tenant basis)	2025-02-11 20:09:41 +00:00
Folke Behrens	f62bc28086	proxy: Move binaries into the lib (#10758 ) * This way all clippy lints defined in the lib also cover the binary code. * It's much easier to detect unused code. * Fix all discovered lints.	2025-02-11 19:46:23 +00:00
Tristan Partin	da9c101939	Implement a second HTTP server within compute_ctl (#10574 ) The compute_ctl HTTP server has the following purposes: - Allow management via the control plane - Provide an endpoint for scaping metrics - Provide APIs for compute internal clients - Neon Postgres extension for installing remote extensions - local_proxy for installing extensions and adding grants The first two purposes require the HTTP server to be available outside the compute. The Neon threat model is a bad actor within our internal network. We need to reduce the surface area of attack. By exposing unnecessary unauthenticated HTTP endpoints to the internal network, we increase the surface area of attack. For endpoints described in the third bullet point, we can just run an extra HTTP server, which is only bound to the loopback interface since all consumers of those endpoints are within the compute.	2025-02-11 18:02:22 +00:00
Arpad Müller	f7b2293317	Hardlink resident layers during detach ancestor (#10729 ) After a detach ancestor operation, we don't want to on-demand download layers that are already resident. This has shown to impede performance, sometimes quite a lot (50 seconds: https://github.com/neondatabase/neon/issues/8828#issuecomment-2643735644) Fixes #8828.	2025-02-11 16:58:34 +00:00
Arpad Müller	be447ba4f8	Change timeline_offloading setting default to true (#10760 ) This changes the default value of the `timeline_offloading` pageserver and tenant configs to true, now that offloading has been rolled out without problems. There is also a small fix in the tenant config merge function, where we applied the `lazy_slru_download` value instead of `timeline_offloading`. Related issue: https://github.com/neondatabase/cloud/issues/21353	2025-02-11 16:36:54 +00:00
Christian Schwarz	9247331c67	fix(page_service / batching): smgr op latency metric of dropped responses include flush time (#10756 ) # Problem Say we have a batch of 10 responses to send out. Then, even with - #10728 we've still only called observe_execution_end_flush_start for the first 3 responses. The remaining 7 response timers are still ticking. When compute now closes the connection, the waiting flush fails with an error and we `drop()` the remaining 7 responses' smgr op timers. The `impl Drop for SmgrOpTimer` will observe an execution time that includes the flush time. In practice, this is supsected to produce the `+Inf` observations in the smgr op latency histogram we've seen since the introduction of pipelining, even after shipping #10728. refs: - fixup of https://github.com/neondatabase/neon/pull/10042 - fixup of https://github.com/neondatabase/neon/pull/10728 - fixes https://github.com/neondatabase/neon/issues/10754	2025-02-11 14:05:59 +00:00
John Spray	fcedd10226	tests: temporarily permit a log error (#10752 ) ## Problem These tests can encounter a bug in the pageserver read path (#9185) which occurs under the very specific circumstances that the tests create, but is very unlikely to happen in the field. We will fix the bug, but in the meantime let's un-flake the tests. Related: https://github.com/neondatabase/neon/issues/10720 ## Summary of changes - Permit "could not find data for key" errors in tests affected by #9185	2025-02-11 12:37:09 +00:00
Arpad Müller	a4ea1e53ae	Apply Azure SDK patch to periodically load workload identity file (#10415 ) The SDK bug https://github.com/Azure/azure-sdk-for-rust/issues/1739 was originally worked around via #10378, but now upstream has provided a fix in [this](https://github.com/Azure/azure-sdk-for-rust/pull/1997) PR, which we've been asked to test. So this is what this PR is doing: revert #10378 (to make sure we fail if the bug isn't fixed by the SDK PR), and apply the SDK PR to our fork. Currently pointing to my local branch to check CI. I'd like to merge the [SDK fork PR](https://github.com/neondatabase/azure-sdk-for-rust/pull/2) before merging this to main.	2025-02-11 09:40:22 +00:00
Heikki Linnakangas	c26131c2b3	Link pgbouncer dynamically (#10749 ) I don't see the point of static linking, postgres itself and many of the extensions are already built dynamically. One reason for the change is that I'm working on bigger changes to start using systemd in the compute, and as part of that I wanted to add the --with-systemd configure option to pgbouncer, and there doesn't seem to be a static version of libsystemd (at least not on Debian).	2025-02-11 07:48:54 +00:00
Andrew Rudenko	4ab18444ec	compute_ctl: database_schema should keep process::Child as part of returned value (#10273 ) ## Problem /database_schema endpoint returns incomplete output from `pg_dump` ## Summary of changes The Tokio process was not used properly. The returned stream does not include `process::Child`, and the process is scheduled to be killed immediately after the `get_database_schema` call when `cmd` goes out of scope. The solution in this PR is to return a special Stream implementation that retains `process::Child`.	2025-02-11 07:02:13 +00:00
Heikki Linnakangas	98883e4b30	compute_ctl: Use a single tokio runtime (#10743 ) compute_ctl is mostly written in synchronous fashion, intended to run in a single thread. However various parts had become async, and they launched their own tokio runtimes to run the async code. For example, VM monitor ran in its own multi-threaded runtime, and apply_spec_sql() launched another multi-threaded runtime to run the per-database SQL commands in parallel. In addition to that, a few places used a current-thread runtime to run async code in the main thread, or launched a current-thread runtime in a different thread to run background tasks. Unify the runtimes so that there is only one tokio runtime. It's created very early at process startup, and the main thread "enters" the runtime, so that it's always available for tokio::spawn() and runtime.block_on() calls. All code that needs to run async code uses the same runtime. The main thread still mostly runs in a synchronous fashion. When it needs to run async code, it uses rt.block_on(). Spawn fewer additional threads, prefer to spawn tokio tasks instead. Convert some code that ran synchronously in background threads into async. I didn't go all the way, though, some background threads are still spawned.	2025-02-11 00:39:44 +00:00
Tristan Partin	3d143ad799	Unbrick the forward compatibility test failures (#10747 ) Since the merge of https://github.com/neondatabase/neon/pull/10523, forward compatibility tests have been broken everywhere. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-10 22:22:10 +00:00
Alex Chi Z.	b0c7ee0175	feat(pageserver): better gc_compaction_split heuristics (#10727 ) ## Problem close https://github.com/neondatabase/neon/issues/10213 `range_search` only returns the top-most layers that may satisfy the search, so it doesn't include all layers that might be accessed (the user needs to recursively call this function). We need to retrieve the full layer map and find overlaps in order to have a correct heuristics of the job split. ## Summary of changes Retrieve all layers and find overlaps instead of doing `range_search`. The patch also reduces the time holding the layer map read guard. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-10 19:33:34 +00:00
Erik Grinaker	8c4e94107d	pageserver: notify compaction loop at threshold (#10740 ) ## Problem The compaction loop currently runs periodically, which can cause it to wait for up to 20 seconds before starting L0 compaction by default. Also, when we later separate the semaphores for L0 compaction and image compaction, we want to give up waiting for the image compaction semaphore if L0 compaction is needed on any timeline. Touches #10694. ## Summary of changes Notify the compaction loop when an L0 flush (on any timeline) exceeds `compaction_threshold`. Also do some opportunistic cleanups in the area.	2025-02-10 17:48:09 +00:00
Heikki Linnakangas	c368b0fe14	Use a cache mount to speed up rebuilding compute node image (#10737 ) Building the compute rust binaries from scratch is pretty slow, it takes between 4-15 minutes on my laptop, depending on which compiler flags and other tricks I use. A cache mount allows caching the dependencies and incremental builds, which speeds up rebuilding significantly when you only makes a small change in a source file.	2025-02-10 16:58:29 +00:00
Heikki Linnakangas	aba61a3712	Download awscli in separate layer in Dockerfile, to allow caching (#10733 ) The awscli was downloaded at the last stages of the overall compute image build, which meant that if you modified any part of the build, it would trigger a re-download of the awscli. That's a bit annoying when developing locally and rebuilding the compute image repeatedly. Move it to a separate layer, to cache separately and to avoid the spurious rebuilds.	2025-02-10 16:48:28 +00:00
Tristan Partin	946da3f7e2	Require --compute-id when running compute_ctl (#10523 ) The compute_id will be used when verifying claims sent by the control plane. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-10 16:46:20 +00:00
Ivan Efremov	73633e27ed	fix(proxy): Log errors from the local proxy in auth-broker (#10659 ) Handle errors from local proxy by parsing HTTP response in auth broker code Closes [#19476](https://github.com/neondatabase/cloud/issues/19476)	2025-02-10 16:06:13 +00:00
Konstantin Knizhnik	0cf0119751	Add --save_records option to pg_waldump (#10626 ) ## Problem Make it possible to dump WAL records in format recognised by walredo process. Intended usage: ``` pg_waldump -R 1663/5/16396 -B 771727 000000010000000100000034 --save-records=/tmp/walredo.records postgres --wal-redo < /tmp/walredo.records > /tmp/page.img ``` ## Summary of changes Related Postgres PRs: https://github.com/neondatabase/postgres/pull/575 https://github.com/neondatabase/postgres/pull/572 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-10 15:48:03 +00:00
Alex Chi Z.	b37f52fdf1	feat(pageserver): dump read path on missing key error (#10528 ) ## Problem helps investigate https://github.com/neondatabase/neon/issues/10482 ## Summary of changes In debug mode and testing mode, we will record all files visited by a read operation, and print it out when it errors. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-10 14:25:56 +00:00
Alex Chi Z.	443c8d0b4b	feat(pageserver): repartition on L0-L1 boundary (#10548 ) ## Problem Reduce the read amplification when doing `repartition`. ## Summary of changes Compute the L0-L1 boundary LSN and do repartition here. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-10 14:25:48 +00:00
Alexander Bayandin	2f36bdb218	CI(build-neon): fix duplicated builds (#10731 ) ## Problem Parameterising `build-neon` job with `test-cfg` makes it to build exactly the same thing several times. See - `874accd6ed/.github/workflows/_build-and-test-locally.yml (L51-L52)` - https://github.com/neondatabase/neon/actions/runs/13215068271/job/36893373038 ## Summary of changes - Extract `sanitizers` to a separate input from `test-cfg` and set it separately - Don't parametrise `build-neon` with `test-cfg`	2025-02-10 12:29:39 +00:00
Ivan Efremov	e7118213ab	impr(proxy): Set TTL for Redis cancellation map keys (#10671 ) Use expire() op to set TTL for Redis cancellation key	2025-02-10 10:51:53 +00:00
a-masterov	d204d51faf	Fix the upgrade test for pg_jwt by adding the database name (#10738 ) ## Problem The upgrade test for pg_jwt does not work correctly. ## Summary of changes The script for the upgrade test is modified to use the database `contrib_regression`.	2025-02-10 09:56:46 +00:00
Erik Grinaker	ac55e2dbe5	pageserver: improve tenant housekeeping task (#10725 ) # Problem walredo shutdown is done in the compaction task. Let's move it to tenant housekeeping. # Summary of changes * Rename "ingest housekeeping" to "tenant housekeeping". * Move walredo shutdown into tenant housekeeping. * Add a constant `WALREDO_IDLE_TIMEOUT` set to 3 minutes (previously 10x compaction threshold).	2025-02-08 12:42:55 +00:00
Erik Grinaker	874accd6ed	pageserver: misc task cleanups (#10723 ) This patch does a bunch of superficial cleanups of `tenant::tasks` to avoid noise in subsequent PRs. There are no functional changes. PS: enable "hide whitespace" when reviewing, due to the unindentation of large async blocks.	2025-02-08 11:02:13 +00:00
Christian Schwarz	6cd3b501ec	fix(page_service / batching): smgr op latency metrics includes the flush time of preceding requests (#10728 ) Before this PR, if a batch contains N responses, the smgr op latency reported for response (N-i) would include the time we spent flushing the preceding requests. refs: - fixup of https://github.com/neondatabase/neon/pull/10042 - fixes https://github.com/neondatabase/neon/issues/10674	2025-02-08 09:28:09 +00:00
Christian Schwarz	bf20d78292	fix(page_service): page reconstruct error log does not include `shard_id` label (#10680 ) # Problem Before this PR, the `shard_id` field was missing when page_service logs a reconstruct error. This was caused by batching-related refactorings. Example from staging: ``` 2025-01-30T07:10:04.346022Z ERROR page_service_conn_main{peer_addr=...}:process_query{tenant_id=... timeline_id=...}:handle_pagerequests:request:handle_get_page_at_lsn_request_batched{req_lsn=FFFFFFFF/FFFFFFFF}: error reading relation or page version: Read error: whole vectored get request failed because one or more of the requested keys were missing: could not find data for key ... ``` # Changes Delay creation of the handler-specific span until after shard routing This also avoids the need for the record() call in the pagestream hot path. # Testing Manual testing with a failpoint that is part of this PR's history but will be squashed away. # Refs - fixes https://github.com/neondatabase/neon/issues/10599	2025-02-07 19:45:39 +00:00
Arpad Müller	2656c713a4	Revert recent AWS SDK update (#10724 ) We've been seeing some regressions in staging since the AWS SDK updates: https://github.com/neondatabase/neon/issues/10695 . We aren't sure the regression was caused by the SDK update, but the issues do involve S3, so it's not unlikely. By reverting the SDK update we find out whether it was really the SDK update, or something else. Reverts the two PRs: * https://github.com/neondatabase/neon/pull/10588 * https://github.com/neondatabase/neon/pull/10699 https://neondb.slack.com/archives/C08C2G15M6U/p1738576986047179	2025-02-07 17:37:53 +00:00
John Spray	5e95860e70	tests: wait for manifest persistence in test_timeline_archival_chaos (#10719 ) ## Problem This test would sometimes fail its assertion that a timeline does not revert to active once archived. That's because it was using the in-memory offload state, not the persistent state, so this was sometimes lost across a pageserver restart. Closes: https://github.com/neondatabase/neon/issues/10389 ## Summary of changes - When reading offload status, read from pageserver API _and_ remote storage before considering the timeline offloaded	2025-02-07 16:27:39 +00:00
Heikki Linnakangas	0abff59e97	compute: Allow postgres user to power off the VM (#10710 ) I plan to use this when launching a fast_import job in a VM. There's currently no good way for an executable running in a NeonVM to exit gracefully and have the VM shut down. The inittab we use always respawns the payload command. The idea is that the control plane can use "fast_import ... && poweroff" as the command, so that when fast_import completes successfully, the VM is terminated, and the k8s Pod and VirtualMachine object are marked as completed successfully. I'm working on bigger changes to how we launch VMs, and will try to come up with a nicer system for that, but in the meanwhile, this quick hack allows us to proceed with using VMs for one-off jobs like fast_import.	2025-02-07 16:03:01 +00:00
John Spray	9609f7547e	tests: address warnings in timeline shutdown (#10702 ) ## Problem There are a couple of log warnings tripping up `test_timeline_archival_chaos` - `[stopping left-over name="timeline_delete" tenant_shard_id=2d526292b67dac0e6425266d7079c253 timeline_id=Some(44ba36bfdee5023672c93778985facd9) kind=TimelineDeletionWorker\n')](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10672/13161357302/index.html#/testresult/716b997bb1d8a021)` - `ignoring attempt to restart exited flush_loop 503d8f401d8887cfaae873040a6cc193/d5eed0673ba37d8992f7ec411363a7e3\n')` Related: https://github.com/neondatabase/neon/issues/10389 ## Summary of changes - Downgrade the 'ignoring attempt to restart' to info -- there's nothing in the design that forbids this happening, i.e. someone calling maybe_spawn_flush_loop concurrently with shutdown() - Prevent timeline deletion tasks outliving tenants by carrying a gateguard. This logically makes sense because the deletion process does call into Tenant to update manifests.	2025-02-07 15:29:34 +00:00
Erik Grinaker	d6e87a3a9c	pageserver: add separate, disabled compaction semaphore (#10716 ) ## Problem L0 compaction can get starved by other background tasks. It needs to be responsive to avoid read amp blowing up during heavy write workloads. Touches #10694. ## Summary of changes Add a separate semaphore for compaction, configurable via `use_compaction_semaphore` (disabled by default). This is primarily for testing in staging; it needs further work (in particular to split image/L0 compaction jobs) before it can be enabled.	2025-02-07 15:11:31 +00:00
Arpad Müller	f5243992fa	safekeeper: make timeline deletions a bit more verbose (#10721 ) Make timeline deletion print the sub-steps, so that we can narrow down some stuck timeline deletion issues we are observing. https://neondb.slack.com/archives/C08C2G15M6U/p1738930694716009	2025-02-07 15:06:26 +00:00
John Spray	95220ba43e	tests: fix flaky endpoint in test_ingest_logical_message (#10700 ) ## Problem Endpoint kept running while timeline was deleted, causing forbidden warnings on the pageserver when the tenant is not found. ## Summary of changes - Explicitly stop the endpoint before the end of the test, so that it isn't trying to talk to the pageserver in the background while things are torn down	2025-02-07 14:51:36 +00:00
John Spray	08f92bb916	pageserver: clean up DeletionQueue push_layers_sync (#10701 ) ## Problem This is tech debt. While we introduced generations for tenants, some legacy situations without generations needed to delete things inline (async operation) instead of enqueing them (sync operation). ## Summary of changes - Remove the async code, replace calls with the sync variant, and assert that the generation is always set	2025-02-07 13:03:01 +00:00
Fedor Dikarev	8f651f9582	switch from localtest.me to local.neon.build (#10714 ) ## Problem Ref: https://github.com/neondatabase/neon/issues/10632 We use dns named `.localtest.me` in our test, and that domain is well-known and widely used for that, with all the records there resolve to the localhost, both IPv4 and IPv6: `127.0.0.1` and `::1` In some cases on our runners these addresses resolves only to `IPv6`, and so components fail to connect when runner doesn't have `IPv6` address. We suspect issue in systemd-resolved here (https://github.com/systemd/systemd/issues/17745) To workaround that and improve test stability, we introduced our own domain `.local.neon.build` with IPv4 address `127.0.0.1` only See full details and troubleshoot log in referred issue. p.s. If you're FritzBox user, don't forget to add that domain `local.neon.build` to the `DNS Rebind Protection` section under `Home Network -> Network -> Network Settings`, otherwise FritzBox will block addresses, resolving to the local addresses. For other devices/vendors, please check corresponding documentation, if resolving `local.neon.build` will produce empty answer for you. ## Summary of changes Replace all the occurrences of `localtest.me` with `local.neon.build`	2025-02-07 12:25:16 +00:00
Arseny Sher	b5a239c4ae	Add reconciliation details to sk membership change rfc (#10514 ) ## Problem RFC pointed out the need of reconciliation, but wasn't detailed how it can be done. ## Summary of changes Add these details.	2025-02-07 11:20:49 +00:00
Alexander Lakhin	de05258419	Adjust diesel schema check for build with sanitizers (#10711 ) We need to disable the detection of memory leaks when running ``neon_local init` for build with sanitizers to avoid an error thrown by AddressSanitizer.	2025-02-07 08:56:39 +00:00
Peter Bendel	e73d681a0e	Patch pgcopydb and fix another segfault (#10706 ) ## Problem Found another pgcopydb segfault in error handling ```bash 2025-02-06 15:30:40.112 51299 ERROR pgsql.c:2330 [TARGET -738302813] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.112 51298 ERROR pgsql.c:2330 [TARGET -1407749748] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.112 51297 ERROR pgsql.c:2330 [TARGET -2073308066] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.112 51300 ERROR pgsql.c:2330 [TARGET 1220908650] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.432 51300 ERROR pgsql.c:2536 [Postgres] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.513 51290 ERROR copydb.c:773 Sub-process 51300 exited with code 0 and signal Segmentation fault 2025-02-06 15:30:40.578 51299 ERROR pgsql.c:2536 [Postgres] FATAL: terminating connection due to administrator command 2025-02-06 15:30:40.613 51290 ERROR copydb.c:773 Sub-process 51299 exited with code 0 and signal Segmentation fault 2025-02-06 15:30:41.253 51298 ERROR pgsql.c:2536 [Postgres] FATAL: terminating connection due to administrator command 2025-02-06 15:30:41.314 51290 ERROR copydb.c:773 Sub-process 51298 exited with code 0 and signal Segmentation fault 2025-02-06 15:30:43.133 51297 ERROR pgsql.c:2536 [Postgres] FATAL: terminating connection due to administrator command 2025-02-06 15:30:43.215 51290 ERROR copydb.c:773 Sub-process 51297 exited with code 0 and signal Segmentation fault 2025-02-06 15:30:43.215 51290 ERROR indexes.c:123 Some INDEX worker process(es) have exited with error, see above for details 2025-02-06 15:30:43.215 51290 ERROR indexes.c:59 Failed to create indexes, see above for details 2025-02-06 15:30:43.232 51271 ERROR copydb.c:768 Sub-process 51290 exited with code 12 ``` ```bashadmin@ip-172-31-38-164:~/pgcopydb$ gdb /usr/local/pgsql/bin/pgcopydb core GNU gdb (Debian 13.1-3) 13.1 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/local/pgsql/bin/pgcopydb... [New LWP 51297] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1". Core was generated by `pgcopydb: create index ocr.ocr_pipeline_step_results_version_pkey '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000aaaac3a4b030 in splitLines (lbuf=lbuf@entry=0xffffd8b86930, buffer=<optimized out>) at string_utils.c:630 630 newLinePtr = '\0'; (gdb) bt #0 0x0000aaaac3a4b030 in splitLines (lbuf=lbuf@entry=0xffffd8b86930, buffer=<optimized out>) at string_utils.c:630 #1 0x0000aaaac3a3a678 in pgsql_execute_log_error (pgsql=pgsql@entry=0xffffd8b87040, result=result@entry=0x0, sql=sql@entry=0xffff81fe9be0 "CREATE UNIQUE INDEX IF NOT EXISTS ocr_pipeline_step_results_version_pkey ON ocr.ocr_pipeline_step_results_version USING btree (id, transaction_id);", debugParameters=debugParameters@entry=0xaaaaec5f92f0, context=context@entry=0x0) at pgsql.c:2322 #2 0x0000aaaac3a3bbec in pgsql_execute_with_params (pgsql=pgsql@entry=0xffffd8b87040, sql=0xffff81fe9be0 "CREATE UNIQUE INDEX IF NOT EXISTS ocr_pipeline_step_results_version_pkey ON ocr.ocr_pipeline_step_results_version USING btree (id, transaction_id);", paramCount=paramCount@entry=0, paramTypes=paramTypes@entry=0x0, paramValues=paramValues@entry=0x0, context=context@entry=0x0, parseFun=parseFun@entry=0x0) at pgsql.c:1649 #3 0x0000aaaac3a3c468 in pgsql_execute (pgsql=pgsql@entry=0xffffd8b87040, sql=<optimized out>) at pgsql.c:1522 #4 0x0000aaaac3a245f4 in copydb_create_index (specs=specs@entry=0xffffd8b8ec98, dst=dst@entry=0xffffd8b87040, index=index@entry=0xffff81f71800, ifNotExists=<optimized out>) at indexes.c:846 #5 0x0000aaaac3a24ca8 in copydb_create_index_by_oid (specs=specs@entry=0xffffd8b8ec98, dst=dst@entry=0xffffd8b87040, indexOid=<optimized out>) at indexes.c:410 #6 0x0000aaaac3a25040 in copydb_index_worker (specs=specs@entry=0xffffd8b8ec98) at indexes.c:297 #7 0x0000aaaac3a25238 in copydb_start_index_workers (specs=specs@entry=0xffffd8b8ec98) at indexes.c:209 #8 0x0000aaaac3a252f4 in copydb_index_supervisor (specs=specs@entry=0xffffd8b8ec98) at indexes.c:112 #9 0x0000aaaac3a253f4 in copydb_start_index_supervisor (specs=0xffffd8b8ec98) at indexes.c:57 #10 copydb_start_index_supervisor (specs=specs@entry=0xffffd8b8ec98) at indexes.c:34 #11 0x0000aaaac3a51ff4 in copydb_process_table_data (specs=specs@entry=0xffffd8b8ec98) at table-data.c:146 #12 0x0000aaaac3a520dc in copydb_copy_all_table_data (specs=specs@entry=0xffffd8b8ec98) at table-data.c:69 #13 0x0000aaaac3a0ccd8 in cloneDB (copySpecs=copySpecs@entry=0xffffd8b8ec98) at cli_clone_follow.c:602 #14 0x0000aaaac3a0d2cc in start_clone_process (pid=0xffffd8b743d8, copySpecs=0xffffd8b8ec98) at cli_clone_follow.c:502 #15 start_clone_process (copySpecs=copySpecs@entry=0xffffd8b8ec98, pid=pid@entry=0xffffd8b89788) at cli_clone_follow.c:482 #16 0x0000aaaac3a0d52c in cli_clone (argc=<optimized out>, argv=<optimized out>) at cli_clone_follow.c:164 #17 0x0000aaaac3a53850 in commandline_run (command=command@entry=0xffffd8b9eb88, argc=0, argc@entry=22, argv=0xffffd8b9edf8, argv@entry=0xffffd8b9ed48) at /home/admin/pgcopydb/src/bin/pgcopydb/../lib/subcommands.c/commandline.c:71 #18 0x0000aaaac3a01464 in main (argc=22, argv=0xffffd8b9ed48) at main.c:140 (gdb) ``` The problem is most likely that the following call returned a message in a read-only memory segment where we cannot replace \n with \0 in string_utils.c splitLines() function ```C char message = PQerrorMessage(pgsql->connection); ``` ## Summary of changes modified the patch to also address this problem	2025-02-06 20:21:18 +00:00
Anastasia Lubennikova	44b905d14b	Fix remote extension lookup (#10708 ) when library name doesn't match extension name. The bug was introduced by recent commit `ebc55e6a`	2025-02-06 19:21:38 +00:00
Arseny Sher	186199f406	Update aws sdk (#10699 ) ## Problem We have unclear issue with stuck s3 client, probably after partial aws sdk update without updating sdk-s3. https://github.com/neondatabase/neon/pull/10588 Let's try to update s3 as well. ## Summary of changes Result of running cargo update -p aws-types -p aws-sigv4 -p aws-credential-types -p aws-smithy-types -p aws-smithy-async -p aws-sdk-kms -p aws-sdk-iam -p aws-sdk-s3 -p aws-config ref https://github.com/neondatabase/neon/issues/10695	2025-02-06 17:28:27 +00:00
OBBO67	82cbab7512	Switch reqlsns[0].request_lsn to arrow operator in neon_read_at_lsnv() (#10620 ) (#10687 ) ## Problem Currently the following line below uses array subscript notation which is confusing since `reqlsns` is not an array but just a pointer to a struct. ``` XLogWaitForReplayOf(reqlsns[0].request_lsn); ``` ## Summary of changes Switch from array subscript notation to arrow operator to improve readability of code. Close #10620.	2025-02-06 17:26:26 +00:00
Erik Grinaker	2943590694	pageserver: use histogram for background job semaphore waits (#10697 ) ## Problem We don't have visibility into how long an individual background job is waiting for a semaphore permit. ## Summary of changes * Make `pageserver_background_loop_semaphore_wait_seconds` a histogram rather than a sum. * Add a paced warning when a task takes more than 10 minutes to get a permit (for now). * Drive-by cleanup of some `EnumMap` usage.	2025-02-06 17:17:47 +00:00
John Spray	df06c41085	tests: don't detach from controller in test_issue_5878 (#10675 ) ## Problem This test called NeonPageserver.tenant_detach, which as well as detaching locally on the pageserver, also updates the storage controller to put the tenant into Detached mode. When the test runs slowly in debug mode, it sometimes takes long enough that the background_reconcile loop wakes up and drops the tenant from memory in response, such that the pageserver can't validate its deletions and the test does not behave as expected. Closes: https://github.com/neondatabase/neon/issues/10513 ## Summary of changes - Call the pageserver HTTP client directly rather than going via NeonPageserver.tenant_detach	2025-02-06 15:18:50 +00:00
Alexander Bayandin	ddd7c36343	CI(approved-for-ci-run): Use internal CI_ACCESS_TOKEN for cloning repo (#10693 ) ## Problem The default `GITHUB_TOKEN` is used to push changes created with `approved-for-ci-run`, which doesn't work: ``` Run git push --force origin "${BRANCH}" remote: Permission to neondatabase/neon.git denied to github-actions[bot]. fatal: unable to access 'https://github.com/neondatabase/neon/': The requested URL returned error: 403 ``` Ref: https://github.com/neondatabase/neon/actions/runs/13166108303/job/36746518291?pr=10687 ## Summary of changes - Use `CI_ACCESS_TOKEN` to clone an external repo - Remove unneeded `actions/checkout`	2025-02-06 14:40:22 +00:00
Peter Bendel	839f41f5bb	fix pgcopydb seg fault and -c idle_in_transaction_session_timeout=0 (#10692 ) ## Problem During ingest_benchmark which uses `pgcopydb` ([see](https://github.com/dimitri/pgcopydb))we sometimes had outages. - when PostgreSQL COPY step failed we got a segfault (reported [here](https://github.com/dimitri/pgcopydb/issues/899)) - the root cause was Neon idle_in_transaction_session_timeout is set to 5 minutes which is suboptimal for long-running tasks like project import (reported [here](https://github.com/dimitri/pgcopydb/issues/900)) ## Summary of changes Patch pgcopydb to avoid segfault. override idle_in_transaction_session_timeout and set it to "unlimited"	2025-02-06 14:39:45 +00:00
Alex Chi Z.	f22d41eaec	feat(pageserver): num of background job metrics (#10690 ) ## Problem We need a metrics to know what's going on in pageserver's background jobs. ## Summary of changes * Waiting tasks: task still waiting for the semaphore. * Running tasks: tasks doing their actual jobs. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-02-06 14:39:37 +00:00
Alexander Lakhin	977781e423	Enable sanitizers for postgres v17 (#10401 ) Add a build with sanitizers (asan, ubsan) to the CI pipeline and run tests on it. See https://github.com/neondatabase/neon/issues/6053 --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-02-06 12:53:43 +00:00
Arpad Müller	67b71538d0	Limit returned lsn for timestamp by the planned gc cutoff (#10678 ) Often the output of the timestamp->lsn API is used as input for branch creation, and branch creation takes the planned lsn into account, i.e. rejects lsn's as branch lsns that are before the planned lsn. This patch doesn't fix all race conditions, it's still racy. But at least it is a step into the right direction. For #10639	2025-02-06 11:17:08 +00:00
Erik Grinaker	f4cfa725b8	pageserver: add a few critical errors (#10657 ) ## Problem Following #10641, let's add a few critical errors. Resolves #10094. ## Summary of changes Adds the following critical errors: * WAL sender read/decode failure. * WAL record ingestion failure. * WAL redo failure. * Missing key during compaction. We don't add an error for missing keys during GetPage requests, since we've seen a handful of these in production recently, and the cause is still unclear (most likely a benign race).	2025-02-06 10:30:27 +00:00
Arpad Müller	05326cc247	Skip gc cutoff lsn check at timeline creation if lease exists (#10685 ) Right now, branch creation doesn't care if a lsn lease exists or not, it just fails if the passed lsn is older than either the last or the planned gc cutoff. However, if an lsn lease exists for a given lsn, we can actually create a branch at that point: nothing has been gc'd away. This prevents race conditions that #10678 still leaves around. Related: #10639 https://github.com/neondatabase/cloud/issues/23667	2025-02-06 10:10:11 +00:00
Arpad Müller	b66fbd6176	Warn on basebackups for archived timelines (#10688 ) We don't want any external requests for an archived timeline. This includes basebackup requests, i.e. when a compute is being started up. Therefore, we'd like to forbid such basebackup requests: any attempt to get a basebackup on an archived timeline (or any getpage request really) is a cplane bug. Make this a warning for now so that, if there is potentially a bug, we can detect cases in the wild before they cause stuck operations, but the intention is to return an error eventually. Related: #9548	2025-02-06 10:09:20 +00:00
Vlad Lazar	95588dab98	safekeeper: fix wal fan-out shard subscription data race (#10677 ) ## Problem [This select arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L414) runs when we want to attach a new reader to the current cursor. It checks the current position of the cursor and resets it if required. The current position of the cursor is updated in the [other select arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L336-L345). That runs when we get some WAL to send. Now, what happens if we want to attach two shards consecutively to the cursor? Let's say [this select arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L397) runs twice in a row. Let's assume cursor is currently at LSN X. First shard wants to attach at position V and the other one at W. Assume X > W > V. First shard resets the stream to position V. Second shard comes in, sees stale cursor position X and resets it to W. This means that the first shard doesn't get wal in the [V, W) range. ## Summary of changes Ultimately, this boils down to the current position not being kept in sync with the reset of the WAL stream. This patch fixes the race by updating it when resetting the WAL stream and adds a unit test repro. Closes https://github.com/neondatabase/cloud/issues/23750	2025-02-06 09:24:28 +00:00
Christian Schwarz	1686d9e733	perf(page_service): dont `.instrument(span.clone())` the response flush (#10686 ) On my AX102 Hetzner box, removing this line removes about 20us from the `latency_mean` result in `test_pageserver_characterize_latencies_with_1_client_and_throughput_with_many_clients_one_tenant`. If the same 20us can be removed in the nightly benchmark run, this will be a ~10% improvement because there, mean latencies are about ~220us. This span was added during batching refactors, we didn't have it before, and I don't think it's terribly useful. refs - https://github.com/neondatabase/cloud/issues/21759	2025-02-06 08:33:37 +00:00
Ivan Efremov	3e624581cd	Merge pull request #10691 from neondatabase/rc/release-proxy/2025-02-06 Proxy release 2025-02-06	2025-02-06 10:23:43 +02:00
Erik Grinaker	abcd00181c	pageserver: set a concurrency limit for LocalFS (#10676 ) ## Problem The local filesystem backend for remote storage doesn't set a concurrency limit. While it can't/won't enforce a concurrency limit itself, this also bounds the upload queue concurrency. Some tests create thousands of uploads, which slows down the quadratic scheduling of the upload queue, and there is no point spawning that many Tokio tasks. Resolves #10409. ## Summary of changes Set a concurrency limit of 100 for the LocalFS backend. Before: `test_layer_map[release-pg17].test_query: 68.338 s` After: `test_layer_map[release-pg17].test_query: 5.209 s`	2025-02-06 07:24:36 +00:00
Konstantin Knizhnik	01f0be03b5	Fix bugs in lfc_cache_containsv (#10682 ) ## Problem Incorrect manipulations with iteration index in `lfc_cache_containsv` ## Summary of changes ``` - int this_chunk = Min(nblocks, BLOCKS_PER_CHUNK - chunk_offs); + int this_chunk = Min(nblocks - i, BLOCKS_PER_CHUNK - chunk_offs); int this_chunk = ``` - if (i + 1 >= nblocks) + if (i >= nblocks) ``` Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-06 07:00:00 +00:00
github-actions[bot]	fedf4f169c	Proxy release 2025-02-06	2025-02-06 06:02:11 +00:00
Konstantin Knizhnik	81cd30e4d6	Use #ifdef instead of #if USE_ASSERT_CHECKING (#10683 ) ## Problem USE_ASSERT _CHECKING is defined as empty entity. but it is checked using #if ## Summary of changes Replace `#if USE_ASSERT _CHECKING` with `#ifdef USE_ASSERT _CHECKING` as done in other places in Postgres Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-06 05:47:56 +00:00
Konstantin Knizhnik	7fc6953da4	Is neon superuser (#10625 ) ## Problem is_neon_superuser() fiunction is public in pg14/pg15 but statically defined in publicationcmd.c in pg16/pg17 ## Summary of changes Make this function public for all Postgres version. It is intended to be used not only in publicationcmd.c See https://github.com/neondatabase/postgres/pull/573 https://github.com/neondatabase/postgres/pull/576 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-06 05:42:14 +00:00
Christian Schwarz	77f9e74d86	pgxn: include socket send & recv queue size in slow response logs (#10673 ) # Problem When we see an apparent slow request, one possible cause is that the client is failing to consume responses, but we don't have a clear way to see that. # Solution - Log the socket queue depths on slow/stuck connections, so that we have an indication of whether the compute is keeping up with processing the connection's responses. refs - slack https://neondb.slack.com/archives/C036U0GRMRB/p1738652644396329 - refs https://github.com/neondatabase/cloud/issues/23515 - refs https://github.com/neondatabase/cloud/issues/23486	2025-02-06 01:14:29 +00:00
Alex Chi Z.	0ceeec9be3	fix(pageserver): schedule compaction immediately if pending (#10684 ) ## Problem The code is intended to reschedule compaction immediately if there are pending tasks. We set the duration to 0 before if there are pending tasks, but this will go through the `if period == Duration::ZERO {` branch and sleep for another 10 seconds. ## Summary of changes Set duration to 1 so that it doesn't sleep for too long. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-05 22:11:50 +00:00
Alex Chi Z.	733a57247b	fix(pageserver): disallow gc-compaction produce l0 layer (#10679 ) ## Problem Any compaction should never produce l0 layers. This never happened in my experiments, but would be good to guard it early. ## Summary of changes Disallow gc-compaction to produce l0 layers. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-05 20:44:28 +00:00
Heikki Linnakangas	6699a30a49	Make it easy to build only a subset of extensions into compute image (#10655 ) The full build of all extensions takes a long time. When working locally on parts that don't need extensions, you can iterate more quickly by skipping the unnecessary extensions. This adds a build argument to the dockerfile to specify extensions to build. There are three options: - EXTENSIONS=all (default) - EXTENSIONS=minimal: Build only a few extensions that are listed in shared_preload_libraries in the default neon config. - EXTENSIONS=none: Build no extensions (except for the mandatory 'neon' extension).	2025-02-05 18:07:51 +00:00
Alex Chi Z.	133b89a83d	feat(pageserver): continue from last incomplete image layer creation (#10660 ) ## Problem close https://github.com/neondatabase/neon/issues/10651 ## Summary of changes * Image layer creation starts from the next partition of the last processed partition if the previous attempt was not complete. * Add tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-05 17:35:39 +00:00
Arseny Sher	fba22a7123	Record more timings in test_layer_map (#10670 ) ## Problem It it is not very clear how much time take different operations. ## Summary of changes Record more timings. ref https://github.com/neondatabase/neon/issues/10409	2025-02-05 17:00:26 +00:00
John Spray	14e05276a3	storcon: fix a case where optimise could get stuck on unschedulable node (#10648 ) ## Problem When a shard has two secondary locations, but one of them is on a node with MaySchedule::No, the optimiser would get stuck, because it couldn't decide which secondary to remove. This is generally okay if a node is offline, but if a node is in Pause mode for a long period of time, it's a problem. Closes: https://github.com/neondatabase/neon/issues/10646 ## Summary of changes - Instead of insisting on finding a node in the wrong AZ to remove, find an available node in the _right_ AZ, and remove all the others. This ensures that if there is one live suitable node, then other offline/paused nodes cannot hold things up.	2025-02-05 16:05:12 +00:00
Tristan Partin	ebc55e6ae8	Fix logic for checking if a compute can install a remote extension (#10656 ) Given a remote extensions manifest of the following: ```json { "public_extensions": [], "custom_extensions": null, "library_index": { "pg_search": "pg_search" }, "extension_data": { "pg_search": { "control_data": { "pg_search.control": "comment = 'pg_search: Full text search for PostgreSQL using BM25'\ndefault_version = '0.14.1'\nmodule_pathname = '$libdir/pg_search'\nrelocatable = false\nsuperuser = true\nschema = paradedb\ntrusted = true\n" }, "archive_path": "13117844657/v14/extensions/pg_search.tar.zst" } } } ``` We were allowing a compute to install a remote extension that wasn't listed in either public_extensions or custom_extensions. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-05 14:58:33 +00:00
Erik Grinaker	f07119cca7	pageserver: add `pageserver_wal_ingest_values_committed` metric (#10653 ) ## Problem We don't have visibility into the ratio of image vs. delta pages ingested in Pageservers. This might be useful to determine whether we should compress WAL records before storing them, which in turn might make compaction more efficient. ## Summary of changes Add `pageserver_wal_ingest_values_committed` metric with dimensions `class=metadata\|data` and `kind=image\|delta`.	2025-02-05 14:33:04 +00:00
Vlad Lazar	47975d06d9	storcon: silence cplane 404s on tenant creation (#10665 ) ## Problem We get WARN log noise on tenant creations. Cplane creates tenants via /location_config. That returns the attached locations in the response and spawns a reconciliation which will also attempt to notify cplane. If the notification is attempted before cplane persists the shards to its database, storcon gets back a 404. The situation is harmless, but annoying. ## Summary of Changes * Add a tenant creation hint to the reconciler config * If the hint is true and we get back a 404 on the notification from cplane, ignore the error, but still queue the reconcile up for a retry. Closes https://github.com/neondatabase/cloud/issues/20732	2025-02-05 12:41:09 +00:00
Fedor Dikarev	472007dd7c	ci: unify Dockerfiles, set bash as SHELL for debian layers, make cpan step as separate RUN (#10645 ) ## Problem Ref: https://github.com/neondatabase/cloud/issues/23461 and follow-up after: https://github.com/neondatabase/neon/pull/10553 we used `echo` to set-up `.wgetrc` and `.curlrc`, and there we used `\n` to make these multiline configs with one echo command. The problem is that Debian `/bin/sh`'s built-in echo command behaves differently from the `/bin/echo` executable and from the `echo` built-in in `bash`. Namely, it does not support the`-e` option, and while it does treat `\n` as a newline, passing `-e` here will add that `-e` to the output. At the same time, when we use different base images, for example `alpine/curl`, their `/bin/sh` supports and requires `-e` for treating escape sequences like `\n`. But having different `echo` and remembering difference in their behaviour isn't best experience for the developer and makes bad experience maintaining Dockerfiles. Work-arounds: - Explicitly use `/bin/bash` (like in this PR) - Use `/bin/echo` instead of the shell's built-in echo function - Use printf "foo\n" instead of echo -e "foo\n" ## Summary of changes 1. To fix that, we process with the option setting `/bin/bash` as a SHELL for the debian-baysed layers 2. With no changes for `alpine/curl` based layers. 3. And one more change here: in `extensions` layer split to the 2 steps: installing dependencies from `CPAN` and installing `lcov` from github, so upgrading `lcov` could reuse previous layer with installed cpan modules.	2025-02-04 18:58:02 +00:00
Vlad Lazar	f9009d6b80	pageserver: write heatmap to disk after uploading it (#10650 ) ## Problem We wish to make heatmap generation additive in https://github.com/neondatabase/neon/pull/10597. However, if the pageserver restarts and has a heatmap on disk from when it was a secondary long ago, we can end up keeping extra layers on the secondary's disk. ## Summary of changes Persist the heatmap after a successful upload.	2025-02-04 17:52:54 +00:00
Alex Chi Z.	cab60b6d9f	fix(pagesever): stablize gc-compaction tests (#10621 ) ## Problem Hopefully this can resolve https://github.com/neondatabase/neon/issues/10517. The reason why the test is flaky is that after restart the compute node might write some data so that the pageserver flush some layers, and in the end, causing L0 compaction to run, and we cannot get the test scenario as we want. ## Summary of changes Ensure all L0 layers are compacted before starting the test. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-04 16:11:31 +00:00
Erik Grinaker	06090bbccd	pageserver: log critical error on `ClearVmBits` for unknown pages (#10634 ) ## Problem In #9895, we fixed some issues where `ClearVmBits` were broadcast to all shards, even those not owning the VM relation. As part of that, we found some ancient code from #1417, which discarded spurious incorrect `ClearVmBits` records for pages outside of the VM relation. We added observability in #9911 to see how often this actually happens in the wild. After two months, we have not seen this happen once in production or staging. However, out of caution, we don't want a hard error and break WAL ingestion. Resolves #10067. ## Summary of changes Log a critical error when ingesting `ClearVmBits` for unknown VM relations or pages.	2025-02-04 14:55:11 +00:00
Folke Behrens	dcf335a251	proxy: Switch proxy to JSON logging (#9857 ) ## Problem We want to switch proxy and ideally all Rust services to structured JSON logging to support better filtering and cross-referencing with tracing. ## Summary of changes * Introduce a custom tracing-subscriber to write the JSON. In a first attempt a customized tracing::fmt::FmtSubscriber was used, but it's very inefficient and can still generate invalid JSON. It's also doesn't allow us to add important fields to the root object. * Make this opt in: the `LOGFMT` env var can be set to `"json"` to enable to new logger at startup.	2025-02-04 14:50:53 +00:00
Arpad Müller	b6e9daea9a	storcon: only allow errrors of the server cert verification (#10644 ) This PR does a bunch of things: * only allow errors of the server cert verification, not of the TLS handshake. The TLS handshake doesn't cause any errors for us so we can just always require it to be valid. This simplifies the code a little. * As the solution is more permanent than originally anticipated, I think it makes sense to move the `AcceptAll` verifier outside. * log the connstr information. this helps with figuring out which domain names are configured in the connstr, etc. I think it is generally useful to print it. make extra sure that the password is not leaked. Follow-up of #10640	2025-02-04 14:01:57 +00:00
a-masterov	d5c3a4e2b9	Add support for pgjwt test (#10611 ) ## Problem We don't currently test pgjwt, while it is based on pg_prove and can be easily added ## Summary of changes The test for pgjwt was added.	2025-02-04 13:49:44 +00:00
Heikki Linnakangas	8107140f7f	Refactor compute dockerfile (#10371 ) Refactor how extensions are built in compute Dockerfile 1. Rename some of the extension layers, so that names correspond more precisely to the upstream repository name and the source directory name. For example, instead of "pg-jsonschema-pg-build", spell it "pg_jsonschema-build". Some of the layer names had the extra "pg-" part, and some didn't; harmonize on not having it. And use an underscore if the upstream project name uses an underscore. 2. Each extension now consists of two dockerfile targets: [extension]-src and [extension]-build. By convention, the -src target downloads the sources and applies any neon-specific patches if necessary. The source tarball is downloaded and extracted under /ext-src. For example, the 'pgvector' extension creates the following files and directory: /ext-src/pgvector.tar.gz # original tarball /ext-src/pgvector.patch # neon-specific patch, copied from patches/ dir /ext-src/pgvector-src/ # extracted tarball, with patch applied This separation avoids re-downloading the sources every time the extension is recompiled. The 'extension-tests' target also uses the [extension]-src layers, by copying the /ext-src/ dirs from all the extensions together into one image. This refactoring came about when I was experimenting with different ways of splitting up the Dockerfile so that each extension would be in a separate file. That's not part of this PR yet, but this is a good step in modularizing the extensions.	2025-02-04 10:35:43 +00:00
Alex Chi Z.	e219d48bfe	refactor(pageserver): clearify compaction return value (#10643 ) ## Problem ## Summary of changes Make the return value of the set of compaction functions less confusing. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-03 21:56:55 +00:00
Alex Chi Z.	c1be84197e	feat(pageserver): preempt image layer generation if L0 piles up (#10572 ) ## Problem Image layer generation could block L0 compactions for a long time. ## Summary of changes * Refactored the return value of `create_image_layers_for_` functions to make it self-explainable. Preempt image layer generation in `Try` mode if L0 piles up. Note that we might potentially run into a state that only the beginning part of the keyspace gets image coverage. In that case, we either need to implement something to prioritize some keyspaces with image coverage, or tune the image_creation_threshold to ensure that the frequency of image creation could keep up with L0 compaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-02-03 20:55:47 +00:00
dependabot[bot]	d80cbb2443	build(deps): bump openssl from 0.10.66 to 0.10.70 in /test_runner/pg_clients/rust/tokio-postgres in the cargo group across 1 directory (#10642 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-03 19:42:40 +00:00
Erik Grinaker	06b45fd0fd	utils/logging: add `critical!` macro and metric (#10641 ) ## Problem We don't currently have good alerts for critical errors, e.g. data loss/corruption. Touches #10094. ## Summary of changes Add a `critical!` macro and corresponding `libmetrics_tracing_event_count{level="critical"}` metric. This will: * Emit an `ERROR` log message with prefix `"CRITICAL:"` and a backtrace. * Increment `libmetrics_tracing_event_count{level="critical"}`, and indirectly `level="error"`. * Trigger a pageable alert (via the metric above). * In debug builds, panic the process. I'll add uses of the macro separately.	2025-02-03 19:23:12 +00:00
John Spray	715e20343a	storage controller: improve scheduling of tenants created in PlacementPolicy::Secondary (#10590 ) ## Problem I noticed when onboarding lots of tenants that the AZ scheduling violation stat was climbing, before falling later as optimisations happened. This was happening because we first add the tenant with PlacementPolicy::Secondary, and then later go to PlacementPolicy::Attached, and the scheduler's behavior led to a bad AZ choice: 1. Create a secondary location in the non-preferred AZ 2. Upgrade to Attached where we promote that non-preferred-AZ location to attached and then create another secondary 3. Optimiser later realises we're in the wrong AZ and moves us ## Summary of changes - Extend some logging to give more information about AZs - When scheduling secondary location in PlacementPolicy::Secondary, select it as if we were attached: in this mode, our business goal is to have a warm pageserver location that we can make available as attached quickly if needed, therefore we want it to be in the preferred AZ. - Make optimize_secondary logic the same, so that it will consider a secondary location in the preferred AZ to be optimal when in PlacementPolicy::Secondary - When transitioning to from PlacementPolicy::Attached(N) to PlacementPolicy::Secondary, instead of arbitrarily picking a location to keep, prefer to keep the location in the preferred AZ	2025-02-03 19:01:16 +00:00
Arpad Müller	c774f0a147	storcon db: allow accepting any TLS certificate (#10640 ) We encountered some TLS validation errors for the storcon since applying #10614. Add an option to downgrade them to logged errors instead to allow us to debug with more peace. cc issue https://github.com/neondatabase/cloud/issues/23583	2025-02-03 18:21:01 +00:00
Folke Behrens	628a9616c4	fix(proxy): Don't use --is-private-access-proxy to disable IP check (#10633 ) ## Problem * The behavior of this flag changed. Plus, it's not necessary to disable the IP check as long as there are no IPs listed in the local postgres. ## Summary of changes * Drop the flag from the command in the README.md section. * Change the postgres URL passed to proxy to not use the endpoint hostname. * Also swap postgres creation and proxy startup, so the DB is running when proxy comes up.	2025-02-03 14:12:41 +00:00
Alexander Bayandin	43682624b5	CI(pg-clients): fix logical replication tests (#10623 ) ## Problem Tests for logical replication (on Staging) have been failing for some time because logical replication is not enabled for them. This issue occurred after switching to an org API key with a different default setting, where logical replication was not enabled by default. ## Summary of changes - Add `enable_logical_replication` input to `actions/neon-project-create` - Enable logical replication in `test-logical-replication` job	2025-02-03 13:41:41 +00:00
Em Sharnoff	e617a3a075	vm-monitor: Improve error display (#10542 ) Logging errors with the debug format specifier causes multi-line errors, which are sometimes a pain to deal with. Instead, we should use anyhow's alternate display format, which shows the same information on a single line. Also adjusted a couple of error messages that were stale. Fixes neondatabase/cloud#14710.	2025-02-03 13:34:11 +00:00
Fedor Dikarev	23ca8b061b	Use actions/checkout for checkout (#10630 ) ## Problem 1. First of all it's more correct 2. Current usage allows ` Time-of-Check-Time-of-Use (TOCTOU) 'Pwn Request' vulnerabilities`. Please check security slack channel or reach me for more details. I will update PR description after merge. ## Summary of changes 1. Use `actions/checkout` with `ref: ${{ github.event.pull_request.head.sha }}` Discovered by and Co-author: @varunsh-coder	2025-02-03 12:55:48 +00:00
Anastasia Lubennikova	b1bc33eb4d	Fix logical_replication_sync test fixture (#10531 ) Fixes flaky test_lr_with_slow_safekeeper test #10242 Fix query to `pg_catalog.pg_stat_subscription` catalog to handle table synchronization and parallel LR correctly.	2025-02-03 12:44:47 +00:00
OBBO67	b1e451091a	pageserver: clean up references to timeline delete marker, uninit marker (#5718 ) (#10627 ) ## Problem Since [#5580](https://github.com/neondatabase/neon/pull/5580) the delete and uninit file markers are no longer needed. ## Summary of changes Remove the remaining code for the delete and uninit markers. Additionally removes the `ends_with_suffix` function as it is no longer required. Closes [#5718](https://github.com/neondatabase/neon/issues/5718).	2025-02-03 11:54:07 +00:00
Arpad Müller	87ad50c925	storcon: use diesel-async again, now with tls support (#10614 ) Successor of #10280 after it was reverted in #10592. Re-introduce the usage of diesel-async again, but now also add TLS support so that we connect to the storcon database using TLS. By default, diesel-async doesn't support TLS, so add some code to make us explicitly request TLS. cc https://github.com/neondatabase/cloud/issues/23583	2025-02-03 11:53:51 +00:00
Alexander Bayandin	89b9f74077	CI(pre-merge-checks): do not run `conclusion` job for PRs (#10619 ) ## Problem While working on https://github.com/neondatabase/neon/pull/10617 I (unintentionally) merged the PR before the main CI pipeline has finished. I suspect this happens because we have received all the required job results from the pre-merge-checks workflow, which runs on PRs that include changes to relevant files. ## Summary of changes - Skip the `conclusion` job in `pre-merge-checks` workflows for PRs	2025-02-03 09:40:12 +00:00
John Spray	f071800979	tests: stabilize shard locations earlier in test_scrubber_tenant_snapshot (#10606 ) ## Problem This test would sometimes emit unexpected logs from the storage controller's requests to do migrations, which overlap with the test's restarts of pageservers, where those migrations are happening some time after a shard split as the controller moves load around. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10602/13067323736/index.html#testresult/f66f1329557a1fc5/retries ## Summary of changes - Do a reconcile_until_idle after shard split, so that the rest of the test doesn't run concurrently with migrations	2025-02-03 09:02:21 +00:00
Peter Bendel	4dfe60e2ad	revert https://github.com/neondatabase/neon/pull/10616 (#10631 ) ## Problem https://github.com/neondatabase/neon/pull/10616 was only intended temparily during the weekend, want to reset to prior state ## Summary of changes revert https://github.com/neondatabase/neon/pull/10616 but keep fixes in https://github.com/neondatabase/neon/pull/10622	2025-02-03 09:00:23 +00:00
Arpad Müller	8ae6f656a6	Don't require partial backup semaphore capacity for deletions (#10628 ) In the safekeeper, we block deletions on the timeline's gate closing, and any `WalResidentTimeline` keeps the gate open (because it owns a gate lock object). Thus, unless the `main_task` function of a partial backup doesn't return, we can't delete the associated timeline. In order to make these tasks exit early, we call the cancellation token of the timeline upon its shutdown. However, the partial backup task wasn't looking for the cancellation while waiting to acquire a partial backup permit. On a staging safekeeper we have been in a situation in the past where the semaphore was already empty for a duration of many hours, rendering all attempted deletions unable to proceed until a restart where the semaphore was reset: https://neondb.slack.com/archives/C03H1K0PGKH/p1738416586442029	2025-02-03 04:11:06 +00:00
Peter Bendel	b9e1a67246	fix generate matrix for olap for saturdays (#10622 ) ## Problem when introducing pg17 for job step `Generate matrix for OLAP benchmarks` I introduced a syntax error that only hits on Saturdays. ## Summary of changes Remove trailing comma ## successful test run https://github.com/neondatabase/neon/actions/runs/13086363907	2025-02-01 11:09:45 +00:00
Folke Behrens	6318828c63	Update rust to 1.84.1 (#10618 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://releases.rs/docs/1.84.1/). Prior update was in https://github.com/neondatabase/neon/pull/10328. Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-01-31 20:52:17 +00:00
Stefan Radig	6dd48ba148	feat(proxy): Implement access control with VPC endpoint checks and block for public internet / VPC (#10143 ) - Wired up filtering on VPC endpoints - Wired up block access from public internet / VPC depending on per project flag - Added cache invalidation for VPC endpoints (partially based on PR from Raphael) - Removed BackendIpAllowlist trait --------- Co-authored-by: Ivan Efremov <ivan@neon.tech>	2025-01-31 20:32:57 +00:00
Conrad Ludgate	ad1a41157a	feat(proxy): optimizing the chances of large write in copy_bidirectional (#10608 ) We forked copy_bidirectional to solve some issues like fast-shutdown (disallowing half-open connections) and to introduce better error tracking (which side of the conn closed down). A change recently made its way upstream offering performance improvements: https://github.com/tokio-rs/tokio/pull/6532. These seem applicable to our fork, thus it makes sense to apply them here as well.	2025-01-31 19:14:27 +00:00
Tristan Partin	fcd195c2b6	Migrate compute_ctl arg parsing to clap derive (#10497 ) The primary benefit is that all the ad hoc get_matches() calls are no longer necessary. Now all it takes to get at the CLI arguments is referencing a struct member. It's also great the we can replace the ad hoc CLI struct we had with this more formal solution. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-31 19:04:26 +00:00
Peter Bendel	bc7822d90c	temporarily disable some steps and run more often to expose more pgbench --initialize in benchmarking workflow (#10616 ) ## Problem we want to disable some steps in benchmarking workflow that do not initialize new projects and instead run the test more frequently Test run https://github.com/neondatabase/neon/actions/runs/13077737888	2025-01-31 18:41:17 +00:00
Alexander Bayandin	48c87dc458	CI(pre-merge-checks): fix condition (#10617 ) ## Problem Merge Queue fails if changes include Rust code. ## Summary of changes - Fix condition for `build-build-tools-image` - Add a couple of no-op `false \|\|` to make predicates look symmetric	2025-01-31 18:07:26 +00:00
John Spray	aedeb1c7c2	pageserver: revise logging of cancelled request results (#10604 ) ## Problem When a client dropped before a request completed, and a handler returned an ApiError, we would log that at error severity. That was excessive in the case of a request erroring on a shutdown, and could cause test flakes. example: https://neon-github-public-dev.s3.amazonaws.com/reports/main/13067651123/index.html#suites/ad9c266207b45eafe19909d1020dd987/6021ce86a0d72ae7/ ``` Cancelled request finished with an error: ShuttingDown ``` ## Summary of changes - Log a different info-level on ShuttingDown and ResourceUnavailable API errors from cancelled requests	2025-01-31 17:43:54 +00:00
John Spray	a93e9f22fc	pageserver: remove faulty debug assertion in compaction (#10610 ) ## Problem This assertion is incorrect: it is legal to see another shard's data at this point, after a shard split. Closes: https://github.com/neondatabase/neon/issues/10609 ## Summary of changes - Remove faulty assertion	2025-01-31 17:43:31 +00:00
JC Grünhage	10cf5e7a38	Move cargo-deny into a separate workflow on a schedule (#10289 ) ## Problem There are two (related) problems with the previous handling of `cargo-deny`: - When a new advisory is added to rustsec that affects a dependency, unrelated pull requests will fail. - New advisories rely on pushes or PRs to be surfaced. Problems that already exist on main will only be found if we try to merge new things into main. ## Summary of changes We split out `cargo-deny` into a separate workflow that runs on all PRs that touch `Cargo.lock`, and on a schedule on `main`, `release`, `release-compute` and `release-proxy` to find new advisories.	2025-01-31 13:42:59 +00:00
Arpad Müller	dce617fe07	Update to rebased rust-postgres (#10584 ) Update to a rebased version of our rust-postgres patches, rebased on [this](`98f5a11bc0`) commit this time. With #10280 reapplied, this means that the rust-postgres crates will be deduplicated, as the new crate versions are finally compatible with the requirements of diesel-async. Earlier update: #10561 rust-postgres PR: https://github.com/neondatabase/rust-postgres/pull/39	2025-01-31 12:40:20 +00:00
Alexander Bayandin	503bc72d31	CI: add `diesel print-schema` check (#10527 ) ## Problem We want to check that `diesel print-schema` doesn't generate any changes (`storage_controller/src/schema.rs`) in comparison with the list of migration. ## Summary of changes - Add `diesel_cli` to `build-tools` image - Add `Check diesel schema` step to `build-neon` job, at this stage we have all required binaries, so don't need to compile anything additionally - Check runs only on x86 release builds to be sure we do it at least once per CI run.	2025-01-31 11:48:46 +00:00
Fedor Dikarev	89cff08354	unify pg-build-nonroot-with-cargo base layer and config retries in curl (#10575 ) Ref: https://github.com/neondatabase/cloud/issues/23461 ## Problem Just made changes around and see these 2 base layers could be optimised. and after review comment from @myrrc setting up timeouts and retries in `alpine/curl` image ## Summary of changes	2025-01-31 11:46:33 +00:00
Erik Grinaker	afbcebe7f7	test_runner: force-compact in `test_sharding_autosplit` (#10605 ) ## Problem This test may not fully detect data corruption during splits, since we don't force-compact the entire keyspace. ## Summary of changes Force-compact all data in `test_sharding_autosplit`.	2025-01-31 11:31:58 +00:00
Arpad Müller	7d5c70c717	Update AWS SDK crates (#10588 ) We want to keep the AWS SDK up to date as that way we benefit from new developments and improvements. Prior update was in #10056	2025-01-31 11:23:12 +00:00
John Spray	f09cfd11cb	pageserver: exclude archived timelines from freeze+flush on shutdown (#10594 ) ## Problem If offloading races with normal shutdown, we get a "failed to freeze and flush: cannot flush frozen layers when flush_loop is not running, state is Exited". This is harmless but points to it being quite strange to try and freeze and flush such a timeline. flushing on shutdown for an archived timeline isn't useful. Related: https://github.com/neondatabase/neon/issues/10389 ## Summary of changes - During Timeline::shutdown, ignore ShutdownMode::FreezeAndFlush if the timeline is archived	2025-01-31 10:54:14 +00:00
Arseny Sher	765ba43438	Allow pageserver unreachable errors in test_scrubber_tenant_snapshot (#10585 ) ## Problem test_scrubber_tenant_snapshot restarts pageservers, but log validation fails tests on any non white listed storcon warnings, making the test flaky. ## Summary of changes Allow warns like 2025-01-29T12:37:42.622179Z WARN reconciler{seq=1 tenant_id=2011077aea9b4e8a60e8e8a19407634c shard_id=0004}: Call to node 2 (localhost:15352) management API failed, will retry (attempt 1): receive body: error sending request for url (http://localhost:15352/v1/tenant/2011077aea9b4e8a60e8e8a19407634c-0004/location_config): client error (Connect) ref https://github.com/neondatabase/neon/issues/10462	2025-01-31 10:33:24 +00:00
Folke Behrens	6041a93591	Update tokio base crates (#10556 ) Update `tokio` base crates and their deps. Pin `tokio` to at least 1.41 which stabilized task ID APIs. To dedup `mio` dep the `notify` crate is updated. It's used in `compute_tools`. `9f81828429/compute_tools/src/pg_helpers.rs (L258-L367)`	2025-01-31 09:54:31 +00:00
Conrad Ludgate	738bf83583	chore: replace dashmap with clashmap (#10582 ) ## Problem Because dashmap 6 switched to hashbrown RawTable API, it required us to use unsafe code in the upgrade: https://github.com/neondatabase/neon/pull/8107 ## Summary of changes Switch to clashmap, a fork maintained by me which removes much of the unsafe and ultimately switches to HashTable instead of RawTable to remove much of the unsafe requirement on us.	2025-01-31 09:53:43 +00:00
Anna Stepanyan	423e239617	[infra/notes] impr: add issue types to issue templates (#10018 ) refs #0000 --------- Co-authored-by: Fedor Dikarev <fedor@neon.tech>	2025-01-31 06:29:06 +00:00
Heikki Linnakangas	df87a55609	tests: Speed up test_pgdata_import_smoke on Postgres v17 (#10567 ) The test runs this query: select count(*), sum(data::bigint)::bigint from t to validate the test results between each part of the test. It performs a simple sequential scan and aggregation, but was taking an order of magnitude longer on v17 than on previous Postgres versions, which sometimes caused the test to time out. There were two reasons for that: 1. On v17, the planner estimates the table to have only only one row. In reality it has 305790 rows, and older versions estimated it at 611580, which is not too bad given that the table has not been analyzed so the planner bases that estimate just on the number of pages and the widths of the datatypes. The new estimate of 1 row is much worse, and it leads the planner to disregard parallel plans, whereas on older versions you got a Parallel Seq Scan. I tracked this down to upstream commit 29cf61ade3, "Consider fillfactor when estimating relation size". With that commit, table_block_relation_estimate_size() function calculates that each page accommodates less than 1 row when the fillfactor is taken into account, which rounds down to 0. In reality, the executor will always place at least one row on a page regardless of fillfactor, but the new estimation formula doesn't take that into account. I reported this to pgsql-hackers (https://www.postgresql.org/message-id/2bf9d973-7789-4937-a7ca-0af9fb49c71e%40iki.fi), we don't need to do anything more about it in neon. It's OK to not use parallel scans here; once issue 2. below is addressed, the queries are fast enough without parallelism.. 2. On v17, prefetching was not happening for the sequential scan. That's because starting with v17, buffers are reserved in the shared buffer cache before prefetching is initiated, and we use a tiny shared_buffers=1MB setting in the tests. The prefetching is effectively disabled with such a small shared_buffers setting, to protect the system from completely starving out of buffers. To address that, simply bump up shared_buffers in the test. This patch addresses the second issue, which is enough to fix the problem.	2025-01-30 22:55:17 +00:00
John Spray	5e0c40709f	storcon: refine chaos selection logic (#10600 ) ## Problem In https://github.com/neondatabase/neon/pull/10438 it was pointed out that it would be good to avoid picking tenants in ID order, and also to avoid situations where we might double-select the same tenant. There was an initial swing at this in https://github.com/neondatabase/neon/pull/10443, where Chi suggested a simpler approach which is done in this PR ## Summary of changes - Split total set of tenants into in and out of home AZ - Consume out of home AZ first, and if necessary shuffle + consume from out of home AZ	2025-01-30 22:45:43 +00:00
John Spray	e1273acdb1	pageserver: handle shutdown cleanly in layer download API (#10598 ) ## Problem This API is used in tests and occasionally for support. It cast all errors to 500. That can cause a failure on the log checks: https://neon-github-public-dev.s3.amazonaws.com/reports/main/13056992876/index.html#suites/ad9c266207b45eafe19909d1020dd987/683a7031d877f3db/ ## Summary of changes - Avoid using generic anyhow::Error for layer downloads - Map shutdown cases to 503 in http route	2025-01-30 22:43:36 +00:00
John Spray	d18f6198e1	storcon: fix AZ-driven tenant selection in chaos (#10443 ) ## Problem In https://github.com/neondatabase/neon/pull/10438 I had got the function for picking tenants backwards, and it was preferring to move things _away_ from their preferred AZ. ## Summary of changes - Fix condition in `is_attached_outside_preferred_az`	2025-01-30 22:17:07 +00:00
John Spray	6da7c556c2	pageserver: fix race cleaning up timeline files when shut down during bootstrap (#10532 ) ## Problem Timeline bootstrap starts a flush loop, but doesn't reliably shut down the timeline (incl. waiting for flush loop to exit) before destroying UninitializedTimeline, and that destructor tries to clean up local storage. If local storage is still being written to, then this is unsound. Currently the symptom is that we see a "Directory not empty" error log, e.g. https://neon-github-public-dev.s3.amazonaws.com/reports/main/12966756686/index.html#testresult/5523f7d15f46f7f7/retries ## Summary of changes - Move fallible IO part of bootstrap into a function (notably, this is fallible in the case of the tenant being shut down while creation is happening) - When that function returns an error, call shutdown() on the timeline	2025-01-30 20:33:22 +00:00
a-masterov	bf6d5e93ba	Run tests of the contrib extensions (#10392 ) ## Problem We don't test the extensions, shipped with contrib ## Summary of changes The tests are now running	2025-01-30 19:32:35 +00:00
Arpad Müller	4d2c2e9460	Revert "storcon: switch to diesel-async and tokio-postgres (#10280 )" (#10592 ) There was a regression of #10280, tracked in [#23583](https://github.com/neondatabase/cloud/issues/23583). I have ideas how to fix the issue, but we are too close to the release cutoff, so revert #10280 for now. We can revert the revert later :).	2025-01-30 19:23:25 +00:00
John Spray	bae0de643e	tests: relax constraints on test_timeline_archival_chaos (#10595 ) ## Problem The test asserts that it completes at least 10 full timeline lifecycles, but the noisy CI environment sometimes doesn't meet that goal. Related: https://github.com/neondatabase/neon/issues/10389 ## Summary of changes - Sleep for longer between pageserver restarts, so that the timeline workers have more chance to make progress - Sleep for shorter between retries from timeline worker, so that they have better chance to get in while a pageserver is up between restarts - Relax the success condition to complete at least 5 iterations instead of 10	2025-01-30 19:22:59 +00:00
Cheng Chen	8293b252b2	chore(compute): pg_mooncake v0.1.1 (#10578 ) ## Problem Upgrade pg_mooncake to v0.1.1 ## Summary of changes https://github.com/Mooncake-Labs/pg_mooncake/blob/main/CHANGELOG.md#011-2025-01-29	2025-01-30 18:33:25 +00:00
Peter Bendel	6c8fc909d6	Benchmarking PostgreSQL17: for OLAP need specific connstr secrets (#10587 ) ## Problem for OLAP benchmarks we need specific connstr secrets with different database names for each job step This is a follow-up for https://github.com/neondatabase/neon/pull/10536 In previous PR we used a common GitHub secret for a shared re-use project that has 4 databases: neondb, tpch, clickbench and userexamples. [Failure example](https://neon-github-public-dev.s3.amazonaws.com/reports/main/13044872855/index.html#suites/54d0af6f403f1d8611e8894c2e07d023/fc029330265e9f6e/): ```log # /tmp/neon/pg_install/v17/bin/psql user=neondb_owner dbname=neondb host=ep-broad-brook-w2luwzzv.us-east-2.aws.neon.build sslmode=require options='-cstatement_timeout=0 ' -c -- $ID$ -- TPC-H/TPC-R Pricing Summary Report Query (Q1) -- Functional Query Definition -- Approved February 1998 ... ERROR: relation "lineitem" does not exist ``` ## Summary of changes We need dedicated GitHub secrets and dedicated connection strings for each of the use cases. ## Test run https://github.com/neondatabase/neon/actions/runs/13053968231	2025-01-30 16:41:46 +00:00
Heikki Linnakangas	efe42db264	tests: test_pgdata_import_smoke requires the 'testing' cargo feature (#10569 ) It took me ages to figure out why it was failing on my laptop. What I saw was that when the test makes the 'import_pgdata' in the pageserver, the pageserver actually performs a regular 'bootstrap' timeline creation by running initdb, with no importing. It boiled down to the json request that the test uses: ``` { "new_timeline_id": str(timeline_id), "import_pgdata": { "idempotency_key": str(idempotency), "location": {"LocalFs": {"path": str(importbucket.absolute())}}, }, }, ``` and how serde deserializes into rust structs. The 'LocalFs' enum variant in `models.rs` is gated on the 'testing' cargo feature. On a non-testing build, that got deserialized into the default Bootstrap enum variant, as a valid TimelineCreateRequestModeImportPgdata variant could not be formed. PS. IMHO we should get rid of the testing feature, compile in all the functionality, and have a runtime flag to disable anything dangeorous. With that, you would've gotten a nice "feature only enabled in testing mode" error in this case, or the test would've simply worked. But that's another story.	2025-01-30 16:11:26 +00:00
Alex Chi Z.	cf6dee946e	fix(pageserver): gc-compaction race with read (#10543 ) ## Problem close https://github.com/neondatabase/neon/issues/10482 ## Summary of changes Add an extra lock on the read path to protect against races. The read path has an implication that only certain kind of compactions can be performed. Garbage keys must first have an image layer covering the range, and then being gc-ed -- they cannot be done in one operation. An alternative to fix this is to move the layers read guard to be acquired at the beginning of `get_vectored_reconstruct_data_timeline`, but that was intentionally optimized out and I don't want to regress. The race is not limited to image layers. Gc-compaction will consolidate deltas automatically and produce a flat delta layer (i.e., when we have retain_lsns below the gc-horizon). The same race would also cause behaviors like getting an un-replayable key history as in https://github.com/neondatabase/neon/issues/10049. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-30 15:25:29 +00:00
Alexey Kondratov	be51b10da7	chore(compute): Print some compute_ctl errors in debug mode (#10586 ) ## Problem In some cases, we were returning a very shallow error like `error sending request for url (XXX)`, which made it very hard to figure out the actual error. ## Summary of changes Use `{:?}` in a few places, and remove it from places where we were printing a string anyway.	2025-01-30 14:31:49 +00:00
Arpad Müller	93714c4c7b	secondary downloader: load metadata on loading of timeline (#10539 ) Related to #10308, we might have legitimate changes in file size or generation. Those changes should not cause warn log lines. In order to detect changes of the generation number while the file size stayed the same, load the metadata that we store on disk on loading of the timeline. Still do a comparison with the on-disk layer sizes to find any discrepancies that might occur due to race conditions (new metadata file gets written but layer file has not been updated yet, and PS shuts down). However, as it's possible to hit it in a race conditon, downgrade it to a warning. Also fix a mistake in #10529: we want to compare the old with the new metadata, not the old metadata with itself.	2025-01-30 12:03:36 +00:00
John Spray	ab627ad9fd	storcon_cli: fix spurious error setting preferred AZ (#10568 ) ## Problem The client code for `tenant-set-preferred-az` declared response type `()`, so printed a spurious error on each use: ``` Error: receive body: error decoding response body: invalid type: map, expected unit at line 1 column 0 ``` The requests were successful anyway. ## Summary of changes - Declare the proper return type, so that the command succeeds quietly.	2025-01-30 11:54:02 +00:00
Erik Grinaker	6a2afa0c02	pageserver: add per-timeline read amp histogram (#10566 ) ## Problem We don't have per-timeline observability for read amplification. Touches https://github.com/neondatabase/cloud/issues/23283. ## Summary of changes Add a per-timeline `pageserver_layers_per_read` histogram. NB: per-timeline histograms are expensive, but probably worth it in this case.	2025-01-30 11:24:49 +00:00
Alexander Bayandin	8804d58943	Nightly Benchmarks: use pgbench from artifacts (#10370 ) We don't use statically linked OpenSSL anymore (#10302), it's ok to switch to Neon's pgbench for pgvector benchmarks	2025-01-30 11:18:07 +00:00
Erik Grinaker	d3db96c211	pageserver: add `pageserver_deltas_per_read_global` metric (#10570 ) ## Problem We suspect that Postgres checkpoints will limit the number of page deltas necessary to reconstruct a page, but don't know for certain. Touches https://github.com/neondatabase/cloud/issues/23283. ## Summary of changes Add `pageserver_deltas_per_read_global` metric. This pairs with `pageserver_layers_per_read_global` from #10573.	2025-01-30 10:55:07 +00:00
Erik Grinaker	b24727134c	pageserver: improve read amp metric (#10573 ) ## Problem The current global `pageserver_layers_visited_per_vectored_read_global` metric does not appear to accurately measure read amplification. It divides the layer count by the number of reads in a batch, but this means that e.g. 10 reads with 100 L0 layers will only measure a read amp of 10 per read, while the actual read amp was 100. While the cost of layer visits are amortized across the batch, and some layers may not intersect with a given key, each visited layer contributes directly to the observed latency for every read in the batch, which is what we care about. Touches https://github.com/neondatabase/cloud/issues/23283. Extracted from #10566. ## Summary of changes * Count the number of layers visited towards each read in the batch, instead of the average across the batch. * Rename `pageserver_layers_visited_per_vectored_read_global` to `pageserver_layers_per_read_global`. * Reduce the read amp log warning threshold down from 512 to 100.	2025-01-30 09:27:40 +00:00
Alexander Lakhin	a7a706cff7	Fix submodule reference after #10473 (#10577 )	2025-01-30 09:09:43 +00:00
Folke Behrens	86d5798108	Merge pull request #10576 from neondatabase/rc/release-proxy/2025-01-30 Proxy release 2025-01-30	2025-01-30 08:52:09 +01:00
github-actions[bot]	8b4088dd8a	Proxy release 2025-01-30	2025-01-30 06:02:00 +00:00
Alex Chi Z.	77ea9b16fe	fix(pageserver): use the larger one of upper limit and threshold (#10571 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/10550 in case the upper limit is set larger than threshold. It does not make sense for someone to enforce the behavior like "if there are >= 50 L0s, only compact 10 of them". ## Summary of changes Use the maximum of compaction threshold and upper limit when selecting L0 files to compact. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-30 00:05:40 +00:00
Alex Chi Z.	9dff6cc2a4	fix(pageserver): skip repartition if we need L0 compaction (#10547 ) ## Problem Repartition is slow, but it's only used in image layer creation. We can skip it if we have a lot of L0 layers to ingest. ## Summary of changes If L0 compaction is not complete, do not repartition and do not create image layers. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-29 21:32:50 +00:00
Erik Grinaker	ff298afb97	pageserver: add `level` for timeline layer metrics (#10563 ) ## Problem We don't have good observability for per-timeline compaction debt, specifically the number of delta layers in the frozen, L0, and L1 levels. Touches https://github.com/neondatabase/cloud/issues/23283. ## Summary of changes * Add a `level` label for `pageserver_layer_{count,size}` with values `l0`, `l1`, and `frozen`. * Track metrics for frozen layers. There is already a `kind={delta,image}` label. `kind=image` is only possible for `level=l1`. We don't include the currently open ephemeral layer, only frozen layers. There is always exactly 1 ephemeral layer, with a dynamic size which is already tracked in `pageserver_timeline_ephemeral_bytes`.	2025-01-29 21:10:56 +00:00
Fedor Dikarev	de1c35fab3	add retries for apt, wget and curl (#10553 ) Ref: https://github.com/neondatabase/cloud/issues/23461 ## Problem > recent CI failure due to apt-get: ``` 4.266 E: Failed to fetch http://deb.debian.org/debian/pool/main/g/gcc-10/libgfortran5_10.2.1-6_arm64.deb Error reading from server - read (104: Connection reset by peer) [IP: 146.75.122.132 80] ``` https://github.com/neondatabase/neon/actions/runs/11144974698/job/30973537767?pr=9186 thinking about if there should be a mirror-selector at the beginning of the dockerfile so that it uses a debian mirror closer to the build server? ## Summary of changes We could consider adding local mirror or proxy and keep it close to our self-hosted runners. For now lets just add retries for `apt`, `wget` and `curl` thanks to @skyzh for reporting that in October 2024, I just finally found time to take a look here :)	2025-01-29 21:02:54 +00:00
Peter Bendel	62819aca36	Add PostgreSQL version 17 benchmarks (#10536 ) ## Problem benchmarking.yml so far is only running benchmarks with PostgreSQL version 16. However neon recently changed the default for new customers to PostgreSQL version 17. See related [epic](https://github.com/neondatabase/cloud/issues/23295) ## Summary of changes We do not want to run every job step with both pg 16 and 17 because this would need excessive resources (runners, computes) and extend the benchmarking run wall clock time too much. So we select an opinionated subset of testcases that we also report in weekly reporting and add a postgres v17 job step. For re-use projects associated Neon projects have been created and connection strings have been added to neon database organization secrets. A follow up is to add the reporting for these new runs to some grafana dashboards.	2025-01-29 20:21:42 +00:00
Tristan Partin	707a926057	Remove unused compute_ctl HTTP routes (#10544 ) These are not used anywhere within the platform, so let's remove dead code. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-29 19:22:01 +00:00
Alex Chi Z.	5bcefb4ee1	fix(pageserver): compaction perftest wrt upper limit (#10564 ) ## Problem The config is added in https://github.com/neondatabase/neon/pull/10550 causing behavior change for l0 compaction. close https://github.com/neondatabase/neon/issues/10562 ## Summary of changes Fix the test case to consider the effect of upper_limit. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-29 18:43:39 +00:00
Alexey Kondratov	34322b2424	chore(compute): Simplify new compute_ctl metrics and fix flaky test (#10560 ) ## Problem 1. `d04d924` added separate metrics for total requests and failures separately, but it doesn't make much sense. We could just have a unified counter with `http_status`. 2. `test_compute_migrations_retry` had a race, i.e., it was waiting for the last successful migration, not an actual failure. This was revealed after adding an assert on failure metric in `d04d924`. ## Summary of changes 1. Switch to unified counters for `compute_ctl` requests. 2. Add a waiting loop into `test_compute_migrations_retry` to eliminate the race. Part of neondatabase/cloud#17590	2025-01-29 18:09:25 +00:00
Vlad Lazar	fdfbc7b358	pageserver: hold GC while reading from a timeline (#10559 ) ## Problem If we are GC-ing because a new image layer was added while traversing the timeline, then it will remove layers that are required for fulfilling the current get request (read-path cannot "look back" and notice the new image layer). ## Summary of Changes Prevent GC from progressing on the current timeline while it is being visited for a read. Epic: https://github.com/neondatabase/neon/issues/9376	2025-01-29 17:08:25 +00:00
Conrad Ludgate	190c19c034	chore: update rust-postgres on rebase (#10561 ) I tried a full update of our tokio-postgres fork before. We hit some breaking change. This PR only pulls in ~50% of the changes from upstream: https://github.com/neondatabase/rust-postgres/pull/38.	2025-01-29 17:02:07 +00:00
Mikhail Kot	34e560fe37	download exporters from releases rather than using docker images (#10551 ) Use releases for postgres-exporter, pgbouncer-exporter, and sql-exporter	2025-01-29 15:52:00 +00:00
Tristan Partin	7922458b98	Use num_cpus from the workspace in pageserver (#10545 ) Luckily they were the same version, so we didn't spend time compiling two versions, which could have been the case in the future. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-29 15:45:36 +00:00
a-masterov	34d9e2d8e3	Add a test for GrapgQL (#10156 ) ## Problem We currently don't run the tests shipped with `pg_graphql`. ## Summary of changes The tests for `pg_graphql` are added.	2025-01-29 15:01:56 +00:00
Conrad Ludgate	2f82c21c63	chore: update rust-postgres fork (#10557 ) I updated the fork to fix some lints. Cargo keeps getting confused by it so let's just update the lockfile here	2025-01-29 12:55:24 +00:00
Ivan Efremov	222cc181e9	impr(proxy): Move the CancelMap to Redis hashes (#10364 ) ## Problem The approach of having CancelMap as an in-memory structure increases code complexity, as well as putting additional load for Redis streams. ## Summary of changes - Implement a set of KV ops for Redis client; - Remove cancel notifications code; - Send KV ops over the bounded channel to the handling background task for removing and adding the cancel keys. Closes #9660	2025-01-29 11:19:10 +00:00
alexanderlaw	4d2328ebe3	Fix C code to satisfy sanitizers (#10473 )	2025-01-29 10:05:43 +00:00
a-masterov	9f81828429	Test extension upgrade compatibility (#10244 ) ## Problem We have to test the extensions, shipped with Neon for compatibility before the upgrade. ## Summary of changes Added the test for compatibility with the upgraded extensions.	2025-01-29 09:19:11 +00:00
Arseny Sher	9ab13d6e2c	Log statements in test_layer_map (#10554 ) ## Problem test_layer_map doesn't log statements and it is not clear how long they take. ## Summary of changes Do log them. ref https://github.com/neondatabase/neon/issues/10409	2025-01-29 09:16:00 +00:00
Alex Chi Z.	983e18e63e	feat(pageserver): add compaction_upper_limit config (#10550 ) ## Problem Follow-up of the incident, we should not use the same bound on lower/upper limit of compaction files. This patch adds an upper bound limit, which is set to 50 for now. ## Summary of changes Add `compaction_upper_limit`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-01-28 23:18:32 +00:00
Alex Chi Z.	b735df6ff0	fix(pageserver): make image layer generation atomic (#10516 ) ## Problem close https://github.com/neondatabase/neon/issues/8362 ## Summary of changes Use `BatchLayerWriter` to ensure we clean up image layers after failed compaction. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-28 21:29:51 +00:00
Fedor Dikarev	68cf0ba439	run benchmark tests on small-metal runners (#10549 ) ## Problem Ref: https://github.com/neondatabase/cloud/issues/23314 We suspect some inconsistency in Benchmark tests runs could be due to different type of runners they are landed in. To have that aligned in both terms: failure rates and benchmark results, lets run them for now on `small-metal` servers and see the progress for the tests stability. ## Summary of changes	2025-01-28 21:26:38 +00:00
Alexey Kondratov	d04d924649	feat(compute): Add some basic compute_ctl metrics (#10504 ) ## Problem There are several parts of `compute_ctl` with a very low visibility of errors: 1. DB migrations that run async in the background after compute start. 2. Requests made to control plane (currently only `GetSpec`). 3. Requests made to the remote extensions server. ## Summary of changes Add new counters to quickly evaluate the amount of errors among the fleet. Part of neondatabase/cloud#17590	2025-01-28 19:24:07 +00:00
JC Grünhage	f5fdaa6dc6	feat(ci): generate basic release notes with links (#10511 ) ## Problem https://github.com/neondatabase/neon/pull/10448 removed release notes, because if their generation failed, the whole release was failing. People liked them though, and wanted some basic release notes as a fall-back instead of completely removing them. ## Summary of changes Include basic release notes that link to the release PR and to a diff to the previous release.	2025-01-28 19:13:39 +00:00
Vlad Lazar	c54cd9e76a	storcon: signal LSN wait to pageserver during live migration (#10452 ) ## Problem We've seen the ingest connection manager get stuck shortly after a migration. ## Summary of changes A speculative mitigation is to use the same mechanism as get page requests for kicking LSN ingest. The connection manager monitors LSN waits and queries the broker if no updates are received for the timeline. Closes https://github.com/neondatabase/neon/issues/10351	2025-01-28 17:33:07 +00:00
Erik Grinaker	1010b8add4	pageserver: add `l0_flush_wait_upload` setting (#10534 ) ## Problem We need a setting to disable the flush upload wait, to test L0 flush backpressure in staging. ## Summary of changes Add `l0_flush_wait_upload` setting.	2025-01-28 17:21:05 +00:00
Folke Behrens	ae4b2af299	fix(proxy): Use correct identifier for usage metrics upload (#10538 ) ## Problem The request data and usage metrics S3 requests use the same identifier shown in logs, causing confusion about what type of upload failed. ## Summary of changes Use the correct identifier for usage metrics uploads. neondatabase/cloud#23084	2025-01-28 17:08:17 +00:00
Tristan Partin	15fecb8474	Update axum to 0.8.1 (#10332 ) Only a few things that needed updating: - async_trait was removed - Message::Text takes a Utf8Bytes object instead of a String Signed-off-by: Tristan Partin <tristan@neon.tech> Co-authored-by: Conrad Ludgate <connor@neon.tech>	2025-01-28 15:32:59 +00:00
Erik Grinaker	47677ba578	pageserver: disable L0 backpressure by default (#10535 ) ## Problem We'll need further improvements to compaction before enabling L0 flush backpressure by default. See: https://neondb.slack.com/archives/C033RQ5SPDH/p1738066068960519?thread_ts=1737818888.474179&cid=C033RQ5SPDH. Touches #5415. ## Summary of changes Disable `l0_flush_delay_threshold` by default.	2025-01-28 14:51:30 +00:00
Arpad Müller	83b6bfa229	Re-download layer if its local and on-disk metadata diverge (#10529 ) In #10308, we noticed many warnings about the local layer having different sizes on-disk compared to the metadata. However, the layer downloader would never redownload layer files if the sizes or generation numbers change. This is obviously a bug, which we aim to fix with this PR. This change also moves the code deciding what to do about a layer to a dedicated function: before we handled the "routing" via control flow, but now it's become too complicated and it is nicer to have the different verdicts for a layer spelled out in a list/match.	2025-01-28 13:39:53 +00:00
Erik Grinaker	ed942b05f7	Revert "pageserver: revert flush backpressure" (#10402 )" (#10533 ) This reverts commit `9e55d79803`. We'll still need this until we can tune L0 flush backpressure and compaction. I'll add a setting to disable this separately.	2025-01-28 13:33:58 +00:00
Vlad Lazar	62a717a2ca	pageserver: use PS node id for SK appname (#10522 ) ## Problem This one is fairly embarrassing. Safekeeper node id was used in the pageserver application name when connecting to safekeepers. ## Summary of changes Use the right node id. Closes https://github.com/neondatabase/neon/issues/10461	2025-01-28 13:11:51 +00:00
Peter Bendel	c8fbbb9b65	Test ingest_benchmark with different stripe size and also PostgreSQL version 17 (#10510 ) We want to verify if pageserver stripe size has an impact on ingest performance. We want to verify if ingest performance has improved or regressed with postgres version 17. ## Summary of changes - Allow to create new project with different postgres versions - allow to pre-shard new project with different stripe sizes instead of relying on storage manager to shard_split the project once a threshold is exceeded Replaces https://github.com/neondatabase/neon/pull/10509 Test run https://github.com/neondatabase/neon/actions/runs/12986410381	2025-01-27 21:06:05 +00:00
John Spray	d73f4a6470	pageserver: retry wrapper on manifest upload (#10524 ) ## Problem On remote storage errors (e.g. I/O timeout) uploading tenant manifest, all of compaction could fail. This is a problem IRL because we shouldn't abort compaction on a single IO error, and in tests because it generates spurious failures. Related: https://github.com/orgs/neondatabase/projects/51/views/2?sliceBy%5Bvalue%5D=jcsp&pane=issue&itemId=93692919&issue=neondatabase%7Cneon%7C10389 ## Summary of changes - Use `backoff::retry` when uploading tenant manifest	2025-01-27 21:02:25 +00:00
Heikki Linnakangas	5477d7db93	fast_import: fixes for Postgres v17 (#10414 ) Now that the tests are run on v17, they're also run in debug mode, which is slow. Increase statement_timeout in the test to work around that.	2025-01-27 19:47:49 +00:00
Arpad Müller	eb9832d846	Remove PQ_LIB_DIR env var (#10526 ) We now don't need libpq any more for the build of the storage controller, as we use `diesel-async` since #10280. Therefore, we remove the env var that gave cargo/rustc the location for libpq. Follow-up of #10280	2025-01-27 19:38:18 +00:00
Christian Schwarz	3d36dfe533	fix: noisy `broker subscription failed` error during storage broker deploys (#10521 ) During broker deploys, pageservers log this noisy WARN en masse. I can trivially reproduce the WARN message in neon_local by SIGKILLing broker during e.g. `pgbench -i`. I don't understand why tonic is not detecting the error as `Code::Unavailable`. Until we find time to understand that / fix upstream, this PR adds the error message to the existing list of known error messages that get demoted to INFO level. Refs: - refs https://github.com/neondatabase/neon/issues/9562	2025-01-27 19:19:55 +00:00
John Spray	ebf44210ba	remote_storage: less sensitive timeout logging in ABS listings (#10518 ) ## Problem We were logging a warning after a single request timeout, while listing objects. Closes: https://github.com/neondatabase/neon/issues/10166 ## Summary of changes - These timeouts are a pretty normal part of life, so back it off to only log a warning after two in a row.	2025-01-27 17:44:18 +00:00
John Spray	aabf455dfb	README: clarify that neon_local is a dev/test tool (#10512 ) ## Problem From time to time, folks discover our `control_plane/` folder and make the (reasonable) mistake of thinking it's a tool for running full-sized Neon systems, whereas in reality it is a tool for dev/test. ## Summary of changes - Change control_plane's readme title to "Local Development Control Plane (`neon_local`)` - Change "Running local installation" to "Running a local development environment" in the main readme	2025-01-27 17:24:42 +00:00
John Spray	aec92bfc34	pageserver: decrease utilization MAX_SHARDS (#10489 ) ## Problem The intent of this parameter is to have pageservers consider themselves "full" if they've got lots of shards, even if they have plenty of capacity. It works, but because we typically successfully oversubscribe capacity up to 200%, the MAX_SHARDS limit is effectively doubled, so this 20,000 value ends up meaning 40,000, whereas the original intent was to limit nodes to ~10000 shards. ## Summary of changes - Change MAX_SHARDS to 5000, so that a node with 5000 will get a 100% utilization, which is equivalent in practice to being considered "half full" by the storage controller in capacity terms. This is all a bit subtle and indiret. Originally the limit was baked into the pageserver with the idea that the pageserver knows better what its own resources tolerate than the storage controller does, but in practice it would be probably be easier to understand all this if we just did it controller-side. So there's scope to refactor here in future.	2025-01-27 17:03:32 +00:00
Arpad Müller	b0b4b7dd8f	storcon: switch to diesel-async and tokio-postgres (#10280 ) Switches the storcon away from using diesel's synchronous APIs in favour of `diesel-async`. Advantages: * less C dependencies, especially no openssl, which might be behind the bug: https://github.com/neondatabase/cloud/issues/21010 * Better to only have async than mix of async plus `spawn_blocking` We had to turn off usage of the connection pool for migrations, as diesel migrations don't support async APIs. Thus we still use `spawn_blocking` in that one place. But this is explicitly done in one of the `diesel-async` examples.	2025-01-27 14:25:11 +00:00
Mikhail Kot	4dd4096f11	Pgbouncer exporter in compute image (#10503 ) https://github.com/neondatabase/cloud/issues/19081 Include pgbouncer_exporter in compute image and run it at port 9127	2025-01-27 14:09:21 +00:00
Erik Grinaker	be718ed121	pageserver: disable L0 flush stalls, tune delay threshold (#10507 ) ## Problem In ingest benchmarks, we see L0 compaction delays of over 10 minutes due to image compaction. We can't stall L0 flushes for that long. ## Summary of changes Disable L0 flush stalls, and bump the default L0 flush delay threshold from 20 to 30 L0 layers.	2025-01-25 16:51:54 +00:00
Konstantin Knizhnik	9f1408fdf3	Do not assign max(lsn) to maxLastWrittenLsn in SetLastWrittenLSNForblokv (#10474 ) ## Problem See https://github.com/neondatabase/neon/issues/10281 `SetLastWrittenLSNForBlockv` is assigning max(lsn) to `maxLastWrittenLsn` while its should contain only max LSN not present in LwLSN cache. It case unnecessary waits in PS. ## Summary of changes Restore status-quo for pg17. Related Postgres PR: https://github.com/neondatabase/postgres/pull/563 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-24 14:57:32 +00:00
Conrad Ludgate	7000aaaf75	chore: fix h2 stubgen (#10491 ) ## Problem ## Summary of changes --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-01-24 14:55:48 +00:00
Erik Grinaker	ef2a2555b1	pageserver: tighten compaction failure detection (#10502 ) ## Problem If compaction fails, we disable L0 flush stalls to avoid persistent stalls. However, the logic would unset the failure marker on offload failures or shutdown. This can lead to sudden L0 flush stalls if we try and fail to offload a timeline with compaction failures, or if there is some kind of shutdown race. Touches #10405. ## Summary of changes Don't touch the compaction failure marker on offload failures or shutdown.	2025-01-24 13:55:05 +00:00
Konstantin Knizhnik	d8ab6ddb0f	Check if relation has storage in calculate_relation_size (#10477 ) ## Problem Parent of partitioned table has no storage, it relfilelocator is zero. It cab be incorrectly hashed and produce wrong results. See https://github.com/neondatabase/postgres/pull/518 ## Summary of changes This problem is already addressed in pg17. Add the same check for all other PG versions. Postgres PRs: https://github.com/neondatabase/postgres/pull/566 https://github.com/neondatabase/postgres/pull/565 https://github.com/neondatabase/postgres/pull/564 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-24 12:43:52 +00:00
JC Grünhage	dcc437da1d	Make promote-images-prod depend on promote-images-dev (#10494 ) ## Problem After talking about it again with @bayandin again this should replace the changes from https://github.com/neondatabase/neon/pull/10475. While the previous changes worked, they are less visually clear in what happens, and we might end up in a situation where we update `latest`, but don't actually have the tagged image pushed that contains the same changes. The latter would result in potentially hard to debug situations. ## Summary of changes Revert `c283aaaf8d` and make promote-images-prod depend on promote-images-dev instead.	2025-01-24 11:03:39 +00:00
a-masterov	c286fea018	Print logs in extensions test in another step to improve readability (#10483 ) ## Problem The containers' log output is mixed with the tests' output, so you must scroll up to find the error. ## Summary of changes Printing of containers' logs moved to a separate step.	2025-01-24 10:44:48 +00:00
Vlad Lazar	de8276488d	tests: enable wal reader fanout in tests (#10301 ) Note: this has to merge after the release is cut on `2025-01-17` for compat tests to start passing. ## Problem SK wal reader fan-out is not enabled in tests by default. ## Summary of changes Enable it.	2025-01-24 10:34:57 +00:00
Erik Grinaker	ddb9ae1214	pageserver: add compaction backpressure for layer flushes (#10405 ) ## Problem There is no direct backpressure for compaction and L0 read amplification. This allows a large buildup of compaction debt and read amplification. Resolves #5415. Requires #10402. ## Summary of changes Delay layer flushes based on the number of level 0 delta layers: * `l0_flush_delay_threshold`: delay flushes such that they take 2x as long (default `2 * compaction_threshold`). * `l0_flush_stall_threshold`: stall flushes until level 0 delta layers drop below threshold (default `4 * compaction_threshold`). If either threshold is reached, ephemeral layer rolls also synchronously wait for layer flushes to propagate this backpressure up into WAL ingestion. This will bound the number of frozen layers to 1 once backpressure kicks in, since all other frozen layers must flush before the rolled layer. ## Analysis This will significantly change the compute backpressure characteristics. Recall the three compute backpressure knobs: * `max_replication_write_lag`: 500 MB (based on Pageserver `last_received_lsn`). * `max_replication_flush_lag`: 10 GB (based on Pageserver `disk_consistent_lsn`). * `max_replication_apply_lag`: disabled (based on Pageserver `remote_consistent_lsn`). Previously, the Pageserver would keep ingesting WAL and build up ephemeral layers and L0 layers until the compute hit `max_replication_flush_lag` at 10 GB and began backpressuring. Now, once we delay/stall WAL ingestion, the compute will begin backpressuring after `max_replication_write_lag`, i.e. 500 MB. This is probably a good thing (we're not building up a ton of compaction debt), but we should consider tuning these settings. `max_replication_flush_lag` probably doesn't serve a purpose anymore, and we should consider removing it. Furthermore, the removal of the upload barrier in #10402 will mean that we no longer backpressure flushes based on S3 uploads, since `max_replication_apply_lag` is disabled. We should consider enabling this as well. ### When and what do we compact? Default compaction settings: * `compaction_threshold`: 10 L0 delta layers. * `compaction_period`: 20 seconds (between each compaction loop check). * `checkpoint_distance`: 256 MB (size of L0 delta layers). * `l0_flush_delay_threshold`: 20 L0 delta layers. * `l0_flush_stall_threshold`: 40 L0 delta layers. Compaction characteristics: * Minimum compaction volume: 10 layers * 256 MB = 2.5 GB. * Additional compaction volume (assuming 128 MB/s WAL): 128 MB/s * 20 seconds = 2.5 GB (10 L0 layers). * Required compaction bandwidth: 5.0 GB / 20 seconds = 256 MB/s. ### When do we hit `max_replication_write_lag`? Depending on how fast compaction and flushes happens, the compute will backpressure somewhere between `l0_flush_delay_threshold` or `l0_flush_stall_threshold` + `max_replication_write_lag`. * Minimum compute backpressure lag: 20 layers * 256 MB + 500 MB = 5.6 GB * Maximum compute backpressure lag: 40 layers * 256 MB + 500 MB = 10.0 GB This seems like a reasonable range to me.	2025-01-24 09:47:28 +00:00
Erik Grinaker	9e55d79803	Reapply "pageserver: revert flush backpressure" (#10270 ) (#10402 ) This reapplies #10135. Just removing this flush backpressure without further mitigations caused read amp increases during bulk ingestion (predictably), so it was reverted. We will replace it by compaction-based backpressure. ## Problem In #8550, we made the flush loop wait for uploads after every layer. This was to avoid unbounded buildup of uploads, and to reduce compaction debt. However, the approach has several problems: * It prevents upload parallelism. * It prevents flush and upload pipelining. * It slows down ingestion even when there is no need to backpressure. * It does not directly backpressure based on compaction debt and read amplification. We will instead implement compaction-based backpressure in a PR immediately following this removal (#5415). Touches #5415. Touches #10095. ## Summary of changes Remove waiting on the upload queue in the flush loop.	2025-01-24 08:35:35 +00:00
Alex Chi Z.	8d47a60de2	fix(pageserver): handle dup layers during gc-compaction (#10430 ) ## Problem If gc-compaction decides to rewrite an image layer, it will now cause index_part to lose reference to that layer. In details, * Assume there's only one image layer of key 0000...AAAA at LSN 0x100 and generation 0xA in the system. * gc-compaction kicks in at gc-horizon 0x100, and then produce 0000...AAAA at LSN 0x100 and generation 0xB. * It submits a compaction result update into the index part that unlinks 0000-AAAA-100-A and adds 0000-AAAA-100-B On the remote storage / local disk side, this is fine -- it unlinks things correctly and uploads the new file. However, the `index_part.json` itself doesn't record generations. The buggy procedure is as follows: 1. upload the new file 2. update the index part to remove the old file and add the new file 3. remove the new file Therefore, the correct update result process for gc-compaction should be as follows: * When modifying the layer map, delete the old one and upload the new one. * When updating the index, uploading the new one in the index without deleting the old one. ## Summary of changes * Modify `finish_gc_compaction` to correctly order insertions and deletions. * Update the way gc-compaction uploads the layer files. * Add new tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-23 21:54:44 +00:00
Alexey Kondratov	6166482589	feat(compute): Automatically create release PRs (#10495 ) We've finally transitioned to using a separate `release-compute` branch. Now, we can finally automatically create release PRs on Fri and release them during the following week. Part of neondatabase/cloud#11698	2025-01-23 20:47:20 +00:00
Arpad Müller	ca6d72ba2a	Increase reconciler timeout after shard split (#10490 ) Sometimes, especially when the host running the tests is overloaded, we can run into reconcile timeouts in `test_timeline_ancestor_detach_idempotent_success`, making the test flaky. By increasing the timeouts from 30 seconds to 120 seconds, we can address the flakiness. Fixes #10464	2025-01-23 16:43:04 +00:00
a-masterov	b6c0f66619	CI(autocomment): add the lfc state (#10121 ) ## Problem Currently, the report does not contain the LFC state of the failed tests. ## Summary of changes Added the LFC state to the link to the allure report. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-01-23 14:52:07 +00:00
Mikhail Kot	3702ec889f	Enable postgres_fdw (#10426 ) Update compute image to include postgres_fdw #3720	2025-01-23 13:22:31 +00:00
Anastasia Lubennikova	8e8df1b453	Disable logical replication subscribers (#10249 ) Drop logical replication subscribers before compute starts on a non-main branch. Add new compute_ctl spec flag: drop_subscriptions_before_start If it is set, drop all the subscriptions from the compute node before it starts. To avoid race on compute start, use new GUC neon.disable_logical_replication_subscribers to temporarily disable logical replication workers until we drop the subscriptions. Ensure that we drop subscriptions exactly once when endpoint starts on a new branch. It is essential, because otherwise, we may drop not only inherited, but newly created subscriptions. We cannot rely only on spec.drop_subscriptions_before_start flag, because if for some reason compute restarts inside VM, it will start again with the same spec and flag value. To handle this, we save the fact of the operation in the database in the neon.drop_subscriptions_done table. If the table does not exist, we assume that the operation was never performed, so we must do it. If table exists, we check if the operation was performed on the current timeline. fixes: https://github.com/neondatabase/neon/issues/8790	2025-01-23 11:02:15 +00:00
Alex Chi Z.	92d95b08cf	fix(pageserver): extend split job key range to the end (#10484 ) ## Problem Not really a bug fix, but hopefully can reproduce https://github.com/neondatabase/neon/issues/10482 more. If the layer map does not contain layers that end at exactly the end range of the compaction job, the current split algorithm will produce the last job that ends at the maximum layer key. This patch extends it all the way to the compaction job end key. For example, the user requests a compaction of 0000...FFFF. However, we only have a layer 0000..3000 in the layer map, and the split job will have a range of 0000..3000 instead of 0000..FFFF. This is not a correctness issue but it would be better to fix it so that we can get consistent job splits. ## Summary of changes Compaction job split will always cover the full specified key range. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-23 00:15:46 +00:00
Arpad Müller	0af40b5494	Only churn rows once in test_scrubber_physical_gc_ancestors (#10481 ) ## Problem PR #10457 was supposed to fix the flakiness of `test_scrubber_physical_gc_ancestors`, but instead it made it even more flaky. However, the original error causes disappeared, now to be replaced by key not found errors. See this for a longer explanation: https://github.com/neondatabase/neon/issues/10391#issuecomment-2608018967 ## Solution This does one churn rows after all compactions, and before we do any timeline gc's. That way, we remain more accessible at older lsn's.	2025-01-22 19:45:12 +00:00
Arpad Müller	c60b91369a	Expose safekeeper APIs for creation and deletion (#10478 ) Add APIs for timeline creation and deletion to the safekeeper client crate. Going to be used later in #10440. Split off from #10440. Part of https://github.com/neondatabase/neon/issues/9011	2025-01-22 18:52:16 +00:00
a-masterov	f1473dd438	Fix the connection error for extension tests (#10480 ) ## Problem The trust connection to the compute required for `pg_anon` was removed. However, the PGPASSWORD environment variable was not added to `docker-compose.yml`. This caused connection errors, which were interpreted as success due to errors in the bash script. ## Summary of changes The environment variable was added, and the logic in the bash script was fixed.	2025-01-22 16:34:57 +00:00
JC Grünhage	c283aaaf8d	Tag images from docker-hub in promote-images-prod (#10475 ) ## Problem https://github.com/neondatabase/neon/actions/runs/12896686483/job/35961290336#step:5:107 showed that `promote-images-prod` was missing another dependency. ## Summary of changes Modify `promote-images-prod` to tag based on docker-hub images, so that `promote-images-prod` does not rely on `promote-images-dev`. The result should be the exact same, but allows the two jobs to run in parallel.	2025-01-22 16:09:41 +00:00
Vlad Lazar	414ed82c1f	pageserver: issue concurrent IO on the read path (#9353 ) ## Refs - Epic: https://github.com/neondatabase/neon/issues/9378 Co-authored-by: Vlad Lazar <vlad@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> ## Problem The read path does its IOs sequentially. This means that if N values need to be read to reconstruct a page, we will do N IOs and getpage latency is `O(NIoLatency)`. ## Solution With this PR we gain the ability to issue IO concurrently within one layer visit and* to move on to the next layer without waiting for IOs from the previous visit to complete. This is an evolved version of the work done at the Lisbon hackathon, cf https://github.com/neondatabase/neon/pull/9002. ## Design ### `will_init` now sourced from disk btree index keys On the algorithmic level, the only change is that the `get_values_reconstruct_data` now sources `will_init` from the disk btree index key (which is PS-page_cache'd), instead of from the `Value`, which is only available after the IO completes. ### Concurrent IOs, Submission & Completion To separate IO submission from waiting for its completion, while simultaneously feature-gating the change, we introduce the notion of an `IoConcurrency` struct through which IO futures are "spawned". An IO is an opaque future, and waiting for completions is handled through `tokio::sync::oneshot` channels. The oneshot Receiver's take the place of the `img` and `records` fields inside `VectoredValueReconstructState`. When we're done visiting all the layers and submitting all the IOs along the way we concurrently `collect_pending_ios` for each value, which means for each value there is a future that awaits all the oneshot receivers and then calls into walredo to reconstruct the page image. Walredo is now invoked concurrently for each value instead of sequentially. Walredo itself remains unchanged. The spawned IO futures are driven to completion by a sidecar tokio task that is separate from the task that performs all the layer visiting and spawning of IOs. That tasks receives the IO futures via an unbounded mpsc channel and drives them to completion inside a `FuturedUnordered`. (The behavior from before this PR is available through `IoConcurrency::Sequential`, which awaits the IO futures in place, without "spawning" or "submitting" them anywhere.) #### Alternatives Explored A few words on the rationale behind having a sidecar task and what alternatives were considered. One option is to queue up all IO futures in a FuturesUnordered that is polled the first time when we `collect_pending_ios`. Firstly, the IO futures are opaque, compiler-generated futures that need to be polled at least once to submit their IO. "At least once" because tokio-epoll-uring may not be able to submit the IO to the kernel on first poll right away. Second, there are deadlocks if we don't drive the IO futures to completion independently of the spawning task. The reason is that both the IO futures and the spawning task may hold some _and_ try to acquire _more_ shared limited resources. For example, both spawning task and IO future may try to acquire * a VirtualFile file descriptor cache slot async mutex (observed during impl) * a tokio-epoll-uring submission slot (observed during impl) * a PageCache slot (currently this is not the case but we may move more code into the IO futures in the future) Another option is to spawn a short-lived `tokio::task` for each IO future. We implemented and benchmarked it during development, but found little throughput improvement and moderate mean & tail latency degradation. Concerns about pressure on the tokio scheduler made us discard this variant. The sidecar task could be obsoleted if the IOs were not arbitrary code but a well-defined struct. However, 1. the opaque futures approach taken in this PR allows leaving the existing code unchanged, which 2. allows us to implement the `IoConcurrency::Sequential` mode for feature-gating the change. Once the new mode sidecar task implementation is rolled out everywhere, and `::Sequential` removed, we can think about a descriptive submission & completion interface. The problems around deadlocks pointed out earlier will need to be solved then. For example, we could eliminate VirtualFile file descriptor cache and tokio-epoll-uring slots. The latter has been drafted in https://github.com/neondatabase/tokio-epoll-uring/pull/63. See the lengthy doc comment on `spawn_io()` for more details. ### Error handling There are two error classes during reconstruct data retrieval: * traversal errors: index lookup, move to next layer, and the like * value read IO errors A traversal error fails the entire get_vectored request, as before this PR. A value read error only fails that value. In any case, we preserve the existing behavior that once `get_vectored` returns, all IOs are done. Panics and failing to poll `get_vectored` to completion will leave the IOs dangling, which is safe but shouldn't happen, and so, a rate-limited log statement will be emitted at warning level. There is a doc comment on `collect_pending_ios` giving more code-level details and rationale. ### Feature Gating The new behavior is opt-in via pageserver config. The `Sequential` mode is the default. The only significant change in `Sequential` mode compared to before this PR is the buffering of results in the `oneshot`s. ## Code-Level Changes Prep work: * Make `GateGuard` clonable. Core Feature: * Traversal code: track `will_init` in `BlobMeta` and source it from the Delta/Image/InMemory layer index, instead of determining `will_init` after we've read the value. This avoids having to read the value to determine whether traversal can stop. * Introduce `IoConcurrency` & its sidecar task. * `IoConcurrency` is the clonable handle. * It connects to the sidecar task via an `mpsc`. * Plumb through `IoConcurrency` from high level code to the individual layer implementations' `get_values_reconstruct_data`. We piggy-back on the `ValuesReconstructState` for this. * The sidecar task should be long-lived, so, `IoConcurrency` needs to be rooted up "high" in the call stack. * Roots as of this PR: * `page_service`: outside of pagestream loop * `create_image_layers`: when it is called * `basebackup`(only auxfiles + replorigin + SLRU segments) * Code with no roots that uses `IoConcurrency::sequential` * any `Timeline::get` call * `collect_keyspace` is a good example * follow-up: https://github.com/neondatabase/neon/issues/10460 * `TimelineAdaptor` code used by the compaction simulator, unused in practive * `ingest_xlog_dbase_create` * Transform Delta/Image/InMemoryLayer to * do their values IO in a distinct `async {}` block * extend the residence of the Delta/Image layer until the IO is done * buffer their results in a `oneshot` channel instead of straight in `ValuesReconstructState` * the `oneshot` channel is wrapped in `OnDiskValueIo` / `OnDiskValueIoWaiter` types that aid in expressiveness and are used to keep track of in-flight IOs so we can print warnings if we leave them dangling. * Change `ValuesReconstructState` to hold the receiving end of the `oneshot` channel aka `OnDiskValueIoWaiter`. * Change `get_vectored_impl` to `collect_pending_ios` and issue walredo concurrently, in a `FuturesUnordered`. Testing / Benchmarking: * Support queue-depth in pagebench for manual benchmarkinng. * Add test suite support for setting concurrency mode ps config field via a) an env var and b) via NeonEnvBuilder. * Hacky helper to have sidecar-based IoConcurrency in tests. This will be cleaned up later. More benchmarking will happen post-merge in nightly benchmarks, plus in staging/pre-prod. Some intermediate helpers for manual benchmarking have been preserved in https://github.com/neondatabase/neon/pull/10466 and will be landed in later PRs. (L0 layer stack generator!) Drive-By: * test suite actually didn't enable batching by default because `config.compatibility_neon_binpath` is always Truthy in our CI environment => https://neondb.slack.com/archives/C059ZC138NR/p1737490501941309 * initial logical size calculation wasn't always polled to completion, which was surfaced through the added WARN logs emitted when dropping a `ValuesReconstructState` that still has inflight IOs. * remove the timing histograms `pageserver_getpage_get_reconstruct_data_seconds` and `pageserver_getpage_reconstruct_seconds` because with planning, value read IO, and walredo happening concurrently, one can no longer attribute latency to any one of them; we'll revisit this when Vlad's work on tracing/sampling through RequestContext lands. * remove code related to `get_cached_lsn()`. The logic around this has been dead at runtime for a long time, ever since the removal of the materialized page cache in #8105. ## Testing Unit tests use the sidecar task by default and run both modes in CI. Python regression tests and benchmarks also use the sidecar task by default. We'll test more in staging and possibly preprod. # Future Work Please refer to the parent epic for the full plan. The next step will be to fold the plumbing of IoConcurrency into RequestContext so that the function signatures get cleaned up. Once `Sequential` isn't used anymore, we can take the next big leap which is replacing the opaque IOs with structs that have well-defined semantics. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-01-22 15:30:23 +00:00
Alexey Kondratov	881e351f69	feat(compute): Allow installing both 0.8.0 and 0.7.4 pgvector (#10345 ) ## Problem Both these versions are binary compatible, but the way pgvector structures the SQL files forbids installing 0.7.4 if you have a 0.8.0 distribution. Yet, some users may need a previous version for backward compatibility, e.g., restoring the dump. See this thread for discussion https://neondb.slack.com/archives/C04DGM6SMTM/p1735911490242919?thread_ts=1731343604.259169&cid=C04DGM6SMTM ## Summary of changes Put `vector--0.7.4.sql` file into compute image to allow installing this version as well. Tested on staging and it seems to be working as expected: ```sql select * from pg_available_extensions where name = 'vector'; name \| default_version \| installed_version \| comment --------+-----------------+-------------------+------------------------------------------------------ vector \| 0.8.0 \| (null) \| vector data type and ivfflat and hnsw access methods create extension vector version '0.7.4'; select * from pg_available_extensions where name = 'vector'; name \| default_version \| installed_version \| comment --------+-----------------+-------------------+------------------------------------------------------ vector \| 0.8.0 \| 0.7.4 \| vector data type and ivfflat and hnsw access methods alter extension vector update; select * from pg_available_extensions where name = 'vector'; name \| default_version \| installed_version \| comment --------+-----------------+-------------------+------------------------------------------------------ vector \| 0.8.0 \| 0.8.0 \| vector data type and ivfflat and hnsw access methods drop extension vector; create extension vector; select * from pg_available_extensions where name = 'vector'; name \| default_version \| installed_version \| comment --------+-----------------+-------------------+------------------------------------------------------ vector \| 0.8.0 \| 0.8.0 \| vector data type and ivfflat and hnsw access methods ``` If we find out it's a good approach, we can adopt the same for other extensions with a stable ABI -- support both `current` and `current - 1` releases.	2025-01-22 12:38:23 +00:00
Christian Schwarz	b31ce14083	initial logical size calculation: always poll to completion (#10471 ) # Refs - extracted from https://github.com/neondatabase/neon/pull/9353 # Problem Before this PR, when task_mgr shutdown is signalled, e.g. during pageserver shutdown or Tenant shutdown, initial logical size calculation stops polling and drops the future that represents the calculation. This is against the current policy that we poll all futures to completion. This became apparent during development of concurrent IO which warns if we drop a `Timeline::get_vectored` future that still has in-flight IOs. We may revise the policy in the future, but, right now initial logical size calculation is the only part of the codebase that doesn't adhere to the policy, so let's fix it. ## Code Changes - make sensitive exclusively to `Timeline::cancel` - This should be sufficient for all cases of shutdowns; the sensitivity to task_mgr shutdown is unnecessary. - this broke the various cancel tests in `test_timeline_size.py`, e.g., `test_timeline_initial_logical_size_calculation_cancellation` - the tests would time out because the await point was not sensitive to cancellation - to fix this, refactor `pausable_failpoint` so that it accepts a cancellation token - side note: we _really_ should write our own failpoint library; maybe after we get heap-allocated RequestContext, we can plumb failpoints through there.	2025-01-22 12:28:26 +00:00
Christian Schwarz	b4d87b9dfe	fix(tests): actually enable pipelinig by default in the test suite (#10472 ) ## Problem PR #9993 was supposed to enable `page_service_pipelining` by default for all `NeonEnv`s, but this was ineffective in our CI environment. Thus, CI Python-based tests and benchmarks, unless explicitly configuring pipelining, were still using serial protocol handling. ## Analysis The root cause was that in our CI environment, `config.compatibility_neon_binpath` is always Truthy. It's not in local environments, which is why this slipped through in local testing. Lesson: always add a log line ot pageserver startup and spot-check tests to ensure the intended default is picked up. ## Summary of changes Fix it. Since enough time has passed, the compatiblity snapshot contains a recent enough software version so we don't need to worry about `compatibility_neon_binpath` anymore. ## Future Work The question how to add a new default except for compatibliity tests, which is what the broken code was supposed to do, is still unsolved. Slack discussion: https://neondb.slack.com/archives/C059ZC138NR/p1737490501941309	2025-01-22 10:10:43 +00:00
Conrad Ludgate	2b49d6ee05	feat: adjust the tonic features to remove axum dependency (#10348 ) To help facilitate an upgrade to axum 0.8 (https://github.com/neondatabase/neon/pull/10332#pullrequestreview-2541989619) this massages the tonic dependency features so that tonic does not depend on axum.	2025-01-22 09:15:52 +00:00
Erik Grinaker	14e1f89053	pageserver: eagerly notify flush waiters (#10469 ) ## Problem Currently, the layer flush loop will continue flushing layers as long as any are pending, and only notify waiters once there are no further layers to flush. This can cause waiters to wait longer than necessary, and potentially starve them if pending layers keep arriving faster than they can be flushed. The impact of this will increase when we add compaction backpressure and propagate it up into the WAL receiver. Extracted from #10405. ## Summary of changes Break out of the layer flush loop once we've flushed up to the requested LSN. If further flush requests have arrived in the meanwhile, flushing will resume immediately after.	2025-01-21 22:01:27 +00:00
Erik Grinaker	8a8c656c06	pageserver: add `LayerMap::watch_layer0_deltas()` (#10470 ) ## Problem For compaction backpressure, we need a mechanism to signal when compaction has reduced the L0 delta layer count below the backpressure threshold. Extracted from #10405. ## Summary of changes Add `LayerMap::watch_level0_deltas()` which returns a `tokio::sync:⌚:Receiver` signalling the current L0 delta layer count.	2025-01-21 21:18:09 +00:00
Erik Grinaker	a75e11cc00	pageserver: return duration from `StorageTimeMetricsTimer` (#10468 ) ## Problem It's sometimes useful to obtain the elapsed duration from a `StorageTimeMetricsTimer` for purposes beyond just recording it in metrics (e.g. to log it). Extracted from #10405. ## Summary of changes Add `StorageTimeMetricsTimer.elapsed()` and return the duration from `stop_and_record()`.	2025-01-21 20:56:34 +00:00
Alex Chi Z.	7d4bfcdc47	feat(pageserver): add config items for gc-compaction auto trigger (#10455 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 The automatic trigger is already implemented at https://github.com/neondatabase/neon/pull/10221 but I need to write some tests and finish my experiments in staging before I can merge it with confidence. Given that I have some other patches that will modify the config items, I'd like to get the config items merged first to reduce conflicts. ## Summary of changes * add `l2_lsn` to index_part.json -- below that LSN, data have been processed by gc-compaction * add a set of gc-compaction auto trigger control items into the config --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-21 19:29:38 +00:00
a-masterov	737888e5c9	Remove the tests for `pg_anon` (#10382 ) ## Problem We are removing the `pg_anon` v1 extension from Neon. So we don't need to test it anymore and can remove the code for simplicity. ## Summary of changes The code required for testing `pg_anon` is removed.	2025-01-21 19:17:14 +00:00
Gleb Novikov	19bf7b78a0	fast import: basic python test (#10271 ) We did not have any tests on fast_import binary yet. In this PR I have introduced: - `FastImport` class and tools for testing in python - basic test that runs fast import against vanilla postgres and checks that data is there Should be merged after https://github.com/neondatabase/neon/pull/10251	2025-01-21 16:50:44 +00:00
Arpad Müller	7e4a39ea53	Fix two flakiness sources in test_scrubber_physical_gc_ancestors (#10457 ) We currently have some flakiness in `test_scrubber_physical_gc_ancestors`, see #10391. The first flakiness kind is about the reconciler not actually becoming idle within the timeout of 30 seconds. We see continuous forward progress so this is likely not a hang. We also see this happen in parallel to a test failure, so is likely due to runners being overloaded. Therefore, we increase the timeout. The second flakiness kind is an assertion failure. This one is a little bit more tricky, but we saw in the successful run that there was some advance of the lsn between the compaction ran (which created layer files) and the gc run. Apparently gc rejects reductions to the single image layer setting if the cutoff lsn is the same as the lsn of the image layer: it will claim that that layer is newer than the space cutoff and therefore skip it, while thinking the old layer (that we want to delete) is the latest one (so it's not deleted). We address the second flakiness kind by inserting a tiny amount of WAL between the compaction and gc. This should hopefully fix things. Related issue: #10391 (not closing it with the merger of the PR as we'll need to validate that these changes had the intended effect). Thanks to Chi for going over this together with me in a call.	2025-01-21 15:40:04 +00:00
JC Grünhage	624a507544	Create Github releases with empty body for now (#10448 ) ## Problem When releasing `release-7574`, the Github Release creation failed with "body is too long" (see https://github.com/neondatabase/neon/actions/runs/12834025431/job/35792346745#step:5:77). There's lots of room for improvement of the release notes, but for now we'll disable them instead. ## Summary of changes - Disable automatic generation of release notes for Github releases - Enable creation of Github releases for proxy/compute	2025-01-21 12:45:21 +00:00
Arpad Müller	2ab9f69825	Simplify pageserver_physical_gc function (#10104 ) This simplifies the code in `pageserver_physical_gc` a little bit after the feedback in #10007 that the code is too complicated. Most importantly, we don't pass around `GcSummary` any more in a complicated fashion, and we save on async stream-combinator-inception in one place in favour of `try_stream!{}`. Follow-up of #10007	2025-01-20 21:57:15 +00:00
Alex Chi Z.	2de2b26c62	feat(pageserver): add reldir migration configs (#10439 ) ## Problem Part of #9516 per RFC at https://github.com/neondatabase/neon/pull/10412 ## Summary of changes Adding the necessary config items and index_part items for the large relation count work. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-20 20:44:12 +00:00
Matthias van de Meent	e781cf6dd8	Compute/LFC: Apply limits consistently (#10449 ) Otherwise we might hit ERRORs in otherwise safe situations (such as user queries), which isn't a great user experience. ## Problem https://github.com/neondatabase/neon/pull/10376 ## Summary of changes Instead of accepting internal errors as acceptable, we ensure we don't exceed our allocated usage.	2025-01-20 18:29:21 +00:00
Christian Schwarz	72130d7d6c	fix(page_service / handle): panic when parallel client disconnect & Timeline shutdown (#10445 ) ## Refs - fixes https://github.com/neondatabase/neon/issues/10444 ## Problem We're seeing a panic `handles are only shut down once in their lifetime` in our performance testbed. ## Hypothesis Annotated code in https://github.com/neondatabase/neon/issues/10444#issuecomment-2602286415. ``` T1: drop Cache, executes up to (1) => HandleInner is now in state ShutDown T2: Timeline::shutdown => PerTimelineState::shutdown executes shutdown() again => panics ``` Likely this snuck in the final touches of #10386 where I narrowed down the locking rules. ## Summary of changes Make duplicate shutdowns a no-op.	2025-01-20 17:51:30 +00:00
John Spray	2657b7ec75	rfcs: add sharded ingest RFC (#8754 ) ## Summary Whereas currently we send all WAL to all pageserver shards, and each shard filters out the data that it needs, in this RFC we add a mechanism to filter the WAL on the safekeeper, so that each shard receives only the data it needs. This will place some extra CPU load on the safekeepers, in exchange for reducing the network bandwidth for ingesting WAL back to scaling as O(1) with shard count, rather than O(N_shards). Touches #9329. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Vlad Lazar <vlalazar.vlad@gmail.com> Co-authored-by: Vlad Lazar <vlad@neon.tech>	2025-01-20 17:33:07 +00:00
Christian Schwarz	02fc58b878	impr(timeline handles): add more tests covering reference cyle (#10446 ) The other test focus on the external interface usage while the tests added in this PR add some testing around HandleInner's lifecycle, ensuring we don't leak it once either connection gets dropped or per-timeline-state is shut down explicitly.	2025-01-20 14:37:24 +00:00
Arpad Müller	b312a3c320	Move DeleteTimelineFlow::prepare to separate function and use enum (#10334 ) It was requested by review in #10305 to use an enum or something like it for distinguishing the different modes instead of two parameters, because two flags allow four combinations, and two of them don't really make sense/ aren't used. follow-up of #10305	2025-01-20 12:50:44 +00:00
John Spray	7d761a9d22	storage controller: make chaos less disruptive to AZ locality (#10438 ) ## Problem Since #9916 , the chaos code is actively fighting the optimizer: tenants tend to be attached in their preferred AZ, so most chaos migrations were moving them to a non-preferred AZ. ## Summary of changes - When picking migrations, prefer to migrate things _toward_ their preferred AZ when possible. Then pick shards to move the other way when necessary. The resulting behavior should be an alternating "back and forth" where the chaos code migrates thiings away from home, and then migrates them back on the next iteration. The side effect will be that the chaos code actively helps to push things into their home AZ. That's not contrary to its purpose though: we mainly just want it to continuously migrate things to exercise migration+notification code.	2025-01-20 09:47:23 +00:00
John Spray	8bdaee35f3	pageserver: safety checks on validity of uploaded indices (#10403 ) ## Problem Occasionally, we encounter bugs in test environments that can be detected at the point of uploading an index, but we proceed to upload it anyway and leave a tenant in a broken state that's awkward to handle. ## Summary of changes - Validate index when submitting it for upload, so that we can see the issue quickly e.g. in an API invoking compaction - Validate index before executing the upload, so that we have a hard enforcement that any code path that tries to upload an index will not overwrite a valid index with an invalid one.	2025-01-20 09:20:31 +00:00
Arpad Müller	b0f34099f9	Add safekeeper utilization endpoint (#10429 ) Add an endpoint to obtain the utilization of a safekeeper. Future changes to the storage controller can use this endpoint to find the most suitable safekeepers for newly created timelines, analogously to how it's done for pageservers already. Initially we just want to assign by timeline count, then we can iterate from there. Part of https://github.com/neondatabase/neon/issues/9011	2025-01-17 21:43:52 +00:00
Vlad Lazar	6975228a76	pageserver: add initdb metrics (#10434 ) ## Problem Initdb observability is poor. ## Summary of changes Add some metrics so we can figure out which part, if any, is slow. Closes https://github.com/neondatabase/neon/issues/10423	2025-01-17 14:51:33 +00:00
JC Grünhage	053abff71f	Fix dependency on neon-image in promote-images-dev (#10437 ) ## Problem `871e8b325f` failed CI on main because a job ran to soon. This was caused by `ea84ec357f`. While `promote-images-dev` does not inherently need `neon-image`, a few jobs depending on `promote-images-dev` do need it, and previously had it when it was `promote-images`, which depended on `test-images`, which in turn depended on `neon-image`. ## Summary of changes To ensure jobs depending `docker.io/neondatabase/neon` images get them, `promote-images-dev` gets the dependency to `neon-image` back which it previously had transitively through `test-images`.	2025-01-17 14:21:30 +00:00
Tristan Partin	871e8b325f	Use the request ID given by the control plane in compute_ctl (#10418 ) Instead of generating our own request ID, we can just use the one provided by the control plane. In the event, we get a request from a client which doesn't set X-Request-ID, then we just generate one which is useful for tracing purposes. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-16 20:46:53 +00:00
Christian Schwarz	c47c5f4ace	fix(page_service pipelining): tenant cannot shut down because gate kept open while flushing responses (#10386 ) # Refs - fixes https://github.com/neondatabase/neon/issues/10309 - fixup of batching design, first introduced in https://github.com/neondatabase/neon/pull/9851 - refinement of https://github.com/neondatabase/neon/pull/8339 # Problem `Tenant::shutdown` was occasionally taking many minutes (sometimes up to 20) in staging and prod if the `page_service_pipelining.mode="concurrent-futures"` is enabled. # Symptoms The issue happens during shard migration between pageservers. There is page_service unavailability and hence effectively downtime for customers in the following case: 1. The source (state `AttachedStale`) gets stuck in `Tenant::shutdown`, waiting for the gate to close. 2. Cplane/Storcon decides to transition the target `AttachedMulti` to `AttachedSingle`. 3. That transition comes with a bump of the generation number, causing the `PUT .../location_config` endpoint to do a full `Tenant::shutdown` / `Tenant::attach` cycle for the target location. 4. That `Tenant::shutdown` on the target gets stuck, waiting for the gate to close. 5. Eventually the gate closes (`close completed`), correlating with a `page_service` connection handler logging that it's exiting because of a network error (`Connection reset by peer` or `Broken pipe`). While in (4): - `Tenant::shutdown` is stuck waiting for all `Timeline::shutdown` calls to complete. So, really, this is a `Timeline::shutdown` bug. - retries from Cplane/Storcon to complete above state transitions, fail with errors related to the tenant mgr slot being in state `TenantSlot::InProgress`, the tenant state being `TenantState::Stopping`, and the timelines being in `TimelineState::Stopping`, and the `Timeline::cancel` being cancelled. - Existing (and/or new?) page_service connections log errors `error reading relation or page version: Not found: Timed out waiting 30s for tenant active state. Latest state: None` # Root-Cause After a lengthy investigation ([internal write-up](https://www.notion.so/neondatabase/2025-01-09-batching-deadlock-Slow-Log-Analysis-in-Staging-176f189e00478050bc21c1a072157ca4?pvs=4)) I arrived at the following root cause. The `spsc_fold` channel (`batch_tx`/`batch_rx`) that connects the Batcher and Executor stages of the pipelined mode was storing a `Handle` and thus `GateGuard` of the Timeline that was not shutting down. The design assumption with pipelining was that this would always be a short transient state. However, that was incorrect: the Executor was stuck on writing/flushing an earlier response into the connection to the client, i.e., socket write being slow because of TCP backpressure. The probable scenario of how we end up in that case: 1. Compute backend process sends a continuous stream of getpage prefetch requests into the connection, but never reads the responses (why this happens: see Appendix section). 2. Batch N is processed by Batcher and Executor, up to the point where Executor starts flushing the response. 3. Batch N+1 is procssed by Batcher and queued in the `spsc_fold`. 4. Executor is still waiting for batch N flush to finish. 5. Batcher eventually hits the `TimeoutReader` error (10min). From here on it waits on the `spsc_fold.send(Err(QueryError(TimeoutReader_error)))` which will never finish because the batch already inside the `spsc_fold` is not being read by the Executor, because the Executor is still stuck in the flush. (This state is not observable at our default `info` log level) 6. Eventually, Compute backend process is killed (`close()` on the socket) or Compute as a whole gets killed (probably no clean TCP shutdown happening in that case). 7. Eventually, Pageserver TCP stack learns about (6) through RST packets and the Executor's flush() call fails with an error. 8. The Executor exits, dropping `cancel_batcher` and its end of the spsc_fold. This wakes Batcher, causing the `spsc_fold.send` to fail. Batcher exits. The pipeline shuts down as intended. We return from `process_query` and log the `Connection reset by peer` or `Broken pipe` error. The following diagram visualizes the wait-for graph at (5) ```mermaid flowchart TD Batcher --spsc_fold.send(TimeoutReader_error)--> Executor Executor --flush batch N responses--> socket.write_end socket.write_end --wait for TCP window to move forward--> Compute ``` # Analysis By holding the GateGuard inside the `spsc_fold` open, the pipelining implementation violated the principle established in (https://github.com/neondatabase/neon/pull/8339). That is, that `Handle`s must only be held across an await point if that await point is sensitive to the `<Handle as Deref<Target=Timeline>>::cancel` token. In this case, we were holding the Handle inside the `spsc_fold` while awaiting the `pgb_writer.flush()` future. One may jump to the conclusion that we should simply peek into the spsc_fold to get that Timeline cancel token and be sensitive to it during flush, then. But that violates another principle of the design from https://github.com/neondatabase/neon/pull/8339. That is, that the page_service connection lifecycle and the Timeline lifecycles must be completely decoupled. Tt must be possible to shut down one shard without shutting down the page_service connection, because on that single connection we might be serving other shards attached to this pageserver. (The current compute client opens separate connections per shard, but, there are plans to change that.) # Solution This PR adds a `handle::WeakHandle` struct that does _not_ hold the timeline gate open. It must be `upgrade()`d to get a `handle::Handle`. That `handle::Handle` _does_ hold the timeline gate open. The batch queued inside the `spsc_fold` only holds a `WeakHandle`. We only upgrade it while calling into the various `handle_` methods, i.e., while interacting with the `Timeline` via `<Handle as Deref<Target=Timeline>>`. All that code has always been required to be (and is!) sensitive to `Timeline::cancel`, and therefore we're guaranteed to bail from it quickly when `Timeline::shutdown` starts. We will drop the `Handle` immediately, before we start `pgb_writer.flush()`ing the responses. Thereby letting go of our hold on the `GateGuard`, allowing the timeline shutdown to complete while the page_service handler remains intact. # Code Changes * Reproducer & Regression Test * Developed and proven to reproduce the issue in https://github.com/neondatabase/neon/pull/10399 * Add a `Test` message to the pagestream protocol (`cfg(feature = "testing")`). * Drive-by minimal improvement to the parsing code, we now have a `PagestreamFeMessageTag`. * Refactor `pageserver/client` to allow sending and receiving `page_service` requests independently. * Add a Rust helper binary to produce situation (4) from above * Rationale: (4) and (5) are the same bug class, we're holding a gate open while `flush()`ing. * Add a Python regression test that uses the helper binary to demonstrate the problem. * Fix * Introduce and use `WeakHandle` as explained earlier. * Replace the `shut_down` atomic with two enum states for `HandleInner`, wrapped in a `Mutex`. * To make `WeakHandle::upgrade()` and `Handle::downgrade()` cache-efficient: * Wrap the `Types::Timeline` in an `Arc` * Wrap the `GateGuard` in an `Arc` * The separate `Arc`s enable uncontended cloning of the timeline reference in `upgrade()` and `downgrade()`. If instead we were `Arc<Timeline>::clone`, different connection handlers would be hitting the same cache line on every upgrade()/downgrade(), causing contention. * Please read the udpated module-level comment in `mod handle` module-level comment for details. # Testing & Performance The reproducer test that failed before the changes now passes, and obviously other tests are passing as well. We'll do more testing in staging, where the issue happens every ~4h if chaos migrations are enabled in storcon. Existing perf testing will be sufficient, no perf degradation is expected. It's a few more alloctations due to the added Arc's, but, they're low frequency. # Appendix: Why Compute Sometimes Doesn't Read Responses Remember, the whole problem surfaced because flush() was slow because Compute was not reading responses. Why is that? In short, the way the compute works, it only advances the page_service protocol processing when it has an interest in data, i.e., when the pagestore smgr is called to return pages. Thus, if compute issues a bunch of requests as part of prefetch but then it turns out it can service the query without reading those pages, it may very well happen that these messages stay in the TCP until the next smgr read happens, either in that session, or possibly in another session. If there’s too many unread responses in the TCP, the pageserver kernel is going to backpressure into userspace, resulting in our stuck flush(). All of this stems from the way vanilla Postgres does prefetching and "async IO": it issues `fadvise()` to make the kernel do the IO in the background, buffering results in the kernel page cache. It then consumes the results through synchronous `read()` system calls, which hopefully will be fast because of the `fadvise()`. If it turns out that some / all of the prefetch results are not needed, Postgres will not be issuing those `read()` system calls. The kernel will eventually react to that by reusing page cache pages that hold completed prefetched data. Uncompleted prefetch requests may or may not be processed -- it's up to the kernel. In Neon, the smgr + Pageserver together take on the role of the kernel in above paragraphs. In the current implementation, all prefetches are sent as GetPage requests to Pageserver. The responses are only processed in the places where vanilla Postgres would do the synchronous `read()` system call. If we never get to that, the responses are queued inside the TCP connection, which, once buffers run full, will backpressure into Pageserver's sending code, i.e., the `pgb_writer.flush()` that was the root cause of the problems we're fixing in this PR.	2025-01-16 20:34:02 +00:00
Tristan Partin	b0838a68e5	Enable pgx_ulid on Postgres 17 (#10397 ) The extension now supports Postgres 17. The release also seems to be binary compatible with the previous version. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-16 19:49:04 +00:00
John Spray	8f2ebc0684	tests: stabilize test_storage_controller_node_deletion (#10420 ) ## Problem `test_storage_controller_node_deletion` sometimes failed because shards were moving around during timeline creation, and neon_local isn't tolerant of that. The movements were unexpected because the shards had only just been created. This was a regression from #9916 Closes: #10383 ## Summary of changes - Make this test use multiple AZs -- this makes the storage controller's scheduling reliably stable Why this works: in #9916 , I made a simplifying assumption that we would have multiple AZs to get nice stable scheduling -- it's much easier, because each tenant has a well defined primary+secondary location when they have an AZ preference and nodes have different AZs. Everything still works if you don't have multiple AZs, but you just have this quirk that sometimes the optimizer can disagree with initial scheduling, so once in a while a shard moves after being created -- annoying for tests, harmless IRL.	2025-01-16 19:00:16 +00:00
Vlad Lazar	3a285a046b	pageserver: include node id when subscribing to SK (#10432 ) ## Problem All pageserver have the same application name which makes it hard to distinguish them. ## Summary of changes Include the node id in the application name sent to the safekeeper. This should gives us more visibility in logs. There's a few metrics that will increase in cardinality by `pageserver_count`, but that's fine.	2025-01-16 18:51:56 +00:00
John Spray	da13154791	storcon: revise fill logic to prioritize AZ (#10411 ) ## Problem Node fills were limited to moving (total shards / node_count) shards. In systems that aren't perfectly balanced already, that leads us to skip migrating some of the shards that belong on this node, generating work for the optimizer later to gradually move them back. ## Summary of changes - Where a shard has a preferred AZ and is currently attached outside this AZ, then always promote it during fill, irrespective of target fill count	2025-01-16 17:33:46 +00:00
John Spray	2e13a3aa7a	storage controller: handle legacy TenantConf in consistency_check (#10422 ) ## Problem We were comparing serialized configs from the database with serialized configs from memory. If fields have been added/removed to TenantConfig, this generates spurious consistency errors. This is fine in test environments, but limits the usefulness of this debug API in the field. Closes: https://github.com/neondatabase/neon/issues/10369 ## Summary of changes - Do a decode/encode cycle on the config before comparing it, so that it will have exactly the expected fields.	2025-01-16 16:56:44 +00:00
Alex Chi Z.	cccc196848	refactor(pageserver): make partitioning an ArcSwap (#10377 ) ## Problem gc-compaction needs the partitioning data to decide the job split. This refactor allows concurrent access/computing the partitioning. ## Summary of changes Make `partitioning` an ArcSwap so that others can access the partitioning while we compute it. Fully eliminate the `repartition is called concurrently` warning when gc-compaction is going on. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-16 15:33:37 +00:00
Arpad Müller	e436dcad57	Rename "disabled" safekeeper scheduling policy to "pause" (#10410 ) Rename the safekeeper scheduling policy "disabled" to "pause". A rename was requested in https://github.com/neondatabase/neon/pull/10400#discussion_r1916259124, as the "disabled" policy is meant to be analogous to the "pause" policy for pageservers. Also simplify the `SkSchedulingPolicyArg::from_str` function, relying on the `from_str` implementation of `SkSchedulingPolicy`. Latter is used for the database format as well, so it is quite stable. If we ever want to change the UI, we'll need to duplicate the function again but this is cheap.	2025-01-16 14:30:49 +00:00
John Spray	21d7b6a258	tests: refactor test_tenant_delete_races_timeline_creation (#10425 ) ## Problem Threads spawned in `test_tenant_delete_races_timeline_creation` are not joined before the test ends, and can generate `PytestUnhandledThreadExceptionWarning` in other tests. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10419/12805365523/index.html#/testresult/53a72568acd04dbd ## Summary of changes - Wrap threads in ThreadPoolExecutor which will join them before the test ends - Remove a spurious deletion call -- the background thread doing deletion ought to succeed.	2025-01-16 14:11:33 +00:00
JC Grünhage	86dbc44db1	CI: Run check-codestyle-rust as part of pre-merge-checks (#10387 ) ## Problem When multiple changes are grouped in a merge group to be merged as part of the merge queue, the changes might individually pass `check-codestyle-rust` but not in their combined form. ## Summary of changes - Move `check-codestyle-rust` into a reusable workflow that is called from it's previous location in `build_and_test.yml`, and additionally call it from `pre_merge_checks.yml`. The additional call does not run on ARM, only x86, to ensure the merge queue continues being responsive. - Trigger `pre_merge_checks.yml` on PRs that change any of the workflows running in `pre_merge_checks.yml`, so that we get feedback on those early an not only after trying to merge those changes.	2025-01-16 09:20:24 +00:00
Tristan Partin	58f6af6c9a	Clean up compute_ctl extension server code (#10417 )	2025-01-16 08:35:36 +00:00
Matthias van de Meent	7be971081a	Make sure we request pages with a known-flushed LSN. (#10413 ) This should fix the largest source of flakyness of test_nbtree_pagesplit_cycleid. ## Problem https://github.com/neondatabase/neon/issues/10390 ## Summary of changes By using a guaranteed-flushed LSN, we ensure that PS won't have to wait forever. (If it does wait forever, we know the issue can't be with Compute's WAL)	2025-01-16 08:34:11 +00:00
Ivan Efremov	c91905e643	Merge pull request #10416 from neondatabase/rc/release-proxy/2025-01-16 Proxy release 2025-01-16	2025-01-16 10:04:38 +02:00
Arseny Sher	6fe4c6798f	Add START_WAL_PUSH proto_version and allow_timeline_creation options. (#10406 ) ## Problem As part of https://github.com/neondatabase/neon/issues/8614 we need to pass options to START_WAL_PUSH. ## Summary of changes Add two options. `allow_timeline_creation`, default true, disables implicit timeline creation in the connection from compute. Eventually such creation will be forbidden completely, but as we migrate to configurations we need to support both: current mode and configurations enabled where creation by compute is disabled. `proto_version` specifies compute <-> sk protocol version. We have it currently in the first greeting package also, but I plan to change tag size from u64 to u8, which would make it hard to use. Command is more appropriate place for it anyway.	2025-01-16 08:01:19 +00:00
github-actions[bot]	44b4e355a2	Proxy release 2025-01-16	2025-01-16 06:02:04 +00:00
Matthias van de Meent	2eda484ef6	prefetch: Read more frequently from TCP buffer (#10394 ) This reduces pressure on the OS TCP read buffer by increasing the moments we read data out of the receive buffer, and increasing the number of bytes we can pull from that buffer when we do reads. ## Problem A backend may not always consume its prefetch data quick enough ## Summary of changes We add a new function `prefetch_pump_state` which pulls as many prefetch requests from the OS TCP receive buffer as possible, but without blocking. This thus reduces pressure on OS-level TCP buffers, thus increasing throughput by limiting throttling caused by full TCP buffers.	2025-01-16 02:43:47 +00:00
Mikhail Kot	c7429af8a0	Enable dblink (#10358 ) Update compute image to include dblink #3720	2025-01-15 22:29:18 +00:00
Alex Chi Z.	a753349cb0	feat(pageserver): validate data integrity during gc-compaction (#10131 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 part of investigation of https://github.com/neondatabase/neon/issues/10049 ## Summary of changes * If `cfg!(test) or cfg!(feature = testing)`, then we will always try generating an image to ensure the history is replayable, but not put the image layer into the final layer results, therefore discovering wrong key history before we hit a read error. * I suspect it's easier to trigger some races if gc-compaction is continuously run on a timeline, so I increased the frequency to twice per 10 churns. * Also, create branches in gc-compaction smoke tests to get more test coverage. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad@neon.tech>	2025-01-15 22:04:06 +00:00
Gleb Novikov	55a68b28a2	fast import: restore to neondb (not postgres) database (#10251 ) ## Problem `postgres` is system database at neon, so we need to do `pg_restore` into `neondb` instead https://github.com/neondatabase/cloud/issues/22100 ## Summary of changes Changed fast_import a little bit: 1. After succesfull connection creating `neondb` in postgres instance 2. Changed restore connstring to use new db 3. Added optional `source_connection_string`, which allows to skip `s3_prefix` and just connect directly. 4. Added `-i` that stops process until sigterm ## TODO - [x] test image in cplane e2e - [ ] Change import job image back to latest after this merged (partial revert of https://github.com/neondatabase/cloud/pull/22338)	2025-01-15 20:51:09 +00:00
John Spray	fb0e2acb2f	pageserver: add `page_trace` API for debugging (#10293 ) ## Problem When a pageserver is receiving high rates of requests, we don't have a good way to efficiently discover what the client's access pattern is. Closes: https://github.com/neondatabase/neon/issues/10275 ## Summary of changes - Add `/v1/tenant/x/timeline/y/page_trace?size_limit_bytes=...&time_limit_secs=...` API, which returns a binary buffer. - Add `pagectl page-trace` tool to decode and analyze the output. --------- Co-authored-by: Erik Grinaker <erik@neon.tech>	2025-01-15 19:07:22 +00:00
Arpad Müller	efaec6cdf8	Add endpoint and storcon cli cmd to set sk scheduling policy (#10400 ) Implementing the last missing endpoint of #9981, this adds support to set the scheduling policy of an individual safekeeper, as specified in the RFC. However, unlike in the RFC we call the endpoint `scheduling_policy` not `status` Closes #9981. As for why not use the upsert endpoint for this: we want to have the safekeeper upsert endpoint be used for testing and for deploying new safekeepers, but not for changes of the scheduling policy. We don't want to change any of the other fields when marking a safekeeper as decommissioned for example, so we'd have to first fetch them only to then specify them again. Of course one can also design an endpoint where one can omit any field and it doesn't get modified, but it's still not great for observability to put everything into one big "change something about this safekeeper" endpoint.	2025-01-15 18:15:30 +00:00
Tristan Partin	3d41069dc4	Update pgrx in extension builds to 0.12.9 (#10372 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-15 16:26:58 +00:00
Vlad Lazar	dbebede7bf	safekeeper: fan out from single wal reader to multiple shards (#10190 ) ## Problem Safekeepers currently decode and interpret WAL for each shard separately. This is wasteful in terms of CPU memory usage - we've seen this in profiles. ## Summary of changes Fan-out interpreted WAL to multiple shards. The basic is that wal decoding and interpretation happens in a separate tokio task and senders attach to it. Senders only receive batches concerning their shard and only past the Lsn they've last seen. Fan-out is gated behind the `wal_reader_fanout` safekeeper flag (disabled by default for now). When fan-out is enabled, it might be desirable to control the absolute delta between the current position and a new shard's desired position (i.e. how far behind or ahead a shard may be). `max_delta_for_fanout` is a new optional safekeeper flag which dictates whether to create a new WAL reader or attach to the existing one. By default, this behaviour is disabled. Let's consider enabling it if we spot the need for it in the field. ## Testing Tests passed [here](https://github.com/neondatabase/neon/pull/10301) with wal reader fanout enabled as of `34f6a71718`. Related: https://github.com/neondatabase/neon/issues/9337 Epic: https://github.com/neondatabase/neon/issues/9329	2025-01-15 15:33:54 +00:00
Tristan Partin	3e529f124f	Remove leading slashes when downloading remote files (#10396 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-15 15:29:52 +00:00
Arseny Sher	05a71c7d6a	safekeeper: add membership configuration switch endpoint (#10241 ) ## Problem https://github.com/neondatabase/neon/issues/9965 ## Summary of changes Add to safekeeper http endpoint to switch membership configuration. Also add it to python client for tests, and add simple test itself.	2025-01-15 14:16:04 +00:00
Alexander Bayandin	b9464865b6	benchmarks: report successful runs to slack as well (#10393 ) ## Problem Successful `benchmarks` runs doesn't have enough visibility Ref https://neondb.slack.com/archives/C069Z2199DL/p1736868055094539 ## Summary of changes - Report both successful and failed `benchmarks` to Slack - Update `slackapi/slack-github-action` action	2025-01-15 13:05:05 +00:00
Vlad Lazar	1577430408	safekeeper: decode and interpret for multiple shards in one go (#10201 ) ## Problem Currently, we call `InterpretedWalRecord::from_bytes_filtered` from each shard. To serve multiple shards at the same time, the API needs to allow for enquiring about multiple shards. ## Summary of changes This commit tweaks it a pretty brute force way. Naively, we could just generate the shard for a key, but pre and post split shards may be subscribed at the same time, so doing it efficiently is more complex.	2025-01-15 11:10:24 +00:00
Erik Grinaker	05d17a10ae	rfc: add CPU and heap profiling RFC (#10085 ) This document proposes a standard cross-team pattern for CPU and memory profiling across applications and languages, using the [pprof](https://github.com/google/pprof) profile format. It enables both ad hoc profiles via HTTP endpoints, and continuous profiling across the fleet via [Grafana Cloud Profiles](https://grafana.com/docs/grafana-cloud/monitor-applications/profiles/). Continuous profiling incurs an overhead of about 0.1% CPU usage and 3% slower heap allocations. [Rendered](https://github.com/neondatabase/neon/blob/erik/profiling-rfc/docs/rfcs/040-profiling.md) Touches #9534. Touches https://github.com/neondatabase/cloud/issues/14888.	2025-01-15 10:35:38 +00:00
Arseny Sher	2d0ea08524	Add safekeeper membership conf to control file. (#10196 ) ## Problem https://github.com/neondatabase/neon/issues/9965 ## Summary of changes Add safekeeper membership configuration struct itself and storing it in the control file. In passing also add creation timestamp to the control file (there were cases where I wanted it in the past). Remove obsolete unused PersistedPeerInfo struct from control file (still keep it control_file_upgrade.rs to have it in old upgrade code). Remove the binary representation of cfile in the roundtrip test. Updating it is annoying, and we still test the actual roundtrip. Also add configuration to timeline creation http request, currently used only in one python test. In passing, slightly change LSNs meaning in the request: normally start_lsn is passed (the same as ancestor_start_lsn in similar pageserver call), but we allow specifying higher commit_lsn for manual intervention if needed. Also when given LSN initialize term_history with it.	2025-01-15 09:45:58 +00:00
Arseny Sher	c98cbbeac1	Add migration details to safekeeper membership RFC. (#10272 ) ## Problem https://github.com/neondatabase/neon/pull/8455 wasn't specific enough on migration from current situation to enabling generations. ## Summary of changes Describe the missing parts, including control plane pushing generation to compute, which also defines whether generations are enabled -- non zero value does it.	2025-01-15 09:41:49 +00:00
John Spray	47c1640acc	storage controller: pagination for tenant listing API (#10365 ) ## Problem For large deployments, the `control/v1/tenant` listing API can time out transmitting a monolithic serialized response. ## Summary of changes - Add `limit` and `start_after` parameters to listing API - Update storcon_cli to use these parameters and limit requests to 1000 items at a time	2025-01-14 21:37:32 +00:00
Erik Grinaker	6debb49b87	pageserver: coalesce index uploads when possible (#10248 ) ## Problem With upload queue reordering in #10218, we can easily get into a situation where multiple index uploads are queued back to back, which can't be parallelized. This will happen e.g. when multiple layer flushes enqueue layer/index/layer/index/... and the layers skip the queue and are uploaded in parallel. These index uploads will incur serial S3 roundtrip latencies, and may block later operations. Touches #10096. ## Summary of changes When multiple back-to-back index uploads are ready to upload, only upload the most recent index and drop the rest.	2025-01-14 21:10:17 +00:00
Erik Grinaker	e58e29e639	pageserver: limit number of upload queue tasks (#10384 ) ## Problem The upload queue can currently schedule an arbitrary number of tasks. This can both spawn an unbounded number of Tokio tasks, and also significantly slow down upload queue scheduling as it's quadratic in number of operations. Touches #10096. ## Summary of changes Limit the number of inprogress tasks to the remote storage upload concurrency. While this concurrency limit is shared across all tenants, there's certainly no point in scheduling more than this -- we could even consider setting the limit lower, but don't for now to avoid artificially constraining tenants.	2025-01-14 18:01:14 +00:00
Heikki Linnakangas	d36112d20f	Simplify compute dockerfile by setting PATH just once (#10357 ) By setting PATH in the 'pg-build' layer, all the extension build layers will inherit. No need to pass PG_CONFIG to all the various make invocations either: once pg_config is in PATH, the Makefiles will pick it up from there.	2025-01-14 17:02:35 +00:00
Erik Grinaker	ffaa52ff5d	pageserver: reorder upload queue when possible (#10218 ) ## Problem The upload queue currently sees significant head-of-line blocking. For example, index uploads act as upload barriers, and for every layer flush we schedule a layer and index upload, which effectively serializes layer uploads. Resolves #10096. ## Summary of changes Allow upload queue operations to bypass the queue if they don't conflict with preceding operations, increasing parallelism. NB: the upload queue currently schedules an explicit barrier after every layer flush as well (see #8550). This must be removed to enable parallelism. This will require a better mechanism for compaction backpressure, see e.g. #8390 or #5415.	2025-01-14 16:31:59 +00:00
John Spray	aa7323a384	storage controller: quality of life improvements for AZ handling (#10379 ) ## Problem Since https://github.com/neondatabase/neon/pull/9916, the preferred AZ of a tenant is much more impactful, and we would like to make it more visible in tooling. ## Summary of changes - Include AZ in node describe API - Include AZ info in node & tenant outputs in CLI - Add metrics for per-node shard counts, labelled by AZ - Add a CLI for setting preferred AZ on a tenant - Extend AZ-setting API+CLI to handle None for clearing preferred AZ	2025-01-14 15:30:43 +00:00
Christian Schwarz	2466a2f977	page_service: throttle individual requests instead of the batched request (#10353 ) ## Problem Before this PR, the pagestream throttle was applied weighted on a per-batch basis. This had several problems: 1. The throttle occurence counters were only bumped by `1` instead of `batch_size`. 2. The throttle wait time aggregator metric only counted one wait time, irrespective of `batch_size`. That makes sense in some ways of looking at it but not in others. 3. If the last request in the batch runs into the throttle, the other requests in the batch are also throttled, i.e., over-throttling happens (theoretical, didn't measure it in practice). ## Solution It occured to me that we can simply push the throttling upwards into `pagestream_read_message`. This has the added benefit that in pipeline mode, the `executor` stage will, if it is idle, steal whatever requests already made it into the `spsc_fold` and execute them; before this change, that was not the case - the throttling happened in the `executor` stage instead of the `batcher` stage. ## Code Changes There are two changes in this PR: 1. Lifting up the throttling into the `pagestream_read_message` method. 2. Move the throttling metrics out of the `Throttle` type into `SmgrOpMetrics`. Unlike the other smgr metrics, throttling is per-tenant, hence the Arc. 3. Refactor the `SmgrOpTimer` implementation to account for the new observation states, and simplify its design. 4. Drive-by-fix flush time metrics. It was using the same `now` in the `observe_guard` every time. The `SmgrOpTimer` is now a state machine. Each observation point moves the state machine forward. If a timer object is dropped early some "pair"-like metrics still require an increment or observation. That's done in the Drop implementation, by driving the state machine to completion.	2025-01-14 15:28:01 +00:00
Alex Chi Z.	9bdb14c1c0	fix(pageserver): ensure initial image layers have correct key ranges (#10374 ) ## Problem Discovered during the relation dir refactor work. If we do not create images as in this patch, we would get two set of image layers: ``` 0000...METADATA_KEYS 0000...REL_KEYS ``` They overlap at the same LSN and would cause data loss for relation keys. This doesn't happen in prod because initial image layer generation is never called, but better to be fixed to avoid future issues with the reldir refactors. ## Summary of changes * Consolidate create_image_layers call into a single one. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-14 15:27:48 +00:00
Conrad Ludgate	df4abd8b14	fix: force-refresh azure identity token (#10378 ) ## Problem Because of https://github.com/Azure/azure-sdk-for-rust/issues/1739, our identity token file was not being refreshed. This caused our uploads to start failing when the storage token expired. ## Summary of changes Drop and recreate the remote storage config every time we upload in order to force reload the identity token file.	2025-01-14 12:53:32 +00:00
Konstantin Knizhnik	a039f8381f	Optimize vector get last written LSN (#10360 ) ## Problem See https://github.com/neondatabase/neon/issues/10281 pg17 performs extra lock/unlock operation when fetching LwLSN. ## Summary of changes Perform all lookups under one lock, moving initialization of not found keys to separate loop. Related Postgres PR: https://github.com/neondatabase/postgres/pull/553 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-14 05:54:30 +00:00
Tristan Partin	430b556b34	Update postgres-exporter and sql_exporter in computes (#10349 ) The postgres-exporter was much further out of date, but let's just bump both. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-14 00:44:39 +00:00
Konstantin Knizhnik	1783501eaa	Increase max connection for replica to prevent test flukyness (#10306 ) ## Problem See https://github.com/neondatabase/neon/issues/10167 Too small number of `max_connections` (2) can cause failures of test_physical_replication_config_mismatch_too_many_known_xids test ## Summary of changes Increase `max_connections` to 5 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-13 20:01:03 +00:00
John Spray	fd1368d31e	storcon: rework scheduler optimisation, prioritize AZ (#9916 ) ## Problem We want to do a more robust job of scheduling tenants into their home AZ: https://github.com/neondatabase/neon/issues/8264. Closes: https://github.com/neondatabase/neon/issues/8969 ## Summary of changes ### Scope This PR combines prioritizing AZ with a larger rework of how we do optimisation. The rationale is that just bumping AZ in the order of Score attributes is a very tiny change: the interesting part is lining up all the optimisation logic to respect this properly, which means rewriting it to use the same scores as the scheduler, rather than the fragile hand-crafted logic that we had before. Separating these changes out is possible, but would involve doing two rounds of test updates instead of one. ### Scheduling optimisation `TenantShard`'s `optimize_attachment` and `optimize_secondary` methods now both use the scheduler to pick a new "favourite" location. Then there is some refined logic for whether + how to migrate to it: - To decide if a new location is sufficiently "better", we generate scores using some projected ScheduleContexts that exclude the shard under consideration, so that we avoid migrating from a node with AffinityScore(2) to a node with AffinityScore(1), only to migrate back later. - Score types get a `for_optimization` method so that when we compare scores, we will only do an optimisation if the scores differ by their highest-ranking attributes, not just because one pageserver is lower in utilization. Eventually we _will_ want a mode that does this, but doing it here would make scheduling logic unstable and harder to test, and to do this correctly one needs to know the size of the tenant that one is migrating. - When we find a new attached location that we would like to move to, we will create a new secondary location there, even if we already had one on some other node. This handles the case where we have a home AZ A, and want to migrate the attachment between pageservers in that AZ while retaining a secondary location in some other AZ as well. - A unit test is added for https://github.com/neondatabase/neon/issues/8969, which is implicitly fixed by reworking optimisation to use the same scheduling scores as scheduling.	2025-01-13 19:33:00 +00:00
Alex Chi Z.	e9ed53b14f	feat(pageserver): support inherited sparse keyspace (#10313 ) ## Problem In preparation to https://github.com/neondatabase/neon/issues/9516. We need to store rel size and directory data in the sparse keyspace, but it does not support inheritance yet. ## Summary of changes Add a new type of keyspace "sparse but inherited" into the system. On the read path: we don't remove the key range when we descend into the ancestor. The search will stop when (1) the full key range is covered by image layers (which has already been implemented before), or (2) we reach the end of the ancestor chain. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-13 15:43:01 +00:00
Conrad Ludgate	a338aee132	feat(local_proxy): use ed25519 signatures with pg_session_jwt (#10290 ) Generally ed25519 seems to be much preferred for cryptographic strength to P256 nowadays, and it is NIST approved finally. We should use it where we can as it's also faster than p256. This PR makes the re-signed JWTs between local_proxy and pg_session_jwt use ed25519. This does introduce a new dependency on ed25519, but I do recall some Neon Authorise customers asking for support for ed25519, so I am justifying this dependency addition in the context that we can then introduce support for customer ed25519 keys sources: * https://csrc.nist.gov/pubs/fips/186-5/final subsection 7 (EdDSA) * https://datatracker.ietf.org/doc/html/rfc8037#section-3.1	2025-01-13 15:20:46 +00:00
Heikki Linnakangas	96243af651	Stop building unnecessary extension tarballs (#10355 ) We build "custom extensions" from a different repository nowadays.	2025-01-13 15:01:13 +00:00
John Spray	ef8bfacd6b	storage controller: API + CLI for migrating secondary locations (#10284 ) ## Problem Currently, if we want to move a secondary there isn't a neat way to do that: we just have migration API for the attached location, and it is only clean to use that if you've manually created a secondary via pageserver API in the place you're going to move it to. Secondary migration API enables: - Moving the secondary somewhere because we would like to later move the attached location there. - Move the secondary location because we just want to reclaim some disk space from its current location. ## Summary of changes - Add `/migrate_secondary` API - Add `tenant-shard-migrate-secondary` CLI - Add tests for above	2025-01-13 14:52:43 +00:00
Konstantin Knizhnik	ceacc29609	Start with minimal prefetch distance to minimize prefetch overhead for exact or limited index scans (#10359 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1736526089437179 In case of queries index scan with LIMIT clause, multiple backends can concurrently send larger number of duplicated prefetch requests which are not stored in LFC and so actually do useless job. Current implementation of index prefetch starts with maximal prefetch distance (10 by default now) when there are no key bounds, so in queries with LIMIT clause like `select * from T order by pk limit 1` compute can send a lot of useless prefetch requests to page server. ## Summary of changes Always start with minimal prefetch distance even if there are not key boundaries. Related Postgres PRs: https://github.com/neondatabase/postgres/pull/552 https://github.com/neondatabase/postgres/pull/551 https://github.com/neondatabase/postgres/pull/550 https://github.com/neondatabase/postgres/pull/549 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-13 14:26:11 +00:00
Erik Grinaker	b31ed0acd1	utils: add ?force=true hint for CPU profiler (#10368 ) This makes it less annoying to try to take a CPU profile when a continuous profile is already running.	2025-01-13 14:23:42 +00:00
Alexander Bayandin	b2d0e1a519	Link OpenSSL dynamically (#10302 ) ## Problem Statically linked OpenSSL is buggy in multithreaded environment: - https://github.com/neondatabase/cloud/issues/16155 - https://github.com/neondatabase/neon/issues/8275 ## Summary of changes - Link OpenSSL dynamically (revert OpenSSL part from https://github.com/neondatabase/neon/pull/8074) Before: ``` ldd /usr/local/v17/lib/libpq.so linux-vdso.so.1 (0x0000ffffb5ce4000) libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffffb5c10000) libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffb5650000) /lib/ld-linux-aarch64.so.1 (0x0000ffffb5ca7000) ``` After: ``` ldd /usr/local/v17/lib/libpq.so linux-vdso.so.1 (0x0000ffffbf3e8000) libssl.so.3 => /lib/aarch64-linux-gnu/libssl.so.3 (0x0000ffffbf260000) libcrypto.so.3 => /lib/aarch64-linux-gnu/libcrypto.so.3 (0x0000ffffbec00000) libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffffbf1c0000) libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffbea50000) /lib/ld-linux-aarch64.so.1 (0x0000ffffbf3ab000) ```	2025-01-13 14:13:02 +00:00
John Spray	d1bc36f536	storage controller: fix retries of compute hook notifications while a secondary node is offline (#10352 ) ## Problem We would sometimes fail to retry compute notifications: 1. Try and send, set compute_notify_failure if we can't 2. On next reconcile, reconcile() fails for some other reason (e.g. tried to talk to an offline node), and we fail the `result.is_ok() && must_notify` condition around the re-sending. Closes: https://github.com/neondatabase/cloud/issues/22612 ## Summary of changes - Clarify the meaning of the reconcile result: it should be Ok(()) if configuring attached location worked, even if secondary or detach locations cannot be reached. - Skip trying to talk to secondaries if they're offline - Even if reconcile fails and we can't send the compute notification (we can't send it because we're not sure if it's really attached), make sure we save the `compute_notify_failure` flag so that subsequent reconciler runs will try again - Add a regression test for the above	2025-01-13 13:31:57 +00:00
Erik Grinaker	0b9032065e	utils: allow 60-second CPU profiles (#10367 ) Taking continuous profiles every 20 seconds is likely too expensive (in dollar terms). Let's try 60-second profiles. We can now interrupt running profiles via `?force=true`, so this should be fine.	2025-01-13 13:14:23 +00:00
Heikki Linnakangas	09fe3b025c	Add a websockets tunnel and a test for the proxy's websockets support. (#3823 ) For testing the proxy's websockets support. I wrote this to test https://github.com/neondatabase/neon/issues/3822. Unfortunately, that bug can not be reproduced with this tunnel. The bug only appears when the client pipelines the first query with the authentication messages. The tunnel doesn't do that. --- Update (@conradludgate 2025-01-10): We have since added some websocket tests, but they manually implemented a very simplistic setup of the postgres protocol. Introducing the tunnel would make more complex testing simpler in the future. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2025-01-13 11:35:39 +00:00
John Spray	12053cf832	storage controller: improve consistency_check_api (#10363 ) ## Problem Limitations found while using this to investigate https://github.com/neondatabase/neon/issues/10234: - If we hit a node consistency issue, we drop out and don't check shards for consistency - The messages printed after a shard consistency issue are huge, and grafana appears to drop them. ## Summary of changes - Defer node consistency errors until the end of the function, so that we always proceed to check shards for consistency - Print out smaller log lines that just point out the diffs between expected and persistent state	2025-01-13 11:18:14 +00:00
Conrad Ludgate	de199d71e1	chore: Address lints introduced in rust 1.85.0 beta (#10340 ) With a new beta build of the rust compiler, it's good to check out the new lints. Either to find false positives, or find flaws in our code. Additionally, it helps reduce the effort required to update to 1.85 in 6 weeks.	2025-01-13 10:34:36 +00:00
Erik Grinaker	22a6460010	libs/utils: add `force` parameter for `/profile/cpu` (#10361 ) ## Problem It's only possible to take one CPU profile at a time. With Grafana continuous profiling, a (low-frequency) CPU profile will always be running, making it hard to take an ad hoc CPU profile at the same time. Resolves #10072. ## Summary of changes Add a `force` parameter for `/profile/cpu` which will end and return an already running CPU profile, starting a new one for the current caller.	2025-01-13 10:01:18 +00:00
Erik Grinaker	cd982a82ec	pageserver,safekeeper: increase heap profiling frequency to 2 MB (#10362 ) ## Problem Currently, the heap profiling frequency is every 1 MB allocated. Taking a profile stack trace takes about 1 µs, and allocating 1 MB takes about 15 µs, so the overhead is about 6.7% which is a bit high. This is a fixed cost regardless of whether heap profiles are actually accessed. ## Summary of changes Increase the heap profiling sample frequency from 1 MB to 2 MB, which reduces the overhead to about 3.3%. This seems acceptable, considering performance-sensitive code will avoid allocations as far as possible anyway.	2025-01-13 09:44:59 +00:00
Heikki Linnakangas	8327f68043	Minor cleanup of extension build commands (#10356 ) There used to be some pg version dependencies in these extensions, but now that there isn't, follow the simpler pattern used in other extensions. No change in the produced images.	2025-01-11 17:39:27 +00:00
Heikki Linnakangas	846e8fdce4	Remove obsolete hnsw extension (#8008 ) This has been deprecated and disabled for new installations for a long time. Let's remove it for good.	2025-01-11 14:20:50 +00:00
Heikki Linnakangas	70a3bf37a0	Stop building 'compute-tools' image (#10333 ) It's been unused from time immemorial. --------- Co-authored-by: Matthias van de Meent <matthias@neon.tech>	2025-01-11 13:09:55 +00:00
Arpad Müller	23c0748cdd	Remove active column (#10335 ) We don't need or want the `active` column. Remove it. Vlad pointed out that this is safe. Thanks to the separation of the schemata in earlier PRs, this is easy. follow-up of #10205 Part of https://github.com/neondatabase/neon/issues/9981	2025-01-11 02:52:45 +00:00
Alex Chi Z.	b5d54ba52a	refactor(pageserver): move queue logic to compaction.rs (#10330 ) ## Problem close https://github.com/neondatabase/neon/issues/10031, part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes Move the compaction job generation to `compaction.rs`, thus making the code more readable and debuggable. We now also return running job through the get compaction job API, versus before we only return scheduled jobs. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-10 20:53:00 +00:00
Christian Schwarz	58332cb361	pageserver: remove unused metric `pageserver_layers_visited_per_read_global` (#10141 ) As of commit "pageserver: remove legacy read path" (#8601) we always use vectored get, which has a separate metric.	2025-01-10 20:35:50 +00:00
Christian Schwarz	9b43204893	fix(page_service): Timeline::gate held open while throttling (#10314 ) When we moved throttling up from Timeline::get into page_service, we stopped being sensitive to `Timeline::cancel`, even though we're holding a Handle and thus a guard on the `Timeline::gate` open. This PR rectifies the situation. Refs - Found while investigating #10309 (hung detach because gate kept open), but not expected to be the root cause of that issue because the affected tenants are not being throttled according to their metrics.	2025-01-10 19:21:01 +00:00
Christian Schwarz	cdd34dfc12	impr(utils/spsc_fold): add another test case (#10319 ) Wondered about the case covered here while investigating #10309.	2025-01-10 19:19:48 +00:00
Vlad Lazar	3cd5034eac	storcon: don't assume host:port format in storcon client (#10347 ) ## Problem https://github.com/neondatabase/infra/pull/2725 updated the scrubber to use a non-host port endpoint for storcon. That breaks when unwrapping the port. ## Summary of changes Support both `host:port` and `host` formats for the storcon api.	2025-01-10 18:35:16 +00:00
Cheng Chen	425b777840	chore(compute): pg_mooncake v0.1.0 (#10337 ) ## Problem Upgrade pg_mooncake to v0.1.0 ## Summary of changes	2025-01-10 16:38:13 +00:00
John Spray	4398051385	tests: smaller datasets in LFC tests (#10346 ) ## Problem These two tests came up in #9537 as doing multi-gigabyte I/O, and from inspection of the tests it doesn't seem like they need that to fulfil their purpose. ## Summary of changes - In test_local_file_cache_unlink, run fewer background threads with a smaller number of rows. These background threads AFAICT exist to make sure some I/O is going on while we unlink the LFC directory, but 5 threads should be enough for "some". - In test_lfc_resize, tweak the test to validate that the cache size is larger than the final size before resizing it, so that we're sure we're writing enough data to really be doing something. Then decrease the pgbench scale.	2025-01-10 15:53:23 +00:00
Folke Behrens	71bca6f580	poetry: Update packaging for poetry v2 (#10344 ) ## Problem When poetry v2 (released Jan 5) is used it needs `packaging.metadata` module, but we downgrade `packaging` to 23.0. `packaging==23.1` introduced the metadata submodule. ## Summary of changes Update `packaging` to 24.2.	2025-01-10 14:32:26 +00:00
John Spray	105f66c4ce	tests: move test_parallel_copy into performance tree (#10343 ) ## Problem This test writes ~5GB of data. It is not suitable to run in parallel with all the other small tests in test_runner/regress. via #9537 ## Summary of changes - Move test_parallel_copy into the performance directory, so that it does not run in parallel with other tests	2025-01-10 13:57:26 +00:00
John Spray	0d4fce2d35	tests: refine how compat snapshot is generated (#10342 ) ## Problem I noticed in https://github.com/neondatabase/neon/pull/9537 that tests which work with compat snapshots were writing several hundred MB of data, which isn't really necessary. Also, the snapshots are large but don't have the proper variety of storage format features, e.g. they could just have L0 deltas. ## Summary of changes - Use smaller scale factor and runtime to generate less data - Configure a small layer size and use force image layer generation so that our output contains L1 deltas and image layers, and has a decent number of entries in the layer map	2025-01-10 13:57:23 +00:00
Erik Grinaker	2b8ea1e768	utils: add flamegraph for heap profiles (#10223 ) ## Problem Unlike CPU profiles, the `/profile/heap` endpoint can't automatically generate SVG flamegraphs. This requires the user to install and use `pprof` tooling, which is unnecessary and annoying. Resolves #10203. ## Summary of changes Add `format=svg` for the `/profile/heap` route, and generate an SVG flamegraph using the `inferno` crate, similarly to what `pprof-rs` already does for CPU profiles.	2025-01-10 12:14:29 +00:00
Christian Schwarz	db00eb41a1	fix(spsc_fold): potentially missing wakeup when send()ing in state `SenderWaitsForReceiverToConsume` (#10318 ) # Problem Before this PR, there were cases where send() in state SenderWaitsForReceiverToConsume would never be woken up by the receiver, because it never registered with `wake_sender`. Example Scenario 1: we stop polling a send() future A that was waiting for the receiver to consume. We drop A and create a new send() future B. B would return Poll::Pending and never regsister a waker. Example Scenario 2: a send() future A transitions from HasData to SenderWaitsForReceiverToConsume. This registers the context X with `wake_sender`. But before the Receiver consumes the data, we poll A from a different context Y. The state is still SenderWaitsForReceiverToConsume, but we wouldn't register the new context with `wake_sender`. When the Receiver comes around to consume and `wake_sender.notify()`s, it wakes the old context X instead of Y. # Fix Register the waker in the case where we're polled in state `SenderWaitsForReceiverToConsume`. # Relation to #10309 I found this bug while investigating #10309. There was never proof that this bug here is the root cause for #10309. In the meantime we found a more probably hypothesis for the root cause than what is being fixed here. Regardless, let's walk through my thought process about how it might have been relevant: There (in page_service), Scenario 1 does not apply because we poll the send() future to completion. Scenario 2 (`tokio::join!`) also does not apply with the current `tokio::join!()` impl, because it will just poll each future every time, each with the same context. Although if we ever used something like a FuturesUnordered anywhere, that will be using a different context, so, in that case, the bug might materialize. Regarding tokio & spurious poll in general: @conradludgate is not aware of any spurious wakeup cases in current tokio, but within a `tokio::join!()`, any wake meant for one future will poll all the futures, so that can appear as a spurious wake up to the N-1 futures of the `tokio::join!()`.	2025-01-10 11:06:03 +00:00
Conrad Ludgate	735c66dc65	fix(proxy): propagate the existing ComputeUserInfo to connect for cancellation (#10322 ) ## Problem We were incorrectly constructing the ComputeUserInfo, used for cancellation checks, based on the return parameters from postgres. This didn't contain the correct info. ## Summary of changes Propagate down the existing ComputeUserInfo.	2025-01-10 09:36:51 +00:00
Folke Behrens	77660f3d88	proxy: Fix parsing of UnknownTopic with payload (#10339 ) ## Problem When the proxy receives a `Notification` with an unknown topic it's supposed to use the `UnknownTopic` unit variant. Unfortunately, in adjacently tagged enums serde will not simply ignore the configured content if found and try to deserialize a map/object instead. ## Summary of changes * Use a custom deserialize function to ignore variant content. * Add a little unit test covering both cases.	2025-01-10 09:12:31 +00:00
Folke Behrens	b6205af4a5	Update tracing/otel crates (#10311 ) Update the tracing(-x) and opentelemetry(-x) crates. Some breaking changes require updating our code: * Initialization is done via builders now https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-otlp/CHANGELOG.md#0270 * Errors from OTel SDK are logged via tracing crate as well. https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry/CHANGELOG.md#0270	2025-01-10 08:48:03 +00:00
Arpad Müller	6149ac8834	Handle race between auto-offload and unarchival (#10305 ) ## Problem Auto-offloading as requested by the compaction task is racy with unarchival, in that the compaction task might attempt to offload an unarchived timeline. By that point it will already have set the timeline to the `Stopping` state however, which makes it unusable for any purpose. For example: 1. compaction task decides to offload timeline 2. timeline gets unarchived 3. `offload_timeline` gets called by compaction task * sets timeline's state to `Stopping` * realizes that the timeline can't be unarchived, errors out 6. endpoint can't be started as the timeline is `Stopping` and thus 'can't be found'. A future iteration of the compaction task can't "heal" this state either as the timeline will still not be archived, same goes for other automatic stuff. The only way to heal this is a tenant detach+attach, or alternatively a pageserver restart. Furthermore, the compaction task is especially amenable for such races as it first stores `can_offload` into a variable, figures out whether compaction is needed (which takes some time), and only then does it attempt an offload operation: the time difference between "check" and "use" is non-trivially small. To make it even worse, we start the compaction task right after attach of a tenant, and it is a common pattern by pageserver users to attach a tenant to then immediately unarchive a timeline, so that an endpoint can be started. ## Solutions not adopted The simplest solution is to move the `can_offload` check to right before attempting of the offload. But this is not a good solution, as no lock is held between that check and timeline shutdown. So races would still be possible, just become less likely. I explored using the timeline state for this, as in adding an additional enum variant. But `Timeline::set_state` is racy (#10297). ## Adopted solution We use the lock on the timeline's upload queue as an arbiter: either unarchival gets to it first and sours the state for auto-offloading, or auto-offloading shuts it down, which stops any parallel unarchival in its tracks. The key part is not releasing the upload queue's lock between the check whether the timeline is archived or not, and shutting it down (the actual implementation only sets `shutting_down` but it has the same effect on `initialized_mut()` as a full shutdown). The rest of the patch is stuff that follows from this. We also move the part where we set the state to `Stopping` to after that arbiter has decided the fate of the timeline. For deletions, we do keep it inside `DeleteTimelineFlow::prepare` however, so that it is called with all of the the timelines locks held that the function allocates (timelines lock most importantly). This is only a precautionary measure however, as I didn't want to analyze deletion related code for possible races. ## Future changes It might make sense to move `can_offload` to right before the offload attempt. Maybe some other properties might have changed as well. Although this will not be perfect either as no lock is held. I want to keep it out of this change to emphasize that this move wasn't the main reason we are race free now. Fixes #10220	2025-01-09 20:41:49 +00:00
Tristan Partin	49756a0d01	Implement compute_ctl management API in Axum (#10099 ) This is a refactor to create better abstractions related to our management server. It cleans up the code, and prepares everything for authorized communication to and from the control plane. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-09 20:08:26 +00:00
Arpad Müller	99b5a6705f	Update rust to 1.84.0 (#10328 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://releases.rs/docs/1.84.0/). Prior update was in #9926.	2025-01-09 18:29:09 +00:00
Alexey Kondratov	f37eeb56ad	fix(compute_ctl): Resolve issues with dropping roles having dangling permissions (#10299 ) ## Problem In Postgres, one cannot drop a role if it has any dependent objects in the DB. In `compute_ctl`, we automatically reassign all dependent objects in every DB to the corresponding DB owner. Yet, it seems that it doesn't help with some implicit permissions. The issue is reproduced by installing a `postgis` extension because it creates some views and tables in the public schema. ## Summary of changes Added a repro test without using a `postgis`: i) create a role via `compute_ctl` (with `neon_superuser` grant); ii) create a test role, a table in schema public, and grant permissions via the role in `neon_superuser`. To fix the issue, I added a new `compute_ctl` code that removes such dangling permissions before dropping the role. It's done in the least invasive way, i.e., only touches the schema public, because i) that's the problem we had with PostGIS; ii) it creates a smaller chance of messing anything up and getting a stuck operation again, just for a different reason. Properly, any API-based catalog operations should fail gracefully and provide an actionable error and status code to the control plane, allowing the latter to unwind the operation and propagate an error message and hint to the user. In this sense, it's aligned with another feature request https://github.com/neondatabase/cloud/issues/21611 Resolve neondatabase/cloud#13582	2025-01-09 16:39:53 +00:00
Arpad Müller	bebc46e713	Add scheduling_policy column to safekeepers table (#10205 ) Add a `scheduling_policy` column to the safekeepers table of the storage controller. Part of #9981	2025-01-09 15:55:02 +00:00
John Spray	ad51622568	remote_storage: enable Azure connection pooling by default (#10324 ) ## Problem Initially we defaulted this to zero to reduce risk. We have now been using pooling in staging for some time without issues, so let's make it the default for anyone using this software without setting the config explicitly. Closes: https://github.com/neondatabase/cloud/issues/20971 ## Summary of changes - Set Azure blob storage connection pool size to 8 by default	2025-01-09 15:34:06 +00:00
John Spray	ac6cca17ac	storcon: don't log a heartbeat error during shutdown (#10325 ) ## Problem Occasionally we see an unexpected error like: ``` ERROR spawn_heartbeat_driver: Failed to update node state 1 after heartbeat round: Shutting down\n') Hint: use scripts/check_allowed_errors.sh to test any new allowed_error you add ``` https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10324/12690404952/index.html#/testresult/63406a0687bf6eca ## Summary of changes - Explicitly handle ApiError::ShuttingDown as a no-op when mutating node status	2025-01-09 15:33:44 +00:00
Alex Chi Z.	640ac4fc9e	fix(pageserver): report timestamp is in the past if the key is missing (#10210 ) ## Problem If for some reasons we already garbage-collected the data under an LSN but the caller uses a past LSN for the find_time_cutoff function, now we will report a missing key error and GC will never proceed. Note that missing key error can also happen if the key is really missing (i.e., during the past offload incidents) ## Summary of changes Make sure GC proceeds by bumping the LSN. When time_cutoff=None, we will not increase the time_cutoff (it will be set to latest_gc_cutoff). If we really need to bump the GC LSN for maintenance purpose, we need a separate API to do that. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-09 14:43:20 +00:00
Konstantin Knizhnik	20c40eb733	Add response tag to getpage request in V3 protocol version (#8686 ) ## Problem We have several serious data corruption incidents caused by mismatch of get-age requests: https://neondb.slack.com/archives/C07FJS4QF7V/p1723032720164359 We hope that the problem is fixed now. But it is better to prevent such kind of problems in future. Part of https://github.com/neondatabase/cloud/issues/16472 ## Summary of changes This PR introduce new V3 version of compute<->pageserver protocol, adding tag to getpage response. So now compute is able to check if it really gets response to the requested page. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2025-01-09 13:12:04 +00:00
Vlad Lazar	f4739d49e3	pageserver: tweak interpreted ingest record metrics (#10291 ) ## Problem The filtered record metric doesn't make sense for interpreted ingest. ## Summary of changes While of dubious utility in the first place, this patch replaces them with records received and records observed metrics for interpreted ingest: * received records cause the pageserver to do _something_: write a key, value pair to storage, update some metadata or flush pending modifications * observed records are a shard 0 concept and contain only key metadata used in tracking relation sizes (received records include observed records)	2025-01-09 12:31:02 +00:00
Arseny Sher	030ab1c0e8	TLA+ spec for safekeeper membership change (#9966 ) ## Problem We want to define the algorithm for safekeeper membership change. ## Summary of changes Add spec for it, several models and logs of checking them. ref https://github.com/neondatabase/neon/issues/8699	2025-01-09 12:26:17 +00:00
John Spray	5baa4e7f0a	docker: don't set LD_LIBRARY_PATH (#10321 ) ## Problem This was causing storage controller to still use neon-built libpq instead of vanilla libpq. Since https://github.com/neondatabase/neon/pull/10269 we have a vanilla postgres in the system path -- anything that wants a postgres library will use that. ## Summary of changes - Remove LD_LIBRARY_PATH assignment in Dockerfile	2025-01-09 11:47:55 +00:00
Folke Behrens	03666a1f37	Merge pull request #10320 from neondatabase/rc/release-proxy/2025-01-09 Proxy release 2025-01-09	2025-01-09 10:19:07 +01:00
Tristan Partin	5b2751397d	Refactor MigrationRunner::run_migrations() to call a helper (#10232 ) This will make it easier to add per-db migrations, such as that for CVE-2024-4317. Link: https://www.postgresql.org/support/security/CVE-2024-4317/ Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-09 07:05:07 +00:00
github-actions[bot]	9c92242ca0	Proxy release 2025-01-09	2025-01-09 06:02:06 +00:00
Ivan Efremov	fcfff72454	impr(proxy): Decouple ip_allowlist from the CancelClosure (#10199 ) This PR removes the direct dependency of the IP allowlist from CancelClosure, allowing for more scalable and flexible IP restrictions and enabling the future use of Redis-based CancelMap storage. Changes: - Introduce a new BackendAuth async trait that retrieves the IP allowlist through existing authentication methods; - Improve cancellation error handling by instrument() async cancel_sesion() rather than dropping it. - Set and store IP allowlist for SCRAM Proxy to consistently perform IP allowance check Relates to #9660	2025-01-08 19:34:53 +00:00
Anastasia Lubennikova	0ad0db6ff8	compute: dropdb DROP SUBSCRIPTION fix (#10066 ) ## Problem Project gets stuck if database with subscriptions was deleted via API / UI. https://github.com/neondatabase/cloud/issues/18646 ## Summary of changes Before dropping the database, drop all the subscriptions in it. Do not drop slot on publisher, because we have no guarantee that the slot still exists or that the publisher is reachable. Add `DropSubscriptionsForDeletedDatabases` phase to run these operations in all databases, we're about to delete. Ignore the error if the database does not exist.	2025-01-08 18:55:04 +00:00
John Spray	68d8acfd05	storage controller: don't hold detached tenants in memory (#10264 ) ## Problem Typical deployments of neon have some tenants that stay in use continuously, and a background churning population of tenants that are created and then fall idle, and are configured to Detached state. Currently, this churn of short lived tenants results in an ever-increasing memory footprint. Closes: https://github.com/neondatabase/neon/issues/9712 ## Summary of changes - At startup, filter to only load shards that don't have Detached policy - In process_result, check if a tenant's shards are all Detached and observed=={}, and if so drop them from memory - In tenant_location_conf and other tenant mutators, load the tenants' shards on-demand if they are not present	2025-01-08 18:12:09 +00:00
Vlad Lazar	dc284247a5	storage_controller: fix node flap detach race (#10298 ) ## Problem The observed state removal may race with the inline updates of the observed state done from `Service::node_activate_reconcile`. This was intended to work as follows: 1. Detaches while the node is unavailable remove the entry from the observed state. 2. `Service::node_activate_reconcile` diffs the locations returned by the pageserver with the observed state and detaches in-line when required. ## Summary of changes This PR removes step (1) and lets background reconciliations deal with the mismatch between the intent and observed state. A follow up will attempt to remove `Service::node_activate_reconcile` altogether. Closes https://github.com/neondatabase/neon/issues/10253	2025-01-08 10:26:53 +00:00
Alex Chi Z.	5c76e2a983	fix(storage-scrubber): ignore errors if index_part is not consistent (#10304 ) ## Problem Consider the pageserver is doing the following sequence of operations: * upload X files * update index_part to add X and remove Y * delete Y files When storage scrubber obtains the initial timeline snapshot before "update index_part" (that is the old version that contains Y but not X), and then obtains the index_part file after it gets updated, it will report all Y files are missing. ## Summary of changes Do not report layer file missing if index_part listed and downloaded are not the same (i.e. different last_modified times) Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-01-07 23:24:17 +00:00
Erik Grinaker	237dae71a1	Revert "pageserver,safekeeper: disable heap profiling (#10268 )" (#10303 ) This reverts commit `b33299dc37`. Heap profiles weren't the culprit after all. Touches #10225.	2025-01-07 22:49:00 +00:00
Fedor Dikarev	43a5e575d6	ci: use reusable workflow for MacOs build (#9889 ) Closes: https://github.com/neondatabase/cloud/issues/17784 ## Problem Currently, we run the whole CI pipeline for any changes. It's slow and expensive. ## Suggestion Starting with MacOs builds: - check what files were changed - rebuild only needed parts - reuse results from previous builds when available - run builds in parallel when possible --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-01-07 20:00:56 +00:00
Folke Behrens	0a117fb1f1	proxy: Parse Notification twice only for unknown topic (#10296 ) ## Problem We currently parse Notification twice even in the happy path. ## Summary of changes Use `#[serde(other)]` to catch unknown topics and defer the second parsing.	2025-01-07 15:24:54 +00:00
JC Grünhage	4aa9786c6b	Fix promote-images-prod after splitting it out (#10292 ) ## Problem `promote-images` was split into `promote-images-dev` and `promote-images-prod` in https://github.com/neondatabase/neon/pull/10267. `dev` credentials were loaded in `promote-images-dev` and `prod` credentials were loaded in `promote-images-prod`, but `promote-images-prod` needs `dev` credentials as well to access the `dev` images to replicate them from `dev` to `prod`. ## Summary of changes Load `dev` credentials in `promote-images-prod` as well.	2025-01-07 13:45:18 +00:00
Matthias van de Meent	be38123e62	Fix accounting of dropped prefetched GetPage requests (#10276 ) Apparently, we failed to do this bookkeeping in quite a few places... ## Problem Fixes https://github.com/neondatabase/cloud/issues/22364 ## Summary of changes Add accounting of dropped requests. Note that this includes prefetches dropped due to things like "PS connection dropped unexpectedly" or "prefetch queue is already full", but not (yet?) "dropped due to backend shutdown".	2025-01-07 10:41:52 +00:00
JC Grünhage	ea84ec357f	Split promote-images into promote-images-dev and promote-images-prod (#10267 ) ## Problem `trigger-e2e-tests` waits half an hour before starting to run. Nearly half of that time can be saved by promoting images before tests on them are complete, so the e2e tests can run in parallel. On `main` and `release{,-proxy,-compute}`, `promote-images` updates `latest` and pushes things to prod ecr, so we want to run `promote-images` only after `test-images` is done, but on other branches, there is no harm in promoting images that aren't tested yet. ## Summary of changes To promote images into dev container registries sooner, `promote-images` is split into `promote-images-dev` and `promote-images-prod`. The former pushes to dev container registries, the latter to prod ones. The latter also waits for `test-images`, while the former doesn't. This allows to run `trigger-e2e-tests` sooner.	2025-01-07 10:36:05 +00:00
Matthias van de Meent	30863c0104	libpagestore: timeout = max(0, difference), not min(0, difference) (#10274 ) Using `min(0, ...)` causes us to fail to wait in most situations, so a lack of data would be a hot wait loop, which is bad. ## Problem We noticed high CPU usage in some situations	2025-01-07 09:07:38 +00:00
Alexander Bayandin	02f81b6469	Fix clippy warning on macOS (#10282 ) ## Problem On macOS: ``` error: unused variable: `disable_lfc_resizing` --> compute_tools/src/bin/compute_ctl.rs:431:9 \| 431 \| disable_lfc_resizing, \| ^^^^^^^^^^^^^^^^^^^^ help: try ignoring the field: `disable_lfc_resizing: _` \| = note: `-D unused-variables` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_variables)]` ``` ## Summary of changes - Initialise `disable_lfc_resizing` only on Linux (because it's used on Linux only in further bloc)	2025-01-06 20:28:33 +00:00
Alexander Bayandin	ad7f14d526	test_runner: update packages for Python 3.13 (#10285 ) ## Problem It's impossible to run regression tests with Python 3.13 as some dependencies don't support it (some of them are outdated, and `jsonnet` doesn't support it at all yet) ## Summary of changes - Update dependencies for Python 3.13 - Install `jsonnet` only on Python < 3.13 and skip relevant tests on Python 3.13 Closes #10237	2025-01-06 20:25:31 +00:00
Erik Grinaker	b342a02b1c	Dockerfile: build with `force-frame-pointers=yes` (#10286 ) See https://github.com/neondatabase/neon/pull/10226#issuecomment-2573725182.	2025-01-06 20:17:43 +00:00
Alex Chi Z.	4a6556e269	fix(pageserver): ensure GC computes time cutoff using the same start time (#10193 ) ## Problem close https://github.com/neondatabase/neon/issues/10192 ## Summary of changes * `find_gc_time_cutoff` takes `now` parameter so that all branches compute the cutoff based on the same start time, avoiding races. * gc-compaction uses a single `get_gc_compaction_watermark` function to get the safe LSN to compact. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-01-06 19:29:18 +00:00
Erik Grinaker	95f1920231	cargo: build with frame pointers (#10226 ) ## Problem Frame pointers are typically disabled by default (depending on CPU architecture), to improve performance. This frees up a CPU register, and avoids a couple of instructions per function call. However, it makes stack unwinding much more inefficient, since it has to use DWARF debug information instead, and gives worse results with e.g. `perf` and eBPF profiles. The `backtrace` implementation of `libunwind` is also suspected to cause seg faults. The performance benefit of frame pointer omission doesn't appear to matter that much on modern 64-bit CPU architectures (which have plenty of registers and optimized instruction execution), and benchmarks did not show measurable overhead. The Rust standard library and jemalloc already enable frame pointers by default. For more information, see https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html. Resolves #10224. Touches #10225. ## Summary of changes Enable frame pointers in all builds, and use frame pointers for pprof-rs stack sampling.	2025-01-06 17:27:08 +00:00
Conrad Ludgate	fda52a0005	feat(proxy): dont trigger error alerts for unknown topics (#10266 ) ## Problem Before the holidays, and just before our code freeze, a change to cplane was made that started publishing the topics from #10197. This triggered our alerts and put us in a sticky situation as it was not an error, and we didn't want to silence the alert for the entire holidays, and we didn't want to release proxy 2 days in a row if it was not essential. We fixed it eventually by rewriting the alert based on logs, but this is not a good solution. ## Summary of changes Introduces an intermediate parsing step to check the topic name first, to allow us to ignore parsing errors for any topics we do not know about.	2025-01-06 13:05:35 +00:00
Busra Kugler	406cca643b	Update neon_fixtures.py - remove logs (#10219 ) We need to remove this line to prevent aws keys exposing in the public s3 buckets	2025-01-06 10:44:23 +00:00
dependabot[bot]	b368e62cfc	build(deps): bump jinja2 from 3.1.4 to 3.1.5 in the pip group (#10236 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-04 15:40:50 +00:00
John Spray	4b2f56862d	docker: include vanilla debian postgres client (#10269 ) ## Problem We are chasing down segfaults in the storage controller https://github.com/neondatabase/cloud/issues/21010 This is for use by the storage controller, which links dynamically with `libpq`. We currently use the neon-built libpq, but this may be unsafe for use from multi-threaded programs like the controller, as it uses a statically linked openssl Precursor to https://github.com/neondatabase/neon/pull/10258 ## Summary of changes - Include `postgresql-15` in container builds. The reason for using version 15 is simply because that is what's available in Debian 12 without adding any extra repositories, and we don't have any special need for latest version in our libpq usage.	2025-01-03 16:16:04 +00:00
Erik Grinaker	a77e87a48a	pageserver: assert that uploads don't modify indexed layers (#10228 ) ## Problem It's not legal to modify layers that are referenced by the current layer index. Assert this in the upload queue, as preparation for upload queue reordering. Touches #10096. ## Summary of changes Add a debug assertion that the upload queue does not modify layers referenced by the current index. I could be convinced that this should be a plain assertion, but will be conservative for now.	2025-01-03 16:03:19 +00:00
Erik Grinaker	1393cc668b	Revert "pageserver: revert flush backpressure (#8550 ) (#10135 )" (#10270 ) This reverts commit `f3ecd5d76a`. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](https://github.com/neondatabase/neon/issues/10096) before properly addressing [compaction backpressure](https://github.com/neondatabase/neon/issues/8390).	2025-01-03 15:38:51 +00:00
Erik Grinaker	b33299dc37	pageserver,safekeeper: disable heap profiling (#10268 ) ## Problem Since enabling continuous profiling in staging, we've seen frequent seg faults. This is suspected to be because jemalloc and pprof-rs take a stack trace at the same time, and the handlers aren't signal safe. jemalloc does this probabilistically on every allocation, regardless of whether someone is taking a heap profile, which means that any CPU profile has a chance to cause a seg fault. Touches #10225. ## Summary of changes For now, just disable heap profiles -- CPU profiles are more important, and we need to be able to take them without risking a crash.	2025-01-03 15:21:31 +00:00
John Spray	e9d30edc7f	pageserver: fix a 500 during timeline creation + shutdown (#10259 ) ## Problem The test_create_churn_during_restart test fails if timeline creation calls return 500 errors (because the API shouldn't do it), and it's sometimes failing, for example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10256/12582034135/index.html#/testresult/3ce2e7045465012e ## Summary of changes - Avoid handling UploadQueueShutDownOrStopped case as an Other (i.e. 500)	2025-01-03 13:13:22 +00:00
Arpad Müller	1303cd5d05	Fix defusing race between Tenant::shutdown and offload_timeline (#10150 ) There is a race condition between `Tenant::shutdown`'s `defuse_for_drop` loop and `offload_timeline`, where timeline offloading can insert into a tenant that is in the process of shutting down, in fact so far progressed that the `defuse_for_drop` has already been called. This prevents warn log lines of the form: ``` offloaded timeline <hash> was dropped without having cleaned it up at the ancestor ``` The solution piggybacks on the `offloaded_timelines` lock: both the defuse loop and the offloaded timeline insertion need to acquire the lock, and we know that the defuse loop only runs after the tenant has set its `TenantState` to `Stopping`. So if we hold the `offloaded_timelines` lock, and know that the `TenantState` is not `Stopping`, then we know that the defuse loop has not ran yet, and holding the lock ensures that it doesn't start running while we are inserting the offloaded timeline. Fixes #10070	2025-01-03 12:36:01 +00:00
John Spray	c08759f367	storcon: verbose logs in rare case of shards not attached yet (#10262 ) ## Problem When we do a timeline CRUD operation, we check that the shards we need to mutate are currently attached to a pageserver, by reading `generation` and `generation_pageserver` from the database. If any don't appear to be attached, we respond with a a 503 and "One or more shards in tenant is not yet attached". This is happening more often than expected, and it's not obvious with current logging what's going on: specifically which shard has a problem, and exactly what we're seeing in these persistent generation columns. (Aside: it's possible that we broke something with the change in #10011 which clears generation_pageserver when we detach a shard, although if so the mechanism isn't trivial: what should happen is that if we stamp on generation_pageserver if a reconciler is running, then it shouldn't matter because we're about to ## Summary of changes - When we are in Attached mode but find that generation_pageserver/generation are unset, output details while looping over shards.	2025-01-03 10:55:15 +00:00
John Spray	ba9722a2fd	tests: add upload wait in test_scrubber_physical_gc_ancestors (#10260 ) ## Problem We see periodic failures in `test_scrubber_physical_gc_ancestors`, where the logs show that the pageserver is creating image layers that should cause child shards to no longer reference their parents' layers, but then the scrubber runs and doesn't find any unreferenced layers.[ https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10256/12582034135/index.html#/testresult/78ea06dea6ba8dd3 From inspecting the code & test, it seems like this could be as simple as the test failing to wait for uploads before running the scrubber. It had a 2 second delay built in to satisfy the scrubbers time threshold checks, which on a lightly loaded machine would also have been easily enough for uploads to complete, but our test machines are more heavily loaded all the time. ## Summary of changes - Wait for uploads to complete after generating images layers in test_scrubber_physical_gc_ancestors, so that the scrubber should reliably see the post-compaction metadata.	2025-01-03 10:55:07 +00:00
John Spray	2d4f267983	cargo: update diesel, pq-sys (#10256 ) ## Problem Versions of `diesel` and `pq-sys` were somewhat stale. I was checking on libpq->openssl versions while investigating a segfault via https://github.com/neondatabase/cloud/issues/21010. I don't think these rust bindings are likely to be the source of issues, but we might as well freshen them as a precaution. ## Summary of changes - Update diesel to 2.2.6 - Update pq-sys to 0.6.3	2025-01-03 10:20:18 +00:00
Ivan Efremov	7a598b9842	[proxy/docs]imprv: Add local testing section to proxy README (#10230 ) Add commands to run proxy locally with the mocked control plane	2025-01-03 10:04:58 +00:00
Tristan Partin	eefad27538	Inline various migration queries (#10231 ) There was no value in saving them off to temporary variables. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-02 22:12:56 +00:00
Em Sharnoff	cd10c719f9	compute: Add spec support for disabling LFC resizing (#10132 ) ref neondatabase/cloud#21731 ## Problem When we manually override the LFC size for particular computes, autoscaling will typically undo that because vm-monitor will resize LFC itself. So, we'd like a way to make vm-monitor not set LFC size — this actually already exists, if we just don't give vm-monitor a postgres connection string. ## Summary of changes Add a new field to the compute spec, `disable_lfc_resizing`. When set to `true`, we pass in `None` for its postgres connection string. That matches the configuration tested in `neondatabase/autoscaling` CI.	2025-01-02 19:45:59 +00:00
Tristan Partin	363ea97f69	Add more substantial tests for compute migrations (#9811 ) The previous tests really didn't do much. This set should be quite a bit more encompassing. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-01-02 18:37:50 +00:00
Conrad Ludgate	56e6ebfe17	chore: building compute_tools and local_proxy together (#10257 ) ## Problem Building local_proxy and compute_tools features the same dependency tree, but as they are currently built in separate clean layers all that progress is wasted. For our arm builds that's an extra 10 minutes. ## Summary of changes Combines the compute_tools and local_proxy build layers.	2025-01-02 16:05:14 +00:00
Raphael 'kena' Poss	1622fd8bda	proxy: recognize but ignore the 3 new redis message types (#10197 ) ## Problem https://neondb.slack.com/archives/C085MBDUSS2/p1734604792755369 ## Summary of changes Recognize and ignore the 3 new broadcast messages: - `/block_public_or_vpc_access_updated` - `/allowed_vpc_endpoints_updated_for_org` - `/allowed_vpc_endpoints_updated_for_projects`	2025-01-02 16:02:48 +00:00
Konstantin Knizhnik	8c7dcd2598	Set heartbeat interval for chaos test (#10222 ) ## Problem See https://neondb.slack.com/archives/C033RQ5SPDH/p1734707873215729 test_timeline_archival_chaos becomes more flaky with increased heartbeat interval Resolves #10250. ## Summary of changes Override heatbeat interval for `test_timelirn_archival_chaos.py` --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-02 14:14:18 +00:00
Folke Behrens	ee22d4c9ef	proxy: Set TCP_NODELAY for compute connections (#10240 ) neondatabase/cloud#19184	2025-01-02 13:32:24 +00:00
JC Grünhage	26600f2973	Skip running clippy without default features (#10098 ) ## Problem Running clippy with `cargo hack --feature-powerset` in CI isn't particularly fast. This PR follows-up on https://github.com/neondatabase/neon/pull/8912 to improve the speed of our clippy runs. Parallelism as suggested in https://github.com/neondatabase/neon/issues/9901 was tested, but didn't show consistent enough improvements to be worth it. It actually increased the amount of work done, as there's less cache hits when clippy runs are spread out over multiple target directories. Additionally, parallelism makes it so caching needs to be thought about more actively and copying around target directories to enable parallelism eats the rest of the performance gains from parallel execution. After some discussion, the decision was to instead cut down on the number of jobs that are running further. The easiest way to do this is to not run clippy without default features. The list of default features is empty for all crates, and I haven't found anything using `cfg(feature = "default")` either, so this is likely not going to change anything except speeding the runs up. ## Summary of changes Reduce the amount of feature combinations tried by `cargo hack` (as suggested in https://github.com/neondatabase/neon/pull/8912#pullrequestreview-2286482368) by never disabling default features. ## Alternatives - We can split things out into different jobs which reduces the time until everything is finished by running more things in parallel. This does however decreases the amount of cache hits and increases the amount of time spent on overhead tasks like repo cloning and restoring caches by doing those multiple times instead of once. - We could replace `cargo hack [...] clippy` with `cargo clippy [...]; cargo clippy --features testing`. I'm not 100% sure how this compares to the change here in the PR, but it does seem to run a bit faster. That likely means it's doing less work, but without understanding what exactly we loose by that I'd rather not do that for now. I'd appreciate input on this though.	2025-01-02 11:33:42 +00:00
Konstantin Knizhnik	b3cd883f93	Unlock LFC mutex when LFC cache is disabled (#10235 ) ## Problem See https://github.com/neondatabase/neon/issues/10233 `lfc_containsv` returns with holding lock when LFC was disabled. This bug was introduced in commit `78938d1b59` ## Summary of changes Release lock before return. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-01-02 11:28:15 +00:00
Conrad Ludgate	38c7a2abfc	chore(proxy): pre-load native tls certificates and propagate compute client config (#10182 ) Now that we construct the TLS client config for cancellation as well as connect, it feels appropriate to construct the same config once and re-use it elsewhere. It might also help should #7500 require any extra setup, so we can easily add it to all the appropriate call sites.	2025-01-02 09:36:13 +00:00
Conrad Ludgate	f94248a594	chore(libs/proxy): refactor tokio-postgres connection control flow (#10247 ) In #10207 it was clear there was some confusion with the current connection logic. To analyse the flow to make sure there was no poll stalling, I ended up with the following refactor. Notable changes: 1. Now all functions called `poll_xyz` and that have a `cx: &mut Context` argument must return a `Poll<_>` type, and can only return `Pending` iff an internal poll call also returned `Pending` 2. State management is handled entirely by `poll_messages`. There are now only 2 states which makes it much easier to keep track of. Each commit should be self-reviewable and should be simple to verify that it keeps the same behaviour	2025-01-02 09:35:28 +00:00
Alex Chi Z.	9c53b41245	fix(pageserver): update remote latest_gc_cutoff after gc-compaction (#10209 ) ## Problem close https://github.com/neondatabase/neon/issues/10208 part of #9114 ## Summary of changes * Ensure remote `latest_gc_cutoff` is up-to-date before removing any files for gc-compaction. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-19 18:40:20 +00:00
Konstantin Knizhnik	197a89ab3d	Increase default stotrage controller heartbeat interval from 100msec … (#10206 ) ## Problem Currently default value of storage controller heartbeat interval is 100msec. It means that 10 times per second it establish connection to PS. And it seems to be quite expensive. At MacOS right now storage_controller consumes 70% CPU and trusts - 30%. So together they completely utilize one core. A lot of us has Macs. Let's save environment a little bit and do not waste electricity and contribute to global warming. By the way, on prod we have interval 10seconds ## Summary of changes Increase heartbeat interval from 100msec to 1 second. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-19 18:32:32 +00:00
Alex Chi Z.	b89e02f3e8	fix(pageserver): consider partial compaction layer map in layer check (#10044 ) ## Problem In https://github.com/neondatabase/neon/pull/9897 we temporarily disabled the layer valid check because the current one only considers the end result of all compaction algorithms, but partial gc-compaction would temporarily produce an "invalid" layer map. part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes Allow LSN splits to overlap in the slow path check. Currently, the valid check is only used in storage scrubber (background job) and during gc-compaction (without taking layer lock). Therefore, it's fine for such checks to be a little bit inefficient but more accurate. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-12-19 18:04:53 +00:00
Konstantin Knizhnik	04517c6ff3	Do not reload config file on PS reconnect (#10204 ) ## Problem See https://github.com/neondatabase/neon/issues/10184 and https://neondb.slack.com/archives/C04DGM6SMTM/p1733997259898819 Reloading config file inside parallel worker cause it's termination ## Summary of changes Remove call of `HandleMainLoopInterrupts()` Update of page server URL is propagated by postmaster through shared memory and we should not reload config for it. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-19 15:22:39 +00:00
Vlad Lazar	628451d68e	safekeeper: short-circuit interpreted wal sender (#10202 ) ## Problem Safekeeper may currently send a batch to the pageserver even if it hasn't decoded a new record. I think this is quite unlikely in the field, but worth adressing. ## Summary of changes Don't send anything if we haven't decoded a full record. Once this merges and releases, the `InterpretedWalRecords` struct can be updated to remove the Option wrapper for `next_record_lsn`.	2024-12-19 14:04:46 +00:00
Vlad Lazar	502d512fe2	safekeeper: lift benchmarking utils into safekeeper crate (#10200 ) ## Problem The benchmarking utilities are also useful for testing. We want to write tests in the safekeeper crate. ## Summary of changes This commit lifts the utils to the safekeeper crate. They are compiled if the benchmarking features is enabled or if in test mode.	2024-12-19 14:04:42 +00:00
John Spray	afda6d4700	storage_scrubber: don't report half-created timelines as corruption (#10198 ) ## Problem test_timeline_archival_chaos does timeline creation with failure injection, and thereby sometimes leaves timelines in a part created state. This was being reported as corruption by the scrubber on test teardown, because it considered a layer without an index to be an invalid state. This was incorrect: the scrubber should accept this state, it occurs legitimately during timeline creation. Closes: https://github.com/neondatabase/neon/issues/9988 ## Summary of changes - Report a timeline with layers but no index as Relic rather than MissingIndexPart. - We retain the MissingIndexPart variant for the case where an index _was_ found in the listing, but was not found by a subsequent GET, i.e. racing with deletion.	2024-12-19 12:55:05 +00:00
John Spray	65042cbadd	tests: use high IO concurrency in `test_pgdata_import_smoke`, use `effective_io_concurrency=2` in tests by default (#10114 ) ## Problem `test_pgdata_import_smoke` writes two gigabytes of pages and then reads them back serially. This is CPU bottlenecked and results in a long runtime, and sensitivity to CPU load from other tests on the same machine. Closes: https://github.com/neondatabase/neon/issues/10071 ## Summary of changes - Use effective_io_concurrency=32 when doing sequential scans through 2GiB of pages in test_pgdata_import_smoke. This is a ~10x runtime decrease in the parts of the test that do sequential scans. - Also set `effective_io_concurrency=2` for tests, as I noticed while debugging that we were doing all getpage requests serially, which is bad for checking the stability of the batching code.	2024-12-19 10:58:49 +00:00
Folke Behrens	b135194090	proxy: Delay SASL complete message until auth is done (#10189 ) The final SASL complete message can be bundled with the remainder of the auth flow messages until ReadyForQuery. neondatabase/cloud#19184	2024-12-19 10:37:08 +00:00
Peter Bendel	43dc03459d	Run pgbench on 10 GB scale factor on database with n relations (e.g. 10k) (#10172 ) ## Problem We want to verify how much / if pgbench throughput and latency on Neon suffers if the database contains many other relations, too. ## Summary of changes Modify the benchmarking.yml pgbench-compare job to - create an addiitional project at scale factor 10 GiB - before running pgbench add n tables (initially 10k) to the database - then compare the pgbench throughput and latency to the existing pgbench-compare at 10 Gib scale factor We use a realistic template for the n relations that is a partitioned table with some realistic data types, indexes and constraints - similar to a table that we use internally. Example run: https://github.com/neondatabase/neon/actions/runs/12377565956/job/34547386959	2024-12-19 10:25:44 +00:00
Christian Schwarz	a1b0558493	fast import: importer: use aws s3 cli (#10162 ) ## Problem s5cmd doesn't pick up the pod service account ``` 2024/12/16 16:26:01 Ignoring, HTTP credential provider invalid endpoint host, "169.254.170.23", only loopback hosts are allowed. <nil> ERROR "ls s3://neon-dev-bulk-import-us-east-2/import-pgdata/fast-import/v1/br-wandering-hall-w2xobawv": NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors ``` ## Summary of changes Switch to offical CLI. ## Testing Tested the pre-merge image in staging, using `job_image` override in project settings. https://neondb.slack.com/archives/C033RQ5SPDH/p1734554944391949?thread_ts=1734368383.258759&cid=C033RQ5SPDH ## Future Work Switch back to s5cmd once https://github.com/peak/s5cmd/pull/769 gets merged. ## Refs - fixes https://github.com/neondatabase/cloud/issues/21876 --------- Co-authored-by: Gleb Novikov <NanoBjorn@users.noreply.github.com>	2024-12-19 10:04:17 +00:00
Alex Chi Z.	cc138b56f9	fix(pageserver): run psql in thread to avoid blocking (#10177 ) ## Problem ref https://github.com/neondatabase/neon/issues/10170 ref https://github.com/neondatabase/neon/issues/9994 The psql command will block the main thread, causing other async tasks to timeout (i.e., HTTP connect). Therefore, we need to move it to an I/O executor thread. ## Summary of changes * run psql connection in a thread --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: John Spray <john@neon.tech>	2024-12-19 09:45:06 +00:00
Konstantin Knizhnik	61fcf64c22	Fix flukyness of test_physical_and_logical_replicaiton.py (#10176 ) ## Problem See https://github.com/neondatabase/neon/issues/10037 test_physical_and_logical_replication.py sometimes failed. ## Summary of changes Add `wait_replica_caughtup` to wait for replica sync Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-18 19:15:38 +00:00
Alex Chi Z.	6d3e8096fc	refactor(test): tighten up test_gc_feedback (#10126 ) ## Problem In https://github.com/neondatabase/neon/pull/8103 we changed the test case to have more test coverage of gc_compaction. Now that we have `test_gc_compaction_smoke`, we can revert this test case to serve its original purpose and revert the parameter changes. part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes * Revert pitr_interval from 60s to 10s. * Assert the physical/logical size ratio in the benchmark. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-12-18 18:10:05 +00:00
Alex Chi Z.	3d1c3a80ae	feat(pageserver): add compact queue http endpoint (#10173 ) ## Problem We cannot get the size of the compaction queue and access the info. Part of #9114 ## Summary of changes * Add an API endpoint to get the compaction queue. * gc_compaction test case now waits until the compaction finishes. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-18 18:09:02 +00:00
John Spray	835287ba3a	neon_local: add a `flock` to protect against concurrent execution (#10185 ) ## Problem `neon_local` has always been unsafe to run concurrently with itself: it uses simple text files for persistent state, and concurrent runs will step on each other. In some test environments we intentionally handle this with mutexes in python land, but it's fragile to try and always remember to do that. ## Summary of changes - Add a `flock` based mutex around the `main` function of neon_local, using the repo directory as the file to lock - Clean up an Option<> around control_plane_api, this is a drive-by change because it was one of the fields that had a weird effect when previous concurrent stuff stamped on it.	2024-12-18 16:29:47 +00:00
Conrad Ludgate	d63602cc78	chore(proxy): fully remove allow-self-signed-compute flag (#10168 ) When https://github.com/neondatabase/cloud/pull/21856 is merged, this flag is no longer necessary.	2024-12-18 16:03:14 +00:00
Erik Grinaker	1668d39b7c	safekeeper: fix typo in allowlist for `/profile/heap` (#10186 )	2024-12-18 15:51:53 +00:00
Alex Chi Z.	1d12efc428	fix(pageserver): allow repartition errors during gc-compaction smoke tests (#10164 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 In https://github.com/neondatabase/neon/pull/10127 we fixed the race, but we didn't add the errors to the allowlist. ## Summary of changes * Allow repartition errors in the gc-compaction smoke test. I think it might be worth to refactor the code to allow multiple threads getting a copy of repartition status (i.e., using Rcu) in the future. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-18 15:37:26 +00:00
Arpad Müller	85696297c5	Add safekeepers command to storcon_cli for listing (#10151 ) Add a `safekeepers` subcommand to `storcon_cli` that allows listing the safekeepers. ``` $ curl -X POST --url http://localhost:1234/control/v1/safekeeper/42 --data \ '{"active":true, "id":42, "created_at":"2023-10-25T09:11:25Z", "updated_at":"2024-08-28T11:32:43Z","region_id":"neon_local","host":"localhost","port":5454,"http_port":0,"version":123,"availability_zone_id":"us-east-2b"}' $ cargo run --bin storcon_cli -- --api http://localhost:1234 safekeepers Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.38s Running `target/debug/storcon_cli --api 'http://localhost:1234' safekeepers` +----+---------+-----------+------+-----------+------------+ \| Id \| Version \| Host \| Port \| Http Port \| AZ Id \| +==========================================================+ \| 42 \| 123 \| localhost \| 5454 \| 0 \| us-east-2b \| +----+---------+-----------+------+-----------+------------+ ``` Also: * Don't return the raw `SafekeeperPersistence` struct that contains the raw database presentation, but instead a new `SafekeeperDescribeResponse` struct. * The `SafekeeperPersistence` struct leaves out the `active` field on purpose because we want to deprecate it and replace it with a `scheduling_policy` one. Part of https://github.com/neondatabase/neon/issues/9981	2024-12-18 12:47:56 +00:00
Konstantin Knizhnik	aaf980f70d	Online checkpoint replication state (#9976 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1733180965970089 Replication state is checkpointed only by shutdown checkpoint. It means that replication snapshots are not removed till compute shutdown. ## Summary of changes Checkpoint replication state during online checkpoint Related Postgres PR: https://github.com/neondatabase/postgres/pull/546 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-18 09:34:38 +00:00
Conrad Ludgate	a354071dd0	Merge pull request #10180 from neondatabase/rc/release-proxy/2024-12-17 Proxy release 2024-12-17	2024-12-18 06:31:05 +00:00
github-actions[bot]	758680d4f8	Proxy release 2024-12-17	2024-12-17 22:06:42 +00:00
a-masterov	c52514ab02	Fix allure report creation on periodic `pg_regress` testing (#10171 ) ## Problem The allure report finishes with the error `HttpError: Resource not accessible by integration` while running the `pg_regress` test against a cloud staging project due to a lack of permissions. ## Summary of changes The permissions are added.	2024-12-17 20:47:44 +00:00
Conrad Ludgate	2ee6bc5ec4	chore(proxy): update vendored postgres libs to edition 2021 (#10139 ) I ran `cargo fix --edition` in each project prior, and it found nothing that needed fixing.	2024-12-17 20:06:18 +00:00
John Spray	fd230227f2	storcon: include preferred AZ in compute notifications (#9953 ) ## Problem It is unreliable for the control plane to infer the AZ for computes from where the tenant is currently attached, because if a tenant happens to be in a degraded state or a release is ongoing while a compute starts, then the tenant's attached AZ can be a different one to where it will run long-term, and the control plane doesn't check back later to restart the compute. This can land in parallel with https://github.com/neondatabase/neon/pull/9947 ## Summary of changes - Thread through the preferred AZ into the compute hook code via the reconciler - Include the preferred AZ in the body of compute hook notifications	2024-12-17 20:04:09 +00:00
Ivan Efremov	93e958341f	[proxy]: Use TLS for cancellation queries (#10152 ) ## Problem pg_sni_router assumes that all the streams are upgradable to TLS. Cancellation requests were declined because of using NoTls config. ## Summary of changes Provide TLS client config for cancellation requests. Fixes [#21789](https://github.com/orgs/neondatabase/projects/65/views/1?pane=issue&itemId=90911361&issue=neondatabase%7Ccloud%7C21789)	2024-12-17 19:26:54 +00:00
Tristan Partin	7dddbb9570	Add pg_repack extension (#10100 ) Our solutions engineers and some customers would like to have this extension available. Link: https://github.com/neondatabase/cloud/issues/18890 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-17 18:36:55 +00:00
Erik Grinaker	a55853f67f	utils: symbolize heap profiles (#10153 ) ## Problem Jemalloc heap profiles aren't symbolized. This is inconvenient, and doesn't work with Grafana Cloud Profiles. Resolves #9964. ## Summary of changes Symbolize the heap profiles in-process, and strip unnecessary cruft. This uses about 100 MB additional memory to cache the DWARF information, but I believe this is already the case with CPU profiles, which use the same library for symbolization. With cached DWARF information, the symbolization CPU overhead is negligible. Example profiles: * [pageserver.pb.gz](https://github.com/user-attachments/files/18141395/pageserver.pb.gz) * [safekeeper.pb.gz](https://github.com/user-attachments/files/18141396/safekeeper.pb.gz)	2024-12-17 16:51:58 +00:00
Mikhail Kot	007b13b79a	Don't build tests in compute image, use ninja (#10149 ) Don't build tests in h3 and rdkit: ~15 min speedup. Use Ninja as cmake generator where possible: ~10 min speedup. Clean apt cache for smaller images: around 250mb size loss for intermediate layers	2024-12-17 16:43:54 +00:00
Alexey Kondratov	2dfd3cab8c	fix(compute): Report compute_backpressure_throttling_seconds as counter (#10125 ) ## Problem It was reported as `gauge`, but it's actually a `counter`. Also add `_total` suffix as that's the convention for counters. The corresponding flux-fleet PR: https://github.com/neondatabase/flux-fleet/pull/386	2024-12-17 16:14:07 +00:00
John Spray	b5833ef259	remote_storage: configurable connection pooling for ABS (#10169 ) ## Problem The ABS SDK's default behavior is to do no connection pooling, i.e. open and close a fresh connection for each request. Under high request rates, this can result in an accumulation of TCP connections in TIME_WAIT or CLOSE_WAIT state, and in extreme cases exhaustion of client ports. Related: https://github.com/neondatabase/cloud/issues/20971 ## Summary of changes - Add a configurable `conn_pool_size` parameter for Azure storage, defaulting to zero (current behavior) - Construct a custom reqwest client using this connection pool size.	2024-12-17 12:24:51 +00:00
Erik Grinaker	b0e43c2f88	postgres_ffi: add `WalStreamDecoder::complete_record()` benchmark (#10158 ) Touches #10097.	2024-12-17 10:35:00 +00:00
a-masterov	e226d7a3d1	Fix docker compose with PG17 (#10165 ) ## Problem It's impossible to run docker compose with compute v17 due to `pg_anon` extension which is not supported under PG17. ## Summary of changes The auto-loading of `pg_anon` is disabled by default	2024-12-17 08:16:54 +00:00
Folke Behrens	aa7ab9b3ac	proxy: Allow dumping TLS session keys for debugging (#10163 ) ## Problem To debug issues with TLS connections there's no easy way to decrypt packets unless a client has special support for logging the keys. ## Summary of changes Add TLS session keys logging to proxy via `SSLKEYLOGFILE` env var gated by flag.	2024-12-16 18:56:24 +00:00
Erik Grinaker	28ccda0a63	test_runner: ignore error in `test_timeline_archival_chaos` (#10161 ) Resolves #10159.	2024-12-16 17:10:55 +00:00
Conrad Ludgate	59b7ff8988	chore(proxy): disallow unwrap and unimplemented (#10142 ) As the title says, I updated the lint rules to no longer allow unwrap or unimplemented. Three special cases: * Tests are allowed to use them * std::sync::Mutex lock().unwrap() is common because it's usually correct to continue panicking on poison * `tokio::spawn_blocking(...).await.unwrap()` is common because it will only error if the blocking fn panics, so continuing the panic is also correct I've introduced two extension traits to help with these last two, that are a bit more explicit so they don't need an expect message every time.	2024-12-16 16:37:15 +00:00
Conrad Ludgate	2e4c9c5704	chore(proxy): remove allow_self_signed from regular proxy (#10157 ) I noticed that the only place we use this flag is for testing console redirect proxy. Makes sense to me to make this assumption more explicit.	2024-12-16 16:11:39 +00:00
Erik Grinaker	3d30a7a934	pageserver: make `RemoteTimelineClient::schedule_index_upload` infallible (#10155 ) Remove an unnecessary `Result` and address a `FIXME`.	2024-12-16 15:54:47 +00:00
Conrad Ludgate	6565fd4056	chore: fix clippy lints 2024-12-06 (#10138 )	2024-12-16 15:33:21 +00:00
Arseny Sher	c5e3314c6e	Add test restarting compute at WAL page boundary (#10111 ) ## Problem We've had similar test in test_logical_replication, but then removed it because it wasn't needed to trigger LR related bug. Restarting at WAL page boundary is still a useful test, so add it separately back. ## Summary of changes Add the test.	2024-12-16 14:53:04 +00:00
Arseny Sher	1ed0e52bc8	Extract safekeeper http client to separate crate. (#10140 ) ## Problem We want to use safekeeper http client in storage controller and neon_local. ## Summary of changes Extract it to separate crate. No functional changes.	2024-12-16 12:07:24 +00:00
Conrad Ludgate	24d6587914	chore(proxy): refactor self-signed config (#10154 ) ## Problem While reviewing #10152 I found it tricky to actually determine whether the connection used `allow_self_signed_compute` or not. I've tried to remove this setting in the past: * https://github.com/neondatabase/neon/pull/7884 * https://github.com/neondatabase/neon/pull/7437 * https://github.com/neondatabase/cloud/pull/13702 But each time it seems it is used by e2e tests ## Summary of changes The `node_info.allow_self_signed_computes` is always initialised to false, and then sometimes inherits the proxy config value. There's no need this needs to be in the node_info, so removing it and propagating it via `TcpMechansim` is simpler.	2024-12-16 11:15:25 +00:00
John Spray	ebcbc1a482	pageserver: tighten up code around SLRU dir key handling (#10082 ) ## Problem Changes in #9786 were functionally complete but missed some edges that made testing less robust than it should have been: - `is_key_disposable` didn't consider SLRU dir keys disposable - Timeline `init_empty` was always creating SLRU dir keys on all shards The result was that when we had a bug (https://github.com/neondatabase/neon/pull/10080), it wasn't apparent in tests, because one would only encounter the issue if running on a long-lived timeline with enough compaction to drop the initially created empty SLRU dir keys, _and_ some CLog truncation going on. Closes: https://github.com/neondatabase/cloud/issues/21516 ## Summary of changes - Update is_key_global and init_empty to handle SLRU dir keys properly -- the only functional impact is that we avoid writing some spurious keys in shards >0, but this makes testing much more robust. - Make `test_clog_truncate` explicitly use a sharded tenant The net result is that if one reverts #10080, then tests fail (i.e. this PR is a reproducer for the issue)	2024-12-16 10:06:08 +00:00
Konstantin Knizhnik	117c1b5dde	Do not perform prefetch for temp relations (#10146 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1734002916827019 With recent prefetch fixes for pg17 and `effective_io_concurrency=100` pg_regress test stats.sql is failed when set temp_buffers to 100. Stream API will try to lock all this 100 buffers for prefetch. ## Summary of changes Disable such behaviour for temp relations. Postgres PR: https://github.com/neondatabase/postgres/pull/548 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-16 06:03:53 +00:00
Erik Grinaker	f3ecd5d76a	pageserver: revert flush backpressure (#8550 ) (#10135 ) ## Problem In #8550, we made the flush loop wait for uploads after every layer. This was to avoid unbounded buildup of uploads, and to reduce compaction debt. However, the approach has several problems: * It prevents upload parallelism. * It prevents flush and upload pipelining. * It slows down ingestion even when there is no need to backpressure. * It does not directly backpressure WAL ingestion (only via `disk_consistent_lsn`), and will build up in-memory layers. * It does not directly backpressure based on compaction debt and read amplification. An alternative solution to these problems is proposed in #8390. In the meanwhile, we revert the change to reduce the impact on ingest throughput. This does reintroduce some risk of unbounded upload/compaction buildup. Until https://github.com/neondatabase/neon/issues/8390, this can be addressed in other ways: * Use `max_replication_apply_lag` (aka `remote_consistent_lsn`), which will more directly limit upload debt. * Shard the tenant, which will spread the flush/upload work across more Pageservers and move the bottleneck to Safekeeper. Touches #10095. ## Summary of changes Remove waiting on the upload queue in the flush loop.	2024-12-15 09:45:12 +00:00
Mikhail Kot	cf161e1556	fix(adapter): password not set in role drop (#10130 ) ## Problem When entry was dropped and password wasn't set, new entry had uninitialized memory in controlplane adapter Resolves: https://github.com/neondatabase/cloud/issues/14914 ## Summary of changes Initialize password in all cases, add tests. Minor formatting for less indentation	2024-12-14 17:37:13 +00:00
Konstantin Knizhnik	2521eba674	Check for invalid down link while prefetching B-Tree leave pages for index-only scan (#9867 ) ## Problem See #9866 Index-only scan prefetch implementation doesn't take in account that down link may be invalid ## Summary of changes Check that downlink is valid block number Correspondent Postgres PRs: https://github.com/neondatabase/postgres/pull/534 https://github.com/neondatabase/postgres/pull/535 https://github.com/neondatabase/postgres/pull/536 https://github.com/neondatabase/postgres/pull/537 --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-13 20:46:41 +00:00
Alexander Bayandin	d56fea680e	CI: always require aws-oicd-role-arn input to be set (#10145 ) ## Problem `benchmarking` job fails because `aws-oicd-role-arn` input is not set ## Summary of changes: - Set `aws-oicd-role-arn` for `benchmarking job - Always require `aws-oicd-role-arn` to be set - Rename `aws_oicd_role_arn` to `aws-oicd-role-arn` for consistency	2024-12-13 19:56:32 +00:00
Alex Chi Z.	7ee5dca752	fix(pageserver): race between gc-compaction and repartition (#10127 ) ## Problem close https://github.com/neondatabase/neon/issues/10124 gc-compaction split_gc_jobs is holding the repartition lock for too long time. ## Summary of changes * Ensure split_gc_compaction_jobs drops the repartition lock once it finishes cloning the structures. * Update comments. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-13 18:22:25 +00:00
Tristan Partin	07d1db54b3	Improve comments and log messages in the logical replication monitor (#9974 ) Improved comments will help others when they read the code, and the log messages will help others understand why the logical replication monitor works the way it does. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-13 18:10:42 +00:00
Konstantin Knizhnik	eeabecd89f	Correctly update LFC used_pages in case of LFC resize (#10128 ) ## Problem LFC used_pages statistic is not updated in case of LFC resize (shrinking `neon.file_cache_size_limit`) ## Summary of changes Update `lfc_ctl->used_pages` in `lfc_change_limit_hook` Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-13 17:40:26 +00:00
Christian Schwarz	fcff752851	fix(test_timeline_archival_chaos): flakiness caused by orphan layers (#10083 ) The test was failing with the scary but generic message `Remote storage metadata corrupted`. The underlying scrubber error is `Orphan layer detected: ...`. The test kills pageserver at random points, hence it's expected that we leak layers if we're killed in the window after layer upload but before it's referenced from index part. Refer to generation numbers RFC for details. Refs: - fixes https://github.com/neondatabase/neon/issues/9988 - root-cause analysis https://github.com/neondatabase/neon/issues/9988#issuecomment-2520673167	2024-12-13 16:28:21 +00:00
Alexander Bayandin	2c91062828	test_prefetch: reduce timeout to default 5m from 10m (#10105 ) ## Problem `test_prefetch` is flaky (https://github.com/neondatabase/neon/issues/9961), but if it passes, the run time is less than 30 seconds — we don't need an extended timeout for it. ## Summary of changes - Remove extended test timeout for `test_prefetch`	2024-12-13 14:52:54 +00:00
Arseny Sher	ce8eb089f3	Extract public sk types to safekeeper_api (#10137 ) ## Problem We want to extract safekeeper http client to separate crate for use in storage controller and neon_local. However, many types used in the API are internal to safekeeper. ## Summary of changes Move them to safekeeper_api crate. No functional changes. ref https://github.com/neondatabase/neon/issues/9011	2024-12-13 14:06:27 +00:00
a-masterov	7dc382601c	Fix pg_regress tests on a cloud staging instance (#10134 ) ## Problem pg_regress tests start failing due to unique ids added to Neon error messages ## Summary of changes Patches updated	2024-12-13 13:59:04 +00:00
Rahul Patil	2451969d5c	fix(ci): Allow github-action-script to post reports (#10136 ) Allow github-action-script to post reports. Failed CI: https://github.com/neondatabase/neon/actions/runs/12304655364/job/34342554049#step:13:514	2024-12-13 12:22:15 +00:00
JC Grünhage	59ef701925	CI(deploy): fix git tag/release creation (#10119 ) ## Problem When moving the comment on proxy-releases from the yaml doc into a javascript code block, I missed converting the comment marker from `#` to `//`. ## Summary of changes Correctly convert comment marker.	2024-12-12 23:38:20 +00:00
Alexander Bayandin	ac04bad457	CI: don't run debug builds with LFC (#10123 ) ## Problem I've noticed that debug builds with LFC fail more frequently and for some reason ,their failure do block merging (but it should not) ## Summary of changes - Do not run Debug builds with LFC	2024-12-12 22:55:38 +00:00
Peter Bendel	2f3f98a319	use OIDC role instead of AWS access keys for managing test runner (#10117 ) in periodic pagebench workflow ## Problem for background see https://github.com/neondatabase/cloud/issues/21545 ## Summary of changes use OIDC role to manage runners instead of AWS access key which needs to be periodically rotated ## logs seems to work in https://github.com/neondatabase/neon/actions/runs/12298575888/job/34322306127#step:6:1	2024-12-12 20:25:39 +00:00
Alex Chi Z.	5ff4b991c7	feat(pageserver): gc-compaction split over LSN (#9900 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114, stacked PR over https://github.com/neondatabase/neon/pull/9897, partially refactored to help with https://github.com/neondatabase/neon/issues/10031 ## Summary of changes * gc-compaction takes `above_lsn` parameter. We only compact the layers above this LSN, and all data below the LSN are treated as if they are on the ancestor branch. * refactored gc-compaction to take `GcCompactJob` that describes the rectangular range to be compacted. * Added unit test for this case. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-12 20:23:24 +00:00
John Spray	a93e3d31cc	storcon: refine logic for choosing AZ on tenant creation (#10054 ) ## Problem When we update our scheduler/optimization code to respect AZs properly (https://github.com/neondatabase/neon/pull/9916), the choice of AZ becomes a much higher-stakes decision. We will pretty much always run a tenant in its preferred AZ, and that AZ is fixed for the lifetime of the tenant (unless a human intervenes) Eventually, when we do auto-balancing based on utilization, I anticipate that part of that will be to automatically change the AZ of tenants if our original scheduling decisions have caused imbalance, but as an interim measure, we can at least avoid making this scheduling decision based purely on which AZ contains the emptiest node. This is a precursor to https://github.com/neondatabase/neon/pull/9947 ## Summary of changes - When creating a tenant, instead of scheduling a shard and then reading its preferred AZ back, make the AZ decision first. - Instead of choosing AZ based on which node is emptiest, use the median utilization of nodes in each AZ to pick the AZ to use. This avoids bad AZ decisions during periods when some node has very low utilization (such as after replacing a dead node) I considered also making the selection a weighted pseudo-random choice based on utilization, but wanted to avoid destabilising tests with that for now.	2024-12-12 19:35:38 +00:00
Rahul Patil	6d5687521b	fix(ci): Allow github-script to post test reports (#10120 ) Allow github-script to post test reports	2024-12-12 18:53:35 +00:00
Heikki Linnakangas	53721266f1	Disable connection logging in pgbouncer by default (#10118 ) It can produce a lot of logs, making pgbouncer itself consume all CPU in extreme cases. We saw that happen in stress testing.	2024-12-12 17:05:58 +00:00
a-masterov	2f3433876f	Change the channel for notification. (#10112 ) ## Problem Now notifications about failures in `pg_regress` tests run on the staging cloud instance, reach the channel `on-call-staging-stream`, while they should reach `on-call-qa-staging-stream` ## Summary of changes The channel changed.	2024-12-12 16:34:07 +00:00
Rahul Patil	58d45c6e86	ci(fix): Use OIDC auth to login on ECR (#10055 ) ## Problem CI currently uses static credentials in some places. These are less secure and hard to maintain, so we are going to deprecate them and use OIDC auth. ## Summary of changes - ci(fix): Use OIDC auth to upload artifact on s3 - ci(fix): Use OIDC auth to login on ECR	2024-12-12 15:13:08 +00:00
Conrad Ludgate	e502e880b5	chore(proxy): remove code for old API (#10109 ) ## Problem Now that https://github.com/neondatabase/cloud/issues/15245 is done, we can remove the old code. ## Summary of changes Removes support for the ManagementV2 API, in favour of the ProxyV1 API.	2024-12-12 13:42:50 +00:00
Arseny Sher	c9a773af37	Fix test_subscriber_synchronous_commit flakiness. (#10057 ) `6f7aeaa` configured LFC for USE_LFC case, but omitted setting shared_buffers for non USE_LFC, causing flakiness. ref https://github.com/neondatabase/neon/issues/9989	2024-12-12 11:57:00 +00:00
Vlad Lazar	ec0ce06c16	tests: default interpreted proto in tests (#10079 ) ## Problem We aren't using the sharded interpreted wal receiver protocol in all tests. ## Summary of changes Default to the interpreted protocol.	2024-12-12 10:53:10 +00:00
Conrad Ludgate	1738fd0a96	Merge pull request #10107 from neondatabase/rc/release-proxy/2024-12-12 Proxy release 2024-12-12	2024-12-12 10:21:30 +00:00
Conrad Ludgate	87b7edfc72	Merge branch 'release-proxy' into rc/release-proxy/2024-12-12	2024-12-12 09:58:31 +00:00
Alexander Bayandin	0bd8eca9ca	Storage: create release PRs On Fridays (#10017 ) ## Problem To give Storage more time on preprod — create a release branch on Friday ## Summary of changes - Automatically create Storage release PR on Friday instead of Monday	2024-12-12 09:18:50 +00:00
Misha Sakhnov	739f627b96	Bump vm-builder v0.35.0 -> v0.37.1 (#10015 ) Bump version to pick up changes introduced in the neonvm-daemon to support sys fs based CPU scaling (https://github.com/neondatabase/autoscaling/issues/1082). Previous update: https://github.com/neondatabase/neon/pull/9208	2024-12-12 08:45:52 +00:00
github-actions[bot]	def05700d5	Proxy release 2024-12-12	2024-12-12 06:02:08 +00:00
Arpad Müller	342cbea255	storcon: add safekeeper list API (#10089 ) This adds an API to the storage controller to list safekeepers registered to it. This PR does a `diesel print-schema > storage_controller/src/schema.rs` because of an inconsistency between up.sql and schema.rs, introduced by [this](`2c142f14f7`) commit, so there is some updates of `schema.rs` due to that. As a followup to this, we should maybe think about running `diesel print-schema` in CI. Part of #9981	2024-12-12 01:09:24 +00:00
Tristan Partin	b391b29bdc	Improve typing in test_runner/fixtures/httpserver.py (#10103 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-11 22:21:42 +00:00
Erik Grinaker	5126ebbfed	test_runner: bump test_check_visibility_map timeout (#10091 ) ## Problem `test_check_visibility_map` has been seen to time out in debug tests. ## Summary of changes Bump the timeout to 10 minutes (test reports indicate 7 minutes is sufficient). We don't want to disable the test entirely in debug builds, to exercise this with debug assertions enabled. Resolves #10069.	2024-12-11 21:37:25 +00:00
Arpad Müller	7fa986bc92	Do tenant manifest validation with index-part (#10007 ) This adds some validation of invariants that we want to uphold wrt the tenant manifest and `index_part.json`: * the data the manifest has about a timeline must match with the data in `index_part.json`. It might actually change, e.g. when we do reparenting during detach ancestor, but that requires the timeline to be unoffloaded, i.e. removed from the manifest. * any timeline mentioned in index part, must, if present, be archived. If we unarchive, we first update the tenant manifest to unoffload, and only then update index part. And one needs to archive before offloading. * it is legal for timelines to be mentioned in the manifest but have no `index_part`: this is a temporary state visible during deletion of the timeline. if the pageserver crashed, an attach of the tenant will clean the state up. * it is also legal for offloaded timelines to have an `ancestor_retain_lsn` of None while having an `ancestor_timeline_id`. This is for the to-be-added flattening functionality: the plan is to set former to None if we have flattened a timeline. follow-up of #9942 part of #8088	2024-12-11 20:10:22 +00:00
Vlad Lazar	e8395807a5	storcon: allow for more concurrency in drain/fill operations (#10093 ) ## Problem We saw the drain/fill operations not drain fast enough in ap-southeast. ## Summary of changes These are some quick changes to speed it up: * double reconcile concurrency - this is now half of the available reconcile bandwidth * reduce the waiter polling timeout - this way we can spawn new reconciliations faster	2024-12-11 19:43:40 +00:00
Vlad Lazar	a3e80448e8	pageserver/storcon: add patch endpoints for tenant config metrics (#10020 ) ## Problem Cplane and storage controller tenant config changes are not additive. Any change overrides all existing tenant configs. This would be fine if both did client side patching, but that's not the case. Once this merges, we must update cplane to use the PATCH endpoint. ## Summary of changes ### High Level Allow for patching of tenant configuration with a `PATCH /v1/tenant/config` endpoint. It takes the same data as it's PUT counterpart. For example the payload below will update `gc_period` and unset `compaction_period`. All other fields are left in their original state. ``` { "tenant_id": "1234", "gc_period": "10s", "compaction_period": null } ``` ### Low Level * PS and storcon gain `PATCH /v1/tenant/config` endpoints. PS endpoint is only used for cplane managed instances. * `storcon_cli` is updated to have separate commands for `set-tenant-config` and `patch-tenant-config` Related https://github.com/neondatabase/cloud/issues/21043	2024-12-11 19:16:33 +00:00
Anastasia Lubennikova	ef233e91ef	Update compute_installed_extensions metric: (#9891 ) add owned_by_superuser field to filter out system extensions. While on it, also correct related code: - fix the metric setting: use set() instead of inc() in a loop. inc() is not idempotent and can lead to incorrect results if the function called multiple times. Currently it is only called at compute start, but this will change soon. - fix the return type of the installed_extensions endpoint to match the metric. Currently it is only used in the test.	2024-12-11 16:43:26 +00:00
Mikhail Kot	dee2041cd3	walproposer: fix link error on debian 12 / ubuntu 22 (#10090 ) ## Problem Linking walproposer library (e.g. `cargo t`) produces linker errors: /home/myrrc/neon/pgxn/neon/walproposer_compat.c:169: undefined reference to `pg_snprintf' The library with these symbols (libpgcommon.a) is present ## Summary of changes Changed order of libraries resolution for linker	2024-12-11 16:23:59 +00:00
Arseny Sher	e4bb1ca7d8	Increase neon_local http client to compute timeout in reconfigure. (#10088 ) Seems like 30s sometimes not enough when CI runners are overloaded, causing pull_timeline flakiness. ref https://github.com/neondatabase/neon/issues/9731#issuecomment-2535946443	2024-12-11 15:46:50 +00:00
a-masterov	b987648e71	Enable LFC for all the PG versions. (#10068 ) ## Problem We added support for LFC for tests but are still using it only for the PG17 release. ## Summary of changes LFC is enabled for all PG versions. Errors in tests with LFC enabled now block merging as usual. We keep tests with disabled LFC for PG17 release. Tests on debug builds with LFC enabled still don't affect permission to merge.	2024-12-11 15:28:10 +00:00
Mikhail Kot	c79c1dd8e9	compute_ctl: don't panic if control plane can't be reached (#10078 ) ## Problem If the control plane cannot be reached for some reason, compute_ctl panics ## Summary of changes panic is removed in favour of returning an error. Code is reformatted a bit for more flat control flow Resolves: #5391	2024-12-11 15:03:11 +00:00
Vlad Lazar	a53db73851	pageserver: don't drop multixact slrus on non zero shards (#10086 ) ## Problem We get slru truncation commands on non-zero shards. Compaction will drop the slru dir keys and ingest will fail when receiving such records. https://github.com/neondatabase/neon/pull/10080 fixed it for clog, but not for multixact. ## Summary of changes Only truncate multixact slrus on shard zero. I audited the rest of the ingest code and it looks fine from this pov.	2024-12-11 14:28:18 +00:00
Christian Schwarz	9ae980bf4f	page_service: don't count time spent in Batcher towards smgr latency metrics (#10075 ) ## Problem With pipelining enabled, the time a request spends in the batcher stage counts towards the smgr op latency. If pipelining is disabled, that time is not accounted for. In practice, this results in a jump in smgr getpage latencies in various dashboards and degrades the internal SLO. ## Solution In a similar vein to #10042 and with a similar rationale, this PR stops counting the time spent in batcher stage towards smgr op latency. The smgr op latency metric is reduced to the actual execution time. Time spent in batcher stage is tracked in a separate histogram. I expect to remove that histogram after batching rollout is complete, but it will be helpful in the meantime to reason about the rollout.	2024-12-11 13:37:08 +00:00
Vlad Lazar	665369c439	wal_decoder: fix compact key protobuf encoding (#10074 ) ## Problem Protobuf doesn't support 128 bit integers, so we encode the keys as two 64 bit integers. Issue is that when we split the 128 bit compact key we use signed 64 bit integers to represent the two halves. This may result in a negative lower half when relnode is larger than `0x00800000`. When we convert the lower half to an i128 we get a negative `CompactKey`. ## Summary of Changes Use unsigned integers when encoding into Protobuf. ## Deployment * Prod: We disabled the interpreted proto, so no compat concerns. * Staging: Disable the interpreted proto, do one release, and then release the fixed version. We do this because a negative int32 will convert to a large uint32 value and could give a key in the actual pageserver space. In production we would around this by adding new fields to the proto and deprecating the old ones, but we can make our lives easy here. * Pre-prod: Same as staging	2024-12-11 12:35:02 +00:00
JC Grünhage	d7aeca2f34	CI(deploy): create git tags/releases before triggering deploy workflows (#10022 ) ## Problem When dev deployments are disabled (or fail), the tags for releases aren't created. It makes more sense to have tag and release creation before the deployment to prevent situations like [this](https://github.com/neondatabase/neon/pull/9959). It is not enough to move the tag creation before the deployment. If the deployment fails, re-running the job isn't possible because the API call to create the tag will fail. ## Summary of changes - Tag/Release creation now happens before the deployment - The two steps for tag and release have been merged into a bigger one - There's new checks to ensure the that if the tags/releases already exist as expected, things will continue just fine.	2024-12-11 09:41:34 +00:00
John Spray	38415a9816	pageserver: fix ingest handling of CLog truncate (#10080 ) ## Problem In #9786 we stop storing SLRUs on non-zero shards. However, there was one code path during ingest that still tries to enumerate SLRU relations on all shards. This fails if it sees a tenant who has never seen any write to an SLRU, or who has done such thorough compaction+GC that it has dropped its SLRU directory key. ## Summary of changes - Avoid trying to list SLRU relations on nonzero shards	2024-12-11 09:16:11 +00:00
Matthias van de Meent	597125e124	Disable readstream's reliance on seqscan readahead (#9860 ) Neon doesn't have seqscan detection of its own, so stop read_stream from trying to utilize that readahead, and instead make it issue readahead of its own. ## Problem @knizhnik noticed that we didn't issue smgrprefetch[v] calls for seqscans in PG17 due to the move to the read_stream API, which assumes that the underlying IO facilities do seqscan detection for readahead. That is a wrong assumption when Neon is involved, so let's remove the code that applies that assumption. ## Summary of changes Remove the cases where seqscans are detected and prefetch is disabled as a consequence, and instead don't do that detection. PG PR: https://github.com/neondatabase/postgres/pull/532	2024-12-11 00:51:05 +00:00
Matthias van de Meent	e71d20d392	Emit nbtree vacuum cycle id in nbtree xlog through forced FPIs (#9932 ) This fixes neondatabase/neon#9929. ## Postgres repo PRS: - PG17: https://github.com/neondatabase/postgres/pull/538 - PG16: https://github.com/neondatabase/postgres/pull/539 - PG15: https://github.com/neondatabase/postgres/pull/540 - PG14: https://github.com/neondatabase/postgres/pull/541 ## Problem see #9929 ## Summary of changes We update the split code to force the code to emit an FPI whenever the cycle ID might be interesting for concurrent btree vacuum.	2024-12-10 19:42:52 +00:00
Alex Chi Z.	aa0554fd1e	feat(test_runner): allowed_errors in storage scrubber (#10062 ) ## Problem resolve https://github.com/neondatabase/neon/issues/9988#issuecomment-2528239437 ## Summary of changes * New verbose mode for storage scrubber scan metadata (pageserver) that contains the error messages. * Filter allowed_error list from the JSON output to determine the healthy flag status. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-10 17:00:47 +00:00
Heikki Linnakangas	b853f78136	Print a log message if GetPage response takes too long (#10046 ) We have metrics for GetPage request latencies, but this is an extra measure to capture requests that take way too long in the logs. The log message is printed every 10 s, until the response is received: ``` PG:2024-12-09 16:02:07.715 GMT [1782845] LOG: [NEON_SMGR] [shard 0] no response received from pageserver for 10.000 s, still waiting (sent 10613 requests, received 10612 responses) PG:2024-12-09 16:02:17.723 GMT [1782845] LOG: [NEON_SMGR] [shard 0] no response received from pageserver for 20.008 s, still waiting (sent 10613 requests, received 10612 responses) PG:2024-12-09 16:02:19.719 GMT [1782845] LOG: [NEON_SMGR] [shard 0] received response from pageserver after 22.006 s ```	2024-12-10 16:26:56 +00:00
Alex Chi Z.	6ad99826c1	fix(pageserver): refresh_gc_info should always increase cutoff (#9862 ) ## Problem close https://github.com/neondatabase/cloud/issues/19671 ``` Timeline ----------------------------- ^ last GC happened LSN ^ original retention period setting = 24hr > refresh-gc-info updates the gc_info ^ planned cutoff (gc_info) ^ customer set retention to 48hr, and it's still within the last GC LSN ^1 ^2 we have two choices: (1) update the planned cutoff to move backwards, or (2) keep the current one ``` In this patch, we decided to keep the current cutoff instead of moving back the gc_info to avoid races. In the future, we could allow the planned gc cutoff to go back once cplane sends a retention_history tenant config update, but this requires a careful revisit of the code. ## Summary of changes Ensure that GC cutoffs never go back if retention settings get changed. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-10 15:23:26 +00:00
Konstantin Knizhnik	311ee793b9	Fix handling in-flight requersts in prefetch buffer resize (#9968 ) ## Problem See https://github.com/neondatabase/neon/issues/9961 Current implementation of prefetch buffer resize doesn't correctly handle in-flight requests ## Summary of changes 1. Fix index of entry we should wait for if new prefetch buffer size is smaller than number of in-flight requests. 2. Correctly set flush position Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-10 15:01:40 +00:00
Erik Grinaker	ad472bd4a1	test_runner: add visibility map test (#9940 ) Verifies that visibility map pages are correctly maintained across shards. Touches #9914.	2024-12-10 12:07:00 +00:00
Arpad Müller	c51db1db61	Replace MAX_KEYS_PER_DELETE constant with function (#10061 ) Azure has a different per-request limit of 256 items for bulk deletion compared to the number of 1000 on AWS. Therefore, we need to support multiple values. Due to `GenericRemoteStorage`, we can't add an associated constant, but it has to be a function. The PR replaces the `MAX_KEYS_PER_DELETE` constant with a function of the same name, implemented on both the `RemoteStorage` trait as well as on `GenericRemoteStorage`. The value serves as hint of how many objects to pass to the `delete_objects` function. Reading: * https://learn.microsoft.com/en-us/rest/api/storageservices/blob-batch * https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html Part of #7931	2024-12-10 11:29:38 +00:00
Ivan Efremov	34c1295594	[proxy] impr: Additional logging for cancellation queries (#10039 ) ## Problem Since cancellation tasks spawned in the background sometimes logs missing context. https://neondb.slack.com/archives/C060N3SEF9D/p1733427801527419?thread_ts=1733419882.560159&cid=C060N3SEF9D ## Summary of changes Add `session_id` and change loglevel for cancellation queries	2024-12-10 10:14:28 +00:00
Evan Fleming	b593e51eae	safekeeper: use arc for global timelines and config (#10051 ) Hello! I was interested in potentially making some contributions to Neon and looking through the issue backlog I found [8200](https://github.com/neondatabase/neon/issues/8200) which seemed like a good first issue to attempt to tackle. I see it was assigned a while ago so apologies if I'm stepping on any toes with this PR. I also apologize for the size of this PR. I'm not sure if there is a simple way to reduce it given the footprint of the components being changed. ## Problem This PR is attempting to address part of the problem outlined in issue [8200](https://github.com/neondatabase/neon/issues/8200). Namely to remove global static usage of timeline state in favour of `Arc<GlobalTimelines>` and to replace wasteful clones of `SafeKeeperConf` with `Arc<SafeKeeperConf>`. I did not opt to tackle `RemoteStorage` in this PR to minimize the amount of changes as this PR is already quite large. I also did not opt to introduce an `SafekeeperApp` wrapper struct to similarly minimize changes but I can tackle either or both of these omissions in this PR if folks would like. ## Summary of changes - Remove static usage of `GlobalTimelines` in favour of `Arc<GlobalTimelines>` - Wrap `SafeKeeperConf` in `Arc` to avoid wasteful clones of the underlying struct ## Some additional thoughts - We seem to currently store `SafeKeeperConf` in `GlobalTimelines` and then expose it through a public`get_global_config` function which requires locking. This seems needlessly wasteful and based on observed usage we could remove this public accessor and force consumers to acquire `SafeKeeperConf` through the new Arc reference.	2024-12-09 21:09:20 +00:00
Alex Chi Z.	4c4cb80186	fix(pageserver): fix gc-compaction racing with legacy gc (#10052 ) ## Problem close https://github.com/neondatabase/neon/issues/10049, close https://github.com/neondatabase/neon/issues/10030, close https://github.com/neondatabase/neon/issues/8861 part of https://github.com/neondatabase/neon/issues/9114 The legacy gc process calls `get_latest_gc_cutoff`, which uses a Rcu different than the gc_info struct. In the gc_compaction_smoke test case, the "latest" cutoff could be lower than the gc_info struct, causing gc-compaction to collect data that could be accessed by `latest_gc_cutoff`. Technically speaking, there's nothing wrong with gc-compaction using gc_info without considering latest_gc_cutoff, because gc_info is the source of truth. But anyways, let's fix it. ## Summary of changes * gc-compaction uses `latest_gc_cutoff` instead of gc_info to determine the gc horizon. * if a gc-compaction is scheduled via tenant compaction iteration, it will take the gc_block lock to avoid racing with functionalities like detach ancestor (if it's triggered via manual compaction API without scheduling, then it won't take the lock) --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-12-09 20:06:06 +00:00
a-masterov	92273b6d5e	Enable the pg_regress tests on staging for PG17 (#9978 ) ## Problem Currently, we run the `pg_regress` tests only for PG16 However, PG17 is a part of Neon and should be tested as well ## Summary of changes Modified the workflow and added a patch for PG17 enabling the `pg_regress` tests. The problem with leftovers was solved by using branches.	2024-12-09 19:30:39 +00:00
Arpad Müller	e74e7aac93	Use updated patched azure SDK crates (#10036 ) For a while already, we've been unable to update the Azure SDK crates due to Azure adopting use of a non-tokio async runtime, see #7545. The effort to upstream the fix got stalled, and I think it's better to switch to a patched version of the SDK that is up to date. Now we have a fork of the SDK under the neondatabase github org, to which I have applied Conrad's rebased patches to: https://github.com/neondatabase/azure-sdk-for-rust/tree/neon . The existence of a fork will also help with shipping bulk delete support before it's upstreamed (#7931). Also, in related news, the Azure SDK has gotten a rift in development, where the main branch pertains to a future, to-be-officially-blessed release of the SDK, and the older versions, which we are currently using, are on the `legacy` branch. Upstream doesn't really want patches for the `legacy` branch any more, they want to focus on the `main` efforts. However, even then, the `legacy` branch is still newer than what we are having right now, so let's switch to `legacy` for now. Depending on how long it takes, we can switch to the official version of the SDK once it's released or switch to the upstream `main` branch if there is changes we want before that. As a nice side effect of this PR, we now use reqwest 0.12 everywhere, dropping the dependency on version 0.11. Fixes #7545	2024-12-09 15:50:06 +00:00
Vlad Lazar	4cca5cdb12	deps: update url to 2.5.4 for RUSTSEC-2024-0421 (#10059 ) ## Problem See https://rustsec.org/advisories/RUSTSEC-2024-0421 ## Summary of changes Update url crate to 2.5.4.	2024-12-09 14:57:42 +00:00
Arpad Müller	9d425b54f7	Update AWS SDK crates (#10056 ) Result of running: cargo update -p aws-types -p aws-sigv4 -p aws-credential-types -p aws-smithy-types -p aws-smithy-async -p aws-sdk-kms -p aws-sdk-iam -p aws-sdk-s3 -p aws-config We want to keep the AWS SDK up to date as that way we benefit from new developments and improvements.	2024-12-09 12:46:59 +00:00
John Spray	ec790870d5	storcon: automatically clear Pause/Stop scheduling policies to enable detaches (#10011 ) ## Problem We saw a tenant get stuck when it had been put into Pause scheduling mode to pin it to a pageserver, then it was left idle for a while and the control plane tried to detach it. Close: https://github.com/neondatabase/neon/issues/9957 ## Summary of changes - When changing policy to Detached or Secondary, set the scheduling policy to Active. - Add a test that exercises this - When persisting tenant shards, set their `generation_pageserver` to null if the placement policy is not Attached (this enables consistency checks to work, and avoids leaving state in the DB that could be confusing/misleading in future)	2024-12-07 13:05:09 +00:00
Christian Schwarz	4d7111f240	page_service: don't count time spent flushing towards smgr latency metrics (#10042 ) ## Problem In #9962 I changed the smgr metrics to include time spent on flush. It isn't under our (=storage team's) control how long that flush takes because the client can stop reading requests. ## Summary of changes Stop the timer as soon as we've buffered up the response in the `pgb_writer`. Track flush time in a separate metric. --------- Co-authored-by: Yuchen Liang <70461588+yliang412@users.noreply.github.com>	2024-12-07 08:57:55 +00:00
Alex Chi Z.	b1fd086c0c	test(pageserver): disable gc_compaction smoke test for now (#10045 ) ## Problem The test is flaky. ## Summary of changes Disable the test. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-06 22:30:04 +00:00
Heikki Linnakangas	b6eea65597	Fix error message if PS connection is lost while receiving prefetch (#9923 ) If the pageserver connection is lost while receiving the prefetch request, the prefetch queue is cleared. The error message prints the values from the prefetch slot, but because the slot was already cleared, they're all zeros: LOG: [NEON_SMGR] [shard 0] No response from reading prefetch entry 0: 0/0/0.0 block 0. This can be caused by a concurrent disconnect To fix, make local copies of the values. In the passing, also add a sanity check that if the receive() call succeeds, the prefetch slot is still intact.	2024-12-06 20:56:57 +00:00
Alex Chi Z.	c42c28b339	feat(pageserver): gc-compaction split job and partial scheduler (#9897 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114, stacked PR over #9809 The compaction scheduler now schedules partial compaction jobs. ## Summary of changes * Add the compaction job splitter based on size. * Schedule subcompactions using the compaction scheduler. * Test subcompaction scheduler in the smoke regress test. * Temporarily disable layer map checks --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-06 18:44:26 +00:00
Tristan Partin	e4837b0a5a	Bump sql_exporter to 0.16.0 (#10041 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-06 17:43:55 +00:00
Erik Grinaker	14c4fae64a	test_runner/performance: add improved bulk insert benchmark (#9812 ) Adds an improved bulk insert benchmark, including S3 uploads. Touches #9789.	2024-12-06 15:17:15 +00:00
Vlad Lazar	cc70fc802d	pageserver: add metric for number of wal records received by each shard (#10035 ) ## Problem With the current metrics we can't identify which shards are ingesting data at any given time. ## Summary of changes Add a metric for the number of wal records received for processing by each shard. This is per (tenant, timeline, shard).	2024-12-06 12:51:41 +00:00
Alexey Kondratov	fa07097f2f	chore: Reorganize and refresh CODEOWNERS (#10008 ) ## Problem We didn't have a codeowner for `/compute`, so nobody was auto-assigned for PRs like #9973 ## Summary of changes While on it: 1. Group codeowners into sections. 2. Remove control plane from the `/compute_tools` because it's primarily the internal `compute_ctl` code. 3. Add control plane (and compute) to `/libs/compute_api` because that's the shared public interface of the compute.	2024-12-06 11:44:50 +00:00
Erik Grinaker	7838659197	pageserver: assert that keys belong to shard (#9943 ) We've seen cases where stray keys end up on the wrong shard. This shouldn't happen. Add debug assertions to prevent this. In release builds, we should be lenient in order to handle changing key ownership policies. Touches #9914.	2024-12-06 10:24:13 +00:00
Vlad Lazar	3f1c542957	pageserver: add disk consistent and remote lsn metrics (#10005 ) ## Problem There's no metrics for disk consistent LSN and remote LSN. This stuff is useful when looking at ingest performance. ## Summary of changes Two per timeline metrics are added: `pageserver_disk_consistent_lsn` and `pageserver_projected_remote_consistent_lsn`. I went for the projected remote lsn instead of the visible one because that more closely matches remote storage write tput. Ideally we would have both, but these metrics are expensive.	2024-12-06 10:21:52 +00:00
Erik Grinaker	ec4072f845	pageserver: add `wait_until_flushed` parameter for timeline checkpoint (#10013 ) ## Problem I'm writing an ingest benchmark in #9812. To time S3 uploads, I need to schedule a flush of the Pageserver's in-memory layer, but don't actually want to wait around for it to complete (which will take a minute). ## Summary of changes Add a parameter `wait_until_flush` (default `true`) for `timeline/checkpoint` to control whether to wait for the flush to complete.	2024-12-06 10:12:39 +00:00
Erik Grinaker	56f867bde5	pageserver: only zero truncated FSM page on owning shard (#10032 ) ## Problem FSM pages are managed like regular relation pages, and owned by a single shard. However, when truncating the FSM relation the last FSM page was zeroed out on all shards. This is unnecessary and potentially confusing. The superfluous keys will be removed during compactions, as they do not belong on these shards. Resolves #10027. ## Summary of changes Only zero out the truncated FSM page on the owning shard.	2024-12-06 07:22:22 +00:00
Arpad Müller	d1ab7471e2	Fix desc_str for Azure container (#10021 ) Small logs fix I've noticed while working on https://github.com/neondatabase/cloud/issues/19963 .	2024-12-05 20:51:57 +00:00
Tristan Partin	6ff4175fd7	Send Content-Type header on reconfigure request from neon_local (#10029 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-05 20:30:35 +00:00
Tristan Partin	6331cb2161	Bump anyhow to 1.0.94 (#10028 ) We were over a year out of date. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-05 19:42:52 +00:00
Alex Chi Z.	71f38d1354	feat(pageserver): support schedule gc-compaction (#9809 ) ## Problem part of https://github.com/neondatabase/neon/issues/9114 gc-compaction can take a long time. This patch adds support for scheduling a gc-compaction job. The compaction loop will first handle L0->L1 compaction, and then gc compaction. The scheduled jobs are stored in a non-persistent queue within the tenant structure. This will be the building block for the partial compaction trigger -- if the system determines that we need to do a gc compaction, it will partition the keyspace and schedule several jobs. Each of these jobs will run for a short amount of time (i.e, 1 min). L0 compaction will be prioritized over gc compaction. ## Summary of changes * Add compaction scheduler in tenant. * Run scheduled compaction in integration tests. * Change the manual compaction API to allow schedule a compaction instead of immediately doing it. * Add LSN upper bound as gc-compaction parameter. If we schedule partial compactions, gc_cutoff might move across different runs. Therefore, we need to pass a pre-determined gc_cutoff beforehand. (TODO: support LSN lower bound so that we can compact arbitrary "rectangle" in the layer map) * Refactor the gc_compaction internal interface. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-05 19:37:17 +00:00
Tristan Partin	c0ba416967	Add compute_logical_snapshots_bytes metric (#9887 ) This metric exposes the size of all non-temporary logical snapshot files. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-05 19:04:33 +00:00
Alexey Kondratov	13e8105740	feat(compute): Allow specifying the reconfiguration concurrency (#10006 ) ## Problem We need a higher concurrency during reconfiguration in case of many DBs, but the instance is already running and used by the client. We can easily get out of `max_connections` limit, and the current code won't handle that. ## Summary of changes Default to 1, but also allow control plane to override this value for specific projects. It's also recommended to bump `superuser_reserved_connections` += `reconfigure_concurrency` for such projects to ensure that we always have enough spare connections for reconfiguration process to succeed. Quick workaround for neondatabase/cloud#17846	2024-12-05 17:57:25 +00:00
Erik Grinaker	db79304416	storage_controller: increase shard scan timeout (#10000 ) ## Problem The node shard scan timeout of 1 second is a bit too aggressive, and we've seen this cause test failures. The scans are performed in parallel across nodes, and the entire operation has a 15 second timeout. Resolves #9801. ## Summary of changes Increase the timeout to 5 seconds. This is still enough to time out on a network failure and retry successfully within 15 seconds.	2024-12-05 17:29:21 +00:00
Ivan Efremov	b547681e08	Merge pull request #10024 from neondatabase/rc/release-proxy/2024-12-05 Proxy release 2024-12-05	2024-12-05 15:35:35 +02:00
Ivan Efremov	0fd211537b	proxy: Present new auth backend cplane_proxy_v1 (#10012 ) Implement a new auth backend based on the current Neon backend to switch to the new Proxy V1 cplane API. Implements [#21048](https://github.com/neondatabase/cloud/issues/21048)	2024-12-05 13:00:40 +02:00
Yuchen Liang	a83bd4e81c	pageserver: fix buffered-writer on macos build (#10019 ) ## Problem In https://github.com/neondatabase/neon/pull/9693, we forgot to check macos build. The [CI run](https://github.com/neondatabase/neon/actions/runs/12164541897/job/33926455468) on main showed that macos build failed with unused variables and dead code. ## Summary of changes - add `allow(dead_code)` and `allow(unused_variables)` to the relevant code that is not used on macos. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-12-05 13:00:40 +02:00
Conrad Ludgate	ecdad5e6d5	chore: update rust-postgres (#10002 ) Like #9931 but without rebasing upstream just yet, to try and minimise the differences. Removes all proxy-specific commits from the rust-postgres fork, now that proxy no longer depends on them. Merging upstream changes to come later.	2024-12-05 13:00:40 +02:00
Conrad Ludgate	d028929945	chore: update clap (#10009 ) This updates clap to use a new version of anstream	2024-12-05 13:00:40 +02:00
Yuchen Liang	7b0e3db868	pageserver: make `BufferedWriter` do double-buffering (#9693 ) Closes #9387. ## Problem `BufferedWriter` cannot proceed while the owned buffer is flushing to disk. We want to implement double buffering so that the flush can happen in the background. See #9387. ## Summary of changes - Maintain two owned buffers in `BufferedWriter`. - The writer is in charge of copying the data into owned, aligned buffer, once full, submit it to the flush task. - The flush background task is in charge of flushing the owned buffer to disk, and returned the buffer to the writer for reuse. - The writer and the flush background task communicate through a bi-directional channel. For in-memory layer, we also need to be able to read from the buffered writer in `get_values_reconstruct_data`. To handle this case, we did the following - Use replace `VirtualFile::write_all` with `VirtualFile::write_all_at`, and use `Arc` to share it between writer and background task. - leverage `IoBufferMut::freeze` to get a cheaply clonable `IoBuffer`, one clone will be submitted to the channel, the other clone will be saved within the writer to serve reads. When we want to reuse the buffer, we can invoke `IoBuffer::into_mut`, which gives us back the mutable aligned buffer. - InMemoryLayer reads is now aware of the maybe_flushed part of the buffer. Caveat - We removed the owned version of write, because this interface does not work well with buffer alignment. The result is that without direct IO enabled, [`download_object`](`a439d57050/pageserver/src/tenant/remote_timeline_client/download.rs (L243)`) does one more memcpy than before this PR due to the switch to use `_borrowed` version of the write. - "Bypass aligned part of write" could be implemented later to avoid large amount of memcpy. Testing - use an oneshot channel based control mechanism to make flush behavior deterministic in test. - test reading from `EphemeralFile` when the last submitted buffer is not flushed, in-progress, and done flushing to disk. ## Performance We see performance improvement for small values, and regression on big values, likely due to being CPU bound + disk write latency. [Results](https://www.notion.so/neondatabase/Benchmarking-New-BufferedWriter-11-20-2024-143f189e0047805ba99acda89f984d51?pvs=4) ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-05 13:00:40 +02:00
John Spray	088eb72dd7	tests: make storcon scale test AZ-aware (#9952 ) ## Problem We have a scale test for the storage controller which also acts as a good stress test for scheduling stability. However, it created nodes with no AZs set. ## Summary of changes - Bump node count to 6 and set AZs on them. This is a precursor to other AZ-related PRs, to make sure any new code that's landed is getting scale tested in an AZ-aware environment.	2024-12-05 13:00:40 +02:00
a-masterov	d550e3f626	Create a branch for compute release (#9637 ) ## Problem We practice a manual release flow for the compute module. This will allow automation of the compute release process. ## Summary of changes The workflow was modified to make a compute release automatically on the branch release-compute. ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-12-05 13:00:40 +02:00
Erik Grinaker	8c6b41daf5	Display reqwest error source (#10004 ) ## Problem Reqwest errors don't include details about the inner source error. This means that we get opaque errors like: ``` receive body: error sending request for url (http://localhost:9898/v1/location_config) ``` Instead of the more helpful: ``` receive body: error sending request for url (http://localhost:9898/v1/location_config): operation timed out ``` Touches #9801. ## Summary of changes Include the source error for `reqwest::Error` wherever it's displayed.	2024-12-05 13:00:40 +02:00
Alexey Kondratov	bbb050459b	feat(compute): Set default application_name for pgbouncer connections (#9973 ) ## Problem When client specifies `application_name`, pgbouncer propagates it to the Postgres. Yet, if client doesn't do it, we have hard time figuring out who opens a lot of Postgres connections (including the `cloud_admin` ones). See this investigation as an example: https://neondb.slack.com/archives/C0836R0RZ0D ## Summary of changes I haven't found this documented, but it looks like pgbouncer accepts standard Postgres connstring parameters in the connstring in the `[databases]` section, so put the default `application_name=pgbouncer` there. That way, we will always see who opens Postgres connections. I did tests, and if client specifies a `application_name`, pgbouncer overrides this default, so it only works if it's not specified or set to blank `&application_name=` in the connection string. This is the last place we could potentially open some Postgres connections without `application_name`. Everything else should be either of two: 1. Direct client connections without `application_name`, but these should be strictly non-`cloud_admin` ones 2. Some ad-hoc internal connections, so if we see spikes of unidentified `cloud_admin` connections, we will need to investigate it again. Fixes neondatabase/cloud#20948	2024-12-05 13:00:40 +02:00
Conrad Ludgate	cab498c787	feat(proxy): add option to forward startup params (#9979 ) (stacked on #9990 and #9995) Partially fixes #1287 with a custom option field to enable the fixed behaviour. This allows us to gradually roll out the fix without silently changing the observed behaviour for our customers. related to https://github.com/neondatabase/cloud/issues/15284	2024-12-05 13:00:40 +02:00
Folke Behrens	6359342ffb	Assign /libs/proxy/ to proxy team (#10003 )	2024-12-05 13:00:40 +02:00
Erik Grinaker	13285c2a5e	pageserver: return proper status code for heatmap_upload errors (#9991 ) ## Problem During deploys, we see a lot of 500 errors due to heapmap uploads for inactive tenants. These should be 503s instead. Resolves #9574. ## Summary of changes Make the secondary tenant scheduler use `ApiError` rather than `anyhow::Error`, to propagate the tenant error and convert it to an appropriate status code.	2024-12-05 13:00:40 +02:00
Peter Bendel	33790d14a3	fix parsing human time output like "50m37s" (#10001 ) ## Problem In ingest_benchmark.yml workflow we use pgcopydb tool to migrate project. pgcopydb logs human time. Our parsing of the human time doesn't work for times like "50m37s". [Example workflow](https://github.com/neondatabase/neon/actions/runs/12145539948/job/33867418065#step:10:479) contains "57m45s" but we [reported](https://github.com/neondatabase/neon/actions/runs/12145539948/job/33867418065#step:10:500) only the seconds part: 45.000 s ## Summary of changes add a regex pattern for Minute/Second combination	2024-12-05 13:00:40 +02:00
Peter Bendel	709b8cd371	optimize parms for ingest bench (#9999 ) ## Problem we tried different parallelism settings for ingest bench ## Summary of changes the following settings seem optimal after merging - SK side Wal filtering - batched getpages Settings: - effective_io_concurrency 100 - concurrency limit 200 (different from Prod!) - jobs 4, maintenance workers 7 - 10 GB chunk size	2024-12-05 13:00:40 +02:00
Vlad Lazar	1c9bbf1a92	storcon: return an error for drain attempts while paused (#9997 ) ## Problem We currently allow drain operations to proceed while the node policy is paused. ## Summary of changes Return a precondition failed error in such cases. The orchestrator is updated in https://github.com/neondatabase/infra/pull/2544 to skip drain and fills if the pageserver is paused. Closes: https://github.com/neondatabase/neon/issues/9907	2024-12-05 13:00:40 +02:00
Christian Schwarz	16163fb850	page_service: enable batching in Rust & Python Tests + Python benchmarks (#9993 ) This is the first step towards batching rollout. Refs - rollout plan: https://github.com/neondatabase/cloud/issues/20620 - task https://github.com/neondatabase/neon/issues/9377 - uber-epic: https://github.com/neondatabase/neon/issues/9376	2024-12-05 13:00:40 +02:00
Alexander Bayandin	73ccc2b08c	test_page_service_batching: fix non-numeric metrics (#9998 ) ## Problem ``` 2024-12-03T15:42:46.5978335Z + poetry run python /__w/neon/neon/scripts/ingest_perf_test_result.py --ingest /__w/neon/neon/test_runner/perf-report-local 2024-12-03T15:42:49.5325077Z Traceback (most recent call last): 2024-12-03T15:42:49.5325603Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 165, in <module> 2024-12-03T15:42:49.5326029Z main() 2024-12-03T15:42:49.5326316Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 155, in main 2024-12-03T15:42:49.5326739Z ingested = ingest_perf_test_result(cur, item, recorded_at_timestamp) 2024-12-03T15:42:49.5327488Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-12-03T15:42:49.5327914Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 99, in ingest_perf_test_result 2024-12-03T15:42:49.5328321Z psycopg2.extras.execute_values( 2024-12-03T15:42:49.5328940Z File "/github/home/.cache/pypoetry/virtualenvs/non-package-mode-_pxWMzVK-py3.11/lib/python3.11/site-packages/psycopg2/extras.py", line 1299, in execute_values 2024-12-03T15:42:49.5335618Z cur.execute(b''.join(parts)) 2024-12-03T15:42:49.5335967Z psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type numeric: "concurrent-futures" 2024-12-03T15:42:49.5336287Z LINE 57: 'concurrent-futures', 2024-12-03T15:42:49.5336462Z ^ ``` ## Summary of changes - `test_page_service_batching`: save non-numeric params as `labels` - Add a runtime check that `metric_value` is NUMERIC	2024-12-05 13:00:40 +02:00
Christian Schwarz	c719be6474	tests & benchmarks: unify the way we customize the default tenant config (#9992 ) Before this PR, some override callbacks used `.default()`, others used `.setdefault()`. As of this PR, all callbacks use `.setdefault()` which I think is least prone to failure. Aligning on a single way will set the right example for future tests that need such customization. The `test_pageserver_getpage_throttle.py` technically is a change in behavior: before, it replaced the `tenant_config` field, now it just configures the throttle. This is what I believe is intended anyway.	2024-12-05 13:00:40 +02:00
Arpad Müller	718645e56c	Support tenant manifests in the scrubber (#9942 ) Support tenant manifests in the storage scrubber: * list the manifests, order them by generation * delete all manifests except for the two most recent generations * for the latest manifest: try parsing it. I've tested this patch by running the against a staging bucket and it successfully deleted stuff (and avoided deleting the latest two generations). In follow-up work, we might want to also check some invariants of the manifest, as mentioned in #8088. Part of #9386 Part of #8088 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-05 13:00:40 +02:00
Conrad Ludgate	fbc8c36983	chore(proxy): enforce single host+port (#9995 ) proxy doesn't ever provide multiple hosts/ports, so this code adds a lot of complexity of error handling for no good reason. (stacked on #9990)	2024-12-05 13:00:40 +02:00
Alexey Immoreev	5519e42612	Improvement: add console redirect timeout warning (#9985 ) ## Problem There is no information on session being cancelled in 2 minutes at the moment ## Summary of changes The timeout being logged for the user	2024-12-05 13:00:40 +02:00
Erik Grinaker	4157eaf4c5	pageserver: respond to multiple shutdown signals (#9982 ) ## Problem The Pageserver signal handler would only respond to a single signal and initiate shutdown. Subsequent signals were ignored. This meant that a `SIGQUIT` sent after a `SIGTERM` had no effect (e.g. in the case of a slow or stalled shutdown). The `test_runner` uses this to force shutdown if graceful shutdown is slow. Touches #9740. ## Summary of changes Keep responding to signals after the initial shutdown signal has been received. Arguably, the `test_runner` should also use `SIGKILL` rather than `SIGQUIT` in this case, but it seems reasonable to respond to `SIGQUIT` regardless.	2024-12-05 13:00:40 +02:00
Conrad Ludgate	60241127e2	chore(proxy): remove postgres config parser and md5 support (#9990 ) Keeping the `mock` postgres cplane adaptor using "stock" tokio-postgres allows us to remove a lot of dead weight from our actual postgres connection logic.	2024-12-05 13:00:40 +02:00
John Spray	f7d5322e8b	pageserver: more detailed logs when calling re-attach (#9996 ) ## Problem We saw a peculiar case where a pageserver apparently got a 0-tenant response to `/re-attach` but we couldn't see the request landing on a storage controller. It was hard to confirm retrospectively that the pageserver was configured properly at the moment it sent the request. ## Summary of changes - Log the URL to which we are sending the request - Log the NodeId and metadata that we sent	2024-12-05 13:00:40 +02:00
John Spray	41bb9c5280	pageserver: only store SLRUs & aux files on shard zero (#9786 ) ## Problem Since https://github.com/neondatabase/neon/pull/9423 the non-zero shards no longer need SLRU content in order to do GC. This data is now redundant on shards >0. One release cycle after merging that PR, we may merge this one, which also stops writing those pages to shards > 0, reaping the efficiency benefit. Closes: https://github.com/neondatabase/neon/issues/7512 Closes: https://github.com/neondatabase/neon/issues/9641 ## Summary of changes - Avoid storing SLRUs on non-zero shards - Bonus: avoid storing aux files on non-zero shards	2024-12-05 13:00:40 +02:00
John Spray	69c0d61c5c	storcon: in shard splits, inherit parent's AZ (#9946 ) ## Problem Sharded tenants should be run in a single AZ for best performance, so that computes have AZ-local latency to all the shards. Part of https://github.com/neondatabase/neon/issues/8264 ## Summary of changes - When we split a tenant, instead of updating each shard's preferred AZ to wherever it is scheduled, propagate the preferred AZ from the parent. - Drop the check in `test_shard_preferred_azs` that asserts shards end up in their preferred AZ: this will not be true again until the optimize_attachment logic is updated to make this so. The existing check wasn't testing anything about scheduling, it was just asserting that we set preferred AZ in a way that matches the way things happen to be scheduled at time of split.	2024-12-05 13:00:40 +02:00
Christian Schwarz	63cb8ce975	pageserver: only throttle pagestream requests & bring back throttling deduction for smgr latency metrics (#9962 ) ## Problem In the batching PR - https://github.com/neondatabase/neon/pull/9870 I stopped deducting the time-spent-in-throttle fro latency metrics, i.e., - smgr latency metrics (`SmgrOpTimer`) - basebackup latency (+scan latency, which I think is part of basebackup). The reason for stopping the deduction was that with the introduction of batching, the trick with tracking time-spent-in-throttle inside RequestContext and swap-replacing it from the `impl Drop for SmgrOpTimer` no longer worked with >1 requests in a batch. However, deducting time-spent-in-throttle is desirable because our internal latency SLO definition does not account for throttling. ## Summary of changes - Redefine throttling to be a page_service pagestream request throttle instead of a throttle for repository `Key` reads through `Timeline::get` / `Timeline::get_vectored`. - This means reads done by `basebackup` are no longer subject to any throttle. - The throttle applies after batching, before handling of the request. - Drive-by fix: make throttle sensitive to cancellation. - Rename metric label `kind` from `timeline_get` to `pagestream` to reflect the new scope of throttling. To avoid config format breakage, we leave the config field named `timeline_get_throttle` and ignore the `task_kinds` field. This will be cleaned up in a future PR. ## Trade-Offs Ideally, we would apply the throttle before reading a request off the connection, so that we queue the minimal amount of work inside the process. However, that's not possible because we need to do shard routing. The redefinition of the throttle to limit pagestream request rate instead of repository `Key` rate comes with several downsides: - We're no longer able to use the throttle mechanism for other other tasks, e.g. image layer creation. However, in practice, we never used that capability anyways. - We no longer throttle basebackup.	2024-12-05 13:00:40 +02:00
Erik Grinaker	907e4aa3c4	test_runner: use immediate shutdown in `test_sharded_ingest` (#9984 ) ## Problem `test_sharded_ingest` ingests a lot of data, which can cause shutdown to be slow e.g. due to local "S3 uploads" or compactions. This can cause test flakes during teardown. Resolves #9740. ## Summary of changes Perform an immediate shutdown of the cluster.	2024-12-05 13:00:40 +02:00
Erik Grinaker	0a2a84b766	safekeeper,pageserver: add heap profiling (#9778 ) ## Problem We don't have good observability for memory usage. This would be useful e.g. to debug OOM incidents or optimize performance or resource usage. We would also like to use continuous profiling with e.g. [Grafana Cloud Profiles](https://grafana.com/products/cloud/profiles-for-continuous-profiling/) (see https://github.com/neondatabase/cloud/issues/14888). This PR is intended as a proof of concept, to try it out in staging and drive further discussions about profiling more broadly. Touches https://github.com/neondatabase/neon/issues/9534. Touches https://github.com/neondatabase/cloud/issues/14888. Depends on #9779. Depends on #9780. ## Summary of changes Adds a HTTP route `/profile/heap` that takes a heap profile and returns it. Query parameters: * `format`: output format (`jemalloc` or `pprof`; default `pprof`). Unlike CPU profiles (see #9764), heap profiles are not symbolized and require the original binary to translate addresses to function names. To make this work with Grafana, we'll probably have to symbolize the process server-side -- this is left as future work, as is other output formats like SVG. Heap profiles don't work on macOS due to limitations in jemalloc.	2024-12-05 13:00:40 +02:00
a-masterov	85b12ddd52	Add support for the extensions test for Postgres v17 (#9748 ) ## Problem The extensions for Postgres v17 are ready but we do not test the extensions shipped with v17 ## Summary of changes Build the test image based on Postgres v17. Run the tests for v17. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2024-12-05 13:00:40 +02:00
Christian Schwarz	dd76f1eeee	page_service: batching observability & include throttled time in smgr metrics (#9870 ) This PR - fixes smgr metrics https://github.com/neondatabase/neon/issues/9925 - adds an additional startup log line logging the current batching config - adds a histogram of batch sizes global and per-tenant - adds a metric exposing the current batching config The issue described #9925 is that before this PR, request latency was only observed after batching. This means that smgr latency metrics (most importantly getpage latency) don't account for - `wait_lsn` time - time spent waiting for batch to fill up / the executor stage to pick up the batch. The fix is to use a per-request batching timer, like we did before the initial batching PR. We funnel those timers through the entire request lifecycle. I noticed that even before the initial batching changes, we weren't accounting for the time spent writing & flushing the response to the wire. This PR drive-by fixes that deficiency by dropping the timers at the very end of processing the batch, i.e., after the `pgb.flush()` call. I was *unable to maintain the behavior that we deduct time-spent-in-throttle from various latency metrics. The reason is that we're using a single* counter in `RequestContext` to track micros spent in throttle. But there are N metrics timers in the batch, one per request. As a consequence, the practice of consuming the counter in the drop handler of each timer no longer works because all but the first timer will encounter error `close() called on closed state`. A failed attempt to maintain the current behavior can be found in https://github.com/neondatabase/neon/pull/9951. So, this PR remvoes the deduction behavior from all metrics. I started a discussion on Slack about it the implications this has for our internal SLO calculation: https://neondb.slack.com/archives/C033RQ5SPDH/p1732910861704029 # Refs - fixes https://github.com/neondatabase/neon/issues/9925 - sub-issue https://github.com/neondatabase/neon/issues/9377 - epic: https://github.com/neondatabase/neon/issues/9376	2024-12-05 13:00:40 +02:00
Christian Schwarz	8963ac85f9	storcon_cli tenant-describe: include tenant-wide information in output (#9899 ) Before this PR, the storcon_cli didn't have a way to show the tenant-wide information of the TenantDescribeResponse. Sadly, the `Serialize` impl for the tenant config doesn't skip on `None`, so, the output becomes a bit bloated. Maybe we can use `skip_serializing_if(Option::is_none)` in the future. => https://github.com/neondatabase/neon/issues/9983	2024-12-05 13:00:40 +02:00
John Spray	4a488b3e24	storcon: use proper schedule context during node delete (#9958 ) ## Problem I was touching `test_storage_controller_node_deletion` because for AZ scheduling work I was adding a change to the storage controller (kick secondaries during optimisation) that made a FIXME in this test defunct. While looking at it I also realized that we can easily fix the way node deletion currently doesn't use a proper ScheduleContext, using the iterator type recently added for that purpose. ## Summary of changes - A testing-only behavior in storage controller where if a secondary location isn't yet ready during optimisation, it will be actively polled. - Remove workaround in `test_storage_controller_node_deletion` that previously was needed because optimisation would get stuck on cold secondaries. - Update node deletion code to use a `TenantShardContextIterator` and thereby a proper ScheduleContext	2024-12-05 13:00:40 +02:00
Alexey Kondratov	c4987b0b13	fix(testing): Use 1 MB shared_buffers even with LFC (#9969 ) ## Problem After enabling LFC in tests and lowering `shared_buffers` we started having more problems with `test_pg_regress`. ## Summary of changes Set `shared_buffers` to 1MB to both exercise getPage requests/LFC, and still have enough room for Postgres to operate. Everything smaller might be not enough for Postgres under load, and can cause errors like 'no unpinned buffers available'. See Konstantin's comment [1] as well. Fixes #9956 [1]: https://github.com/neondatabase/neon/issues/9956#issuecomment-2511608097	2024-12-05 13:00:40 +02:00
Tristan Partin	84b4821118	Stop changing the value of neon.extension_server_port at runtime (#9972 ) On reconfigure, we no longer passed a port for the extension server which caused us to not write out the neon.extension_server_port line. Thus, Postgres thought we were setting the port to the default value of 0. PGC_POSTMASTER GUCs cannot be set at runtime, which causes the following log messages: > LOG: parameter "neon.extension_server_port" cannot be changed without restarting the server > LOG: configuration file "/var/db/postgres/compute/pgdata/postgresql.conf" contains errors; unaffected changes were applied Fixes: https://github.com/neondatabase/neon/issues/9945 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-05 13:00:40 +02:00
Conrad Ludgate	32ba9811f9	feat(proxy): emit JWT auth method and JWT issuer in parquet logs (#9971 ) Fix the HTTP AuthMethod to accomodate the JWT authorization method. Introduces the JWT issuer as an additional field in the parquet logs	2024-12-05 13:00:40 +02:00
Folke Behrens	a0cd64c4d3	Bump OTel, tracing, reqwest crates (#9970 )	2024-12-05 13:00:40 +02:00
Arseny Sher	84687b743d	Update consensus protocol spec (#9607 ) The spec was written for the buggy protocol which we had before the one more similar to Raft was implemented. Update the spec with what we currently have. ref https://github.com/neondatabase/neon/issues/8699	2024-12-05 13:00:40 +02:00
Folke Behrens	b6f93dcec9	proxy: Create Elasticache credentials provider lazily (#9967 ) ## Problem The credentials providers tries to connect to AWS STS even when we use plain Redis connections. ## Summary of changes * Construct the CredentialsProvider only when needed ("irsa").	2024-12-05 13:00:40 +02:00
Alexander Bayandin	4f6c594973	CI(replication-tests): fix notifications about replication-tests failures (#9950 ) ## Problem `if: ${{ github.event.schedule }}` gets skipped if a previous step has failed, but we want to run the step for both `success` and `failure` ## Summary of changes - Add `!cancelled()` to notification step if-condition, to skip only cancelled jobs	2024-12-05 13:00:40 +02:00
Conrad Ludgate	a750c14735	fix(proxy): forward notifications from authentication (#9948 ) Fixes https://github.com/neondatabase/cloud/issues/20973. This refactors `connect_raw` in order to return direct access to the delayed notices. I cannot find a way to test this with psycopg2 unfortunately, although testing it with psql does return the expected results.	2024-12-05 13:00:40 +02:00
John Spray	9ce0dd4e55	storcon: add metric for AZ scheduling violations (#9949 ) ## Problem We can't easily tell how far the state of shards is from their AZ preferences. This can be a cause of performance issues, so it's important for diagnosability that we can tell easily if there are significant numbers of shards that aren't running in their preferred AZ. Related: https://github.com/neondatabase/cloud/issues/15413 ## Summary of changes - In reconcile_all, count shards that are scheduled into the wrong AZ (if they have a preference), and publish it as a prometheus gauge. - Also calculate a statistic for how many shards wanted to reconcile but couldn't. This is clearly a lazy calculation: reconcile all only runs periodically. But that's okay: shards in the wrong AZ is something that only matters if it stays that way for some period of time.	2024-12-05 13:00:40 +02:00
Erik Grinaker	0e1a336607	test_runner: improve `wait_until` (#9936 ) Improves `wait_until` by: * Use `timeout` instead of `iterations`. This allows changing the timeout/interval parameters independently. * Make `timeout` and `interval` optional (default 20s and 0.5s). Most callers don't care. * Only output status every 1s by default, and add optional `status_interval` parameter. * Remove `show_intermediate_error`, this was always emitted anyway. Most callers have been updated to use the defaults, except where they had good reason otherwise.	2024-12-05 13:00:40 +02:00
Anastasia Lubennikova	7fc2912d06	Update pgvector to 0.8.0 (#9733 )	2024-12-05 13:00:40 +02:00
John Spray	fdf231c237	storcon: don't take any Service locks in /status and /ready (#9944 ) ## Problem We saw unexpected container terminations when running in k8s with with small CPU resource requests. The /status and /ready handlers called `maybe_forward`, which always takes the lock on Service::inner. If there is a lot of writer lock contention, and the container is starved of CPU, this increases the likelihood that we will get killed by the kubelet. It isn't certain that this was a cause of issues, but it is a potential source that we can eliminate. ## Summary of changes - Revise logic to return immediately if the URL is in the non-forwarded list, rather than calling maybe_forward	2024-12-05 13:00:40 +02:00
Konstantin Knizhnik	1e08b5dccc	Fix issues with prefetch ring buffer resize (#9847 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1732110190129479 We observe the following error in the logs ``` [XX000] ERROR: [NEON_SMGR] [shard 3] Incorrect prefetch read: status=1 response=0x7fafef335138 my=128 receive=128 ``` most likely caused by changing `neon.readahead_buffer_size` ## Summary of changes 1. Copy shard state 2. Do not use prefetch_set_unused in readahead_buffer_resize 3. Change prefetch buffer overflow criteria --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-05 13:00:40 +02:00
Alexander Bayandin	030810ed3e	Compute image: prepare Postgres v14-v16 for Debian 12 (#9954 ) ## Problem Current compute images for Postgres 14-16 don't build on Debian 12 because of issues with extensions. This PR fixes that, but for the current setup, it is mostly a no-op change. ## Summary of changes - Use `/bin/bash -euo pipefail` as SHELL to fail earlier - Fix `plv8` build: backport a trivial patch for v8 - Fix `postgis` build: depend `sfgal` version on Debian version instead of Postgres version Tested in: https://github.com/neondatabase/neon/pull/9849	2024-12-05 13:00:40 +02:00
Konstantin Knizhnik	62b74bdc2c	Add GUC controlling whether to pause recovery if some critical GUCs at replica have smaller value than on primary (#9057 ) ## Problem See https://github.com/neondatabase/neon/issues/9023 ## Summary of changes Ass GUC `recovery_pause_on_misconfig` allowing not to pause in case of replica and primary configuration mismatch See https://github.com/neondatabase/postgres/pull/501 See https://github.com/neondatabase/postgres/pull/502 See https://github.com/neondatabase/postgres/pull/503 See https://github.com/neondatabase/postgres/pull/504 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-12-05 13:00:40 +02:00
Folke Behrens	8b7e9ed820	Merge the consumption metric pushes (#9939 ) #8564 ## Problem The main and backup consumption metric pushes are completely independent, resulting in different event time windows and different idempotency keys. ## Summary of changes * Merge the push tasks, but keep chunks the same size.	2024-12-05 13:00:40 +02:00
Christian Schwarz	5dad89acd4	page_service: rewrite batching to work without a timeout (#9851 ) # Problem The timeout-based batching adds latency to unbatchable workloads. We can choose a short batching timeout (e.g. 10us) but that requires high-resolution timers, which tokio doesn't have. I thoroughly explored options to use OS timers (see [this](https://github.com/neondatabase/neon/pull/9822) abandoned PR). In short, it's not an attractive option because any timer implementation adds non-trivial overheads. # Solution The insight is that, in the steady state of a batchable workload, the time we spend in `get_vectored` will be hundreds of microseconds anyway. If we prepare the next batch concurrently to `get_vectored`, we will have a sizeable batch ready once `get_vectored` of the current batch is done and do not need an explicit timeout. This can be reasonably described as pipelining of the protocol handler. # Implementation We model the sub-protocol handler for pagestream requests (`handle_pagrequests`) as two futures that form a pipeline: 2. Batching: read requests from the connection and fill the current batch 3. Execution: `take` the current batch, execute it using `get_vectored`, and send the response. The Reading and Batching stage are connected through a new type of channel called `spsc_fold`. See the long comment in the `handle_pagerequests_pipelined` for details. # Changes - Refactor `handle_pagerequests` - separate functions for - reading one protocol message; produces a `BatchedFeMessage` with just one page request in it - batching; tried to merge an incoming `BatchedFeMessage` into an existing `BatchedFeMessage`; returns `None` on success and returns back the incoming message in case merging isn't possible - execution of a batched message - unify the timeline handle acquisition & request span construction; it now happen in the function that reads the protocol message - Implement serial and pipelined model - serial: what we had before any of the batching changes - read one protocol message - execute protocol messages - pipelined: the design described above - optionality for execution of the pipeline: either via concurrent futures vs tokio tasks - Pageserver config - remove batching timeout field - add ability to configure pipelining mode - add ability to limit max batch size for pipelined configurations (required for the rollout, cf https://github.com/neondatabase/cloud/issues/20620 ) - ability to configure execution mode - Tests - remove `batch_timeout` parametrization - rename `test_getpage_merge_smoke` to `test_throughput` - add parametrization to test different max batch sizes and execution moes - rename `test_timer_precision` to `test_latency` - rename the test case file to `test_page_service_batching.py` - better descriptions of what the tests actually do ## On the holding The `TimelineHandle` in the pending batch While batching, we hold the `TimelineHandle` in the pending batch. Therefore, the timeline will not finish shutting down while we're batching. This is not a problem in practice because the concurrently ongoing `get_vectored` call will fail quickly with an error indicating that the timeline is shutting down. This results in the Execution stage returning a `QueryError::Shutdown`, which causes the pipeline / entire page service connection to shut down. This drops all references to the `Arc<Mutex<Option<Box<BatchedFeMessage>>>>` object, thereby dropping the contained `TimelineHandle`s. - => fixes https://github.com/neondatabase/neon/issues/9850 # Performance Local run of the benchmarks, results in [this empty commit](`1cf5b1463f`) in the PR branch. Key take-aways: * `concurrent-futures` and `tasks` deliver identical `batching_factor` * tail latency impact unknown, cf https://github.com/neondatabase/neon/issues/9837 * `concurrent-futures` has higher throughput than `tasks` in all workloads (=lower `time` metric) * In unbatchable workloads, `concurrent-futures` has 5% higher `CPU-per-throughput` than that of `tasks`, and 15% higher than that of `serial`. * In batchable-32 workload, `concurrent-futures` has 8% lower `CPU-per-throughput` than that of `tasks` (comparison to tput of `serial` is irrelevant) * in unbatchable workloads, mean and tail latencies of `concurrent-futures` is practically identical to `serial`, whereas `tasks` adds 20-30us of overhead Overall, `concurrent-futures` seems like a slightly more attractive choice. # Rollout This change is disabled-by-default. Rollout plan: - https://github.com/neondatabase/cloud/issues/20620 # Refs - epic: https://github.com/neondatabase/neon/issues/9376 - this sub-task: https://github.com/neondatabase/neon/issues/9377 - the abandoned attempt to improve batching timeout resolution: https://github.com/neondatabase/neon/pull/9820 - closes https://github.com/neondatabase/neon/issues/9850 - fixes https://github.com/neondatabase/neon/issues/9835	2024-12-05 13:00:40 +02:00
Matthias van de Meent	547b2d2827	Fix timeout value used in XLogWaitForReplayOf (#9937 ) The previous value assumed usec precision, while the timeout used is in milliseconds, causing replica backends to wait for (potentially) many hours for WAL replay without the expected progress reports in logs. This fixes the issue. Reported-By: Alexander Lakhin <exclusion@gmail.com> ## Problem https://github.com/neondatabase/postgres/pull/279#issuecomment-2507671817 The timeout value was configured with the assumption the indicated value would be microseconds, where it's actually milliseconds. That causes the backend to wait for much longer (2h46m40s) before it emits the "I'm waiting for recovery" message. While we do have wait events configured on this, it's not great to have stuck backends without clear logs, so this fixes the timeout value in all our PostgreSQL branches. ## PG PRs * PG14: https://github.com/neondatabase/postgres/pull/542 * PG15: https://github.com/neondatabase/postgres/pull/543 * PG16: https://github.com/neondatabase/postgres/pull/544 * PG17: https://github.com/neondatabase/postgres/pull/545	2024-12-05 13:00:40 +02:00
Gleb Novikov	93f29a0065	Fixed fast_import pgbin in calling get_pg_version (#9933 ) Was working on https://github.com/neondatabase/cloud/pull/20795 and discovered that fast_import is not working normally.	2024-12-05 13:00:40 +02:00
John Spray	4f36494615	pageserver: download small objects using a smaller timeout (#9938 ) ## Problem It appears that the Azure storage API tends to hang TCP connections more than S3 does. Currently we use a 2 minute timeout for all downloads. This is large because sometimes the objects we download are large. However, waiting 2 minutes when doing something like downloading a manifest on tenant attach is problematic, because when someone is doing a "create tenant, create timeline" workflow, that 2 minutes is long enough for them reasonably to give up creating that timeline. Rather than propagate oversized timeouts further up the stack, we should use a different timeout for objects that we expect to be small. Closes: https://github.com/neondatabase/neon/issues/9836 ## Summary of changes - Add a `small_timeout` configuration attribute to remote storage, defaulting to 30 seconds (still a very generous period to do something like download an index) - Add a DownloadKind parameter to DownloadOpts, so that callers can indicate whether they expect the object to be small or large. - In the azure client, use small timeout for HEAD requests, and for GET requests if DownloadKind::Small is used. - Use DownloadKind::Small for manifests, indices, and heatmap downloads. This PR intentionally does not make the equivalent change to the S3 client, to reduce blast radius in case this has unexpected consequences (we could accomplish the same thing by editing lots of configs, but just skipping the code is simpler for right now)	2024-12-05 13:00:40 +02:00
Alexey Kondratov	0a550f3e7d	feat(compute_ctl): Always set application_name (#9934 ) ## Problem It was not always possible to judge what exactly some `cloud_admin` connections were doing because we didn't consistently set `application_name` everywhere. ## Summary of changes Unify the way we connect to Postgres: 1. Switch to building configs everywhere 2. Always set `application_name` and make naming consistent Follow-up for #9919 Part of neondatabase/cloud#20948	2024-12-05 13:00:40 +02:00
Erik Grinaker	4bb9554e4a	safekeeper: use jemalloc (#9780 ) ## Problem To add Safekeeper heap profiling in #9778, we need to switch to an allocator that supports it. Pageserver and proxy already use jemalloc. Touches #9534. ## Summary of changes Use jemalloc in Safekeeper.	2024-12-05 13:00:40 +02:00
John Spray	008616cfe6	storage controller: use proper ScheduleContext when evacuating a node (#9908 ) ## Problem When picking locations for a shard, we should use a ScheduleContext that includes all the other shards in the tenant, so that we apply proper anti-affinity between shards. If we don't do this, then it can lead to unstable scheduling, where we place a shard somewhere that the optimizer will then immediately move it away from. We didn't always do this, because it was a bit awkward to accumulate the context for a tenant rather than just walking tenants. This was a TODO in `handle_node_availability_transition`: ``` // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters // for tenants without secondary locations: if they have a secondary location, then this // schedule() call is just promoting an existing secondary) ``` This is a precursor to https://github.com/neondatabase/neon/issues/8264, where the current imperfect scheduling during node evacuation hampers testing. ## Summary of changes - Add an iterator type that yields each shard along with a schedulecontext that includes all the other shards from the same tenant - Use the iterator to replace hand-crafted logic in optimize_all_plan (functionally identical) - Use the iterator in `handle_node_availability_transition` to apply proper anti-affinity during node evacuation.	2024-12-05 13:00:40 +02:00
Conrad Ludgate	e61ec94fbc	chore(proxy): vendor a subset of rust-postgres (#9930 ) Our rust-postgres fork is getting messy. Mostly because proxy wants more control over the raw protocol than tokio-postgres provides. As such, it's diverging more and more. Storage and compute also make use of rust-postgres, but in more normal usage, thus they don't need our crazy changes. Idea: * proxy maintains their subset * other teams use a minimal patch set against upstream rust-postgres Reviewing this code will be difficult. To implement it, I 1. Copied tokio-postgres, postgres-protocol and postgres-types from `00940fcdb5` 2. Updated their package names with the `2` suffix to make them compile in the workspace. 3. Updated proxy to use those packages 4. Copied in the code from tokio-postgres-rustls 0.13 (with some patches applied https://github.com/jbg/tokio-postgres-rustls/pull/32 https://github.com/jbg/tokio-postgres-rustls/pull/33) 5. Removed as much dead code as I could find in the vendored libraries 6. Updated the tokio-postgres-rustls code to use our existing channel binding implementation	2024-12-05 13:00:40 +02:00
Erik Grinaker	e5152551ad	test_runner/performance: add logical message ingest benchmark (#9749 ) Adds a benchmark for logical message WAL ingestion throughput end-to-end. Logical messages are essentially noops, and thus ignored by the Pageserver. Example results from my MacBook, with fsync enabled: ``` postgres_ingest: 14.445 s safekeeper_ingest: 29.948 s pageserver_ingest: 30.013 s pageserver_recover_ingest: 8.633 s wal_written: 10,340 MB message_count: 1310720 messages postgres_throughput: 715 MB/s safekeeper_throughput: 345 MB/s pageserver_throughput: 344 MB/s pageserver_recover_throughput: 1197 MB/s ``` See https://github.com/neondatabase/neon/issues/9642#issuecomment-2475995205 for running analysis. Touches #9642.	2024-12-05 13:00:40 +02:00
Alexey Kondratov	b0822a5499	fix(compute_ctl): Allow usage of DB names with whitespaces (#9919 ) ## Problem We used `set_path()` to replace the database name in the connection string. It automatically does url-safe encoding if the path is not already encoded, but it does it as per the URL standard, which assumes that tabs can be safely removed from the path without changing the meaning of the URL. See, e.g., https://url.spec.whatwg.org/#concept-basic-url-parser. It also breaks for DBs with properly %-encoded names, like with `%20`, as they are kept intact, but actually should be escaped. Yet, this is not true for Postgres, where it's completely valid to have trailing tabs in the database name. I think this is the PR that caused this regression https://github.com/neondatabase/neon/pull/9717, as it switched from `postgres::config::Config` back to `set_path()`. This was fixed a while ago already [1], btw, I just haven't added a test to catch this regression back then :( ## Summary of changes This commit changes the code back to use `postgres/tokio_postgres::Config` everywhere. While on it, also do some changes around, as I had to touch this code: 1. Bump some logging from `debug` to `info` in the spec apply path. We do not use `debug` in prod, and it was tricky to understand what was going on with this bug in prod. 2. Refactor configuration concurrency calculation code so it was reusable. Yet, still keep `1` in the case of reconfiguration. The database can be actively used at this moment, so we cannot guarantee that there will be enough spare connection slots, and the underlying code won't handle connection errors properly. 3. Simplify the installed extensions code. It was spawning a blocking task inside async function, which doesn't make much sense. Instead, just have a main sync function and call it with `spawn_blocking` in the API code -- the only place we need it to be async. 4. Add regression python test to cover this and related problems in the future. Also, add more extensive testing of schema dump and DBs and roles listing API. [1]: `4d1e48f3b9` [2]: https://www.postgresql.org/message-id/flat/20151023003445.931.91267%40wrigleys.postgresql.org Resolves neondatabase/cloud#20869	2024-12-05 13:00:40 +02:00
Alexander Bayandin	1fb6ab59e8	test_runner: rerun all failed tests (#9917 ) ## Problem Currently, we rerun only known flaky tests. This approach was chosen to reduce the number of tests that go unnoticed (by forcing people to take a look at failed tests and rerun the job manually), but it has some drawbacks: - In PRs, people tend to push new changes without checking failed tests (that's ok) - In the main, tests are just restarted without checking (understandable) - Parametrised tests become flaky one by one, i.e. if `test[1]` is flaky `, test[2]` is not marked as flaky automatically (which may or may not be the case). I suggest rerunning all failed tests to increase the stability of GitHub jobs and using the Grafana Dashboard with flaky tests for deeper analysis. ## Summary of changes - Rerun all failed tests twice at max	2024-12-05 13:00:40 +02:00
Vlad Lazar	e16439400d	pageserver: return correct LSN for interpreted proto keep alive responses (#9928 ) ## Problem For the interpreted proto the pageserver is not returning the correct LSN in replies to keep alive requests. This is because the interpreted protocol arm was not updating `last_rec_lsn`. ## Summary of changes * Return correct LSN in keep-alive responses * Fix shard field in wal sender traces	2024-12-05 13:00:40 +02:00
Arpad Müller	e401f66698	Update rust to 1.83.0, also update cargo adjacent tools (#9926 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://releases.rs/docs/1.83.0/). Also update `cargo-hakari`, `cargo-deny`, `cargo-hack` and `cargo-nextest` to their latest versions. Prior update was in #9445.	2024-12-05 13:00:40 +02:00
Erik Grinaker	2fa461b668	Makefile: build pg_visibility (#9922 ) Build the `pg_visibility` extension for use with `neon_local`. This is useful to inspect the visibility map for debugging. Touches #9914.	2024-12-05 13:00:40 +02:00
Vlad Lazar	03d90bc0b3	remote_storage/abs: count 404 and 304 for get as ok for metrics (#9912 ) ## Problem We currently see elevated levels of errors for GetBlob requests. This is because 404 and 304 are counted as errors for metric reporting. ## Summary of Changes Bring the implementation in line with the S3 client and treat 404 and 304 responses as ok for metric purposes. Related: https://github.com/neondatabase/cloud/issues/20666	2024-12-05 13:00:40 +02:00
Ivan Efremov	268bc890ea	proxy: spawn cancellation checks in the background (#9918 ) ## Problem For cancellation, a connection is open during all the cancel checks. ## Summary of changes Spawn cancellation checks in the background, and close connection immediately. Use task_tracker for cancellation checks.	2024-12-05 13:00:40 +02:00
Ivan Efremov	ffc9c33eb2	proxy: Present new auth backend cplane_proxy_v1 (#10012 ) Implement a new auth backend based on the current Neon backend to switch to the new Proxy V1 cplane API. Implements [#21048](https://github.com/neondatabase/cloud/issues/21048)	2024-12-05 05:30:38 +00:00
Yuchen Liang	ed2d892113	pageserver: fix buffered-writer on macos build (#10019 ) ## Problem In https://github.com/neondatabase/neon/pull/9693, we forgot to check macos build. The [CI run](https://github.com/neondatabase/neon/actions/runs/12164541897/job/33926455468) on main showed that macos build failed with unused variables and dead code. ## Summary of changes - add `allow(dead_code)` and `allow(unused_variables)` to the relevant code that is not used on macos. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-12-05 02:16:09 +00:00
Conrad Ludgate	131585eb6b	chore: update rust-postgres (#10002 ) Like #9931 but without rebasing upstream just yet, to try and minimise the differences. Removes all proxy-specific commits from the rust-postgres fork, now that proxy no longer depends on them. Merging upstream changes to come later.	2024-12-04 21:07:44 +00:00
Conrad Ludgate	0bab7e3086	chore: update clap (#10009 ) This updates clap to use a new version of anstream	2024-12-04 17:42:17 +00:00
Yuchen Liang	e6cd5050fc	pageserver: make `BufferedWriter` do double-buffering (#9693 ) Closes #9387. ## Problem `BufferedWriter` cannot proceed while the owned buffer is flushing to disk. We want to implement double buffering so that the flush can happen in the background. See #9387. ## Summary of changes - Maintain two owned buffers in `BufferedWriter`. - The writer is in charge of copying the data into owned, aligned buffer, once full, submit it to the flush task. - The flush background task is in charge of flushing the owned buffer to disk, and returned the buffer to the writer for reuse. - The writer and the flush background task communicate through a bi-directional channel. For in-memory layer, we also need to be able to read from the buffered writer in `get_values_reconstruct_data`. To handle this case, we did the following - Use replace `VirtualFile::write_all` with `VirtualFile::write_all_at`, and use `Arc` to share it between writer and background task. - leverage `IoBufferMut::freeze` to get a cheaply clonable `IoBuffer`, one clone will be submitted to the channel, the other clone will be saved within the writer to serve reads. When we want to reuse the buffer, we can invoke `IoBuffer::into_mut`, which gives us back the mutable aligned buffer. - InMemoryLayer reads is now aware of the maybe_flushed part of the buffer. Caveat - We removed the owned version of write, because this interface does not work well with buffer alignment. The result is that without direct IO enabled, [`download_object`](`a439d57050/pageserver/src/tenant/remote_timeline_client/download.rs (L243)`) does one more memcpy than before this PR due to the switch to use `_borrowed` version of the write. - "Bypass aligned part of write" could be implemented later to avoid large amount of memcpy. Testing - use an oneshot channel based control mechanism to make flush behavior deterministic in test. - test reading from `EphemeralFile` when the last submitted buffer is not flushed, in-progress, and done flushing to disk. ## Performance We see performance improvement for small values, and regression on big values, likely due to being CPU bound + disk write latency. [Results](https://www.notion.so/neondatabase/Benchmarking-New-BufferedWriter-11-20-2024-143f189e0047805ba99acda89f984d51?pvs=4) ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-04 16:54:56 +00:00
John Spray	60c0d19f57	tests: make storcon scale test AZ-aware (#9952 ) ## Problem We have a scale test for the storage controller which also acts as a good stress test for scheduling stability. However, it created nodes with no AZs set. ## Summary of changes - Bump node count to 6 and set AZs on them. This is a precursor to other AZ-related PRs, to make sure any new code that's landed is getting scale tested in an AZ-aware environment.	2024-12-04 15:04:04 +00:00
a-masterov	dec2e2fb29	Create a branch for compute release (#9637 ) ## Problem We practice a manual release flow for the compute module. This will allow automation of the compute release process. ## Summary of changes The workflow was modified to make a compute release automatically on the branch release-compute. ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-12-04 13:10:00 +00:00
Erik Grinaker	699a213c5d	Display reqwest error source (#10004 ) ## Problem Reqwest errors don't include details about the inner source error. This means that we get opaque errors like: ``` receive body: error sending request for url (http://localhost:9898/v1/location_config) ``` Instead of the more helpful: ``` receive body: error sending request for url (http://localhost:9898/v1/location_config): operation timed out ``` Touches #9801. ## Summary of changes Include the source error for `reqwest::Error` wherever it's displayed.	2024-12-04 13:05:53 +00:00
Alexey Kondratov	9a4157dadb	feat(compute): Set default application_name for pgbouncer connections (#9973 ) ## Problem When client specifies `application_name`, pgbouncer propagates it to the Postgres. Yet, if client doesn't do it, we have hard time figuring out who opens a lot of Postgres connections (including the `cloud_admin` ones). See this investigation as an example: https://neondb.slack.com/archives/C0836R0RZ0D ## Summary of changes I haven't found this documented, but it looks like pgbouncer accepts standard Postgres connstring parameters in the connstring in the `[databases]` section, so put the default `application_name=pgbouncer` there. That way, we will always see who opens Postgres connections. I did tests, and if client specifies a `application_name`, pgbouncer overrides this default, so it only works if it's not specified or set to blank `&application_name=` in the connection string. This is the last place we could potentially open some Postgres connections without `application_name`. Everything else should be either of two: 1. Direct client connections without `application_name`, but these should be strictly non-`cloud_admin` ones 2. Some ad-hoc internal connections, so if we see spikes of unidentified `cloud_admin` connections, we will need to investigate it again. Fixes neondatabase/cloud#20948	2024-12-04 13:05:31 +00:00
Conrad Ludgate	bd52822e14	feat(proxy): add option to forward startup params (#9979 ) (stacked on #9990 and #9995) Partially fixes #1287 with a custom option field to enable the fixed behaviour. This allows us to gradually roll out the fix without silently changing the observed behaviour for our customers. related to https://github.com/neondatabase/cloud/issues/15284	2024-12-04 12:58:35 +00:00
Folke Behrens	dcd016bbfc	Assign /libs/proxy/ to proxy team (#10003 )	2024-12-04 12:58:31 +00:00
Erik Grinaker	7b18e33997	pageserver: return proper status code for heatmap_upload errors (#9991 ) ## Problem During deploys, we see a lot of 500 errors due to heapmap uploads for inactive tenants. These should be 503s instead. Resolves #9574. ## Summary of changes Make the secondary tenant scheduler use `ApiError` rather than `anyhow::Error`, to propagate the tenant error and convert it to an appropriate status code.	2024-12-04 12:53:52 +00:00
Peter Bendel	9d75218ba7	fix parsing human time output like "50m37s" (#10001 ) ## Problem In ingest_benchmark.yml workflow we use pgcopydb tool to migrate project. pgcopydb logs human time. Our parsing of the human time doesn't work for times like "50m37s". [Example workflow](https://github.com/neondatabase/neon/actions/runs/12145539948/job/33867418065#step:10:479) contains "57m45s" but we [reported](https://github.com/neondatabase/neon/actions/runs/12145539948/job/33867418065#step:10:500) only the seconds part: 45.000 s ## Summary of changes add a regex pattern for Minute/Second combination	2024-12-04 11:37:24 +00:00
Peter Bendel	1b3558df7a	optimize parms for ingest bench (#9999 ) ## Problem we tried different parallelism settings for ingest bench ## Summary of changes the following settings seem optimal after merging - SK side Wal filtering - batched getpages Settings: - effective_io_concurrency 100 - concurrency limit 200 (different from Prod!) - jobs 4, maintenance workers 7 - 10 GB chunk size	2024-12-04 11:07:22 +00:00
Vlad Lazar	68205c48ed	storcon: return an error for drain attempts while paused (#9997 ) ## Problem We currently allow drain operations to proceed while the node policy is paused. ## Summary of changes Return a precondition failed error in such cases. The orchestrator is updated in https://github.com/neondatabase/infra/pull/2544 to skip drain and fills if the pageserver is paused. Closes: https://github.com/neondatabase/neon/issues/9907	2024-12-04 09:25:29 +00:00
Christian Schwarz	8d93d02c2f	page_service: enable batching in Rust & Python Tests + Python benchmarks (#9993 ) This is the first step towards batching rollout. Refs - rollout plan: https://github.com/neondatabase/cloud/issues/20620 - task https://github.com/neondatabase/neon/issues/9377 - uber-epic: https://github.com/neondatabase/neon/issues/9376	2024-12-04 00:07:49 +00:00
Alexander Bayandin	023821a80c	test_page_service_batching: fix non-numeric metrics (#9998 ) ## Problem ``` 2024-12-03T15:42:46.5978335Z + poetry run python /__w/neon/neon/scripts/ingest_perf_test_result.py --ingest /__w/neon/neon/test_runner/perf-report-local 2024-12-03T15:42:49.5325077Z Traceback (most recent call last): 2024-12-03T15:42:49.5325603Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 165, in <module> 2024-12-03T15:42:49.5326029Z main() 2024-12-03T15:42:49.5326316Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 155, in main 2024-12-03T15:42:49.5326739Z ingested = ingest_perf_test_result(cur, item, recorded_at_timestamp) 2024-12-03T15:42:49.5327488Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-12-03T15:42:49.5327914Z File "/__w/neon/neon/scripts/ingest_perf_test_result.py", line 99, in ingest_perf_test_result 2024-12-03T15:42:49.5328321Z psycopg2.extras.execute_values( 2024-12-03T15:42:49.5328940Z File "/github/home/.cache/pypoetry/virtualenvs/non-package-mode-_pxWMzVK-py3.11/lib/python3.11/site-packages/psycopg2/extras.py", line 1299, in execute_values 2024-12-03T15:42:49.5335618Z cur.execute(b''.join(parts)) 2024-12-03T15:42:49.5335967Z psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type numeric: "concurrent-futures" 2024-12-03T15:42:49.5336287Z LINE 57: 'concurrent-futures', 2024-12-03T15:42:49.5336462Z ^ ``` ## Summary of changes - `test_page_service_batching`: save non-numeric params as `labels` - Add a runtime check that `metric_value` is NUMERIC	2024-12-03 22:46:18 +00:00
Christian Schwarz	944c1adc4c	tests & benchmarks: unify the way we customize the default tenant config (#9992 ) Before this PR, some override callbacks used `.default()`, others used `.setdefault()`. As of this PR, all callbacks use `.setdefault()` which I think is least prone to failure. Aligning on a single way will set the right example for future tests that need such customization. The `test_pageserver_getpage_throttle.py` technically is a change in behavior: before, it replaced the `tenant_config` field, now it just configures the throttle. This is what I believe is intended anyway.	2024-12-03 22:07:03 +00:00
Arpad Müller	ca85f364ba	Support tenant manifests in the scrubber (#9942 ) Support tenant manifests in the storage scrubber: * list the manifests, order them by generation * delete all manifests except for the two most recent generations * for the latest manifest: try parsing it. I've tested this patch by running the against a staging bucket and it successfully deleted stuff (and avoided deleting the latest two generations). In follow-up work, we might want to also check some invariants of the manifest, as mentioned in #8088. Part of #9386 Part of #8088 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-12-03 20:39:10 +00:00
Conrad Ludgate	9ef0662a42	chore(proxy): enforce single host+port (#9995 ) proxy doesn't ever provide multiple hosts/ports, so this code adds a lot of complexity of error handling for no good reason. (stacked on #9990)	2024-12-03 20:00:14 +00:00
Alexey Immoreev	3baef0bca3	Improvement: add console redirect timeout warning (#9985 ) ## Problem There is no information on session being cancelled in 2 minutes at the moment ## Summary of changes The timeout being logged for the user	2024-12-03 18:59:44 +00:00
Erik Grinaker	f312c6571f	pageserver: respond to multiple shutdown signals (#9982 ) ## Problem The Pageserver signal handler would only respond to a single signal and initiate shutdown. Subsequent signals were ignored. This meant that a `SIGQUIT` sent after a `SIGTERM` had no effect (e.g. in the case of a slow or stalled shutdown). The `test_runner` uses this to force shutdown if graceful shutdown is slow. Touches #9740. ## Summary of changes Keep responding to signals after the initial shutdown signal has been received. Arguably, the `test_runner` should also use `SIGKILL` rather than `SIGQUIT` in this case, but it seems reasonable to respond to `SIGQUIT` regardless.	2024-12-03 18:47:17 +00:00
Conrad Ludgate	27a42d0f96	chore(proxy): remove postgres config parser and md5 support (#9990 ) Keeping the `mock` postgres cplane adaptor using "stock" tokio-postgres allows us to remove a lot of dead weight from our actual postgres connection logic.	2024-12-03 18:39:23 +00:00
John Spray	b04ab468ee	pageserver: more detailed logs when calling re-attach (#9996 ) ## Problem We saw a peculiar case where a pageserver apparently got a 0-tenant response to `/re-attach` but we couldn't see the request landing on a storage controller. It was hard to confirm retrospectively that the pageserver was configured properly at the moment it sent the request. ## Summary of changes - Log the URL to which we are sending the request - Log the NodeId and metadata that we sent	2024-12-03 18:36:37 +00:00
John Spray	dcb629532b	pageserver: only store SLRUs & aux files on shard zero (#9786 ) ## Problem Since https://github.com/neondatabase/neon/pull/9423 the non-zero shards no longer need SLRU content in order to do GC. This data is now redundant on shards >0. One release cycle after merging that PR, we may merge this one, which also stops writing those pages to shards > 0, reaping the efficiency benefit. Closes: https://github.com/neondatabase/neon/issues/7512 Closes: https://github.com/neondatabase/neon/issues/9641 ## Summary of changes - Avoid storing SLRUs on non-zero shards - Bonus: avoid storing aux files on non-zero shards	2024-12-03 17:22:49 +00:00
John Spray	71d004289c	storcon: in shard splits, inherit parent's AZ (#9946 ) ## Problem Sharded tenants should be run in a single AZ for best performance, so that computes have AZ-local latency to all the shards. Part of https://github.com/neondatabase/neon/issues/8264 ## Summary of changes - When we split a tenant, instead of updating each shard's preferred AZ to wherever it is scheduled, propagate the preferred AZ from the parent. - Drop the check in `test_shard_preferred_azs` that asserts shards end up in their preferred AZ: this will not be true again until the optimize_attachment logic is updated to make this so. The existing check wasn't testing anything about scheduling, it was just asserting that we set preferred AZ in a way that matches the way things happen to be scheduled at time of split.	2024-12-03 16:55:00 +00:00
Christian Schwarz	4d422b937c	pageserver: only throttle pagestream requests & bring back throttling deduction for smgr latency metrics (#9962 ) ## Problem In the batching PR - https://github.com/neondatabase/neon/pull/9870 I stopped deducting the time-spent-in-throttle fro latency metrics, i.e., - smgr latency metrics (`SmgrOpTimer`) - basebackup latency (+scan latency, which I think is part of basebackup). The reason for stopping the deduction was that with the introduction of batching, the trick with tracking time-spent-in-throttle inside RequestContext and swap-replacing it from the `impl Drop for SmgrOpTimer` no longer worked with >1 requests in a batch. However, deducting time-spent-in-throttle is desirable because our internal latency SLO definition does not account for throttling. ## Summary of changes - Redefine throttling to be a page_service pagestream request throttle instead of a throttle for repository `Key` reads through `Timeline::get` / `Timeline::get_vectored`. - This means reads done by `basebackup` are no longer subject to any throttle. - The throttle applies after batching, before handling of the request. - Drive-by fix: make throttle sensitive to cancellation. - Rename metric label `kind` from `timeline_get` to `pagestream` to reflect the new scope of throttling. To avoid config format breakage, we leave the config field named `timeline_get_throttle` and ignore the `task_kinds` field. This will be cleaned up in a future PR. ## Trade-Offs Ideally, we would apply the throttle before reading a request off the connection, so that we queue the minimal amount of work inside the process. However, that's not possible because we need to do shard routing. The redefinition of the throttle to limit pagestream request rate instead of repository `Key` rate comes with several downsides: - We're no longer able to use the throttle mechanism for other other tasks, e.g. image layer creation. However, in practice, we never used that capability anyways. - We no longer throttle basebackup.	2024-12-03 15:25:58 +00:00
Erik Grinaker	bbe4dfa991	test_runner: use immediate shutdown in `test_sharded_ingest` (#9984 ) ## Problem `test_sharded_ingest` ingests a lot of data, which can cause shutdown to be slow e.g. due to local "S3 uploads" or compactions. This can cause test flakes during teardown. Resolves #9740. ## Summary of changes Perform an immediate shutdown of the cluster.	2024-12-03 14:33:31 +00:00
Erik Grinaker	dcb24ce170	safekeeper,pageserver: add heap profiling (#9778 ) ## Problem We don't have good observability for memory usage. This would be useful e.g. to debug OOM incidents or optimize performance or resource usage. We would also like to use continuous profiling with e.g. [Grafana Cloud Profiles](https://grafana.com/products/cloud/profiles-for-continuous-profiling/) (see https://github.com/neondatabase/cloud/issues/14888). This PR is intended as a proof of concept, to try it out in staging and drive further discussions about profiling more broadly. Touches https://github.com/neondatabase/neon/issues/9534. Touches https://github.com/neondatabase/cloud/issues/14888. Depends on #9779. Depends on #9780. ## Summary of changes Adds a HTTP route `/profile/heap` that takes a heap profile and returns it. Query parameters: * `format`: output format (`jemalloc` or `pprof`; default `pprof`). Unlike CPU profiles (see #9764), heap profiles are not symbolized and require the original binary to translate addresses to function names. To make this work with Grafana, we'll probably have to symbolize the process server-side -- this is left as future work, as is other output formats like SVG. Heap profiles don't work on macOS due to limitations in jemalloc.	2024-12-03 11:35:59 +00:00
a-masterov	a2a942f93c	Add support for the extensions test for Postgres v17 (#9748 ) ## Problem The extensions for Postgres v17 are ready but we do not test the extensions shipped with v17 ## Summary of changes Build the test image based on Postgres v17. Run the tests for v17. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2024-12-03 11:25:29 +00:00
Christian Schwarz	cb10be710d	page_service: batching observability & include throttled time in smgr metrics (#9870 ) This PR - fixes smgr metrics https://github.com/neondatabase/neon/issues/9925 - adds an additional startup log line logging the current batching config - adds a histogram of batch sizes global and per-tenant - adds a metric exposing the current batching config The issue described #9925 is that before this PR, request latency was only observed after batching. This means that smgr latency metrics (most importantly getpage latency) don't account for - `wait_lsn` time - time spent waiting for batch to fill up / the executor stage to pick up the batch. The fix is to use a per-request batching timer, like we did before the initial batching PR. We funnel those timers through the entire request lifecycle. I noticed that even before the initial batching changes, we weren't accounting for the time spent writing & flushing the response to the wire. This PR drive-by fixes that deficiency by dropping the timers at the very end of processing the batch, i.e., after the `pgb.flush()` call. I was *unable to maintain the behavior that we deduct time-spent-in-throttle from various latency metrics. The reason is that we're using a single* counter in `RequestContext` to track micros spent in throttle. But there are N metrics timers in the batch, one per request. As a consequence, the practice of consuming the counter in the drop handler of each timer no longer works because all but the first timer will encounter error `close() called on closed state`. A failed attempt to maintain the current behavior can be found in https://github.com/neondatabase/neon/pull/9951. So, this PR remvoes the deduction behavior from all metrics. I started a discussion on Slack about it the implications this has for our internal SLO calculation: https://neondb.slack.com/archives/C033RQ5SPDH/p1732910861704029 # Refs - fixes https://github.com/neondatabase/neon/issues/9925 - sub-issue https://github.com/neondatabase/neon/issues/9377 - epic: https://github.com/neondatabase/neon/issues/9376	2024-12-03 11:03:23 +00:00
Christian Schwarz	15d01b257a	storcon_cli tenant-describe: include tenant-wide information in output (#9899 ) Before this PR, the storcon_cli didn't have a way to show the tenant-wide information of the TenantDescribeResponse. Sadly, the `Serialize` impl for the tenant config doesn't skip on `None`, so, the output becomes a bit bloated. Maybe we can use `skip_serializing_if(Option::is_none)` in the future. => https://github.com/neondatabase/neon/issues/9983	2024-12-03 10:55:13 +00:00
John Spray	aaee713e53	storcon: use proper schedule context during node delete (#9958 ) ## Problem I was touching `test_storage_controller_node_deletion` because for AZ scheduling work I was adding a change to the storage controller (kick secondaries during optimisation) that made a FIXME in this test defunct. While looking at it I also realized that we can easily fix the way node deletion currently doesn't use a proper ScheduleContext, using the iterator type recently added for that purpose. ## Summary of changes - A testing-only behavior in storage controller where if a secondary location isn't yet ready during optimisation, it will be actively polled. - Remove workaround in `test_storage_controller_node_deletion` that previously was needed because optimisation would get stuck on cold secondaries. - Update node deletion code to use a `TenantShardContextIterator` and thereby a proper ScheduleContext	2024-12-03 08:59:38 +00:00
Alexey Kondratov	2e9207fdf3	fix(testing): Use 1 MB shared_buffers even with LFC (#9969 ) ## Problem After enabling LFC in tests and lowering `shared_buffers` we started having more problems with `test_pg_regress`. ## Summary of changes Set `shared_buffers` to 1MB to both exercise getPage requests/LFC, and still have enough room for Postgres to operate. Everything smaller might be not enough for Postgres under load, and can cause errors like 'no unpinned buffers available'. See Konstantin's comment [1] as well. Fixes #9956 [1]: https://github.com/neondatabase/neon/issues/9956#issuecomment-2511608097	2024-12-02 18:46:06 +00:00
Tristan Partin	d8ebd33fe6	Stop changing the value of neon.extension_server_port at runtime (#9972 ) On reconfigure, we no longer passed a port for the extension server which caused us to not write out the neon.extension_server_port line. Thus, Postgres thought we were setting the port to the default value of 0. PGC_POSTMASTER GUCs cannot be set at runtime, which causes the following log messages: > LOG: parameter "neon.extension_server_port" cannot be changed without restarting the server > LOG: configuration file "/var/db/postgres/compute/pgdata/postgresql.conf" contains errors; unaffected changes were applied Fixes: https://github.com/neondatabase/neon/issues/9945 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-12-02 18:06:19 +00:00
Conrad Ludgate	2dc238e5b3	feat(proxy): emit JWT auth method and JWT issuer in parquet logs (#9971 ) Fix the HTTP AuthMethod to accomodate the JWT authorization method. Introduces the JWT issuer as an additional field in the parquet logs	2024-12-02 17:54:32 +00:00
Folke Behrens	243bca1c49	Bump OTel, tracing, reqwest crates (#9970 )	2024-12-02 17:24:48 +00:00
Arseny Sher	fa909c27fc	Update consensus protocol spec (#9607 ) The spec was written for the buggy protocol which we had before the one more similar to Raft was implemented. Update the spec with what we currently have. ref https://github.com/neondatabase/neon/issues/8699	2024-12-02 16:10:44 +00:00
Folke Behrens	1b60571636	proxy: Create Elasticache credentials provider lazily (#9967 ) ## Problem The credentials providers tries to connect to AWS STS even when we use plain Redis connections. ## Summary of changes * Construct the CredentialsProvider only when needed ("irsa").	2024-12-02 15:38:12 +00:00
Alexander Bayandin	c18716bb3f	CI(replication-tests): fix notifications about replication-tests failures (#9950 ) ## Problem `if: ${{ github.event.schedule }}` gets skipped if a previous step has failed, but we want to run the step for both `success` and `failure` ## Summary of changes - Add `!cancelled()` to notification step if-condition, to skip only cancelled jobs	2024-12-02 12:46:07 +00:00
Conrad Ludgate	cd1d2d1996	fix(proxy): forward notifications from authentication (#9948 ) Fixes https://github.com/neondatabase/cloud/issues/20973. This refactors `connect_raw` in order to return direct access to the delayed notices. I cannot find a way to test this with psycopg2 unfortunately, although testing it with psql does return the expected results.	2024-12-02 12:29:57 +00:00
John Spray	bd09369198	storcon: add metric for AZ scheduling violations (#9949 ) ## Problem We can't easily tell how far the state of shards is from their AZ preferences. This can be a cause of performance issues, so it's important for diagnosability that we can tell easily if there are significant numbers of shards that aren't running in their preferred AZ. Related: https://github.com/neondatabase/cloud/issues/15413 ## Summary of changes - In reconcile_all, count shards that are scheduled into the wrong AZ (if they have a preference), and publish it as a prometheus gauge. - Also calculate a statistic for how many shards wanted to reconcile but couldn't. This is clearly a lazy calculation: reconcile all only runs periodically. But that's okay: shards in the wrong AZ is something that only matters if it stays that way for some period of time.	2024-12-02 11:50:22 +00:00
Erik Grinaker	5330122049	test_runner: improve `wait_until` (#9936 ) Improves `wait_until` by: * Use `timeout` instead of `iterations`. This allows changing the timeout/interval parameters independently. * Make `timeout` and `interval` optional (default 20s and 0.5s). Most callers don't care. * Only output status every 1s by default, and add optional `status_interval` parameter. * Remove `show_intermediate_error`, this was always emitted anyway. Most callers have been updated to use the defaults, except where they had good reason otherwise.	2024-12-02 10:26:15 +00:00
Anastasia Lubennikova	45658ccccb	Update pgvector to 0.8.0 (#9733 )	2024-12-02 10:10:51 +00:00
John Spray	14853a3284	storcon: don't take any Service locks in /status and /ready (#9944 ) ## Problem We saw unexpected container terminations when running in k8s with with small CPU resource requests. The /status and /ready handlers called `maybe_forward`, which always takes the lock on Service::inner. If there is a lot of writer lock contention, and the container is starved of CPU, this increases the likelihood that we will get killed by the kubelet. It isn't certain that this was a cause of issues, but it is a potential source that we can eliminate. ## Summary of changes - Revise logic to return immediately if the URL is in the non-forwarded list, rather than calling maybe_forward	2024-12-01 18:09:58 +00:00
Konstantin Knizhnik	aad809b048	Fix issues with prefetch ring buffer resize (#9847 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1732110190129479 We observe the following error in the logs ``` [XX000] ERROR: [NEON_SMGR] [shard 3] Incorrect prefetch read: status=1 response=0x7fafef335138 my=128 receive=128 ``` most likely caused by changing `neon.readahead_buffer_size` ## Summary of changes 1. Copy shard state 2. Do not use prefetch_set_unused in readahead_buffer_resize 3. Change prefetch buffer overflow criteria --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-12-01 15:47:28 +00:00
Alexander Bayandin	fae8e7ba76	Compute image: prepare Postgres v14-v16 for Debian 12 (#9954 ) ## Problem Current compute images for Postgres 14-16 don't build on Debian 12 because of issues with extensions. This PR fixes that, but for the current setup, it is mostly a no-op change. ## Summary of changes - Use `/bin/bash -euo pipefail` as SHELL to fail earlier - Fix `plv8` build: backport a trivial patch for v8 - Fix `postgis` build: depend `sfgal` version on Debian version instead of Postgres version Tested in: https://github.com/neondatabase/neon/pull/9849	2024-12-01 13:04:37 +00:00
Konstantin Knizhnik	97a9abd181	Add GUC controlling whether to pause recovery if some critical GUCs at replica have smaller value than on primary (#9057 ) ## Problem See https://github.com/neondatabase/neon/issues/9023 ## Summary of changes Ass GUC `recovery_pause_on_misconfig` allowing not to pause in case of replica and primary configuration mismatch See https://github.com/neondatabase/postgres/pull/501 See https://github.com/neondatabase/postgres/pull/502 See https://github.com/neondatabase/postgres/pull/503 See https://github.com/neondatabase/postgres/pull/504 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-12-01 12:23:10 +00:00
Folke Behrens	4abc8e5282	Merge the consumption metric pushes (#9939 ) #8564 ## Problem The main and backup consumption metric pushes are completely independent, resulting in different event time windows and different idempotency keys. ## Summary of changes * Merge the push tasks, but keep chunks the same size.	2024-11-30 10:11:37 +00:00
Christian Schwarz	aa4ec11af9	page_service: rewrite batching to work without a timeout (#9851 ) # Problem The timeout-based batching adds latency to unbatchable workloads. We can choose a short batching timeout (e.g. 10us) but that requires high-resolution timers, which tokio doesn't have. I thoroughly explored options to use OS timers (see [this](https://github.com/neondatabase/neon/pull/9822) abandoned PR). In short, it's not an attractive option because any timer implementation adds non-trivial overheads. # Solution The insight is that, in the steady state of a batchable workload, the time we spend in `get_vectored` will be hundreds of microseconds anyway. If we prepare the next batch concurrently to `get_vectored`, we will have a sizeable batch ready once `get_vectored` of the current batch is done and do not need an explicit timeout. This can be reasonably described as pipelining of the protocol handler. # Implementation We model the sub-protocol handler for pagestream requests (`handle_pagrequests`) as two futures that form a pipeline: 2. Batching: read requests from the connection and fill the current batch 3. Execution: `take` the current batch, execute it using `get_vectored`, and send the response. The Reading and Batching stage are connected through a new type of channel called `spsc_fold`. See the long comment in the `handle_pagerequests_pipelined` for details. # Changes - Refactor `handle_pagerequests` - separate functions for - reading one protocol message; produces a `BatchedFeMessage` with just one page request in it - batching; tried to merge an incoming `BatchedFeMessage` into an existing `BatchedFeMessage`; returns `None` on success and returns back the incoming message in case merging isn't possible - execution of a batched message - unify the timeline handle acquisition & request span construction; it now happen in the function that reads the protocol message - Implement serial and pipelined model - serial: what we had before any of the batching changes - read one protocol message - execute protocol messages - pipelined: the design described above - optionality for execution of the pipeline: either via concurrent futures vs tokio tasks - Pageserver config - remove batching timeout field - add ability to configure pipelining mode - add ability to limit max batch size for pipelined configurations (required for the rollout, cf https://github.com/neondatabase/cloud/issues/20620 ) - ability to configure execution mode - Tests - remove `batch_timeout` parametrization - rename `test_getpage_merge_smoke` to `test_throughput` - add parametrization to test different max batch sizes and execution moes - rename `test_timer_precision` to `test_latency` - rename the test case file to `test_page_service_batching.py` - better descriptions of what the tests actually do ## On the holding The `TimelineHandle` in the pending batch While batching, we hold the `TimelineHandle` in the pending batch. Therefore, the timeline will not finish shutting down while we're batching. This is not a problem in practice because the concurrently ongoing `get_vectored` call will fail quickly with an error indicating that the timeline is shutting down. This results in the Execution stage returning a `QueryError::Shutdown`, which causes the pipeline / entire page service connection to shut down. This drops all references to the `Arc<Mutex<Option<Box<BatchedFeMessage>>>>` object, thereby dropping the contained `TimelineHandle`s. - => fixes https://github.com/neondatabase/neon/issues/9850 # Performance Local run of the benchmarks, results in [this empty commit](`1cf5b1463f`) in the PR branch. Key take-aways: * `concurrent-futures` and `tasks` deliver identical `batching_factor` * tail latency impact unknown, cf https://github.com/neondatabase/neon/issues/9837 * `concurrent-futures` has higher throughput than `tasks` in all workloads (=lower `time` metric) * In unbatchable workloads, `concurrent-futures` has 5% higher `CPU-per-throughput` than that of `tasks`, and 15% higher than that of `serial`. * In batchable-32 workload, `concurrent-futures` has 8% lower `CPU-per-throughput` than that of `tasks` (comparison to tput of `serial` is irrelevant) * in unbatchable workloads, mean and tail latencies of `concurrent-futures` is practically identical to `serial`, whereas `tasks` adds 20-30us of overhead Overall, `concurrent-futures` seems like a slightly more attractive choice. # Rollout This change is disabled-by-default. Rollout plan: - https://github.com/neondatabase/cloud/issues/20620 # Refs - epic: https://github.com/neondatabase/neon/issues/9376 - this sub-task: https://github.com/neondatabase/neon/issues/9377 - the abandoned attempt to improve batching timeout resolution: https://github.com/neondatabase/neon/pull/9820 - closes https://github.com/neondatabase/neon/issues/9850 - fixes https://github.com/neondatabase/neon/issues/9835	2024-11-30 00:16:24 +00:00
Matthias van de Meent	973a8d2680	Fix timeout value used in XLogWaitForReplayOf (#9937 ) The previous value assumed usec precision, while the timeout used is in milliseconds, causing replica backends to wait for (potentially) many hours for WAL replay without the expected progress reports in logs. This fixes the issue. Reported-By: Alexander Lakhin <exclusion@gmail.com> ## Problem https://github.com/neondatabase/postgres/pull/279#issuecomment-2507671817 The timeout value was configured with the assumption the indicated value would be microseconds, where it's actually milliseconds. That causes the backend to wait for much longer (2h46m40s) before it emits the "I'm waiting for recovery" message. While we do have wait events configured on this, it's not great to have stuck backends without clear logs, so this fixes the timeout value in all our PostgreSQL branches. ## PG PRs * PG14: https://github.com/neondatabase/postgres/pull/542 * PG15: https://github.com/neondatabase/postgres/pull/543 * PG16: https://github.com/neondatabase/postgres/pull/544 * PG17: https://github.com/neondatabase/postgres/pull/545	2024-11-29 19:10:26 +00:00
Gleb Novikov	c848f25ec2	Fixed fast_import pgbin in calling get_pg_version (#9933 ) Was working on https://github.com/neondatabase/cloud/pull/20795 and discovered that fast_import is not working normally.	2024-11-29 17:58:36 +00:00
John Spray	d5624cc505	pageserver: download small objects using a smaller timeout (#9938 ) ## Problem It appears that the Azure storage API tends to hang TCP connections more than S3 does. Currently we use a 2 minute timeout for all downloads. This is large because sometimes the objects we download are large. However, waiting 2 minutes when doing something like downloading a manifest on tenant attach is problematic, because when someone is doing a "create tenant, create timeline" workflow, that 2 minutes is long enough for them reasonably to give up creating that timeline. Rather than propagate oversized timeouts further up the stack, we should use a different timeout for objects that we expect to be small. Closes: https://github.com/neondatabase/neon/issues/9836 ## Summary of changes - Add a `small_timeout` configuration attribute to remote storage, defaulting to 30 seconds (still a very generous period to do something like download an index) - Add a DownloadKind parameter to DownloadOpts, so that callers can indicate whether they expect the object to be small or large. - In the azure client, use small timeout for HEAD requests, and for GET requests if DownloadKind::Small is used. - Use DownloadKind::Small for manifests, indices, and heatmap downloads. This PR intentionally does not make the equivalent change to the S3 client, to reduce blast radius in case this has unexpected consequences (we could accomplish the same thing by editing lots of configs, but just skipping the code is simpler for right now)	2024-11-29 15:11:44 +00:00
Alexey Kondratov	538e2312a6	feat(compute_ctl): Always set application_name (#9934 ) ## Problem It was not always possible to judge what exactly some `cloud_admin` connections were doing because we didn't consistently set `application_name` everywhere. ## Summary of changes Unify the way we connect to Postgres: 1. Switch to building configs everywhere 2. Always set `application_name` and make naming consistent Follow-up for #9919 Part of neondatabase/cloud#20948	2024-11-29 13:55:56 +00:00
Erik Grinaker	a6073b5013	safekeeper: use jemalloc (#9780 ) ## Problem To add Safekeeper heap profiling in #9778, we need to switch to an allocator that supports it. Pageserver and proxy already use jemalloc. Touches #9534. ## Summary of changes Use jemalloc in Safekeeper.	2024-11-29 13:38:04 +00:00
John Spray	ea3798e3b3	storage controller: use proper ScheduleContext when evacuating a node (#9908 ) ## Problem When picking locations for a shard, we should use a ScheduleContext that includes all the other shards in the tenant, so that we apply proper anti-affinity between shards. If we don't do this, then it can lead to unstable scheduling, where we place a shard somewhere that the optimizer will then immediately move it away from. We didn't always do this, because it was a bit awkward to accumulate the context for a tenant rather than just walking tenants. This was a TODO in `handle_node_availability_transition`: ``` // TODO: populate a ScheduleContext including all shards in the same tenant_id (only matters // for tenants without secondary locations: if they have a secondary location, then this // schedule() call is just promoting an existing secondary) ``` This is a precursor to https://github.com/neondatabase/neon/issues/8264, where the current imperfect scheduling during node evacuation hampers testing. ## Summary of changes - Add an iterator type that yields each shard along with a schedulecontext that includes all the other shards from the same tenant - Use the iterator to replace hand-crafted logic in optimize_all_plan (functionally identical) - Use the iterator in `handle_node_availability_transition` to apply proper anti-affinity during node evacuation.	2024-11-29 13:27:49 +00:00
Conrad Ludgate	1d642d6a57	chore(proxy): vendor a subset of rust-postgres (#9930 ) Our rust-postgres fork is getting messy. Mostly because proxy wants more control over the raw protocol than tokio-postgres provides. As such, it's diverging more and more. Storage and compute also make use of rust-postgres, but in more normal usage, thus they don't need our crazy changes. Idea: * proxy maintains their subset * other teams use a minimal patch set against upstream rust-postgres Reviewing this code will be difficult. To implement it, I 1. Copied tokio-postgres, postgres-protocol and postgres-types from `00940fcdb5` 2. Updated their package names with the `2` suffix to make them compile in the workspace. 3. Updated proxy to use those packages 4. Copied in the code from tokio-postgres-rustls 0.13 (with some patches applied https://github.com/jbg/tokio-postgres-rustls/pull/32 https://github.com/jbg/tokio-postgres-rustls/pull/33) 5. Removed as much dead code as I could find in the vendored libraries 6. Updated the tokio-postgres-rustls code to use our existing channel binding implementation	2024-11-29 11:08:01 +00:00
Erik Grinaker	3ffe6de0b9	test_runner/performance: add logical message ingest benchmark (#9749 ) Adds a benchmark for logical message WAL ingestion throughput end-to-end. Logical messages are essentially noops, and thus ignored by the Pageserver. Example results from my MacBook, with fsync enabled: ``` postgres_ingest: 14.445 s safekeeper_ingest: 29.948 s pageserver_ingest: 30.013 s pageserver_recover_ingest: 8.633 s wal_written: 10,340 MB message_count: 1310720 messages postgres_throughput: 715 MB/s safekeeper_throughput: 345 MB/s pageserver_throughput: 344 MB/s pageserver_recover_throughput: 1197 MB/s ``` See https://github.com/neondatabase/neon/issues/9642#issuecomment-2475995205 for running analysis. Touches #9642.	2024-11-29 09:40:08 +00:00
Alexey Kondratov	42fb3c4d30	fix(compute_ctl): Allow usage of DB names with whitespaces (#9919 ) ## Problem We used `set_path()` to replace the database name in the connection string. It automatically does url-safe encoding if the path is not already encoded, but it does it as per the URL standard, which assumes that tabs can be safely removed from the path without changing the meaning of the URL. See, e.g., https://url.spec.whatwg.org/#concept-basic-url-parser. It also breaks for DBs with properly %-encoded names, like with `%20`, as they are kept intact, but actually should be escaped. Yet, this is not true for Postgres, where it's completely valid to have trailing tabs in the database name. I think this is the PR that caused this regression https://github.com/neondatabase/neon/pull/9717, as it switched from `postgres::config::Config` back to `set_path()`. This was fixed a while ago already [1], btw, I just haven't added a test to catch this regression back then :( ## Summary of changes This commit changes the code back to use `postgres/tokio_postgres::Config` everywhere. While on it, also do some changes around, as I had to touch this code: 1. Bump some logging from `debug` to `info` in the spec apply path. We do not use `debug` in prod, and it was tricky to understand what was going on with this bug in prod. 2. Refactor configuration concurrency calculation code so it was reusable. Yet, still keep `1` in the case of reconfiguration. The database can be actively used at this moment, so we cannot guarantee that there will be enough spare connection slots, and the underlying code won't handle connection errors properly. 3. Simplify the installed extensions code. It was spawning a blocking task inside async function, which doesn't make much sense. Instead, just have a main sync function and call it with `spawn_blocking` in the API code -- the only place we need it to be async. 4. Add regression python test to cover this and related problems in the future. Also, add more extensive testing of schema dump and DBs and roles listing API. [1]: `4d1e48f3b9` [2]: https://www.postgresql.org/message-id/flat/20151023003445.931.91267%40wrigleys.postgresql.org Resolves neondatabase/cloud#20869	2024-11-28 21:38:30 +00:00
Alexander Bayandin	e04dd3be0b	test_runner: rerun all failed tests (#9917 ) ## Problem Currently, we rerun only known flaky tests. This approach was chosen to reduce the number of tests that go unnoticed (by forcing people to take a look at failed tests and rerun the job manually), but it has some drawbacks: - In PRs, people tend to push new changes without checking failed tests (that's ok) - In the main, tests are just restarted without checking (understandable) - Parametrised tests become flaky one by one, i.e. if `test[1]` is flaky `, test[2]` is not marked as flaky automatically (which may or may not be the case). I suggest rerunning all failed tests to increase the stability of GitHub jobs and using the Grafana Dashboard with flaky tests for deeper analysis. ## Summary of changes - Rerun all failed tests twice at max	2024-11-28 19:02:57 +00:00
Vlad Lazar	eb520a14ce	pageserver: return correct LSN for interpreted proto keep alive responses (#9928 ) ## Problem For the interpreted proto the pageserver is not returning the correct LSN in replies to keep alive requests. This is because the interpreted protocol arm was not updating `last_rec_lsn`. ## Summary of changes * Return correct LSN in keep-alive responses * Fix shard field in wal sender traces	2024-11-28 17:38:47 +00:00
Arpad Müller	eb5d832e6f	Update rust to 1.83.0, also update cargo adjacent tools (#9926 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://releases.rs/docs/1.83.0/). Also update `cargo-hakari`, `cargo-deny`, `cargo-hack` and `cargo-nextest` to their latest versions. Prior update was in #9445.	2024-11-28 15:49:30 +00:00
Erik Grinaker	70780e310c	Makefile: build pg_visibility (#9922 ) Build the `pg_visibility` extension for use with `neon_local`. This is useful to inspect the visibility map for debugging. Touches #9914.	2024-11-28 15:48:18 +00:00
Vlad Lazar	e82f7f0dfc	remote_storage/abs: count 404 and 304 for get as ok for metrics (#9912 ) ## Problem We currently see elevated levels of errors for GetBlob requests. This is because 404 and 304 are counted as errors for metric reporting. ## Summary of Changes Bring the implementation in line with the S3 client and treat 404 and 304 responses as ok for metric purposes. Related: https://github.com/neondatabase/cloud/issues/20666	2024-11-28 10:11:08 +00:00
Folke Behrens	8a6ee79f6f	Merge pull request #9921 from neondatabase/rc/release-proxy/2024-11-28 Proxy release 2024-11-28	2024-11-28 11:09:06 +01:00
Ivan Efremov	8173dc600a	proxy: spawn cancellation checks in the background (#9918 ) ## Problem For cancellation, a connection is open during all the cancel checks. ## Summary of changes Spawn cancellation checks in the background, and close connection immediately. Use task_tracker for cancellation checks.	2024-11-28 06:32:22 +00:00
github-actions[bot]	9052c32b46	Proxy release 2024-11-28	2024-11-28 06:02:15 +00:00
Erik Grinaker	da1daa2426	pageserver: only apply `ClearVmBits` on relevant shards (#9895 ) # Problem VM (visibility map) pages are stored and managed as any regular relation page, in the VM fork of the main relation. They are also sharded like other pages. Regular WAL writes to the VM pages (typically performed by vacuum) are routed to the correct shard as usual. However, VM pages are also updated via `ClearVmBits` metadata records emitted when main relation pages are updated. These metadata records were sent to all shards, like other metadata records. This had the following effects: * On shards responsible for VM pages, the `ClearVmBits` applies as expected. * On shard 0, which knows about the VM relation and its size but doesn't necessarily have any VM pages, the `ClearVmBits` writes may have been applied without also having applied the explicit WAL writes to VM pages. * If VM pages are spread across multiple shards (unlikely with 256MB stripe size), all shards may have applied `ClearVmBits` if the pages fall within their local view of the relation size, even for pages they do not own. * On other shards, this caused a relation size cache miss and a DbDir and RelDir lookup before dropping the `ClearVmBits`. With many relations, this could cause significant CPU overhead. This is not believed to be a correctness problem, but this will be verified in #9914. Resolves #9855. # Changes Route `ClearVmBits` metadata records only to the shards responsible for the VM pages. Verification of the current VM handling and cleanup of incomplete VM pages on shard 0 (and potentially elsewhere) is left as follow-up work.	2024-11-27 19:44:24 +00:00
Alex Chi Z.	9e3cb75bc7	fix(pageserver): flush deletion queue in `reload` shutdown mode (#9884 ) ## Problem close https://github.com/neondatabase/neon/issues/9859 ## Summary of changes Ensure that the deletion queue gets fully flushed (i.e., the deletion lists get applied) during a graceful shutdown. It is still possible that an incomplete shutdown would leave deletion list behind and cause race upon the next startup, but we assume this will unlikely happen, and even if it happened, the pageserver should already be at a tainted state and the tenant should be moved to a new tenant with a new generation number. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-27 18:30:54 +00:00
Folke Behrens	5c41707bee	proxy: promote two logs to error, fix multiline log (#9913 ) * Promote two logs from mpsc send errors to error level. The channels are unbounded and there shouldn't be errors. * Fix one multiline log from anyhow::Error. Use Debug instead of Display.	2024-11-27 18:05:46 +00:00
Erik Grinaker	cc37fa0f33	pageserver: add metrics for unknown `ClearVmBits` pages (#9911 ) ## Problem When ingesting implicit `ClearVmBits` operations, we silently drop the writes if the relation or page is unknown. There are implicit assumptions around VM pages wrt. explicit/implicit updates, sharding, and relation sizes, which can possibly drop writes incorrectly. Adding a few metrics will allow us to investigate further and tighten up the logic. Touches #9855. ## Summary of changes Add a `pageserver_wal_ingest_clear_vm_bits_unknown` metric to record dropped `ClearVmBits` writes. Also add comments clarifying the behavior of relation sizes on non-zero shards.	2024-11-27 17:16:41 +00:00
Alex Chi Z.	23f5a27146	fix(storage-scrubber): valid layermap error degrades to warning (#9902 ) Valid layer assumption is a necessary condition for a layer map to be valid. It's a stronger check imposed by gc-compaction than the actual valid layermap definition. Actually, the system can work as long as there are no overlapping layer maps. Therefore, we degrade that into a warning. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-27 16:07:39 +00:00
Erik Grinaker	e4f437a354	pageserver: add relsize cache metrics (#9890 ) ## Problem We don't have any observability for the relation size cache. We have seen cache misses cause significant performance impact with high relation counts. Touches #9855. ## Summary of changes Adds the following metrics: * `pageserver_relsize_cache_entries` * `pageserver_relsize_cache_hits` * `pageserver_relsize_cache_misses` * `pageserver_relsize_cache_misses_old`	2024-11-27 13:54:14 +00:00
Vlad Lazar	8fdf786217	pageserver: add tenant config override for wal receiver proto (#9888 ) ## Problem Can't change protocol at tenant granularity. ## Summary of changes Add tenant config level override for wal receiver protocol. ## Links Related: https://github.com/neondatabase/neon/issues/9336 Epic: https://github.com/neondatabase/neon/issues/9329	2024-11-27 13:46:23 +00:00
Vlad Lazar	9e0148de11	safekeeper: use protobuf for sending compressed records to pageserver (#9821 ) ## Problem https://github.com/neondatabase/neon/pull/9746 lifted decoding and interpretation of WAL to the safekeeper. This reduced the ingested amount on the pageservers by around 10x for a tenant with 8 shards, but doubled the ingested amount for single sharded tenants. Also, https://github.com/neondatabase/neon/pull/9746 uses bincode which doesn't support schema evolution. Technically the schema can be evolved, but it's very cumbersome. ## Summary of changes This patch set addresses both problems by adding protobuf support for the interpreted wal records and adding compression support. Compressed protobuf reduced the ingested amount by 100x on the 32 shards `test_sharded_ingest` case (compared to non-interpreted proto). For the 1 shard case the reduction is 5x. Sister change to `rust-postgres` is [here](https://github.com/neondatabase/rust-postgres/pull/33). ## Links Related: https://github.com/neondatabase/neon/issues/9336 Epic: https://github.com/neondatabase/neon/issues/9329	2024-11-27 12:12:21 +00:00
Alexander Bayandin	7b41ee872e	CI(pre-merge-checks): build only one build-tools-image (#9718 ) ## Problem The `pre-merge-checks` workflow relies on the build-tools image. If changes to the `build-tools` image have been merged into the main branch since the last CI run for a PR (with other changes to the `build-tools`), the image will be rebuilt during the merge queue run. Otherwise, cached images are used. Rebuilding the image adds approximately 10 minutes on x86-64 and 20 minutes on arm64 to the process. ## Summary of changes - parametrise `build-build-tools-image` job with arch and Debian version - Run `pre-merge-checks` only on Debian 12 x86-64 image	2024-11-27 10:42:26 +00:00
Peter Bendel	277c33ba3f	ingest benchmark: after effective_io_concurrency = 100 we can increase compute side parallelism (#9904 ) ## Problem ingest benchmark tests project migration to Neon involving steps - COPY relation data - create indexes - create constraints Previously we used only 4 copy jobs, 4 create index jobs and 7 maintenance workers. After increasing effective_io_concurrency on compute we see that we can sustain more parallelism in the ingest bench ## Summary of changes Increase copy jobs to 8, create index jobs to 8 and maintenance workers to 16	2024-11-27 10:09:01 +00:00
Tristan Partin	2b788cb53f	Bump neon.logical_replication_max_snap_files default to 10000 (#9896 ) This bump comes from a recommendation from Chi. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-26 17:49:37 +00:00
Peter Bendel	13feda0669	track how much time the flush loop is stalled waiting for uploads (#9885 ) ## Problem We don't know how much time PS is losing during ingest when waiting for remote storage uploads in the flush frozen layer loop. Also we don't know how many remote storage requests get an permit without waiting (not throttled by remote_storage concurrency_limit). ## Summary of changes - Add a metric that accumulates the time waited per shard/PS - in [remote storage semaphore wait seconds](https://neonprod.grafana.net/d/febd9732-9bcf-4992-a821-49b1f6b02724/remote-storage?orgId=1&var-datasource=HUNg6jvVk&var-instance=pageserver-26.us-east-2.aws.neon.build&var-instance=pageserver-27.us-east-2.aws.neon.build&var-instance=pageserver-28.us-east-2.aws.neon.build&var-instance=pageserver-29.us-east-2.aws.neon.build&var-instance=pageserver-30.us-east-2.aws.neon.build&var-instance=pageserver-31.us-east-2.aws.neon.build&var-instance=pageserver-36.us-east-2.aws.neon.build&var-instance=pageserver-37.us-east-2.aws.neon.build&var-instance=pageserver-38.us-east-2.aws.neon.build&var-instance=pageserver-39.us-east-2.aws.neon.build&var-instance=pageserver-40.us-east-2.aws.neon.build&var-instance=pageserver-41.us-east-2.aws.neon.build&var-request_type=put_object&from=1731961336340&to=1731964762933&viewPanel=3) add a first bucket with 100 microseconds to count requests that do not need to wait on semaphore Update: created a new version that uses a Gauge (one increasing value per PS/shard) instead of histogram as suggested by review	2024-11-26 11:46:58 +00:00
Conrad Ludgate	96a1b71c84	chore(proxy): discard request context span during passthrough (#9882 ) ## Problem The RequestContext::span shouldn't live for the entire postgres connection, only the handshake. ## Summary of changes * Slight refactor to the RequestContext to discard the span upon handshake completion. * Make sure the temporary future for the handshake is dropped (not bound to a variable) * Runs our nightly fmt script	2024-11-25 21:32:53 +00:00
Arpad Müller	a74ab9338d	fast_import: remove hardcoding of pg_version (#9878 ) Before, we hardcoded the pg_version to 140000, while the code expected version numbers like 14. Now we use an enum, and code from `extension_server.rs` to auto-detect the correct version. The enum helps when we add support for a version: enums ensure that compilation fails if one forgets to put the version to one of the `match` locations. cc https://github.com/neondatabase/neon/pull/9218	2024-11-25 20:23:42 +00:00
Folke Behrens	7404887b81	proxy: Demote errors from cplane request routines to debug (#9886 ) ## Problem Any errors from these async blocks are unconditionally logged at error level even though we already handle such errors based on context. ## Summary of changes * Log raw errors from creating and executing cplane requests at debug level. * Inline macro calls to retain the correct callsite.	2024-11-25 19:35:32 +00:00
Folke Behrens	87e4dd23a1	proxy: Demote all cplane error replies to info log level (#9880 ) ## Problem The vast majority of the error/warn logs from cplane are about time or data transfer quotas exceeded or endpoint-not-found errors and not operational errors in proxy or cplane. ## Summary of changes * Demote cplane error replies to info level. * Raise other errors from warn back to error.	2024-11-25 17:53:26 +00:00
Vlad Lazar	7a2f0ed8d4	safekeeper: lift decoding and interpretation of WAL to the safekeeper (#9746 ) ## Problem For any given tenant shard, pageservers receive all of the tenant's WAL from the safekeeper. This soft-blocks us from using larger shard counts due to bandwidth concerns and CPU overhead of filtering out the records. ## Summary of changes This PR lifts the decoding and interpretation of WAL from the pageserver into the safekeeper. A customised PG replication protocol is used where instead of sending raw WAL, the safekeeper sends filtered, interpreted records. The receiver drives the protocol selection, so, on the pageserver side, usage of the new protocol is gated by a new pageserver config: `wal_receiver_protocol`. More granularly the changes are: 1. Optionally inject the protocol and shard identity into the arguments used for starting replication 2. On the safekeeper side, implement a new wal sending primitive which decodes and interprets records before sending them over 3. On the pageserver side, implement the ingestion of this new replication message type. It's very similar to what we already have for raw wal (minus decoding and interpreting). ## Notes * This PR currently uses my [branch of rust-postgres](https://github.com/neondatabase/rust-postgres/tree/vlad/interpreted-wal-record-replication-support) which includes the deserialization logic for the new replication message type. PR for that is open [here](https://github.com/neondatabase/rust-postgres/pull/32). * This PR contains changes for both pageservers and safekeepers. It's safe to merge because the new protocol is disabled by default on the pageserver side. We can gradually start enabling it in subsequent releases. * CI tests are running on https://github.com/neondatabase/neon/pull/9747 ## Links Related: https://github.com/neondatabase/neon/issues/9336 Epic: https://github.com/neondatabase/neon/issues/9329	2024-11-25 17:29:28 +00:00
Christian Schwarz	5c2356988e	page_service: add benchmark for batching (#9820 ) This PR adds two benchmark to demonstrate the effect of server-side getpage request batching added in https://github.com/neondatabase/neon/pull/9321. For the CPU usage, I found the the `prometheus` crate's built-in CPU usage accounts the seconds at integer granularity. That's not enough you reduce the target benchmark runtime for local iteration. So, add a new `libmetrics` metric and report that. The benchmarks are disabled because [on our benchmark nodes, timer resolution isn't high enough](https://neondb.slack.com/archives/C059ZC138NR/p1732264223207449). They work (no statement about quality) on my bare-metal devbox. They will be refined and enabled once we find a fix. Candidates at time of writing are: - https://github.com/neondatabase/neon/pull/9822 - https://github.com/neondatabase/neon/pull/9851 Refs: - Epic: https://github.com/neondatabase/neon/issues/9376 - Extracted from https://github.com/neondatabase/neon/pull/9792	2024-11-25 15:52:39 +00:00
Konstantin Knizhnik	441612c1ce	Prefetch on macos (#9875 ) ## Problem Prefetch is disabled at MacODS because `posix_fadvise` is not available. But Neon prefetch is not using this function and for testing at MacOS is it very convenient that prefetch is available. ## Summary of changes Define `USE_PREFETCH` in Makefile. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-25 15:21:52 +00:00
Arpad Müller	77630e5408	Address beta clippy lint needless_lifetimes (#9877 ) The 1.82.0 version of Rust will be stable soon, let's get the clippy lint fixes in before the compiler version upgrade.	2024-11-25 14:59:12 +00:00
Alexander Bayandin	3d380acbd1	Bump default Debian version to Bookworm everywhere (#9863 ) ## Problem We have a couple of CI workflows that still run on Debian Bullseye, and the default Debian version in images is Bullseye as well (we explicitly set building on Bookworm) ## Summary of changes - Run `pgbench-pgvector` on Bookworm (fix a couple of packages) - Run `trigger_bench_on_ec2_machine_in_eu_central_1` on Bookworm - Change default `DEBIAN_VERSION` in Dockerfiles to Bookworm - Make `pinned` docker tag an alias to `pinned-bookworm`	2024-11-25 14:43:32 +00:00
Alex Chi Z.	4630b70962	fix(pageserver): ensure all layers are flushed before measuring RSS (#9861 ) ## Problem close https://github.com/neondatabase/neon/issues/9761 The test assumed that no new L0 layers are flushed throughout the process, which is not true. ## Summary of changes Fix the test case `test_compaction_l0_memory` by flushing in-memory layers before compaction. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-25 14:25:18 +00:00
Conrad Ludgate	6f6749c4a9	chore: update rustls (#9871 )	2024-11-25 12:01:30 +00:00
Folke Behrens	0d1e82f0a7	Bump futures-* crates, drop unused license, hide duplicate crate warnings (#9858 ) * The futures-util crate we use was yanked. Bump it and its siblings to new patch release. https://github.com/rust-lang/futures-rs/releases/tag/0.3.31 * cargo-deny: Drop an unused license. * cargo-deny: Don't warn about duplicate crate. Duplicate crates are unavoidable and the noise just hides real warnings.	2024-11-25 10:59:49 +00:00
Alexander Bayandin	6f7aeaa1c5	test_runner: use LFC by default (#8613 ) ## Problem LFC is not enabled by default in tests, but it is enabled in production. This increases the risk of errors in the production environment, which were not found during the routine workflow. However, enabling LFC for all the tests may overload the disk on our servers and increase the number of failures. So, we try enabling LFC in one case to evaluate the possible risk. ## Summary of changes A new environment variable, USE_LFC is introduced. If it is set to true, LFC is enabled by default in all the tests. In our workflow, we enable LFC for PG17, release, x86-64, and disabled for all other combinations. --------- Co-authored-by: Alexey Masterov <alexeymasterov@neon.tech> Co-authored-by: a-masterov <72613290+a-masterov@users.noreply.github.com>	2024-11-25 09:01:05 +00:00
Christian Schwarz	450be26bbb	fast imports: initial Importer and Storage changes (#9218 ) Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Stas Kelvic <stas@neon.tech> # Context This PR contains PoC-level changes for a product feature that allows onboarding large databases into Neon without going through the regular data path. # Changes This internal RFC provides all the context * https://github.com/neondatabase/cloud/pull/19799 In the language of the RFC, this PR covers * the Importer code (`fast_import`) * all the Pageserver changes (mgmt API changes, flow implementation, etc) * a basic test for the Pageserver changes # Reviewing As acknowledged in the RFC, the code added in this PR is not ready for general availability. Also, the architecture is not to be discussed in this PR, but in the RFC and associated Slack channel instead. Reviewers of this PR should take that into consideration. The quality bar to apply during review depends on what area of the code is being reviewed: * Importer code (`fast_import`): practically anything goes * Core flow (`flow.rs`): * Malicious input data must be expected and the existing threat models apply. * The code must not be safe to execute on dedicated Pageserver instances: * This means in particular that tenants on other Pageserver instances must not be affected negatively wrt data confidentiality, integrity or availability. * Other code: the usual quality bar * Pay special attention to correct use of gate guards, timeline cancellation in all places during shutdown & migration, etc. * Consider the broader system impact; if you find potentially problematic interactions with Storage features that were not covered in the RFC, bring that up during the review. I recommend submitting three separate reviews, for the three high-level areas with different quality bars. # References (Internal-only) * refs https://github.com/neondatabase/cloud/issues/17507 * refs https://github.com/neondatabase/company_projects/issues/293 * refs https://github.com/neondatabase/company_projects/issues/309 * refs https://github.com/neondatabase/cloud/issues/20646 --------- Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: John Spray <john@neon.tech>	2024-11-22 22:47:06 +00:00
Anastasia Lubennikova	3245f7b88d	Rename 'installed_extensions' metric to 'compute_installed_extensions' (#9759 ) to keep it consistent with existing compute metrics. flux-fleet change is not needed, because it doesn't have any filter by metric name for compute metrics.	2024-11-22 19:27:04 +00:00
Alex Chi Z.	c1937d073f	fix(pageserver): ensure upload happens after delete (#9844 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/9682, that patch didn't fully address the problem: what if shutdown fails due to whatever reason and then we reattach the tenant? Then we will still remove the future layer. The underlying problem is that the fix for #5878 gets voided because of the generation optimizations. Of course, we also need to ensure that delete happens after uploads, but note that we only schedule deletes when there are no ongoing upload tasks, so that's fine. ## Summary of changes * Add a test case to reproduce the behavior (by changing the original test case to attach the same generation). * If layer upload happens after the deletion, drain the deletion queue before uploading. * If blocked_deletion is enabled, directly remove it from the blocked_deletion queue. * Local fs backend fix to avoid race between deletion and preload. * test_emergency_mode does not need to wait for uploads (and it's generally not possible to wait for uploads). * ~~Optimize deletion executor to skip validation if there are no files to delete.~~ this doesn't work --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 18:30:53 +00:00
Alex Chi Z.	6f8b1eb5a6	test(pageserver): add detach ancestor smoke test (#9842 ) ## Problem Follow up to https://github.com/neondatabase/neon/pull/9682, hopefully we can detect some issues or assure ourselves that this is ready for production. ## Summary of changes * Add a compaction-detach-ancestor smoke test. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 18:21:51 +00:00
Erik Grinaker	e939d36dd4	safekeeper,pageserver: fix CPU profiling allowlists (#9856 ) ## Problem The HTTP router allowlists matched both on the path and the query string. This meant that only `/profile/cpu` would be allowed without auth, while `/profile/cpu?format=svg` would require auth. Follows #9764. ## Summary of changes * Match allowlists on URI path, rather than the entire URI. * Fix the allowlist for Safekeeper to use `/profile/cpu` rather than the old `/pprof/profile`. * Just use a constant slice for the allowlist; it's only a handful of items, and these handlers are not on hot paths.	2024-11-22 17:50:33 +00:00
Alex Chi Z.	211e4174d2	fix(pageserver): preempt and retry azure list operation (#9840 ) ## Problem close https://github.com/neondatabase/neon/issues/9836 Looking at Azure SDK, the only related issue I can find is https://github.com/azure/azure-sdk-for-rust/issues/1549. Azure uses reqwest as the backend, so I assume there's some underlying magic unknown to us that might have caused the stuck in #9836. The observation is: * We didn't get an explicit out of resource HTTP error from Azure. * The connection simply gets stuck and times out. * But when we retry after we reach the timeout, it succeeds. This issue is hard to identify -- maybe something went wrong at the ABS side, or something wrong with our side. But we know that a retry will usually succeed if we give up the stuck connection. Therefore, I propose the fix that we preempt stuck HTTP operation and actively retry. This would mitigate the problem, while in the long run, we need to keep an eye on ABS usage and see if we can fully resolve this problem. The reasoning of such timeout mechanism: we use a much smaller timeout than before to preempt, while it is possible that a normal listing operation would take a longer time than the initial timeout if it contains a lot of keys. Therefore, after we terminate the connection, we should double the timeout, so that such requests would eventually succeed. ## Summary of changes * Use exponential growth for ABS list timeout. * Rather than using a fixed timeout, use a timeout that starts small and grows * Rather than exposing timeouts to the list_streaming caller as soon as we see them, only do so after we have retried a few times Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 17:50:00 +00:00
Ivan Efremov	3b1ac8b14a	proxy: Implement cancellation rate limiting (#9739 ) Implement cancellation rate limiting and ip allowlist checks. Add ip_allowlist to the cancel closure Fixes [#16456](https://github.com/neondatabase/cloud/issues/16456)	2024-11-22 16:46:38 +00:00
Alexander Bayandin	b3b579b45e	test_bulk_insert: fix typing for PgVersion (#9854 ) ## Problem Along with the migration to Python 3.11, I switched `C(str, Enum)` with `C(StrEnum)`; one such example is the `PgVersion` enum. It required more changes in `PgVersion` itself (before, it accepted both `str` and `int`, and after it, it supports only `str`), which caused the `test_bulk_insert` test to fail. ## Summary of changes - `test_bulk_insert`: explicitly cast pg_version from `timeline_detail` to str	2024-11-22 16:13:53 +00:00
Conrad Ludgate	8ab96cc71f	chore(proxy/jwks): reduce the rightward drift of jwks renewal (#9853 ) I found the rightward drift of the `renew_jwks` function hard to review. This PR splits out some major logic and uses early returns to make the happy path more linear.	2024-11-22 14:51:32 +00:00
Alexander Bayandin	51d26a261b	build(deps): bump mypy from 1.3.0 to 1.13.0 (#9670 ) ## Problem We use a pretty old version of `mypy` 1.3 (released 1.5 years ago), it produces false positives for `typing.Self`. ## Summary of changes - Bump `mypy` from 1.3 to 1.13 - Fix new warnings and errors - Use `typing.Self` whenever we `return self`	2024-11-22 14:31:36 +00:00
Tristan Partin	c10b7f7de9	Write a newline after adding dynamic_shared_memory_type to PG conf (#9843 ) Without adding a newline, we can end up with a conf line that looks like the following: dynamic_shared_memory_type = mmap# Managed by compute_ctl: begin This leads to Postgres logging: LOG: configuration file "/var/db/postgres/compute/pgdata/postgresql.conf" contains errors; unaffected changes were applied Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-22 13:37:06 +00:00
Heikki Linnakangas	7372312a73	Avoid unnecessary send_replace calls in seqwait (#9852 ) The notifications need to be sent whenever the waiters heap changes, per the comment in `update_status`. But if 'advance' is called when there are no waiters, or the new LSN is lower than the waiters so that no one needs to be woken up, there's no need to send notifications. This saves some CPU cycles in the common case that there are no waiters.	2024-11-22 13:29:49 +00:00
John Spray	d9de65ee8f	pageserver: permit reads behind GC cutoff during LSN grace period (#9833 ) ## Problem In https://github.com/neondatabase/neon/issues/9754 and the flakiness of `test_readonly_node_gc`, we saw that although our logic for controlling GC was sound, the validation of getpage requests was not, because it could not consider LSN leases when requests arrived shortly after restart. Closes https://github.com/neondatabase/neon/issues/9754 ## Summary of changes This is the "Option 3" discussed verbally -- rather than holding back gc cutoff, we waive the usual validation of request LSN if we are still waiting for leases to be sent after startup - When validating LSN in `wait_or_get_last_lsn`, skip the validation relative to GC cutoff if the timeline is still in its LSN lease grace period - Re-enable test_readonly_node_gc	2024-11-22 09:24:23 +00:00
Fedor Dikarev	83b73fc24e	Batch scrape workflows up to last 30 days and stop ad-hoc (#9846 ) Comparing Batch and Ad-hoc collectors there is no big difference, just we need scrape for longer duration to catch retries. Dashboard with comparison: https://neonprod.grafana.net/d/be3pjm7c9ne2oe/compare-ad-hoc-and-batch?orgId=1&from=1731345095814&to=1731946295814 I should anyway raise support case with Github relating to that, meanwhile that should be working solution and should save us some cost, so it worths to switch to Batch now. Ref: https://github.com/neondatabase/cloud/issues/17503	2024-11-22 09:06:00 +00:00
Peter Bendel	1e05e3a6e2	minor PostgreSQL update in benchmarking (#9845 ) ## Problem in benchmarking.yml job pgvector we install postgres from deb packages. After the minor postgres update the referenced packages no longer exist [Failing job: ](https://github.com/neondatabase/neon/actions/runs/11965785323/job/33360391115#step:4:41) ## Summary of changes Reference and install the updated packages. [Successful job after this fix](https://github.com/neondatabase/neon/actions/runs/11967959920/job/33366011934#step:4:45)	2024-11-22 08:31:54 +00:00
Tristan Partin	37962e729e	Fix panic in compute_ctl metrics collection (#9831 ) Calling unwrap on the encoder is a little overzealous. One of the errors that can be returned by the encode function in particular is the non-existence of metrics for a metric family, so we should prematurely filter instances like that out. I believe that the cause of this panic was caused by a race condition between the prometheus collector and the compute collecting the installed extensions metric for the first time. The HTTP server is spawned on a separate thread before we even start bringing up Postgres. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-21 20:19:02 +00:00
Erik Grinaker	190e8cebac	safekeeper,pageserver: add CPU profiling (#9764 ) ## Problem We don't have a convenient way to gather CPU profiles from a running binary, e.g. during production incidents or end-to-end benchmarks, nor during microbenchmarks (particularly on macOS). We would also like to have continuous profiling in production, likely using [Grafana Cloud Profiles](https://grafana.com/products/cloud/profiles-for-continuous-profiling/). We may choose to use either eBPF profiles or pprof profiles for this (pending testing and discussion with SREs), but pprof profiles appear useful regardless for the reasons listed above. See https://github.com/neondatabase/cloud/issues/14888. This PR is intended as a proof of concept, to try it out in staging and drive further discussions about profiling more broadly. Touches #9534. Touches https://github.com/neondatabase/cloud/issues/14888. ## Summary of changes Adds a HTTP route `/profile/cpu` that takes a CPU profile and returns it. Defaults to a 5-second pprof Protobuf profile for use with e.g. `pprof` or Grafana Alloy, but can also emit an SVG flamegraph. Query parameters: * `format`: output format (`pprof` or `svg`) * `frequency`: sampling frequency in microseconds (default 100) * `seconds`: number of seconds to profile (default 5) Also integrates pprof profiles into Criterion benchmarks, such that flamegraph reports can be taken with `cargo bench ... --profile-duration <seconds>`. Output under `target/criterion//profile/flamegraph.svg`. Example profiles: pprof profile (use [`pprof`](https://github.com/google/pprof)): [profile.pb.gz](https://github.com/user-attachments/files/17756788/profile.pb.gz) * Web interface: `pprof -http :6060 profile.pb.gz` * Interactive flamegraph: [profile.svg.gz](https://github.com/user-attachments/files/17756782/profile.svg.gz)	2024-11-21 18:59:46 +00:00
Conrad Ludgate	725a5ff003	fix(proxy): CancelKeyData display log masking (#9838 ) Fixes the masking for the CancelKeyData display format. Due to negative i32 cast to u64, the top-bits all had `0xffffffff` prefix. On the bitwise-or that followed, these took priority. This PR also compresses 3 logs during sql-over-http into 1 log with durations as label fields, as prior discussed.	2024-11-21 16:46:30 +00:00
Alexander Bayandin	8d1c44039e	Python 3.11 (#9515 ) ## Problem On Debian 12 (Bookworm), Python 3.11 is the latest available version. ## Summary of changes - Update Python to 3.11 in build-tools - Fix ruff check / format - Fix mypy - Use `StrEnum` instead of pair `str`, `Enum` - Update docs	2024-11-21 16:25:31 +00:00
Konstantin Knizhnik	0713ff3176	Bump Postgres version (#9808 ) ## Problem I have made a mistake in merging Postgre PRs ## Summary of changes Restore consistency of submodule referenced. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-21 14:56:56 +00:00
John Spray	42bda5d632	pageserver: revise metrics lifetime for SecondaryTenant (#9818 ) ## Problem We saw a scale test failure when one shard went secondary->attached->secondary in a short period of time -- the metrics for the shard failed a validation assertion that is meant to ensure the size metric matches the sum of layer sizes in the SecondaryDetail struct. This appears to be due to two SecondaryTenants being alive at the same time -- the first one was shut down but still had its contributions to the metrics. Closes: https://github.com/neondatabase/neon/issues/9628 ## Summary of changes - Refactor code for validating metrics and call it in shutdown as well as during downloads - Move code for dropping per-tenant secondary metrics from drop() into shutdown(), so that once shutdown() completes it is definitely safe to instantiate another SecondaryTenant for the same tenant.	2024-11-21 08:31:24 +00:00
Ivan Efremov	995e729ebe	Merge pull request #9832 from neondatabase/rc/release-proxy/2024-11-21 Proxy release 2024-11-21	2024-11-21 09:41:31 +02:00
github-actions[bot]	76077e1ddf	Proxy release 2024-11-21	2024-11-21 06:02:11 +00:00
Arpad Müller	59c2c3f8ad	compute_ctl: print OpenTelemetry errors via tracing, not stdout (#9830 ) Before, `OpenTelemetry` errors were printed to stdout/stderr directly, causing one of the few log lines without a timestamp, like: ``` OpenTelemetry trace error occurred. error sending request for url (http://localhost:4318/v1/traces) ``` Now, we print: ``` 2024-11-21T02:24:20.511160Z INFO OpenTelemetry error: error sending request for url (http://localhost:4318/v1/traces) ``` I found this while investigating #9731.	2024-11-21 04:46:01 +00:00
Ivan Efremov	2d6bf176a0	proxy: Refactor http conn pool (#9785 ) - Use the same ConnPoolEntry for http connection pool. - Rename EndpointConnPool to the HttpConnPool. - Narrow clone bound for client Fixes #9284	2024-11-20 19:36:29 +00:00
Vadim Kharitonov	313ebfdb88	[proxy] chore: allow bypassing empty `params` to `/sql` endpoint (#9827 ) ## Problem ``` curl -H "Neon-Connection-String: postgresql://neondb_owner:PASSWORD@ep-autumn-rain-a58lubg0.us-east-2.aws.neon.tech/neondb?sslmode=require" https://ep-autumn-rain-a58lubg0.us-east-2.aws.neon.tech/sql -d '{"query":"SELECT 1","params":[]}' ``` For such a query, I also need to send `params`. Do I really need it? ## Summary of changes I've marked `params` as optional	2024-11-20 19:36:23 +00:00
Arpad Müller	811fab136f	scrubber: allow restricting find_garbage to a partial tenant id prefix (#9814 ) Adds support to the `find_garbage` command to restrict itself to a partial tenant ID prefix, say `a`, and then it only traverses tenants with IDs starting with `a`. One can now pass the `--tenant-id-prefix` parameter. That way, one can shard the `find_garbage` command and make it run in parallel. The PR also does a change of how `remote_storage` first removes trailing `/`s, only to then add them in the listing function. It turns out that this isn't neccessary and it prevents the prefix functionality from working. S3 doesn't do this either.	2024-11-20 19:31:02 +00:00
Vlad Lazar	ee26f09e45	pageserver: remove shard split hard link assertion (#9829 ) ## Problem We were hitting this assertion in debug mode tests sometimes. This case was being hit when the parent shard has no resident layers. For instance, this is the case on split retry where the previous attempt shut-down the parent and deleted local state for it. If the logical size calculation does not download some layers before we get to the hardlinking, then the assertion is hit. ## Summary of Changes Remove the assertion. It's fine for the ancestor to not have any resident layers at the time of the split. Closes https://github.com/neondatabase/neon/issues/9412	2024-11-20 18:33:05 +00:00
Conrad Ludgate	f36f0068b8	chore(proxy): demote more logs during successful connection attempts (#9828 ) Follow up to #9803 See https://github.com/neondatabase/cloud/issues/14378 In collaboration with @cloneable and @awarus, we sifted through logs and simply demoted some logs to debug. This is not at all finished and there are more logs to review, but we ran out of time in the session we organised. In any slightly more nuanced cases, we didn't touch the log, instead leaving a TODO comment. I've also slightly refactored the sql-over-http body read/length reject code. I can split that into a separate PR. It just felt natural after I switched to `read_body_with_limit` as we discussed during the meet.	2024-11-20 17:50:39 +00:00
John Spray	5ff2f1ee7d	pageserver: enable compaction to proceed while live-migrating (#5397 ) ## Problem Long ago, in #5299 the tenant states for migration are added, but respected only in a coarse-grained way: when hinted not to do deletions, tenants will just avoid doing all GC or compaction. Skipping compaction is not necessary for AttachedMulti, as we will soon become the primary attached location, and it is not a waste of resources to proceed with compaction. Instead, per the RFC https://github.com/neondatabase/neon/pull/5029/files), deletions should be queued up in this state, and executed later when we switch to AttachedSingle. Avoiding compaction in AttachedMulti can have an operational impact if a tenant is under significant write load, as a long-running migration can result in a large accumulation of delta layers with commensurate impact on read latency. Closes: https://github.com/neondatabase/neon/issues/5396 ## Summary of changes - Add a 'config' part to RemoteTimelineClient so that it can be aware of the mode of the tenant it belongs to, and wire this through for construction + updates - Add a special buffer for delayed deletions, and when in AttachedMulti route deletions here instead of into the main remote client queue. This is drained when transitioning to AttachedSingle. If the tenant is detached or our process dies before then, then these objects are leaked. - As a quality of life improvement, also use the remote timeline client's knowledge of the tenant state to avoid submitting remote consistent LSN updates for validation when in AttachedStale (as we know these will fail) ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-11-20 17:31:55 +00:00
John Spray	67f5f83edc	pageserver: avoid reading SLRU blocks for GC on shards >0 (#9423 ) ## Problem SLRU blocks, which can add up to several gigabytes, are currently ingested by all shards, multiplying their capacity cost by the shard count and slowing down ingest. We do this because all shards need the SLRU pages to do timestamp->LSN lookup for GC. Related: https://github.com/neondatabase/neon/issues/7512 ## Summary of changes - On non-zero shards, learn the GC offset from shard 0's index instead of calculating it. - Add a test `test_sharding_gc` that exercises this - Do GC in test_pg_regress as a general smoke test that GC functions run (e.g. this would fail if we were using SLRUs we didn't have) In this PR we are still ingesting SLRUs everywhere, but not using them any more. Part 2 PR (https://github.com/neondatabase/neon/pull/9786) makes the change to not store them at all. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-11-20 15:56:14 +00:00
John Spray	593e35027a	tests: use fewer pageservers in test_sharding_split_smoke (#9804 ) ## Problem This test uses a gratuitous number of pageservers (16). This works fine when there are plenty of system resources, but causes issues on test runners that have limited resources and run many tests concurrently. Related: https://github.com/neondatabase/neon/issues/9802 ## Summary of changes - Split from 2 shards to 4, instead of 4 to 8 - Don't give every shard a separate pageserver, let two locations share each pageserver. Net result is 4 pageservers instead of 16	2024-11-20 14:57:59 +00:00
Folke Behrens	bf7d859a8b	proxy: Rename RequestMonitoring to RequestContext (#9805 ) ## Problem It is called context/ctx everywhere and the Monitoring suffix needlessly confuses with proper monitoring code. ## Summary of changes * Rename RequestMonitoring to RequestContext * Rename RequestMonitoringInner to RequestContextInner	2024-11-20 12:50:36 +00:00
Alexander Bayandin	899933e159	scan_log_for_errors: check that regex is correct (#9815 ) ## Problem I've noticed that we have 2 flaky tests which failed with error: ``` re.error: missing ), unterminated subpattern at position 21 ``` - `test_timeline_archival_chaos` — has been already fixed - `test_sharded_tad_interleaved_after_partial_success` — I didn't manage to find the incorrect regex [Internal link](https://neonprod.grafana.net/goto/yfmVHV7NR?orgId=1) ## Summary of changes - Wrap `re.match` in `try..except` block and print incorrect regex	2024-11-20 12:48:21 +00:00
Alexander Bayandin	46beecacce	CI(benchmarking): route test failures to on-call-qa-staging-stream (#9813 ) ## Problem We want to keep `#on-call-staging-stream` channel close to the prod one and redirect notifications from failing benchmarks to another channel for investigation. ## Summary of changes - Send notifications regarding failures in `benchmarking` job to `#on-call-staging-stream` - Send notifications regarding failures in `periodic_pagebench` job to `#on-call-staging-stream`	2024-11-20 12:23:41 +00:00
Fedor Dikarev	94e4a0e2a0	update macos version for runner (#9817 ) Closes: https://github.com/neondatabase/neon/issues/9816 Run MacOs builds on `macos-15`. As `pkg-config` is bundled in runner image, don't install it with `brew`	2024-11-20 13:04:14 +01:00
John Spray	33dce25af8	safekeeper: block deletion on protocol handler shutdown (#9364 ) ## Problem Two recently observed log errors indicate safekeeper tasks for a timeline running after that timeline's deletion has started. - https://github.com/neondatabase/neon/issues/8972 - https://github.com/neondatabase/neon/issues/8974 These code paths do not have a mechanism that coordinates task shutdown with the overall shutdown of the timeline. ## Summary of changes - Add a `Gate` to `Timeline` - Take the gate as part of resident timeline guard: any code that holds a guard over a timeline staying resident should also hold a guard over the timeline's total lifetime. - Take the gate from the wal removal task - Respect Timeline::cancel in WAL send/recv code, so that we do not block shutdown indefinitely. - Add a test that deletes timelines with open pageserver+compute connections, to check these get torn down as expected. There is some risk to introducing gates: if there is code holding a gate which does not properly respect a cancellation token, it can cause shutdown hangs. The risk of this for safekeepers is lower in practice than it is for other services, because in a healthy timeline deletion, the compute is shutdown first, then the timeline is deleted on the pageserver, and finally it is deleted on the safekeepers -- that makes it much less likely that some protocol handler will still be running. Closes: #8972 Closes: #8974	2024-11-20 11:07:45 +00:00
Conrad Ludgate	3ae0b2149e	chore(proxy): demote a ton of logs for successful connection attempts (#9803 ) See https://github.com/neondatabase/cloud/issues/14378 In collaboration with @cloneable and @awarus, we sifted through logs and simply demoted some logs to debug. This is not at all finished and there are more logs to review, but we ran out of time in the session we organised. In any slightly more nuanced cases, we didn't touch the log, instead leaving a TODO comment.	2024-11-20 10:14:28 +00:00
Arpad Müller	0a499a3176	Don't preload offloaded timelines (#9646 ) In timeline preloading, we also do a preload for offloaded timelines. This includes the download of `index-part.json`. Ultimately, such a download is wasteful, therefore avoid it. Same goes for the remote client, we just discard it immediately thereafter. Part of #8088 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-11-20 05:44:23 +00:00
Matthias van de Meent	ea1858e3b6	compute_ctl: Streamline and Pipeline startup SQL (#9717 ) Before, compute_ctl didn't have a good registry for what command would run when, depending exclusively on sync code to apply changes. When users have many databases/roles to manage, this step can take a substantial amount of time, breaking assumptions about low (re)start times in other systems. This commit reduces the time compute_ctl takes to restart when changes must be applied, by making all commands more or less blind writes, and applying these commands in an asynchronous context, only waiting for completion once we know the commands have all been sent. Additionally, this reduces time spent by batching per-database operations where previously we would create a new SQL connection for every user-database operation we planned to execute.	2024-11-20 02:14:58 +01:00
Alexander Bayandin	2281a02c49	CODEOWNERS: add developer-productivity team (#9810 ) Notify @neondatabase/developer-productivity team about changes in CI (i.e. in `.github/` directory)	2024-11-20 00:30:24 +00:00
Alexander Bayandin	725e0a1ac9	CI(release): create reusable workflow for releases (#9806 ) ## Problem We have a bunch of duplicated code for automated releases. There will be even more, once we have `release-compute` branch (https://github.com/neondatabase/neon/pull/9637). Another issue with the current `release` workflow is that it creates a PR from the main as is. If we create 2 different releases from the same commit, GitHub could mix up results from different PRs. ## Summary of changes - Create a reusable workflow for releases - Create an empty commit to differentiate releases	2024-11-19 23:03:15 +00:00
Konstantin Knizhnik	770ac34ae6	Register custom xlog reader callbacks for on-demand WAL download in StartupDecodingContext (#9007 ) ## Problem See https://github.com/neondatabase/neon/issues/8931 On-demand WAL download are not set in all cases where WAL is accessed by logical replication ## Summary of changes Set customer xlog reader handles in StartupDecodingContext Related changes in Postgres modules: https://github.com/neondatabase/postgres/pull/495 https://github.com/neondatabase/postgres/pull/496 https://github.com/neondatabase/postgres/pull/497 https://github.com/neondatabase/postgres/pull/498 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-19 22:29:57 +02:00
Alex Chi Z.	b22a84a7bf	feat(pageserver): support key range for manual compaction trigger (#9723 ) part of https://github.com/neondatabase/neon/issues/9114, we want to be able to run partial gc-compaction in tests. In the future, we can also expand this functionality to legacy compaction, so that we can trigger compaction for a specific key range. ## Summary of changes * Support passing compaction key range through pageserver routes. * Refactor input parameters of compact related function to take the new `CompactOptions`. * Add tests for partial compaction. Note that the test may or may not trigger compaction based on GC horizon. We need to improve the test case to ensure things always get below the gc_horizon and the gc-compaction can be triggered. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-19 19:38:41 +00:00
Arpad Müller	b092126c94	scrubber: fix parsing issue with Azure (#9797 ) Apparently Azure returns timelines ending with `/` which confuses the parsing. So remove all trailing `/`s before attempting to parse. Part of https://github.com/neondatabase/cloud/issues/19963	2024-11-19 20:10:53 +01:00
Alex Chi Z.	5e3fbef721	fix(pageserver): queue stopped error should be ignored during create timeline (#9767 ) close https://github.com/neondatabase/neon/issues/9730 The test case tests if anything goes wrong during pageserver restart + during timeline creation not complete. Therefore, queue is stopped error is normal in this case, except that it should be categorized as a shutdown error instead of a real error. ## Summary of changes * More comments for the test case. * Queue stopped error will now be forwarded as CreateTimelineError::ShuttingDown. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-19 14:10:09 -05:00
dependabot[bot]	15468cd23c	build(deps): bump aiohttp from 3.10.2 to 3.10.11 (#9794 )	2024-11-19 19:08:00 +00:00
Peter Bendel	a8ac895b83	re-acquire S3 OIDC token after long running tests for report upload to S3 (#9799 ) ## Problem If a benchmark or test-case runs longer than the AWS OIDC token lifetime successive upload of test reports to S3 fail - example: https://github.com/neondatabase/neon/actions/runs/11905529176/job/33176168174#step:9:243 ## Summary of changes In actions that require access to S3 and which are invoked after a long running python testcase we re-acquire the OIDC token explicitly. Note that we need to pass down the aws_oicd_role_arn from the workflow to the action because actions have no access to GitHub vars for security reasons. Sample run https://github.com/neondatabase/neon/actions/runs/11912328276/job/33195676867	2024-11-19 18:22:51 +01:00
Heikki Linnakangas	ada84400b7	PostgreSQL minor version updates (17.2, 16.6, 15.10, 14.15) (#9795 ) The community decided to make a new off-schedule release due to ABI breakage in last week's release. We're not affected by the ABI breakage because we rebuild all extensions in our docker images, but let's stay up-to-date. There were a few other fixes in the release too.	2024-11-19 17:01:05 +02:00
Conrad Ludgate	191f745c81	fix(proxy/auth_broker): ignore -pooler suffix (#9800 ) Fixes https://github.com/neondatabase/cloud/issues/20400 We cannot mix local_proxy and pgbouncer, so we are filtering out the `-pooler` suffix prior to calling wake_compute.	2024-11-19 13:58:26 +00:00
Conrad Ludgate	37b97b3a68	chore(local_proxy): reduce some startup logging (#9798 ) Currently, local_proxy will write an error log if it doesn't find the config file. This is expected for startup, so it's just noise. It is an error if we do receive an explicit SIGHUP though. I've also demoted the build info logs to be debug level. We don't need them in the compute image since we have other ways to determine what code is running. Lastly, I've demoted SIGHUP signal handling from warn to info, since it's not really a warning event. See https://github.com/neondatabase/cloud/issues/10880 for more details	2024-11-19 13:58:11 +00:00
Konstantin Knizhnik	c9acd214ae	Do not create DSM segment for wal_redo_postgres (#9793 ) ## Problem See https://github.com/neondatabase/neon/issues/9738 ## Summary of changes Do not create DSM segment for wal_redo Postgres --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-19 11:56:40 +02:00
Peter Bendel	982cb1c15d	Move logic for ingest benchmark from GitHub workflow into python testcase (#9762 ) ## Problem The first version of the ingest benchmark had some parsing and reporting logic in shell script inside GitHub workflow. it is better to move that logic into a python testcase so that we can also run it locally. ## Summary of changes - Create new python testcase - invoke pgcopydb inside python test case - move the following logic into python testcase - determine backpressure - invoke pgcopydb and report its progress - parse pgcopydb log and extract metrics - insert metrics into perf test database - add additional column to perf test database that can receive endpoint ID used for pgcopydb run to have it available in grafana dashboard when retrieving other metrics for an endpoint ## Example run https://github.com/neondatabase/neon/actions/runs/11860622170/job/33056264386	2024-11-19 09:46:46 +00:00
Arpad Müller	9b6af2bcad	Add the ability to configure GenericRemoteStorage for the scrubber (#9652 ) Earlier work (#7547) has made the scrubber internally generic, but one could only configure it to use S3 storage. This is the final piece to make (most of, snapshotting still requires S3) the scrubber be able to be configured via GenericRemoteStorage. I.e. you can now set an env var like: ``` REMOTE_STORAGE_CONFIG='remote_storage = { bucket_name = "neon-dev-safekeeper-us-east-2d", bucket_region = "us-east-2" } ``` and the scrubber will read it instead.	2024-11-18 21:01:48 +00:00
Arpad Müller	4fc3af15dd	Remove at most one retain_lsn entry from (possibly offloaded) timelne's parent (#9791 ) There is a potential data corruption issue, not one I've encountered, but it's still not hard to hit with some correct looking code given our current architecture. It has to do with the timeline's memory object storage via reference counted `Arc`s, and the removal of `retain_lsn` entries at the drop of the last `Arc` reference. The corruption steps are as follows: 1. timeline gets offloaded. timeline object A doesn't get dropped though, because some long-running task accesses it 2. the same timeline gets unoffloaded again. timeline object B gets created for it, timeline object A still referenced. both point to the same timeline. 3. the task keeping the reference to timeline object A exits. destructor for object A runs, removing `retain_lsn` in the timeline's parent. 4. the timeline's parent runs gc without the `retain_lsn` of the still exant timleine's child, leading to data corruption. In general we are susceptible each time when we recreate a `Timeline` object in the same process, which happens both during a timeline offload/unoffload cycle, as well as during an ancestor detach operation. The solution this PR implements is to make the destructor for a timeline as well as an offloaded timeline remove at most one `retain_lsn`. PR #9760 has added a log line to print the refcounts at timeline offload, but this only detects one of the places where we do such a recycle operation. Plus it doesn't prevent the actual issue. I doubt that this occurs in practice. It is more a defense in depth measure. Usually I'd assume that the timeline gets dropped immediately in step 1, as there is no background tasks referencing it after its shutdown. But one never knows, and reducing the stakes of step 1 actually occurring is a really good idea, from potential data corruption to waste of CPU time. Part of #8088	2024-11-18 21:42:19 +01:00
Vlad Lazar	d7662fdc7b	feat(page_service): timeout-based batching of requests (#9321 ) ## Problem We don't take advantage of queue depth generated by the compute on the pageserver. We can process getpage requests more efficiently by batching them. ## Summary of changes Batch up incoming getpage requests that arrive within a configurable time window (`server_side_batch_timeout`). Then process the entire batch via one `get_vectored` timeline operation. By default, no merging takes place. ## Testing * Functional: https://github.com/neondatabase/neon/pull/9792 * Performance: will be done in staging/pre-prod # Refs * https://github.com/neondatabase/neon/issues/9377 * https://github.com/neondatabase/neon/issues/9376 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-11-18 20:24:03 +00:00
Alex Chi Z.	e5c89f3da3	feat(pageserver): drop disposable keys during gc-compaction (#9765 ) close https://github.com/neondatabase/neon/issues/9552, close https://github.com/neondatabase/neon/issues/8920, part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes * Drop keys not belonging to this shard during gc-compaction to avoid constructing history that might have been truncated during shard compaction. * Run gc-compaction at the end of shard compaction test. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-18 19:27:52 +00:00
Alexey Kondratov	5f0e9c9a94	feat(compute/tests): Report successful replication test runs as well (#9787 ) It should increase the visibility of whether they run and pass.	2024-11-18 16:05:09 +00:00
Alexander Bayandin	44f33b2bd6	Bump default Postgres version for tests to v17 (#9777 ) ## Problem Tests that are marked with `run_only_on_default_postgres` do not run on debug builds on CI because we run debug builds only for the latest Postgres version (which is 17) ## Summary of changes - Bump `PgVersion.DEFAULT` to `v17` - Skip `test_timeline_archival_chaos` in debug builds	2024-11-18 15:06:24 +00:00
Alexander Bayandin	913b5b7027	CI: remove separate check-build-tools-image workflow (#9708 ) ## Problem We call `check-build-tools-image` twice for each workflow whenever we use it, along with `build-build-tools-image`, once as a workflow itself, and the second time from `build-build-tools-image`. This is not necessary. ## Summary of changes - Inline `check-build-tools-image` into `build-build-tools-image` - Remove separate `check-build-tools-image` workflow	2024-11-18 13:14:28 +00:00
John Spray	3f401a328f	tests: mitigate bug to stabilize test_storage_controller_many_tenants (#9771 ) ## Problem Due to #9471 , the scale test occasionally gets 404s while trying to modify the config of a timeline that belongs to a tenant being migrated. We rarely see this narrow race in the field, but the test is quite good at reproducing it. ## Summary of changes - Ignore 404 errors in this test.	2024-11-18 11:33:27 +00:00
Peter Bendel	c3eecf6763	adapt pgvector bench to minor version upgrades of PostgreSql (#9784 ) ## Problem pgvector benchmark is failing because after PostgreSQL minor version upgrade previous version packages are no longer available in deb repository [example failure](https://github.com/neondatabase/neon/actions/runs/11875503070/job/33092787149#step:4:40) ## Summary of changes Update postgres minor version of packages to current version [Example run after this change](https://github.com/neondatabase/neon/actions/runs/11888978279/job/33124614605)	2024-11-18 10:47:43 +00:00
Konstantin Knizhnik	6fa9b0cd8c	Use DATA_DIR instead of current workign directory in restore_from_wal script (#9729 ) ## Problem See https://github.com/neondatabase/neon/issues/7750 test_wal_restore.sh is copying file to current working directory which can cause interfere of test_wa_restore.py tests spawned of different configurations. ## Summary of changes Copy file to $DATA_DIR Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-18 11:55:38 +02:00
a-masterov	10bc1903e1	Fix the regression test running against the staging instance (#9773 ) ## Problem The Postgres version was updated. The patch has to be updated accordingly. ## Summary of changes The patch of the regression test was updated.	2024-11-18 10:30:50 +01:00
John Spray	261d065e6f	pageserver: respect no_sync in `VirtualFile` (#9772 ) ## Problem `no_sync` initially just skipped syncfs on startup (#9677). I'm also interested in flaky tests that time out during pageserver shutdown while flushing l0s, so to eliminate disk throughput as a source of issues there, ## Summary of changes - Drive-by change for test timeouts: add a couple more ::info logs during pageserver startup so it's obvious which part got stuck. - Add a SyncMode enum to configure VirtualFile and respect it in sync_all and sync_data functions - During pageserver startup, set SyncMode according to `no_sync`	2024-11-18 08:59:05 +00:00
Christian Schwarz	b6154b03f4	build(deps): bump smallvec to 1.13.2 to get UB fix (#9781 ) Smallvec 1.13.2 contains [an UB fix](https://github.com/servo/rust-smallvec/pull/345). Upstream opened [a request](https://github.com/rustsec/advisory-db/issues/1960) for this in the advisory-db but it never got acted upon. Found while working on https://github.com/neondatabase/neon/pull/9321.	2024-11-17 21:25:16 +01:00
Erik Grinaker	8880134171	Cargo.toml: upgrade tikv-jemallocator to 0.6.0 (#9779 )	2024-11-17 19:52:05 +01:00
Erik Grinaker	de7e4a34ca	safekeeper: send `AppendResponse` on segment flush (#9692 ) ## Problem When processing pipelined `AppendRequest`s, we explicitly flush the WAL every second and return an `AppendResponse`. However, the WAL is also implicitly flushed on segment bounds, but this does not result in an `AppendResponse`. Because of this, concurrent transactions may take up to 1 second to commit and writes may take up to 1 second before sending to the pageserver. ## Summary of changes Advance `flush_lsn` when a WAL segment is closed and flushed, and emit an `AppendResponse`. To accommodate this, track the `flush_lsn` in addition to the `flush_record_lsn`.	2024-11-17 18:19:14 +01:00
Vlad Lazar	ac689ab014	wal_decoder: rename end_lsn to next_record_lsn (#9776 ) ## Problem It turns out that `WalStreamDecoder::poll_decode` returns the start LSN of the next record and not the end LSN of the current record. They are not always equal. For example, they're not equal when the record in question is an XLOG SWITCH record. ## Summary of changes Rename things to reflect that.	2024-11-15 21:53:11 +00:00
Tristan Partin	23eabb9919	Fix PG_MAJORVERSION_NUM typo In `ea32f1d0a3`, Matthias added a feature to our extension to expose more granular wait events. However, due to the typo, those wait events were never registered, so we used the more generic wait events instead. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-15 15:17:23 -06:00
Vlad Lazar	2af791ba83	wal_decoder: make InterpretedWalRecord serde (#9775 ) ## Problem We want to serialize interpreted records to send them over the wire from safekeeper to pageserver. ## Summary of changes Make `InterpretedWalRecord` ser/de. This is a temporary change to get the bulk of the lift merged in https://github.com/neondatabase/neon/pull/9746. For going to prod, we don't want to use bincode since we can't evolve the schema. Questions on serialization will be tackled separately.	2024-11-15 20:34:48 +00:00
Mikhail Kot	e12628fe93	Collect max_connections metric (#9770 ) This will further allow us to expose this metric to users	2024-11-15 17:42:41 +00:00
Arpad Müller	7880c246f1	Correct mistakes in offloaded timeline retain_lsn management (#9760 ) PR #9308 has modified tenant activation code to take offloaded child timelines into account for populating the list of `retain_lsn` values. However, there is more places than just tenant activation where one needs to update the `retain_lsn`s. This PR fixes some bugs of the current code that could lead to corruption in the worst case: 1. Deleting of an offloaded timeline would not get its `retain_lsn` purged from its parent. With the patch we now do it, but as the parent can be offloaded as well, the situatoin is a bit trickier than for non-offloaded timelines which can just keep a pointer to their parent. Here we can't keep a pointer because the parent might get offloaded, then unoffloaded again, creating a dangling pointer situation. Keeping a pointer to the tenant is not good either, because we might drop the offloaded timeline in a context where a `offloaded_timelines` lock is already held: so we don't want to acquire a lock in the drop code of OffloadedTimeline. 2. Unoffloading a timeline would not get its `retain_lsn` values populated, leading to it maybe garbage collecting values that its children might need. We now call `initialize_gc_info` on the parent. 3. Offloading of a timeline would not get its `retain_lsn` values registered as offloaded at the parent. So if we drop the `Timeline` object, and its registration is removed, the parent would not have any of the child's `retain_lsn`s around. Also, before, the `Timeline` object would delete anything related to its timeline ID, now it only deletes `retain_lsn`s that have `MaybeOffloaded::No` set. Incorporates Chi's reproducer from #9753. cc https://github.com/neondatabase/cloud/issues/20199 The `test_timeline_retain_lsn` test is extended: 1. it gains a new dimension, duplicating each mode, to either have the "main" branch be the direct parent of the timeline we archive, or the "test_archived_parent" branch intermediary, creating a three timeline structure. This doesn't test anything fixed by this PR in particular, just explores the vast space of possible configurations a little bit more. 2. it gains two new modes, `offload-parent`, which tests the second point, and `offload-no-restart` which tests the third point. It's easy to verify the test actually is "sharp" by removing one of the respective `self.initialize_gc_info()`, `gc_info.insert_child()` or `ancestor_children.push()`. Part of #8088 --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Alex Chi Z <chi@neon.tech>	2024-11-15 14:22:29 +01:00
John Spray	04938d9d55	tests: tolerate pageserver 500s in test_timeline_archival_chaos (#9769 ) ## Problem Test exposes cases where pageserver gives 500 responses, causing failures like https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9766/11844529470/index.html#suites/d1acc79950edeb0563fc86236c620898/3546be2ffed99ba6 ## Summary of changes - Tolerate such messages, and link an issue for cleaning up the pageserver not to return such 500s.	2024-11-15 13:22:05 +00:00
Erik Grinaker	19f7d40c1d	deny.toml: allow CDDL-1.0 license (#9766 ) #9764, which adds profiling support to Safekeeper, pulls in the dependency [`inferno`](https://crates.io/crates/inferno) via [`pprof-rs`](https://crates.io/crates/pprof). This is licenced under the [Common Development and Distribution License 1.0](https://spdx.org/licenses/CDDL-1.0.html), which is not allowed by `cargo-deny`. This patch allows the CDDL-1.0 license. It is a derivative of the Mozilla Public License, which we already allow, but avoids some issues around European copyright law that the MPL had. As such, I don't expect this to be problematic.	2024-11-15 10:41:43 +00:00
John Spray	38563de7dd	storcon: exclude non-Active tenants from shard autosplitting (#9743 ) ## Problem We didn't have a neat way to prevent auto-splitting of tenants. This could be useful during incidents or for testing. Closes https://github.com/neondatabase/neon/issues/9332 ## Summary of changes - Filter splitting candidates by scheduling policy	2024-11-14 19:41:10 +00:00
John Spray	93939f123f	tests: add test_timeline_archival_chaos (#9609 ) ## Problem - We lack test coverage of cases where multiple timelines fight for updates to the same manifest (https://github.com/neondatabase/neon/pull/9557), and in timeline archival changes while dual-attached (https://github.com/neondatabase/neon/pull/9555) ## Summary of changes - Add a chaos test for timeline creation->archival->offload->deletion	2024-11-14 17:31:35 +00:00
Tristan Partin	49b599c113	Remove the replication slot in test_snap_files at the end of the test Analysis of the LR benchmarking tests indicates that in the duration of test_subscriber_lag, a leftover 'slotter' replication slot can lead to retained WAL growing on the publisher. This replication slot is not used by any subscriber. The only purpose of the slot is to generate snapshot files for the puspose of test_snap_files. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-14 10:59:15 -06:00
Yuchen Liang	8cde37bc0b	test: disable test_readonly_node_gc until proper fix (#9755 ) ## Problem After investigation, we think to make `test_readonly_node_gc` less flaky, we need to make a proper fix (likely involving persisting part of the lease state). See https://github.com/neondatabase/neon/issues/9754 for details. ## Summary of changes - skip the test until proper fix. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-11-14 15:26:58 +00:00
Konstantin Knizhnik	f70611c8df	Correctly truncate VM (#9342 ) ## Problem https://github.com/neondatabase/neon/issues/9240 ## Summary of changes Correctly truncate VM page instead just replacing it with zero page. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-11-14 17:19:13 +02:00
Vlad Lazar	21282aa113	cargo: use neon branch of rust-postgres (#9757 ) ## Problem We are pining our fork of rust-postgres to a commit hash and that prevents us from making further changes to it. The latest commit in rust-postgres requires https://github.com/neondatabase/neon/pull/8747, but that seems to have gone stale. I reverted rust-postgres `neon` branch to the pinned commit in https://github.com/neondatabase/rust-postgres/pull/31. ## Summary of changes Switch back to using the `neon` branch of the rust-postgres fork.	2024-11-14 15:16:43 +00:00
Arseny Sher	d06bf4b0fe	safekeeper: fix atomicity of WAL truncation (#9685 ) If WAL truncation fails in the middle it might leave some data on disk above the write/flush LSN. In theory, concatenated with previous records it might form bogus WAL (though very unlikely in practice because CRC would protect from that). To protect from that, set pending_wal_truncation flag: means before any WAL writes truncation must be retried until it succeeds. We already did that in case of safekeeper restart, now extend this mechanism for failures without restart. Also, importantly, reset LSNs in the beginning of the operation, not in the end, because once on disk deletion starts previous pointers are wrong. All this most likely haven't created any problems in practice because CRC protects from the consequences. Tests for this are hard; simulation infrastructure might be useful here in the future, but not yet.	2024-11-14 13:06:42 +03:00
Ivan Efremov	0467d88f06	Merge pull request #9756 from neondatabase/rc/proxy/2024-11-14 Proxy release 2024-11-14	2024-11-14 09:46:52 +02:00
Tristan Partin	1280b708f1	Improve error handling for NeonAPI fixture Move error handling to the common request function and add a debug log. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-13 20:35:48 -06:00
John Spray	b4e00b8b22	pageserver: refuse to load tenants with suspiciously old indices in old generations (#9719 ) ## Problem Historically, if a control component passed a pageserver "generation: 1" this could be a quick way to corrupt a tenant by loading a historic index. Follows https://github.com/neondatabase/neon/pull/9383 Closes #6951 ## Summary of changes - Introduce a Fatal variant to DownloadError, to enable index downloads to signal when they have encountered a scary enough situation that we shouldn't proceed to load the tenant. - Handle this variant by putting the tenant into a broken state (no matter which timeline within the tenant reported it) - Add a test for this case In the event that this behavior fires when we don't want it to, we have ways to intervene: - "Touch" an affected index to update its mtime (download+upload S3 object) - If this behavior is triggered, it indicates we're attaching in some old generation, so we should be able to fix that by manually bumping generation numbers in the storage controller database (this should never happen, but it's an option if it does)	2024-11-13 18:07:39 +00:00
Heikki Linnakangas	10aaa3677d	PostgreSQL minor version updates (17.1, 16.5, 15.9, 14.14) (#9727 ) This includes a patch to temporarily disable one test in the pg_anon test suite. It is an upstream issue, the test started failing with the new PostgreSQL minor versions because of a change in the default timezone used in tests. We don't want to block the release for this, so just disable the test for now. See `199f0a392b (note_2148017485)` Corresponding postgres repository PRs: https://github.com/neondatabase/postgres/pull/524 https://github.com/neondatabase/postgres/pull/525 https://github.com/neondatabase/postgres/pull/526 https://github.com/neondatabase/postgres/pull/527	2024-11-13 15:08:58 +02:00
Heikki Linnakangas	d5435b1a81	tests: Increase timeout in test_create_churn_during_restart (#9736 ) This test was seen to be flaky, e.g. at: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9457/11804246485/index.html#suites/ec4311502db344eee91f1354e9dc839b/982bd121ea698414/. If I _reduce_ the timeout from 10s to 8s on my laptop, it reliably hits that timeout and fails. That suggests that the test is pretty close to the edge even when it passes. Let's bump up the timeout to 30 s to make it more robust. See also https://github.com/neondatabase/neon/issues/9730, although the error message is different there.	2024-11-13 12:20:32 +02:00
Anastasia Lubennikova	080d585b22	Add installed_extensions prometheus metric (#9608 ) and add /metrics endpoint to compute_ctl to expose such metrics metric format example for extension pg_rag with versions 1.2.3 and 1.4.2 installed in 3 and 1 databases respectively: neon_extensions_installed{extension="pg_rag", version="1.2.3"} = 3 neon_extensions_installed{extension="pg_rag", version="1.4.2"} = 1 ------ infra part: https://github.com/neondatabase/flux-fleet/pull/251 --------- Co-authored-by: Tristan Partin <tristan@neon.tech>	2024-11-13 09:36:48 +00:00
John Spray	7595d3afe6	pageserver: add `no_sync` for use in regression tests (2/2) (#9678 ) ## Problem Followup to https://github.com/neondatabase/neon/pull/9677 which enables `no_sync` in tests. This can be merged once the next release has happened. ## Summary of changes - Always run pageserver with `no_sync = true` in tests.	2024-11-13 09:17:26 +00:00
Konstantin Knizhnik	1ff5333a1b	Do not wallog AUX files at replica (#9457 ) ## Problem Attempt to persist LR stuff at replica cause cannot make new WAL entries during recovery` error. See https://neondb.slack.com/archives/C07S7RBFVRA/p1729280401283389 ## Summary of changes Do not wallog AUX files at replica. Related Postgres PRs: https://github.com/neondatabase/postgres/pull/517 https://github.com/neondatabase/postgres/pull/516 https://github.com/neondatabase/postgres/pull/515 https://github.com/neondatabase/postgres/pull/514 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-11-13 08:50:01 +02:00
Tristan Partin	d8f5d43549	Fix autocommit footguns in performance tests psycopg2 has the following warning related to autocommit: > By default, any query execution, including a simple SELECT will start > a transaction: for long-running programs, if no further action is > taken, the session will remain “idle in transaction”, an undesirable > condition for several reasons (locks are held by the session, tables > bloat…). For long lived scripts, either ensure to terminate a > transaction as soon as possible or use an autocommit connection. In the 2.9 release notes, psycopg2 also made the following change: > `with connection` starts a transaction on autocommit transactions too Some of these connections are indeed long-lived, so we were retaining tons of WAL on the endpoints because we had a transaction pinned in the past. Link: https://www.psycopg.org/docs/news.html#what-s-new-in-psycopg-2-9 Link: https://github.com/psycopg/psycopg2/issues/941 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-12 15:48:19 -06:00
Erik Grinaker	2256a5727a	safekeeper: use `WAL_SEGMENT_SIZE` for empty timeline state (#9734 ) ## Problem `TimelinePersistentState::empty()`, used for tests and benchmarks, had a hardcoded 16 MB WAL segment size. This caused confusion when attempting to change the global segment size. ## Summary of changes Inherit from `WAL_SEGMENT_SIZE` in `TimelinePersistentState::empty()`.	2024-11-12 20:35:44 +00:00
Tristan Partin	3f80af8b1d	Add neon.logical_replication_max_logicalsnapdir_size This GUC will drop replication slots if the size of the pg_logical/snapshots directory (not including temp snapshot files) becomes larger than the specified size. Keeping the size of this directory smaller will help with basebackup size from the pageserver. Part-of: https://github.com/neondatabase/neon/issues/8619 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-12 13:13:28 -06:00
Tristan Partin	a61d81bbc7	Calculate compute_backpressure_throttling_seconds correctly The original value that we get is measured in microseconds. It comes from a calculation using Postgres' GetCurrentTimestamp(), whihc is implemented in terms of gettimeofday(2). Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-12 13:12:08 -06:00
Erik Grinaker	05381a48f0	utils: remove unnecessary fsync in `durable_rename()` (#9686 ) ## Problem WAL segment fsyncs significantly affect WAL ingestion throughput. `durable_rename()` is used when initializing every 16 MB segment, and issues 3 fsyncs of which 1 was unnecessary. ## Summary of changes Remove an fsync in `durable_rename` which is unnecessary with Linux and ext4 (which we currently use). This improves WAL ingestion throughput by up to 23% with large appends on my MacBook.	2024-11-12 18:57:31 +01:00
Alex Chi Z.	cef165818c	test(pageserver): add gc-compaction tests with delta will_init (#9724 ) I had an impression that gc-compaction didn't test the case where the first record of the key history is will_init because of there are some code path that will panic in this case. Luckily it got fixed in https://github.com/neondatabase/neon/pull/9026 so we can now implement such tests. Part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes * Randomly changed some images into will_init neon wal record * Split `test_simple_bottom_most_compaction_deltas` into two test cases, one of them has the bottom layer as delta layer with will_init flags, while the other is the original one with image layers. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-12 10:37:31 -05:00
Erik Grinaker	6b19867410	safekeeper: don't flush control file on WAL ingest path (#9698 ) ## Problem The control file is flushed on the WAL ingest path when the commit LSN advances by one segment, to bound the amount of recovery work in case of a crash. This involves 3 additional fsyncs, which can have a significant impact on WAL ingest throughput. This is to some extent mitigated by `AppendResponse` not being emitted on segment bound flushes, since this will prevent commit LSN advancement, which will be addressed separately. ## Summary of changes Don't flush the control file on the WAL ingest path at all. Instead, leave that responsibility to the timeline manager, but ask it to flush eagerly if the control file lags the in-memory commit LSN by more than one segment. This should not cause more than `REFRESH_INTERVAL` (300 ms) additional latency before flushing the control file, which is negligible.	2024-11-12 15:17:03 +00:00
Tristan Partin	cc8029c4c8	Update pg_cron to 1.6.4 This comes with PG 17 support. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-11 20:10:53 -06:00
Tristan Partin	5be6b07cf1	Improve typing related to regress/test_logical_replication.py (#9725 ) Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-11 17:36:45 -06:00
Arpad Müller	b018bc7da8	Add a retain_lsn test (#9599 ) Add a test that ensures the `retain_lsn` functionality works. Right now, there is not a single test that is broken if offloaded or non-offloaded timelines don't get registered at their parents, preventing gc from discarding the ancestor_lsns of the children. This PR fills that gap. The test has four modes: * `offloaded`: offload the child timeline, run compaction on the parent timeline, unarchive the child timeline, then try reading from it. hopefully the data is still there. * `offloaded-corrupted`: offload the child timeline, corrupts the manifest in a way that the pageserver believes the timeline was flattened. This is the closest we can get to pretend the `retain_lsn` mechanism doesn't exist for offloaded timelines, so we can avoid adding endpoints to the pageserver that do this manually for tests. The test then checks that indeed data is corrupted and the endpoint can't be started. That way we know that the test is actually working, and actually tests the `retain_lsn` mechanism, instead of say the lsn lease mechanism, or one of the many other mechanisms that impede gc. * `archived`: the child timeline gets archived but doesn't get offloaded. this currently matches the `None` case but we might have refactors in the future that make archived timelines sufficiently different from non-archived ones. * `None`: the child timeline doesn't even get archived. this tests that normal timelines participate in `retain_lsn`. I've made them locally not participate in `retain_lsn` (via commenting out the respective `ancestor_children.push` statement in tenant.rs) and ran the testsuite, and not a single test failed. So this test is first of its kind. Part of #8088.	2024-11-11 22:29:21 +00:00
Tristan Partin	4b075db7ea	Add a postgres_exporter config file This exporter logs an ERROR if a file called `postgres_exporter.yml` is not located in its current working directory. We can silence it by adding an empty config file and pointing the exporter at it. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-11 14:49:37 -06:00
Fedor Dikarev	fde16f8614	use batch gh-workflow-stats-action with separate table (#9722 ) We found that exporting GH Workflow Runs in batch is more efficient due to - better utilisation of Github API - and gh runners usage is rounded to minutes, so even when ad-hoc export is done in 5-10 seconds, we billed for one minute usage So now we introduce batch exporting, with version v0.2.x of github workflow stats exporter. How it's expected to work now: - every 15 minutes we query for the workflow runs, created in last 2 hours - to avoid missing workflows that ran for more than 2 hours, every night (00:25) we will query workflows created in past 24 hours and export them as well - should we have query for even longer periods? - lets see how it works with current schedule - for longer periods like for days or weeks, it may require to adjust logic and concurrency of querying data, so lets for now use simpler version	2024-11-11 20:33:29 +00:00
Alex Chi Z.	5a138d08a3	feat(pageserver): support partial gc-compaction for delta layers (#9611 ) The final patch for partial compaction, part of https://github.com/neondatabase/neon/issues/9114, close https://github.com/neondatabase/neon/issues/8921 (note that we didn't implement parallel compaction or compaction scheduler for partial compaction -- currently this needs to be scheduled by using a Python script to split the keyspace, and in the future, automatically split based on the key partitioning when the pageserver wants to trigger a gc-compaction) ## Summary of changes * Update the layer selection algorithm to use the same selection as full compaction (everything intersect/below gc horizon) * Update the layer selection algorithm to also generate a list of delta layers that need to be rewritten * Add the logic to rewrite delta layers and add them back to the layer map * Update test case to do partial compaction on deltas --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-11 20:30:32 +00:00
Tristan Partin	2d9652c434	Clean up C.UTF-8 locale changes Removes some unnecessary initdb arguments, and fixes Neon for MacOS since it doesn't seem to ship a C.UTF-8 locale. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-11 13:53:12 -06:00
Alexander Bayandin	e9dcfa2eb2	test_runner: skip more tests using decorator instead of pytest.skip (#9704 ) ## Problem Running `pytest.skip(...)` in a test body instead of marking the test with `@pytest.mark.skipif(...)` makes all fixtures to be initialised, which is not necessary if the test is going to be skipped anyway. Also, some tests are unnecessarily skipped (e.g. `test_layer_bloating` on Postgres 17, or `test_idle_reconnections` at all) or run (e.g. `test_parse_project_git_version_output_positive` more than on once configuration) according to comments. ## Summary of changes - Move `skip_on_postgres` / `xfail_on_postgres` / `run_only_on_default_postgres` decorators to `fixture.utils` - Add new `skip_in_debug_build` and `skip_on_ci` decorators - Replace `pytest.skip(...)` calls with decorators where possible	2024-11-11 18:07:01 +00:00
Peter Bendel	8db84d9964	new ingest benchmark (#9711 ) ## Problem We have no specific benchmark testing project migration of postgresql project with existing data into Neon. Typical steps of such a project migration are - schema creation in the neon project - initial COPY of relations - creation of indexes and constraints - vacuum analyze ## Summary of changes Add a periodic benchmark running 9 AM UTC every day. In each run: - copy a 200 GiB project that has realistic schema, data, tables, indexes and constraints from another project into - a new Neon project (7 CU fixed) - an existing tenant, (but new branch and new database) that already has 4 TiB of data - use pgcopydb tool to automate all steps and parallelize COPY and index creation - parse pgcopydb output and report performance metrics in Neon performance test database ## Logs This benchmark has been tested first manually and then as part of benchmarking.yml workflow, example run see https://github.com/neondatabase/neon/actions/runs/11757679870	2024-11-11 17:51:15 +00:00
Alexander Bayandin	1aab34715a	Remove checklist from the PR template (#9702 ) ## Problem Once we enable the merge queue for the `main` branch, it won't be possible to adjust the commit message right after pressing the "Squash and merge" button and the PR title + description will be used as is. To avoid extra noise in the commits in the `main` with the checklist leftovers, I propose removing the checklist from the PR template and keeping only the Problem / Summary of changes. ## Summary of changes - Remove the checklist from the PR template	2024-11-11 17:01:02 +00:00
Erik Grinaker	f63de5f527	safekeeper: add `initialize_segment` variant of `safekeeper_wal_storage_operation_seconds` (#9691 ) ## Problem We don't have a metric capturing the latency of segment initialization. This can be significant due to fsyncs. ## Summary of changes Add an `initialize_segment` variant of `safekeeper_wal_storage_operation_seconds`.	2024-11-11 17:55:50 +01:00
Alex Chi Z.	54a1676680	rfc: update aux file rfc to reflect latest optimizations (#9681 ) Reflects https://github.com/neondatabase/neon/pull/9631 in the RFC. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-11 09:19:03 -05:00
Alex Chi Z.	48c06d9f7b	fix(pageserver): increase frozen layer warning threshold; ignore in tests (#9705 ) Perf benchmarks produce a lot of layers. ## Summary of changes Bumping the threshold and ignore the warning. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-11 09:13:46 -05:00
Alexander Bayandin	f510647c7e	CI: retry `actions/github-script` for 5XX errors (#9703 ) ## Problem GitHub API can return error 500, and it fails jobs that use `actions/github-script` action. ## Summary of changes - Add `retry: 500` to all `actions/github-script` usage	2024-11-11 12:42:32 +00:00
Vlad Lazar	ceaa80ffeb	storcon: add peer token for peer to peer communication (#9695 ) ## Problem We wish to stop using admin tokens in the infra repo, but step down requests use the admin token. ## Summary of Changes Introduce a new "ControllerPeer" scope and use it for step-down requests.	2024-11-11 09:58:41 +00:00
Alexander Bayandin	2fcac0e66b	CI(pre-merge-checks): add required checks (#9700 ) ## Problem The Merge queue doesn't work because it expects certain jobs, which we don't have in the `pre-merge-checks` workflow. But it turns out we can just create jobs/checks with the same names in any workflow that we run. ## Summary of changes - Add `conclusion` jobs - Create `neon-cloud-e2e` status check - Add a bunch of `if`s to handle cases with no relevant changes found and prepare the workflow to run rust checks in the future - List the workflow in `report-workflow-stats` to collect stats about it	2024-11-09 01:02:54 +00:00
Tristan Partin	ecde8d7632	Improve type safety according to pyright Pyright found many issues that mypy doesn't seem to want to catch or mypy isn't configured to catch. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-08 14:43:15 -06:00
Alex Chi Z.	af8238ae52	fix(pageserver): drain upload queue before offloading timeline (#9682 ) It is possible at the point we shutdown the timeline, there are still layer files we did not upload. ## Summary of changes * If the queue is not empty, avoid offloading. * Shutdown the timeline gracefully using the flush mode to ensure all local files are uploaded before deleting the timeline directory. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-08 14:28:55 -05:00
Erik Grinaker	ab47804d00	safekeeper: remove unused `WriteGuardSharedState::skip_update` (#9699 )	2024-11-08 19:25:31 +00:00
Alex Chi Z.	ecca62a45d	feat(pageserver): more log lines around frozen layers (#9697 ) We saw pageserver OOMs https://github.com/neondatabase/cloud/issues/19715 for tenants doing large writes. Add log lines around in-memory layers to hopefully collect some info during my on-call shift next week. ## Summary of changes * Estimate in-memory size of an in-mem layer. * Print frozen layer number if there are too many layers accumulated in memory. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-08 18:44:00 +00:00
Tristan Partin	34a4eb6f2a	Switch compute-related locales to C.UTF-8 by default Right now, our environments create databases with the C locale, which is really unfortunate for users who have data stored in other languages that they want to analyze. For instance, show_trgm on Hebrew text currently doesn't work in staging or production. I don't envision this being the final solution. I think this is just a way to set a known value so the pageserver doesn't use its parent environment. The final solution to me is exposing initdb parameters to users in the console. Then they could use a different locale or encoding if they so chose. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-08 12:19:18 -06:00
Alexander Bayandin	b6bc954c5d	CI: move check codestyle python to reusable workflow and run on a merge_group (#9683 ) ## Problem To prevent breaking main after Python 3.11 PR get merged we need to enable merge queue and run `check-codestyle-python` job on it ## Summary of changes - Move `check-codestyle-python` to a reusable workflow - Run this workflow on `merge_group` event	2024-11-08 17:32:56 +00:00
Vlad Lazar	30680d1f32	tests: use tigther storcon scopes (#9696 ) ## Problem https://github.com/neondatabase/neon/pull/9596 did not update tests because that would've broken the compat tests. ## Summary of Changes Use infra scope where possible.	2024-11-08 17:00:31 +00:00
Alex Chi Z.	f561cbe1c7	fix(pageserver): drain upload queue before detaching ancestor (#9651 ) In INC-317 https://neondb.slack.com/archives/C033RQ5SPDH/p1730815677932209, we saw an interesting series of operations that would remove valid layer files existing in the layer map. * Timeline A starts compaction and generates an image layer Z but not uploading it yet. * Timeline B/C starts ancestor detaching (which should not affect timeline A) * The tenant gets restarted as part of the ancestor detaching process, without increasing the generation number. * Timeline A reloads, discovering the layer Z is a future layer, and schedules a deletion into the deletion queue. This means that the file will be deleted any time in the future. * Timeline A starts compaction and generates layer Z again, adding it to the layer map. Note that because we don't bump generation number during ancestor detach, it has the same filename + generation number as the original Z. * Timeline A deletes layer Z from s3 + disk, and now we have a dangling reference in the layer map, blocking all compaction/logical_size_calculation process. ## Summary of changes * We wait until all layers to be uploaded before shutting down the tenants in `Flush` mode. * Ancestor detach restarts now use this mode. * Ancestor detach also waits for remote queue completion before starting the detaching process. * The patch ensures that we don't have any future image layer (or something similar) after restart, but not fixing the underlying problem around generation numbers. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-08 10:35:27 -05:00
Tristan Partin	3525d2e381	Update TimescaleDB to 2.17.1 for PG 17 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-08 09:15:38 -06:00
Konstantin Knizhnik	17c002b660	Do not copy logical replicaiton slots to replica (#9458 ) ## Problem Replication slots are now persisted using AUX files mechanism and included in basebackup when replica is launched. This slots are not somehow used at replica but hold WAL, which may cause local disk space exhaustion. ## Summary of changes Add `--replica` parameter to basebackup request and do not include replication slot state files in basebackup for replica. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-08 14:54:58 +02:00
John Spray	aa9112efce	pageserver: add `no_sync` for use in regression tests (1/2) (#9677 ) ## Problem In test environments, the `syncfs` that the pageserver does on startup can take a long time, as other tests running concurrently might have many gigabytes of dirty pages. ## Summary of changes - Add a `no_sync` option to the pageserver's config. - Skip syncfs on startup if this is set - A subsequent PR (https://github.com/neondatabase/neon/pull/9678) will enable this by default in tests. We need to wait until after the next release to avoid breaking compat tests, which would fail if we set no_sync & use an old pageserver binary. Q: Why is this a different mechanism than safekeeper, which as a --no-sync CLI? A: Because the way we manage pageservers in neon_local depends on the pageserver.toml containing the full configuration, whereas safekeepers have a config file which is neon-local-specific and can drive a CLI flag. Q: Why is the option no_sync rather than sync? A: For boolean configs with a dangerous value, it's preferable to make "false" the safe option, so that any downstream future config tooling that might have a "booleans are false by default" behavior (e.g. golang structs) is safe by default. Q: Why only skip the syncfs, and not all fsyncs? A: Skipping all fsyncs would require more code changes, and the most acute problem isn't fsyncs themselves (these just slow down a running test), it's the syncfs (which makes a pageserver startup slow as a result of _other_ tests)	2024-11-08 10:16:04 +00:00
JC Grünhage	027889b06c	ci: use set-docker-config-dir from dev-actions (#9638 ) set-docker-config-dir was replicated over multiple repositories. The replica of this action was removed from this repository and it's using the version from github.com/neondatabase/dev-actions instead	2024-11-08 10:44:59 +01:00
Heikki Linnakangas	79929bb1b6	Disable `rust_2024_compatibility` lint option (#9615 ) Compiling with nightly rust compiler, I'm getting a lot of errors like this: error: `if let` assigns a shorter lifetime since Edition 2024 --> proxy/src/auth/backend/jwt.rs:226:16 \| 226 \| if let Some(permit) = self.try_acquire_permit() { \| ^^^^^^^^^^^^^^^^^^^------------------------- \| \| \| this value has a significant drop implementation which may observe a major change in drop order and requires your discretion \| = warning: this changes meaning in Rust 2024 = note: for more information, see issue #124085 <https://github.com/rust-lang/rust/issues/124085> help: the value is now dropped here in Edition 2024 --> proxy/src/auth/backend/jwt.rs:241:13 \| 241 \| } else { \| ^ note: the lint level is defined here --> proxy/src/lib.rs:8:5 \| 8 \| rust_2024_compatibility \| ^^^^^^^^^^^^^^^^^^^^^^^ = note: `#[deny(if_let_rescope)]` implied by `#[deny(rust_2024_compatibility)]` and this: error: these values and local bindings have significant drop implementation that will have a different drop order from that of Edition 2021 --> proxy/src/auth/backend/jwt.rs:376:18 \| 369 \| let client = Client::builder() \| ------ these values have significant drop implementation and will observe changes in drop order under Edition 2024 ... 376 \| map: DashMap::default(), \| ^^^^^^^^^^^^^^^^^^ \| = warning: this changes meaning in Rust 2024 = note: for more information, see issue #123739 <https://github.com/rust-lang/rust/issues/123739> = note: `#[deny(tail_expr_drop_order)]` implied by `#[deny(rust_2024_compatibility)]` They are caused by the `rust_2024_compatibility` lint option. When we actually switch to the 2024 edition, it makes sense to go through all these and check that the drop order changes don't break anything, but in the meanwhile, there's no easy way to avoid these errors. Disable it, to allow compiling with nightly again. Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-11-08 08:35:03 +00:00
Peter Bendel	9132d80aa3	add pgcopydb tool to build tools image (#9658 ) ## Problem build-tools image does not provide superuser, so additional packages can not be installed during GitHub benchmarking workflows but need to be added to the image ## Summary of changes install pgcopydb version 0.17-1 or higher into build-tools bookworm image ```bash docker run -it neondatabase/build-tools:<tag>-bookworm-arm64 /bin/bash ... nonroot@c23c6f4901ce:~$ LD_LIBRARY_PATH=/pgcopydb/lib /pgcopydb/bin/pgcopydb --version; 13:58:19.768 8 INFO Running pgcopydb version 0.17 from "/pgcopydb/bin/pgcopydb" pgcopydb version 0.17 compiled with PostgreSQL 16.4 (Debian 16.4-1.pgdg120+2) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit compatible with Postgres 11, 12, 13, 14, 15, and 16 ``` Example usage of that image in a workflow https://github.com/neondatabase/neon/actions/runs/11725718371/job/32662681172#step:7:14	2024-11-07 19:00:25 +01:00
Conrad Ludgate	82e3f0ecba	[proxy/authorize]: improve JWKS reliability (#9676 ) While setting up some tests, I noticed that we didn't support keycloak. They make use of encryption JWKs as well as signature ones. Our current jwks crate does not support parsing encryption keys which caused the entire jwk set to fail to parse. Switching to lazy parsing fixes this. Also while setting up tests, I couldn't use localhost jwks server as we require HTTPS and we were using webpki so it was impossible to add a custom CA. Enabling native roots addresses this possibility. I saw some of our current e2e tests against our custom JWKS in s3 were taking a while to fetch. I've added a timeout + retries to address this.	2024-11-07 16:24:38 +00:00
Arpad Müller	75aa19aa2d	Don't attach is_archived to debug output (#9679 ) We are in branches where we know its value already.	2024-11-07 16:13:50 +00:00
Alex Chi Z.	a8d9939ea9	fix(pageserver): reduce aux compaction threshold (#9647 ) ref https://github.com/neondatabase/neon/issues/9441 The metrics from LR publisher testing project: ~300KB aux key deltas per 256MB files. Therefore, I think we can do compaction more aggressively as these deltas are small and compaction can reduce layer download latency. We also have a read path perf fix https://github.com/neondatabase/neon/pull/9631 but I'd still combine the read path fix with the reduce of the compaction threshold. ## Summary of changes * reduce metadata compaction threshold * use num of L1 delta layers as an indicator for metadata compaction * dump more logs Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-07 10:38:15 -05:00
Erik Grinaker	f18aa04b90	safekeeper: use `set_len()` to zero out segments (#9665 ) ## Problem When we create a new segment, we zero it out in order to avoid changing the length and fsyncing metadata on every write. However, we zeroed it out by writing 8 KB zero-pages, and Tokio file writes have non-trivial overhead. ## Summary of changes Zero out the segment using [`File::set_len()`](https://docs.rs/tokio/latest/i686-unknown-linux-gnu/tokio/fs/struct.File.html#method.set_len) instead. This will typically (depending on the filesystem) just write a sparse file and omit the 16 MB of data entirely. This improves WAL append throughput for large messages by over 400% with fsync disabled, and 100% with fsync enabled.	2024-11-07 15:09:57 +00:00
Erik Grinaker	01265b7bc6	safekeeper: add basic WAL ingestion benchmarks (#9531 ) ## Problem We don't have any benchmarks for Safekeeper WAL ingestion. ## Summary of changes Add some basic benchmarks for WAL ingestion, specifically for `SafeKeeper::process_msg()` (single append) and `WalAcceptor` (pipelined batch ingestion). Also add some baseline file write benchmarks.	2024-11-07 13:24:03 +00:00
Arseny Sher	f54f0e8e2d	Fix direct reading from WAL buffers. (#9639 ) Fix direct reading from WAL buffers. Pointer wasn't advanced which resulted in sending corrupted WAL if part of read used WAL buffers and part read from the file. Also move it to neon_walreader so that e.g. replication could also make use of it. ref https://github.com/neondatabase/cloud/issues/19567	2024-11-07 11:29:52 +00:00
Erik Grinaker	d6aa26a533	postgres_ffi: make `WalGenerator` generic over record generator (#9614 ) ## Problem Benchmarks need more control over the WAL generated by `WalGenerator`. In particular, they need to vary the size of logical messages. ## Summary of changes * Make `WalGenerator` generic over `RecordGenerator`, which constructs WAL records. * Add `LogicalMessageGenerator` which emits logical messages, with a configurable payload. * Minor tweaks and code reorganization. There are no changes to the core logic or emitted WAL.	2024-11-07 10:38:39 +00:00
Ivan Efremov	f5eec194e7	Merge pull request #9674 from neondatabase/rc/proxy/2024-11-07 Proxy release 2024-11-07	2024-11-07 12:07:12 +02:00
Cheng Chen	e1d0b73824	chore(compute): Bump pg_mooncake to the latest version	2024-11-06 22:41:18 -06:00
Arpad Müller	011c0a175f	Support copying layers in detach_ancestor from before shard splits (#9669 ) We need to use the shard associated with the layer file, not the shard associated with our current tenant shard ID. Due to shard splits, the shard IDs can refer to older files. close https://github.com/neondatabase/neon/issues/9667	2024-11-07 01:53:58 +01:00
Alex Chi Z.	2a95a51a0d	refactor(pageserver): better pageservice command parsing (#9597 ) close https://github.com/neondatabase/neon/issues/9460 ## Summary of changes A full rewrite of pagestream cmdline parsing to make it more robust and readable. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-11-06 20:41:01 +00:00
Yuchen Liang	11fc1a4c12	fix(test): use layer map dump in `test_readonly_node_gc` to validate layers protected by leases (#9551 ) Fixes #9518. ## Problem After removing the assertion `layers_removed == 0` in #9506, we could miss breakage if we solely rely on the successful execution of the `SELECT` query to check if lease is properly protecting layers. Details listed in #9518. Also, in integration tests, we sometimes run into the race condition where getpage request comes before the lease get renewed (item 2 of #8817), even if compute_ctl sends a lease renewal as soon as it sees a `/configure` API calls that updates the `pageserver_connstr`. In this case, we would observe a getpage request error stating that we `tried to request a page version that was garbage collected` (as we seen in [Allure Report](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8613/11550393107/index.html#suites/3ccffb1d100105b98aed3dc19b717917/d1a1ba47bc180493)). ## Summary of changes - Use layer map dump to verify if the lease protects what it claimed: Record all historical layers that has `start_lsn <= lease_lsn` before and after running timeline gc. This is the same check as `ad79f42460/pageserver/src/tenant/timeline.rs (L5025-L5027)` The set recorded after GC should contain every layer in the set recorded before GC. - Wait until log contains another successful lease request before running the `SELECT` query after GC. We argued in #8817 that the bad request can only exist within a short period after migration/restart, and our test shows that as long as a lease renewal is done before the first getpage request sent after reconfiguration, we will not have bad request. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-11-06 20:18:21 +00:00
Tristan Partin	93123f2623	Rename compute_backpressure_throttling_ms to compute_backpressure_throttling_seconds This is in line with the Prometheus guidance[0]. We also haven't started using this metric, so renaming is essentially free. Link: https://prometheus.io/docs/practices/naming/ [0] Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-06 13:28:23 -06:00
Alex Chi Z.	1d3559d4bc	feat(pageserver): add fast path for sparse keyspace read (#9631 ) In https://github.com/neondatabase/neon/issues/9441, the tenant has a lot of aux keys spread in multiple aux files. The perf tool shows that a significant amount of time is spent on remove_overlapping_keys. For sparse keyspaces, we don't need to report missing key errors anyways, and it's very likely that we will need to read all layers intersecting with the key range. Therefore, this patch adds a new fast path for sparse keyspace reads that we do not track `unmapped_keyspace` in a fine-grained way. We only modify it when we find an image layer. In debug mode, it was ~5min to read the aux files for a dump of the tenant, and now it's only 8s, that's a 60x speedup. ## Summary of changes * Do not add sparse keys into `keys_done` so that remove_overlapping does nothing. * Allow `ValueReconstructSituation::Complete` to be updated again in `ValuesReconstructState::update_key` for sparse keyspaces. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-06 18:17:02 +00:00
Conrad Ludgate	73bdc9a2d0	[proxy]: minor changes to endpoint-cache handling (#9666 ) I think I meant to make these changes over 6 months ago. alas, better late than never. 1. should_reject doesn't eagerly intern the endpoint string 2. Rate limiter uses a std Mutex instead of a tokio Mutex. 3. Recently I introduced a `-local-proxy` endpoint suffix. I forgot to add this to normalize. 4. Random but a small cleanup making the ControlPlaneEvent deser directly to the interned strings.	2024-11-06 17:40:40 +00:00
John Spray	d182ff294c	storcon: respect tenant scheduling policy in drain/fill (#9657 ) ## Problem Pinning a tenant by setting Pause scheduling policy doesn't work because drain/fill code moves the tenant around during deploys. Closes: https://github.com/neondatabase/neon/issues/9612 ## Summary of changes - In drain, only move a tenant if it is in Active or Essential mode - In fill, only move a tenant if it is in Active mode. The asymmetry is a bit annoying, but it faithfully respects the purposes of the modes: Essential is meant to endeavor to keep the tenant available, which means it needs to be drained but doesn't need to be migrated during fills.	2024-11-06 15:14:43 +00:00
Vlad Lazar	4dfa0c221b	pageserver: ingest pre-serialized batches of values (#9579 ) ## Problem https://github.com/neondatabase/neon/pull/9524 split the decoding and interpretation step from ingestion. The output of the first phase is a `wal_decoder::models::InterpretedWalRecord`. Before this patch set that struct contained a list of `Value` instances. We wish to lift the decoding and interpretation step to the safekeeper, but it would be nice if the safekeeper gave us a batch containing the raw data instead of actual values. ## Summary of changes Main goal here is to make `InterpretedWalRecord` hold a raw buffer which contains pre-serialized Values. For this we do: 1. Add a `SerializedValueBatch` type. This is `inmemory_layer::SerializedBatch` with some extra functionality for extension, observing values for shard 0 and tests. 2. Replace `inmemory_layer::SerializedBatch` with `SerializedValueBatch` 3. Make `DatadirModification` maintain a `SerializedValueBatch`. ### `DatadirModification` changes `DatadirModification` now maintains a `SerializedValueBatch` and extends it as new WAL records come in (to avoid flushing to disk on every record). In turn, this cascaded into a number of modifications to `DatadirModification`: 1. Replace `pending_data_pages` and `pending_zero_data_pages` with `pending_data_batch`. 2. Removal of `pending_zero_data_pages` and its cousin `on_wal_record_end` 3. Rename `pending_bytes` to `pending_metadata_bytes` since this is what it tracks now. 4. Adapting of various utility methods like `len`, `approx_pending_bytes` and `has_dirty_data_pages`. Removal of `pending_zero_data_pages` and the optimisation associated with it ((1) and (2)) deserves more detail. Previously all zero data pages went through `pending_zero_data_pages`. We wrote zero data pages when filling gaps caused by relation extension (case A) and when handling special wal records (case B). If it happened that the same WAL record contained a non zero write for an entry in `pending_zero_data_pages` we skipped the zero write. Case A: We handle this differently now. When ingesting the `SerialiezdValueBatch` associated with one PG WAL record, we identify the gaps and fill the them in one go. Essentially, we move from a per key process (gaps were filled after each new key), and replace it with a per record process. Hence, the optimisation is not required anymore. Case B: When the handling of a special record needs to zero out a key, it just adds that to the current batch. I inspected the code, and I don't think the optimisation kicked in here.	2024-11-06 14:10:32 +00:00
Folke Behrens	bdd492b1d8	proxy: Replace "web(auth)" with "console redirect" everywhere (#9655 )	2024-11-06 11:03:38 +00:00
Folke Behrens	5d8284c7fe	proxy: Read cplane JWT with clap arg (#9654 )	2024-11-06 10:27:55 +00:00
Folke Behrens	ebc43efebc	proxy: Refactor cplane types (#9643 ) The overall idea of the PR is to rename a few types to make their purpose more clear, reduce abstraction where not needed, and move types to to more better suited modules.	2024-11-05 23:03:53 +01:00
Folke Behrens	754d2950a3	proxy: Revert ControlPlaneEvent back to struct (#9649 ) Due to neondatabase/cloud#19815 we need to be more tolerant when reading events.	2024-11-05 21:32:33 +00:00
Conrad Ludgate	fcde40d600	[proxy] use the proxy protocol v2 command to silence some logs (#9620 ) The PROXY Protocol V2 offers a "command" concept. It can be of two different values. "Local" and "Proxy". The spec suggests that "Local" be used for health-checks. We can thus use this to silence logging for such health checks such as those from NLB. This additionally refactors the flow to be a bit more type-safe, self documenting and using zerocopy deser.	2024-11-05 17:23:00 +00:00
Erik Grinaker	babfeb70ba	safekeeper: don't allocate send buffers on stack (#9644 ) ## Problem While experimenting with `MAX_SEND_SIZE` for benchmarking, I saw stack overflows when increasing it to 1 MB. Turns out a few buffers of this size are stack-allocated rather than heap-allocated. Even at the default 128 KB size, that's a bit large to allocate on the stack. ## Summary of changes Heap-allocate buffers of size `MAX_SEND_SIZE`.	2024-11-05 17:05:30 +00:00
Ivan Efremov	2f1a56c8f9	proxy: Unify local and remote conn pool client structures (#9604 ) Unify client, EndpointConnPool and DbUserConnPool for remote and local conn. - Use new ClientDataEnum for additional client data. - Add ClientInnerCommon client structure. - Remove Client and EndpointConnPool code from local_conn_pool.rs	2024-11-05 17:33:41 +02:00
John Spray	e30f5fb922	scrubber: remove AWS region assumption, tolerate negative max_project_size (#9636 ) ## Problem First issues noticed when trying to run scrubber find-garbage on Azure: - Azure staging contains projects with -1 set for max_project_size: apparently the control plane treats this as a signed field. - Scrubber code assumed that listing projects should filter to aws-$REGION. This is no longer needed (per comment in the code) because we know hit region-local APIs. This PR doesn't make it work all the way (`init_remote` still assumes S3), but these are necessary precursors. ## Summary of changes - Change max-project_size from unsigned to signed - Remove region filtering in favor of simply using the right region's API (which we already do)	2024-11-05 13:32:50 +00:00
Arpad Müller	70ae8c16da	Construct models::TenantConfig only once (#9630 ) Since 5f83c9290b482dc90006c400dfc68e85a17af785/#1504 we've had duplication in construction of models::TenantConfig, where both constructs contained the same code. This PR removes one of the two locations to avoid the duplication.	2024-11-05 13:02:49 +00:00
Erik Grinaker	8840f3858c	pageserver: return 503 during tenant shutdown (#9635 ) ## Problem Tenant operations may return `409 Conflict` if the tenant is shutting down. This status code is not retried by the control plane, causing user-facing errors during pageserver restarts. Operations should instead return `503 Service Unavailable`, which may be retried for idempotent operations. ## Summary of changes Convert `GetActiveTenantError::WillNotBecomeActive(TenantState::Stopping)` to `ApiError::ShuttingDown` rather than `ApiError::Conflict`. This error is returned by `Tenant::wait_to_become_active` in most (all?) tenant/timeline-related HTTP routes.	2024-11-05 13:16:55 +01:00
Tristan Partin	1e16221f82	Update psycopg2 to latest version for complete PG 17 support Update the types to match. Changes the cursor import to match the C bindings[0]. Link: https://github.com/python/typeshed/issues/12578 [0] Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-04 18:21:59 -06:00
Tristan Partin	34812a6aab	Improve some typing related to performance testing for LR Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-04 15:52:01 -06:00
Arpad Müller	ee68bbf6f5	Add tenant config option to allow timeline_offloading (#9598 ) Allow us to enable timeline offloading for single tenants without having to enable it for the entire pageserver. Part of #8088.	2024-11-04 21:01:18 +01:00
Folke Behrens	1085fe57d3	proxy: Rewrite ControlPlaneEvent as enum (#9627 )	2024-11-04 20:19:26 +01:00
Folke Behrens	59879985b4	proxy: Wrap JWT errors in separate AuthError variant (#9625 ) * Also rename `AuthFailed` variant to `PasswordFailed`. * Before this all JWT errors end up in `AuthError::AuthFailed()`, expects a username and also causes cache invalidation.	2024-11-04 19:56:40 +01:00
Conrad Ludgate	81d1bb1941	quieten aws_config logs (#9626 ) logs during aws authentication are soooo noisy in staging 🙃	2024-11-04 17:28:10 +00:00
Christian Schwarz	06113e94e6	fix(test_regress): always use storcon virtual pageserver API to set tenant config (#9622 ) Problem ------- Tests that directly call the Pageserver Management API to set tenant config are flaky if the Pageserver is managed by Storcon because Storcon is the source of truth and may (theoretically) reconcile a tenant at any time. Solution -------- Switch all users of `set_tenant_config`/`patch_tenant_config_client_side` to use the `env.storage_controller.pageserver_api()` Future Work ----------- Prevent regressions from creeping in. And generally clean up up tenant configuration. Maybe we can avoid the Pageserver having a default tenant config at all and put the default into Storcon instead? * => https://github.com/neondatabase/neon/issues/9621 Refs ---- fixes https://github.com/neondatabase/neon/issues/9522	2024-11-04 17:42:08 +01:00
Erik Grinaker	0d5a512825	safekeeper: add walreceiver metrics (#9450 ) ## Problem We don't have any observability for Safekeeper WAL receiver queues. ## Summary of changes Adds a few WAL receiver metrics: * `safekeeper_wal_receivers`: gauge of currently connected WAL receivers. * `safekeeper_wal_receiver_queue_depth`: histogram of queue depths per receiver, sampled every 5 seconds. * `safekeeper_wal_receiver_queue_depth_total`: gauge of total queued messages across all receivers. * `safekeeper_wal_receiver_queue_size_total`: gauge of total queued message sizes across all receivers. There are already metrics for ingested WAL volume: `written_wal_bytes` counter per timeline, and `safekeeper_write_wal_bytes` per-request histogram.	2024-11-04 15:22:46 +00:00
Conrad Ludgate	8ad1dbce72	[proxy]: parse proxy protocol TLVs with aws/azure support (#9610 ) AWS/azure private link shares extra information in the "TLV" values of the proxy protocol v2 header. This code doesn't action on it, but it parses it as appropriate.	2024-11-04 14:04:56 +00:00
Conrad Ludgate	3dcdbcc34d	remove aws-lc-rs dep and fix storage_broker tls (#9613 ) It seems the ecosystem is not so keen on moving to aws-lc-rs as it's build setup is more complicated than ring (requiring cmake). Eventually I expect the ecosystem should pivot to https://github.com/ctz/graviola/tree/main/rustls-graviola as it stabilises (it has a very simply build step and license), but for now let's try not have a headache of juggling two crypto libs. I also noticed that tonic will just fail with tls without a default provider, so I added some defensive code for that.	2024-11-04 13:29:13 +00:00
Matthias van de Meent	d5de63c6b8	Fix a time zone issue in a PG17 test case (#9618 ) The commit was cherry-picked and thus shouldn't cause issues once we merge the release tag for PostgreSQL 17.1	2024-11-04 12:10:32 +00:00
John Spray	4534f5cdc6	pageserver: make local timeline deletion infallible (#9594 ) ## Problem In https://github.com/neondatabase/neon/pull/9589, timeline offload code is modified to return an explicit error type rather than propagating anyhow::Error. One of the 'Other' cases there is I/O errors from local timeline deletion, which shouldn't need to exist, because our policy is not to try and continue running if the local disk gives us errors. ## Summary of changes - Make `delete_local_timeline_directory` and use `.fatal_err(` on I/O errors --------- Co-authored-by: Erik Grinaker <erik@neon.tech>	2024-11-04 09:11:52 +00:00
Erik Grinaker	0058eb09df	test_runner/performance: add sharded ingest benchmark (#9591 ) Adds a Python benchmark for sharded ingestion. This ingests 7 GB of WAL (100M rows) into a Safekeeper and fans out to 10 shards running on 10 different pageservers. The ingest volume and duration is recorded.	2024-11-02 16:42:10 +00:00
Konstantin Knizhnik	8ac523d2ee	Do not assign page LSN to new (uninitialized) page in ClearVisibilityMapFlags redo handler (#9287 ) ## Problem https://neondb.slack.com/archives/C04DGM6SMTM/p1727872045252899 See https://github.com/neondatabase/neon/issues/9240 ## Summary of changes Add `!page_is_new` check before assigning page lsn. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-11-01 20:31:29 +02:00
John Spray	3c16bd6e0b	storcon: skip non-active projects in chaos injection (#9606 ) ## Problem We may sometimes use scheduling modes like `Pause` to pin a tenant in its current location for operational reasons. It is undesirable for the chaos task to make any changes to such projects. ## Summary of changes - Add a check for scheduling mode - Add a log line when we do choose to do a chaos action for a tenant: this will help us understand which operations originate from the chaos task.	2024-11-01 16:47:20 +00:00
Erik Grinaker	123816e99a	safekeeper: log slow WalAcceptor sends (#9564 ) ## Problem We don't have any observability into full WalAcceptor queues per timeline. ## Summary of changes Logs a message when a WalAcceptor send has blocked for 5 seconds, and another message when the send completes. This implies that the log frequency is at most once every 5 seconds per timeline, so we don't need further throttling.	2024-11-01 13:47:03 +01:00
Peter Bendel	8b3bcf71ee	revert higher token expiration (#9605 ) ## Problem The IAM role associated with our github action runner supports a max token expiration which is lower than the value we tried. ## Summary of changes Since we believe to have understood the performance regression we (by ensuring availability zone affinity of compute and pageserver) the job should again run in lower than 5 hours and we revert this change instead of increasing the max session token expiration in the IAM role which would reduce our security.	2024-11-01 12:46:02 +01:00
Erik Grinaker	4c2c8d6708	test_runner: fix `tenant_get_shards` with one pageserver (#9603 ) ## Problem `tenant_get_shards()` does not work with a sharded tenant on 1 pageserver, as it assumes an unsharded tenant in this case. This special case appears to have been added to handle e.g. `test_emergency_mode`, where the storage controller is stopped. This breaks e.g. the sharded ingest benchmark in #9591 when run with a single shard. ## Summary of changes Correctly look up shards even with a single pageserver, but add a special case that assumes an unsharded tenant if the storage controller is stopped and the caller provides an explicit pageserver, in order to accomodate `test_emergency_mode`.	2024-11-01 11:25:04 +00:00
Conrad Ludgate	2d1366c8ee	fix pre-commit hook with python stubs (#9602 ) fix #9601	2024-11-01 11:22:38 +00:00
Vlad Lazar	e589c2e5ec	storage_controller: allow deployment infra to use infra token (#9596 ) ## Problem We wish for the deployment orchestrator to use infra scoped tokens, but storcon endpoints it's using require admin scoped tokens. ## Summary of Changes Switch over all endpoints that are used by the deployment orchestrator to use an infra scoped token. This causes no breakage during mixed version scenarios because admin scoped tokens allow access to all endpoints. The deployment orchestrator can cut over to the infra token after this commit touches down in prod. Once this commit is released we should also update the tests code to use infra scoped tokens where appropriate. Currently it would fail on the [compat tests](`9761b6a64e/test_runner/regress/test_storage_controller.py (L69-L71)`).	2024-10-31 18:29:16 +00:00
Conrad Ludgate	9761b6a64e	update pg_session_jwt to use pgrx 0.12 for pg17 (#9595 ) Updates the extension to use pgrx 0.12. No changes to the extensions have been made, the only difference is the pgrx version.	2024-10-31 15:50:41 +00:00
Conrad Ludgate	897cffb9d8	auth_broker: fix local_proxy conn count (#9593 ) our current metrics for http pool opened connections is always negative :D oops	2024-10-31 14:57:55 +00:00
John Spray	552088ac16	pageserver: fix spurious error logs in timeline lifecycle (#9589 ) ## Problem The final part of https://github.com/neondatabase/neon/issues/9543 will be a chaos test that creates/deletes/archives/offloads timelines while restarting pageservers and migrating tenants. Developing that test showed up a few places where we log errors during normal shutdown. ## Summary of changes - UninitializedTimeline's drop should log at info severity: this is a normal code path when some part of timeline creation encounters a cancellation `?` path. - When offloading and finding a `RemoteTimelineClient` in a non-initialized state, this is not an error and should not be logged as such. - The `offload_timeline` function returned an anyhow error, so callers couldn't gracefully pick out cancellation errors from real errors: update this to have a structured error type and use it throughout.	2024-10-31 14:44:59 +00:00
Peter Bendel	51fda118f6	increase lifetime of AWS session token to 12 hours (#9590 ) ## Problem clickbench regression causes clickbench to run >9 hours and the AWS session token is expired before the run completes ## Summary of changes extend lifetime of session token for this job to 12 hours	2024-10-31 13:34:50 +00:00
Anastasia Lubennikova	e96398a552	Add support of extensions for v17 (part 4) (#9568 ) - pg_jsonschema 0.3.3 - pg_graphql 1.5.9 - rum 65e0a752 - pg_tiktoken a5bc447e update support of extensions for v14-v16: - pg_jsonschema 0.3.1 -> 0.3.3 - pg_graphql 1.5.7 -> 1.5.9 - rum 6ab37053 -> 65e0a752 - pg_tiktoken e64e55aa -> a5bc447e	2024-10-31 15:05:24 +02:00
Erik Grinaker	f9d8256d55	pageserver: don't return option from `DeletionQueue::new` (#9588 ) `DeletionQueue::new()` always returns deletion workers, so the returned `Option` is redundant.	2024-10-31 10:51:58 +00:00
Vlad Lazar	411c3aa0d6	pageserver: lift decoding and interpreting of wal into wal_decoder (#9524 ) ## Problem Decoding and ingestion are still coupled in `pageserver::WalIngest`. ## Summary of changes A new type is added to `wal_decoder::models`, InterpretedWalRecord. This type contains everything that the pageserver requires in order to ingest a WAL record. The highlights are the `metadata_record` which is an optional special record type to be handled and `blocks` which stores key, value pairs to be persisted to storage. This type is produced by `wal_decoder::models::InterpretedWalRecord::from_bytes` from a raw PG wal record. The rest of this commit separates decoding and interpretation of the PG WAL record from its application in `WalIngest::ingest_record`. Related: https://github.com/neondatabase/neon/issues/9335 Epic: https://github.com/neondatabase/neon/issues/9329	2024-10-31 10:47:43 +00:00
Arpad Müller	65b69392ea	Disallow offloaded children during timeline deletion (#9582 ) If we delete a timeline that has childen, those children will have their data corrupted. Therefore, extend the already existing safety check to offloaded timelines as well. Part of #8088	2024-10-30 19:37:09 +01:00
Alex Chi Z.	8d70f88b37	refactor(pageserver): use JSON field encoding for consumption metrics cache (#9470 ) In https://github.com/neondatabase/neon/issues/9032, I would like to eventually add a `generation` field to the consumption metrics cache. The current encoding is not backward compatible and it is hard to add another field into the cache. Therefore, this patch refactors the format to store "field -> value", and it's easier to maintain backward/forward compatibility with this new format. ## Summary of changes * Add `NewRawMetric` as the new format. * Add upgrade path. When opening the disk cache, the codepath first inspects the `version` field, and decide how to decode. * Refactor metrics generation code and tests. * Add tests on upgrade / compatibility with the old format. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-30 18:13:11 +00:00
Arpad Müller	bcfe013094	Don't keep around the timeline's remote_client (#9583 ) Constructing a remote client is no big deal. Yes, it means an extra download from S3 but it's not that expensive. This simplifies code paths and scenarios to test. This unifies timelines that have been recently offloaded with timelines that have been offloaded in an earlier invocation of the process. Part of #8088	2024-10-30 18:44:29 +01:00
Arpad Müller	d0a02f3649	Disallow archived timelines to be detached or reparented (#9578 ) Disallow a request for timeline ancestor detach if either the to be detached timeline, or any of the to be reparented timelines are offloaded or archived. In theory we could support timelines that are archived but not offloaded, but archived timelines are at the risk of being offloaded, so we treat them like offloaded timelines. As for offloaded timelines, any code to "support" them would amount to unoffloading them, at which point we can just demand to have the timelines be unarchived. Part of #8088	2024-10-30 17:04:57 +01:00
Tristan Partin	8af9412eb2	Collect compute backpressure throttling time This will tell us how much time the compute has spent throttled if pageserver/safekeeper cannot keep up with WAL generation. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-30 09:58:29 -05:00
Erik Grinaker	96e35e11a6	postgres_ffi: add WAL generator for tests/benchmarks (#9503 ) ## Problem We don't have a convenient way to generate WAL records for benchmarks and tests. ## Summary of changes Adds a WAL generator, exposed as an iterator. It currently only generates logical messages (noops), but will be extended to write actual table rows later. Some existing code for WAL generation has been replaced with this generator, to reduce duplication.	2024-10-30 14:46:39 +03:00
Alexey Kondratov	745061ddf8	chore(compute): Bump pg_mooncake to the latest version (#9576 ) ## Problem There were some critical breaking changes made in the upstream since Oct 29th morning. ## Summary of changes Point it to the topmost commit in the `neon` branch at the time of writing this https://github.com/Mooncake-Labs/pg_mooncake/commits/neon/ `c495cd17d6`	2024-10-30 11:07:02 +01:00
Tristan Partin	0c828c57e2	Remove non-gzipped basebackup code path In July of 2023, Bojan and Chi authored `92aee7e07f`. Our in production pageservers are most definitely at a version where they all support gzipped basebackups.	2024-10-29 23:03:45 -05:00
John Spray	8e2e9f0fed	pageserver: generation-aware storage for TenantManifest (#9555 ) ## Problem When tenant manifest objects are written without a generation suffix, concurrently attached pageservers may stamp on each others writes of the manifest and cause undefined behavior. Closes: #9543 ## Summary of changes - Use download_generation_object helper when reading manifests, to search for the most recent generation - Use Tenant::generation as the generation suffix when writing manifests.	2024-10-29 23:24:04 +01:00
Tristan Partin	b77b9bdc9f	Add tests for sql-exporter metrics Should help us keep non-working metrics from hitting staging or production. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Fixes: https://github.com/neondatabase/neon/issues/8569 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-29 15:13:06 -05:00
Alex Chi Z.	81f9aba005	fix(pagectl): layer parsing and image layer dump (#9571 ) This patch contains various improvements for the pagectl tool. ## Summary of changes * Rewrite layer name parsing: LayerName now supports all variants we use now. * Drop pagectl's own layer parsing function, use LayerName in the pageserver crate. * Support image layer dumping in the layer dump command using ImageLayer::dump, drop the original implementation. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-29 15:16:23 -04:00
Alex Chi Z.	88ff8a7803	feat(pageserver): support partial gc-compaction for lowest retain lsn (#9134 ) part of https://github.com/neondatabase/neon/issues/8921, https://github.com/neondatabase/neon/issues/9114 ## Summary of changes We start the partial compaction implementation with the image layer partial generation. The partial compaction API now takes a key range. We will only generate images for that key range for now, and remove layers fully included in the key range after compaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-10-29 18:25:32 +00:00
Konstantin Knizhnik	0c075fab3a	Add --replica parameter to basebackup (#9553 ) ## Problem See https://github.com/neondatabase/neon/pull/9458 This PR separates PS related changes in #9458 from compute_ctl changes to enforce that PS is deployed before compute. ## Summary of changes This PR adds handlings of `--replica` parameters of backebackup to page server. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-29 18:40:10 +02:00
Anastasia Lubennikova	80e1630042	Use pg_mooncake from our fork. (#9565 ) Switch to main repo once https://github.com/Mooncake-Labs/pg_mooncake/pull/3 is merged	2024-10-29 15:57:52 +00:00
Jakub Kołodziejczak	57499640c5	proxy: more granular http status codes for sql-over-http errors (#9549 ) closes #9532	2024-10-29 15:44:45 +00:00
Anastasia Lubennikova	793ad50b7d	fix allow_unstable_extensions GUC - make it USERSET (#9563 ) fix message wording	2024-10-29 14:25:23 +00:00
John Spray	7a1331eee5	pageserver: make concurrent offloaded timeline operations safe wrt manifest uploads (#9557 ) ## Problem Uploads of the tenant manifest could race between different tasks, resulting in unexpected results in remote storage. Closes: https://github.com/neondatabase/neon/issues/9556 ## Summary of changes - Create a central function for uploads that takes a tokio::sync::Mutex - Store the latest upload in that Mutex, so that when there is lots of concurrency (e.g. archive 20 timelines at once) we can coalesce their manifest writes somewhat.	2024-10-29 13:54:48 +00:00
John Spray	4ef74215e1	pageserver: refactor generation-aware loading code into generic (#9545 ) ## Problem Indices used to be the only kind of object where we had to search across generations to find the most recent one. As of https://github.com/neondatabase/neon/issues/9543, manifests will need the same treatment. ## Summary of changes - Refactor download_index_part to a generic download_generation_object function, which will be usable for downloading manifest objects as well.	2024-10-29 13:00:03 +00:00
Conrad Ludgate	7e00be391d	Merge pull request #9558 from neondatabase/rc/proxy/2024-10-29 Auth broker release 2024-10-29	2024-10-29 12:10:50 +00:00
Conrad Ludgate	d4cbc8cfeb	[auth_broker]: regress test (#9541 ) python based regression test setup for auth_broker. This uses a http mock for cplane as well as the JWKs url. complications: 1. We cannot just use local_proxy binary, as that requires the pg_session_jwt extension which we don't have available in the current test suite 2. We cannot use just any old http mock for local_proxy, as auth_broker requires http2 to local_proxy as such, I used the h2 library to implement an echo server - copied from the examples in the h2 docs.	2024-10-29 11:39:09 +00:00
Conrad Ludgate	47c35f67c3	[proxy]: fix JWT handling for AWS cognito. (#9536 ) In the base64 payload of an aws cognito jwt, I saw the following: ``` "iss":"https:\/\/cognito-idp.us-west-2.amazonaws.com\/us-west-2_redacted" ``` issuers are supposed to be URLs, and URLs are always valid un-escaped JSON. However, `\/` is a valid escape character so what AWS is doing is technically correct... sigh... This PR refactors the test suite and adds a new regression test for cognito.	2024-10-29 11:01:09 +00:00
Peter Bendel	45b558f480	temporarily increase timeout for clickbench benchmark until regression is resolved (#9554 ) ## Problem click bench job in benchmarking workflow has a performance regression causing it to run in timeout of max job run. Suspected root cause: Project has been migrated from single pageserver to storage controller managed project on Oct 14th. Since then the regression shows. ## Summary of changes Increase timeout of pytest to 12 hours. Increase job timeout to 12 hours	2024-10-29 10:53:28 +00:00
Arpad Müller	a73402e646	Offloaded timeline deletion (#9519 ) As pointed out in https://github.com/neondatabase/neon/pull/9489#discussion_r1814699683 , we currently didn't support deletion for offloaded timelines after the timeline has been loaded from the manifest instead of having been offloaded. This was because the upload queue hasn't been initialized yet. This PR thus initializes the timeline and shuts it down immediately. Part of #8088	2024-10-29 10:41:53 +00:00
Vlad Lazar	07b974480c	pageserver: move things around to prepare for decoding logic (#9504 ) ## Problem We wish to have high level WAL decoding logic in `wal_decoder::decoder` module. ## Summary of Changes For this we need the `Value` and `NeonWalRecord` types accessible there, so: 1. Move `Value` and `NeonWalRecord` to `pageserver::value` and `pageserver::record` respectively. 2. Get rid of `pageserver::repository` (follow up from (1)) 3. Move PG specific WAL record types to `postgres_ffi::walrecord`. In theory they could live in `wal_decoder`, but it would create a circular dependency between `wal_decoder` and `postgres_ffi`. Long term it makes sense for those types to be PG version specific, so that will work out nicely. 4. Move higher level WAL record types (to be ingested by pageserver) into `wal_decoder::models` Related: https://github.com/neondatabase/neon/issues/9335 Epic: https://github.com/neondatabase/neon/issues/9329	2024-10-29 10:00:34 +00:00
Arpad Müller	62f5d484d9	Assert the tenant to be active in `unoffload_timeline` (#9539 ) Currently, all callers of `unoffload_timeline` ensure that the tenant the unoffload operation is called on is active. We rely on it being active as we activate the timeline below and don't want to race with the activation code of the tenant (in the worst case, activating a timeline twice). Therefore, add this assertion. Part of #8088	2024-10-29 00:36:05 +00:00
Tristan Partin	4df3987054	Get role name when not a C string We will only have a C string if the specified role is a string. Otherwise, we need to resolve references to public, current_role, current_user, and session_user. Fixes: https://github.com/neondatabase/cloud/issues/19323 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-28 18:21:45 -05:00
Konstantin Knizhnik	0624565617	Create the notion of unstable extensions As a DBaaS provider, Neon needs to provide a stable platform for customers to build applications upon. At the same time however, we also need to enable customers to use the latest and greatest technology, so they can prototype their work, and we can solicit feedback. If all extensions are treated the same in terms of stability, it is hard to meet that goal. There are now two new GUCs created by the Neon extension: neon.allow_unstable_extensions: This is a session GUC which allows a session to install and load unstable extensions. neon.unstable_extensions: This is a comma-separated list of extension names. We can check if a CREATE EXTENSION statement is attempting to install an unstable extension, and if so, deny the request if neon.allow_unstable_extensions is not set to true. Signed-off-by: Tristan Partin <tristan@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-28 17:47:15 -05:00
George MacKerron	7d5f6b6a52	Build `pgrag` extensions x3 (#8486 ) Build the pgrag extensions (rag, rag_bge_small_en_v15, and rag_jina_reranker_v1_tiny_en) as part of the compute node Dockerfile. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-10-28 20:06:36 +00:00
Alex Chi Z.	f7c61e856f	fix(pageserver): bump tokio-epoll-uring (#9546 ) Includes https://github.com/neondatabase/tokio-epoll-uring/pull/58 that fixes the clippy error. ## Summary of changes Update the version of tokio-epoll-uring Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-28 20:03:02 +00:00
Alex Chi Z.	57c21aff9f	refactor(pageserver): remove aux v1 configs (#9494 ) ## Problem Part of https://github.com/neondatabase/neon/issues/8623 ## Summary of changes Removed all aux-v1 config processing code. Note that we persisted it into the index part file, so we cannot really remove the field from index part. I also kept the config item within the tenant config, but we will not read it any more. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-28 19:51:14 +00:00
Erik Grinaker	248558dee8	safekeeper: refactor `WalAcceptor` to be event-driven (#9462 ) ## Problem The `WalAcceptor` main loop currently uses two nested loops to consume inbound messages. This makes it hard to slot in periodic events like metrics collection. It also duplicates the event processing code, and assumes all messages in steady state are AppendRequests (other messages types may be dropped if following an AppendRequest). ## Summary of changes Refactor the `WalAcceptor` loop to be event driven.	2024-10-28 17:18:37 +00:00
Sergey Melnikov	3bad52543f	We don't have legacy proxies anymore (#9544 ) We don't have legacy scram proxies anymore: cc: https://github.com/neondatabase/cloud/issues/9745	2024-10-28 16:42:35 +00:00
Tristan Partin	3d64a7ddcd	Add pg_mooncake to compute-node.Dockerfile Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-28 11:23:30 -05:00
Conrad Ludgate	25f1e5cfeb	[proxy] demote warnings and remove dead-argument (#9512 ) fixes https://github.com/neondatabase/cloud/issues/19000	2024-10-28 15:02:20 +00:00
Rahul Patil	8dd555d396	ci(proxy): Update GH action flag on proxy deployment (#9535 ) ## Problem Based on a recent proxy deployment issue, we deployed another proxy version (proxy-scram), which was not needed when deploying a specific proxy type. we have [PR](https://github.com/neondatabase/infra/pull/2142) to update on the infra branch and need to update CI in this repo which triggers proxy deployment. ## Summary of changes - Update proxy deployment flag ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-10-28 13:17:09 +01:00
Arthur Petukhovsky	01b6843e12	Route pgbouncer logs to virtio-serial (#9488 ) virtio-serial is much more performant than /dev/console emulation, therefore, is much more suitable for the verbose logs inside vm. This commit changes routing for pgbouncer logs, since we've recently noticed it can emit large volumes of logs. Manually tested on staging by pinning a compute image to my test project. Should help with https://github.com/neondatabase/cloud/issues/19072	2024-10-28 12:09:47 +00:00
John Spray	93987b5a4a	tests: add test_storage_controller_onboard_detached (#9431 ) ## Problem We haven't historically taken this API route where we would onboard a tenant to the controller in detached state. It worked, but we didn't have test coverage. ## Summary of changes - Add a test that onboards a tenant to the storage controller in Detached mode, and checks that deleting it without attaching it works as expected.	2024-10-28 11:11:12 +00:00
John Spray	33baca07b6	storcon: add an API to cancel ongoing reconciler (#9520 ) ## Problem If something goes wrong with a live migration, we currently only have awkward ways to interrupt that: - Restart the storage controller - Ask it to do some other modification/migration on the shard, which we don't really want. ## Summary of changes - Add a new `/cancel` control API, and storcon_cli wrapper for it, which fires the Reconciler's cancellation token. This is just for on-call use and we do not expect it to be used by any other services.	2024-10-28 09:26:01 +00:00
John Spray	923974d4da	safekeeper: don't un-evict timelines during snapshot API handler (#9428 ) ## Problem When we use pull_timeline API on an evicted timeline, it gets downloaded to serve the snapshot API request. That means that to evacuate all the timelines from a node, the node needs enough disk space to download partial segments from all timelines, which may not be physically the case. Closes: #8833 ## Summary of changes - Add a "try" variant of acquiring a residence guard, that returns None if the timeline is offloaded - During snapshot API handler, take a different code path if the timeline isn't resident, where we just read the checkpoint and don't try to read any segments.	2024-10-28 08:47:12 +00:00
Arpad Müller	e7277885b3	Don't consider archived timelines for synthetic size calculation (#9497 ) Archived timelines should not count towards synthetic size. Closes #9384. Part of #8088.	2024-10-26 13:27:57 +00:00
dependabot[bot]	80262e724f	build(deps): bump werkzeug from 3.0.3 to 3.0.6 (#9527 )	2024-10-26 08:24:15 +01:00
Yuchen Liang	85b954f449	pageserver: add tokio-epoll-uring slots waiters queue depth metrics (#9482 ) In complement to https://github.com/neondatabase/tokio-epoll-uring/pull/56. ## Problem We want to make tokio-epoll-uring slots waiters queue depth observable via Prometheus. ## Summary of changes - Add `pageserver_tokio_epoll_uring_slots_submission_queue_depth` metrics as a `Histogram`. - Each thread-local tokio-epoll-uring system is given a `LocalHistogram` to observe the metrics. - Keep a list of `Arc<ThreadLocalMetrics>` used on-demand to flush data to the shared histogram. - Extend `Collector::collect` to report `pageserver_tokio_epoll_uring_slots_submission_queue_depth`. Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-10-25 21:30:57 +01:00
Arpad Müller	76328ada05	Fix unoffload_timeline races with creation (#9525 ) This PR does two things: 1. Obtain a `TimelineCreateGuard` object in `unoffload_timeline`. This prevents two unoffload tasks from racing with each other. While they already obtain locks for `timelines` and `offloaded_timelines`, they aren't sufficient, as we have already constructed an entire timeline at that point. We shouldn't ever have two `Timeline` objects in the same process at the same time. 2. don't allow timeline creations for timelines that have been offloaded. Obviously they already exist, so we should not allow creation. the previous logic only looked at the timelines list. Part of #8088	2024-10-25 20:06:27 +00:00
Erik Grinaker	b54b632c6a	safekeeper: don't pass conf into storage constructors (#9523 ) ## Problem The storage components take an entire `SafekeeperConf` during construction, but only actually use the `no_sync` field. This makes it hard to understand the storage inputs (which fields do they actually care about?), and is also inconvenient for tests and benchmarks that need to set up a lot of unnecessary boilerplate. ## Summary of changes * Don't take the entire config, but pass in the `no_sync` field explicitly. * Take the timeline dir instead of `ttid` as an input, since it's the only thing it cares about. * Fix a couple of tests to not leak tempdirs. * Various minor tweaks.	2024-10-25 18:19:52 +01:00
Erik Grinaker	9909551f47	safekeeper: fix version in `TimelinePersistentState::empty()` (#9521 ) ## Problem The Postgres version in `TimelinePersistentState::empty()` is incorrect: the major version should be multiplied by 10000. ## Summary of changes Multiply the version by 10000.	2024-10-25 16:22:35 +01:00
Arseny Sher	700b102b0f	safekeeper: retry eviction. (#9485 ) Without this manager may sleep forever after eviction failure without retries.	2024-10-25 17:48:29 +03:00
Conrad Ludgate	dbadb0f9bb	proxy: propagate session IDs (#9509 ) fixes #9367 by sending session IDs to local_proxy, and also returns session IDs to the client for easier debugging.	2024-10-25 14:34:19 +00:00
John Spray	8297f7a181	pageserver: fix N^2 I/O when processing relation drops in transaction abort (#9507 ) ## Problem We have some known N^2 behaviors when it comes to large relation counts, due to the monolithic encoding and full rewrites of of RelDirectory each time a relation is added. Ordinarily our backpressure mechanisms give "slow but steady" performance when creating/dropping/truncating relations. However, in the case of a transaction abort, it is possible for a single WAL record to drop an unbounded number of relations. The results in an unavailable compute, as when it sends one of these records, it can stall the pageserver's ingest for many minutes, even though the compute only sent a small amount of WAL. Closes https://github.com/neondatabase/neon/issues/9505 ## Summary of changes - Rewrite relation-dropping code to do one read/modify/write cycle of RelDirectory, instead of doing it separately for each relation in a loop. - Add a test for the bug scenario encountered: `test_tx_abort_with_many_relations` The test has ~40s runtime on my workstation. About 1 second of that is the part where we wait for ingest to catch up after a rollback, the rest is the slowness of creating and truncating a large number of relations. --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-10-25 15:09:02 +01:00
Christian Schwarz	2090e928d1	refactor(timeline creation): idempotency checking (#9501 ) # Context In the PGDATA import code (https://github.com/neondatabase/neon/pull/9218) I add a third way to create timelines, namely, by importing from a copy of a vanilla PGDATA directory in object storage. For idempotency, I'm using the PGDATA object storage location specification, which is stored in the IndexPart for the entire lifespan of the timeline. When loading the timeline from remote storage, that value gets stored inside `struct Timeline` and timeline creation compares the creation argument with that value to determine idempotency of the request. # Changes This PR refactors the existing idempotency handling of Timeline bootstrap and branching such that we simply compare the `CreateTimelineIdempotency` struct, using the derive-generated `PartialEq` implementation. Also, by spelling idempotency out in the type names, I find it adds a lot of clarity. The pathway to idempotency via requester-provided idempotency key also becomes very straight-forward, if we ever want to do this in the future. # Refs * platform context: https://github.com/neondatabase/neon/pull/9218 * product context: https://github.com/neondatabase/cloud/issues/17507 * stacks on top of https://github.com/neondatabase/neon/pull/9366	2024-10-25 14:44:20 +01:00
Tristan Partin	05eff3a67e	Move logical replication slot monitor neon.c is getting crowded and the logical replication slot monitor is a good candidate for reorganization. It is very self-contained, and being in a separate file will make it that much easier to find. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-25 08:41:44 -05:00
Arseny Sher	c6cf5e7c0f	Make test_pageserver_lsn_wait_error_safekeeper_stop less aggressive. (#9517 ) Previously it inserted ~150MiB of WAL while expecting page fetching to work in 1s (wait_lsn_timeout=1s). It failed in CI in debug builds. Instead, just directly wait for the wanted condition, i.e. needed safekeepers are reported in pageserver timed out waiting for WAL error message. Also set NEON_COMPUTE_TESTING_BASEBACKUP_RETRIES to 1 in this test and neighbour one, it reduces execution time from 2.5m to ~10s.	2024-10-25 14:13:46 +01:00
Christian Schwarz	e0c7f1ce15	remote_storage(local_fs): return correct file sizes (#9511 ) ## Problem `local_fs` doesn't return file sizes, which I need in PGDATA import (#9218) ## Solution Include file sizes in the result. I would have liked to add a unit test, and started doing that in * https://github.com/neondatabase/neon/pull/9510 by extending the common object storage tests (`libs/remote_storage/tests/common/tests.rs`) to check for sizes as well. But it turns out that localfs is not even covered by the common object storage tests and upon closer inspection, it seems that this area needs more attention. => punt the effort into https://github.com/neondatabase/neon/pull/9510	2024-10-25 12:20:53 +00:00
Christian Schwarz	6f5c262684	pageserver: add testing API to scan layers for disposable keys (#9393 ) This PR adds a pageserver mgmt API to scan a layer file for disposable keys. It hooks it up to the sharding compaction test, demonstrating that we're not filtering out all disposable keys. This is extracted from PGDATA import (https://github.com/neondatabase/neon/pull/9218) where I do the filtering of layer files based on `is_key_disposable`.	2024-10-25 14:16:45 +02:00
Jakub Kołodziejczak	9768f09f6b	proxy: don't follow redirects for user provided JWKS urls + set custom user agent (#9514 ) partially fixes https://github.com/neondatabase/cloud/issues/19249 ref https://docs.rs/reqwest/latest/reqwest/redirect/index.html > By default, a Client will automatically handle HTTP redirects, having a maximum redirect chain of 10 hops. To customize this behavior, a redirect::Policy can be used with a ClientBuilder.	2024-10-25 14:04:41 +02:00
Yuchen Liang	db900ae9d0	fix(test): remove too strict layers_removed==0 check in test_readonly_node_gc (#9506 ) Fixes #9098 ## Problem `test_readonly_node_gc` is flaky. As shown in [Allure Report](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9469/11444519440/index.html#suites/3ccffb1d100105b98aed3dc19b717917/2c02073738fa2b39), we would get a `AssertionError: No layers should be removed, old layers are guarded by leases.` after the test restarts pageservers or after reconfigure pageservers. During the investigation, we found that the layers has LSN (`0/1563088`) greater than the LSN (`0x1562000`) protected by the lease. For instance, Layers removed <pre> 000000067F00000005000034540100000000-000000067F00000005000040050100000000__000000000<b><i>1563088</i></b>-00000001 (shard 0002) 000000068000000000000017E20000000001-010000000100000001000000000000000001__000000000<b><i>1563088</i></b>-00000001 (shard 0002) </pre> Lsn Lease Granted <pre> handle_make_lsn_lease{lsn=<b><i>0/1562000</i></b> shard_id=0002 shard_id=0002}: lease created, valid until 2024-10-21 </pre> This means that these layers are not guarded by the leases: they are in "future", not visible to the static endpoint. ## Summary of changes - Remove the assertion layers_removed == 0 after trigger timeline GC while holding the lease. Instead rely on the successful execution of the`SELECT` query to test lease validity. - Improve test logging Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-10-25 12:50:47 +01:00
Arpad Müller	4d9036bf1f	Support offloaded timelines during shard split (#9489 ) Before, we didn't copy over the `index-part.json` of offloaded timelines to the new shard's location, resulting in the new shard not knowing the timeline even exists. In #9444, we copy over the manifest, but we also need to do this for `index-part.json`. As the operations to do are mostly the same between offloaded and non-offloaded timelines, we can iterate over all of them in the same loop, after the introduction of a `TimelineOrOffloadedArcRef` type to generalize over the two cases. This is analogous to the deletion code added in #8907. The added test also ensures that the sharded archival config endpoint works, something that has not yet been ensured by tests. Part of #8088	2024-10-25 12:32:46 +02:00
Vlad Lazar	b3bedda6fd	pageserver/walingest: log on gappy rel extend (#9502 ) ## Problem https://github.com/neondatabase/neon/pull/9492 added a metric to track the total count of block gaps filled on rel extend. More context is needed to understand when this happens. The current theory is that it may only happen on pg 14 and pg 15 since they do not WAL log relation extends. ## Summary of Changes A rate limited log is added.	2024-10-25 11:15:53 +01:00
Christian Schwarz	b782b11b33	refactor(timeline creation): represent bootstrap vs branch using enum (#9366 ) # Problem Timeline creation can either be bootstrap or branch. The distinction is made based on whether the `ancestor_` fields are present or not. In the PGDATA import code (https://github.com/neondatabase/neon/pull/9218), I add a third variant to timeline creation. # Solution The above pushed me to refactor the code in Pageserver to distinguish the different creation requests through enum variants. There is no externally observable effect from this change. On the implementation level, a notable change is that the acquisition of the `TimelineCreationGuard` happens later than before. This is necessary so that we have everything in place to construct the `CreateTimelineIdempotency`. Notably, this moves the acquisition of the creation guard _after_ the acquisition of the `gc_cs` lock in the case of branching. This might appear as if we're at risk of holding `gc_cs` longer than before this PR, but, even before this PR, we were holding `gc_cs` until after the `wait_completion()` that makes the timeline creation durable in S3 returns. I don't see any deadlock risk with reversing the lock acquisition order. As a drive-by change, I found that the `create_timeline()` function in `neon_local` is unused, so I removed it. # Refs platform context: https://github.com/neondatabase/neon/pull/9218 * product context: https://github.com/neondatabase/cloud/issues/17507 * next PR stacked atop this one: https://github.com/neondatabase/neon/pull/9501	2024-10-25 10:04:27 +00:00
Vlad Lazar	5069123b6d	pageserver: refactor ingest inplace to decouple decoding and handling (#9472 ) ## Problem WAL ingest couples decoding of special records with their handling (updates to the storage engine mostly). This is a roadblock for our plan to move WAL filtering (and implicitly decoding) to safekeepers since they cannot do writes to the storage engine. ## Summary of changes This PR decouples the decoding of the special WAL records from their application. The changes are done in place and I've done my best to refrain from refactorings and attempted to preserve the original code as much as possible. Related: https://github.com/neondatabase/neon/issues/9335 Epic: https://github.com/neondatabase/neon/issues/9329	2024-10-24 17:12:47 +01:00
Alex Chi Z.	fb0406e9d2	refactor(pageserver): refactor split writers using batch layer writer (#9493 ) part of https://github.com/neondatabase/neon/issues/9114, https://github.com/neondatabase/neon/issues/8836, https://github.com/neondatabase/neon/issues/8362 The split layer writer code can be used in a more general way: the caller puts unfinished writers into the batch layer writer and let batch layer writer to ensure the atomicity of the layer produces. ## Summary of changes * Add batch layer writer, which atomically finishes the layers. `BatchLayerWriter::finish` is simply a copy-paste from previous split layer writers. * Refactor split writers to use the batch layer writer. * The current split writer tests cover all code path of batch layer writer. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-24 10:49:54 -04:00
Alexander Bayandin	b8a311131e	CI: remove `git config --add safe.directory` hack (#9391 ) ## Problem We have `git config --global --add safe.directory ...` leftovers from the past, but `actions/checkout` does it by default (since v3.0.2, we use v4) ## Summary of changes - Remove `git config --global --add safe.directory ...` hack	2024-10-24 15:49:26 +01:00
John Spray	d589498c6f	storcon: respect Reconciler::cancel during await_lsn (#9486 ) ## Problem When a pageserver is misbehaving (e.g. we hit an ingest bug or something is pathologically slow), the storage controller could get stuck in the part of live migration that waits for LSNs to catch up. This is a problem, because it can prevent us migrating the troublesome tenant to another pageserver. Closes: https://github.com/neondatabase/cloud/issues/19169 ## Summary of changes - Respect Reconciler::cancel during await_lsn.	2024-10-24 15:23:09 +01:00
Folke Behrens	d56599df2a	Merge pull request #9499 from neondatabase/rc/proxy/2024-10-24 Proxy release 2024-10-24	2024-10-24 10:34:56 +02:00
Christian Schwarz	6f34f97573	refactor(pageserver(load_remote_timeline)) remove dead code handling absence of IndexPart (#9408 ) The code is dead at runtime since we're nowadays always running with remote storage and treat it as the source of truth during attach. Clean it up as a preliminary to https://github.com/neondatabase/neon/pull/9218. Related: https://github.com/neondatabase/neon/pull/9366	2024-10-24 09:00:22 +01:00
Tristan Partin	b86432c29e	Fix buggy sizeof A sizeof on a pointer on a 64 bit machine is 8 bytes whereas Entry::old_name is a 64 byte array of characters. There was most likely no fallout since the string would start with NUL bytes, but best to fix nonetheless. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-23 21:52:22 -06:00
Vlad Lazar	ac1205c14c	pageserver: add metric for number of zeroed pages on rel extend (#9492 ) ## Problem Filling the gap in with zeroes is annoying for sharded ingest. We are not sure it even happens in reality. ## Summary of Changes Add one global counter which tracks how many such gap blocks we filled on relation extends. We can add more metrics once we understand the scope.	2024-10-23 19:58:28 +01:00
John Spray	e3ff87ce3b	tests: avoid using background_process when invoking pg_ctl (#9469 ) ## Problem Occasionally, we get failures to start the storage controller's db with errors like: ``` aborting due to panic at /__w/neon/neon/control_plane/src/background_process.rs:349:67: claim pid file: lock file Caused by: file is already locked ``` e.g. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9428/11380574562/index.html#/testresult/1c68d413ea9ecd4a This is happening in a stop,start cycle during a test. Presumably the pidfile from the startup background process is still held at the point we stop, because we let pg_ctl keep running in the background. ## Summary of changes - Refactor pg_ctl invocations into a helper - In the controller's `start` function, use pg_ctl & a wait loop for pg_isready, instead of using background_process --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-10-23 16:29:55 +00:00
Tristan Partin	0595320c87	Protect call to pg_current_wal_lsn() in retained_wal query We can't call pg_current_wal_lsn() if we are a standby instance (read replica). Any attempt to call this function while in recovery results in: ERROR: recovery is in progress Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-23 09:55:00 -06:00
Folke Behrens	92d5e0e87a	proxy: clear lib.rs of code items (#9479 ) We keep lib.rs for crate configs, lint configs and re-exports for the binaries.	2024-10-23 08:21:28 +02:00
Arpad Müller	3a3bd34a28	Rename IndexPart::{from_s3_bytes,to_s3_bytes} (#9481 ) We support multiple storage backends now, so remove the `_s3_` from the name. Analogous to the names adopted for tenant manifests added in #9444.	2024-10-23 00:34:24 +02:00
Alex Chi Z.	64949a37a9	fix(pageserver): make delta split layer writer finish atomic (#9048 ) similar to https://github.com/neondatabase/neon/pull/8841, we make the delta layer writer atomic when finishing the layers. ## Summary of changes * `put_value` not taking discard fn anymore * `finish` decides what layers to keep --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-22 22:06:21 +00:00
Arpad Müller	6f8fcdf9ea	Timeline offloading persistence (#9444 ) Persist timeline offloaded state to S3. Right now, as of #8907, at each restart of the pageserver, all offloaded state is lost, so we load the full timeline again. As it starts with an empty local directory, we might potentially download some files again, leading to downloads that are ultimately wasteful. This patch adds support for persisting the offloaded state, allowing us to never load offloaded timelines in the first place. The persistence feature is facilitated via a new file in S3 that is tenant-global, which contains a list of all offloaded timelines. It is updated each time we offload or unoffload a timeline, and otherwise never touched. This choice means that tenants where no offloading is happening will not immediately get a manifest, keeping the change very minimal at the start. We leave generation support for future work. It is important to support generations, as in the worst case, the manifest might be overwritten by an older generation after a timeline has been unoffloaded (and unarchived), so the next pageserver process instantiation might wrongly believe that some timeline is still offloaded even though it should be active. Part of #9386, #8088	2024-10-22 20:52:30 +00:00
Tristan Partin	fcb55a2aa2	Fix copy-paste error in checkpoints_timed metric Importing the wrong metric. Sigh... Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-22 14:34:26 -06:00
a-masterov	f36cf3f885	Fix local errors for the tests with the versions mix (#9477 ) ## Problem If the environment variables `COMPATIBILITY_NEON_BIN` or `COMPATIBILITY_POSTGRES_DISTRIB_DIR` are not set (this is usual during a local run), the tests with the versions mix cannot run. ## Summary of changes If these variables are not set turn off the version mix. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-10-22 21:58:55 +02:00
John Spray	8dca188974	storage controller: add metrics for tenant shard, node count (#9475 ) ## Problem Previously, figuring out how many tenant shards were managed by a storage controller was typically done by peeking at the database or calling into the API. A metric makes it easier to monitor, as unexpectedly increasing shard counts can be indicative of problems elsewhere in the system. ## Summary of changes - Add metrics `storage_controller_pageserver_nodes` (updated on node CRUD operations from Service) and `storage_controller_tenant_shards` (updated RAII-style from TenantShard)	2024-10-22 19:43:02 +01:00
Tristan Partin	b7fa93f6b7	Use make's builtin RM variable At least as far as removing individual files goes, this is the best pattern for removing. I can't say the same for removing directories, but I went ahead and changed those to `$(RM) -r` anyway. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-22 09:14:29 -06:00
Arseny Sher	1e8e04bb2c	safekeeper: refactor timeline initialization (#9362 ) Always do timeline init through atomic rename of temp directory. Add GlobalTimelines::load_temp_timeline which does this, and use it from both pull_timeline and basic timeline creation. Fixes a collection of issues: - previously timeline creation didn't really flushed cfile to disk due to 'nothing to do if state didn't change' check; - even if it did, without tmp dir it is possible to lose the cfile but leave timeline dir in place, making it look corrupted; - tenant directory creation fsync was missing in timeline creation; - pull_timeline is now protected from concurrent both itself and timeline creation; - now global timelines map entry got special CreationInProgress entry type which prevents from anyone getting access to timeline while it is being created (previously one could get access to it, but it was locked during creation, which is valid but confusing if creation failed). fixes #8927	2024-10-22 07:11:36 +01:00
David Gomes	94369af782	chore(compute): bumps pg_session_jwt to latest version (#9474 )	2024-10-21 23:39:30 +00:00
Arpad Müller	34b6bd416a	offloaded timeline list API (#9461 ) Add a way to list the offloaded timelines. Before, one had to look at logs to figure out if a timeline has been offloaded or not, or use the non-presence of a certain timeline in the list of normal timelines. Now, one can list them directly. Part of #8088	2024-10-21 16:33:05 +01:00
Yuchen Liang	49d5e56c08	pageserver: use direct IO for delta and image layer reads (#9326 ) Part of #8130 ## Problem Pageserver previously goes through the kernel page cache for all the IOs. The kernel page cache makes light-loaded pageserver have deceptive fast performance. Using direct IO would offer predictable latencies of our virtual file IO operations. In particular for reads, the data pages also have an extremely low temporal locality because the most frequently accessed pages are cached on the compute side. ## Summary of changes This PR enables pageserver to use direct IO for delta layer and image layer reads. We can ship them separately because these layers are write-once, read-many, so we will not be mixing buffered IO with direct IO. - implement `IoBufferMut`, an buffer type with aligned allocation (currently set to 512). - use `IoBufferMut` at all places we are doing reads on image + delta layers. - leverage Rust type system and use `IoBufAlignedMut` marker trait to guarantee that the input buffers for the IO operations are aligned. - page cache allocation is also made aligned. _* in-memory layer reads and the write path will be shipped separately._ ## Testing Integration test suite run with O_DIRECT enabled: https://github.com/neondatabase/neon/pull/9350 ## Performance We evaluated performance based on the `get-page-at-latest-lsn` benchmark. The results demonstrate a decrease in the number of IOps, no sigificant change in the latency mean, and an slight improvement on the p99.9 and p99.99 latencies. [Benchmark](https://www.notion.so/neondatabase/Benchmark-O_DIRECT-for-image-and-delta-layers-2024-10-01-112f189e00478092a195ea5a0137e706?pvs=4) ## Rollout We will add `virtual_file_io_mode=direct` region by region to enable direct IO on image + delta layers. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-10-21 11:01:25 -04:00
Alex Chi Z.	aca81f5fa4	fix(pageserver): make image split layer writer finish atomic (#8841 ) Part of https://github.com/neondatabase/neon/issues/8836 ## Summary of changes This pull request makes the image layer split writer atomic when finishing the layers. All the produced layers either finish at the same time, or discard at the same time. Note that this does not prevent atomicity when crash, but anyways, it will be cleaned up on pageserver restart. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-10-21 15:59:48 +01:00
Ivan Efremov	2dcac94194	proxy: Use common error interface for error handling with cplane (#9454 ) - Remove obsolete error handles. - Use one source of truth for cplane errors. #18468	2024-10-21 17:20:09 +03:00
Ivan Efremov	ababa50cce	Use '-f' for make clean in Makefile compute (#9464 ) Use '-f' instead of '--force' because it is impossible to clean the targets on MacOS	2024-10-21 16:20:39 +03:00
Alexander Bayandin	163beaf9ad	CI: use build-tools on Debian 12 whenever we use Neon artifact (#9463 ) ## Problem ``` + /tmp/neon/pg_install/v16/bin/psql '***' -c 'SELECT version()' /tmp/neon/pg_install/v16/bin/psql: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /tmp/neon/pg_install/v16/bin/psql) /tmp/neon/pg_install/v16/bin/psql: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /tmp/neon/pg_install/v16/bin/psql) /tmp/neon/pg_install/v16/bin/psql: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/neon/pg_install/v16/lib/libpq.so.5) /tmp/neon/pg_install/v16/bin/psql: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /tmp/neon/pg_install/v16/lib/libpq.so.5) /tmp/neon/pg_install/v16/bin/psql: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /tmp/neon/pg_install/v16/lib/libpq.so.5) ``` ## Summary of changes - Use `build-tools:pinned-bookworm` whenever we download Neon artefact	2024-10-21 12:14:19 +01:00
Alexander Bayandin	5b37485c99	Rename dockerfiles from `Dockerfile.<something>` to `<something>.Dockerfile` (#9446 ) ## Problem Our dockerfiles, for some historical reason, have unconventional names `Dockerfile.<something>`, and some tools (like GitHub UI) fail to highlight the syntax in them. > Some projects may need distinct Dockerfiles for specific purposes. A common convention is to name these `<something>.Dockerfile` From: https://docs.docker.com/build/concepts/dockerfile/#filename ## Summary of changes - Rename `Dockerfile.build-tools` -> `build-tools.Dockerfile` - Rename `compute/Dockerfile.compute-node` -> `compute/compute-node.Dockerfile`	2024-10-21 09:51:12 +01:00
Folke Behrens	ed958da38a	proxy: Make tests fail fast when test proxy exited early (#9432 ) This currently happens when proxy is not compiled with feature `testing`. Also fix an adjacent function.	2024-10-21 08:29:23 +00:00
Conrad Ludgate	cc25ef7342	bump pg-session-jwt version (#9455 ) forgot to bump this before	2024-10-20 14:42:50 +02:00
Arpad Müller	71d09c78d4	Accept basebackup <tenant> <timeline> --gzip requests (#9456 ) In #9453, we want to remove the non-gzipped basebackup code in the computes, and always request gzipped basebackups. However, right now the pageserver's page service only accepts basebackup requests in the following formats: * `basebackup <tenant_id> <timeline_id>`, lsn is determined by the pageserver as the most recent one (`timeline.get_last_record_rlsn()`) * `basebackup <tenant_id> <timeline_id> <lsn>` * `basebackup <tenant_id> <timeline_id> <lsn> --gzip` We add a fourth case, `basebackup <tenant_id> <timeline_id> --gzip` to allow gzipping the request for the latest lsn as well.	2024-10-19 00:23:49 +02:00
Tristan Partin	62a334871f	Take the collector name as argument when generating sql_exporter configs In neon_collector_autoscaling.jsonnet, the collector name is hardcoded to neon_collector_autoscaling. This issue manifests itself such that sql_exporter would not find the collector configuration. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-18 09:36:29 -05:00
Vlad Lazar	e162ab8b53	storcon: handle ongoing deletions gracefully (#9449 ) ## Problem Pageserver returns 409 (Conflict) if any of the shards are already deleting the timeline. This resulted in an error being propagated out of the HTTP handler and to the client. It's an expected scenario so we should handle it nicely. This caused failures in `test_storage_controller_smoke` [here](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9435/11390431900/index.html#suites/8fc5d1648d2225380766afde7c428d81/86eee4b002d6572d). ## Summary of Changes Instead of returning an error on 409s, we now bubble the status code up and let the HTTP handler code retry until it gets a 404 or times out.	2024-10-18 15:33:04 +01:00
Conrad Ludgate	5cbdec9c79	[local_proxy]: install pg_session_jwt extension on demand (#9370 ) Follow up on #9344. We want to install the extension automatically. We didn't want to couple the extension into compute_ctl so instead local_proxy is the one to issue requests specific to the extension. depends on #9344 and #9395	2024-10-18 14:41:21 +01:00
Vlad Lazar	ec6d3422a5	pageserver: disconnect when asking client to reconnect (#9390 ) ## Problem Consider the following sequence of events: 1. Shard location gets downgraded to secondary while there's a libpq connection in pagestream mode from the compute 2. There's no active tenant, so we return `QueryError::Reconnect` from `PageServerHandler::handle_get_page_at_lsn_request`. 3. Error bubbles up to `PostgresBackendIO::process_message`, bailing us out of pagestream mode. 4. We instruct the client to reconnnect, but continue serving the libpq connection. The client isn't yet aware of the request to reconnect and believes it is still in pagestream mode. Pageserver fails to deserialize get page requests wrapped in `CopyData` since it's not in pagestream mode. ## Summary of Changes When we wish to instruct the client to reconnect, also disconnect from the server side after flushing the error. Closes https://github.com/neondatabase/cloud/issues/17336	2024-10-18 13:38:59 +01:00
Arseny Sher	fecff15f18	walproposer: immediately exit if sync-safekeepers collected 0/0. (#9442 ) Otherwise term history starting with 0/0 is streamed to safekeepers. ref https://github.com/neondatabase/neon/issues/9434	2024-10-18 15:31:50 +03:00
Jere Vaara	3532ae76ef	compute_ctl: Add endpoint that allows extensions to be installed (#9344 ) Adds endpoint to install extensions: POST `/extensions` ``` {"extension":"pg_sessions_jwt","database":"neondb","version":"1.0.0"} ``` Will be used by `local-proxy`. Example, for the JWT authentication to work the database needs to have the pg_session_jwt extension and also to enable JWT to work in RLS policies. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-10-18 15:07:36 +03:00
Folke Behrens	15fecffe6b	Update ruff to much newer version (#9433 ) Includes a multidict patch release to fix build with newer cpython.	2024-10-18 12:42:41 +02:00
Arseny Sher	98fee7a97d	Increase shared_buffers in test_subscriber_synchronous_commit. (#9427 ) Might make the test less flaky.	2024-10-18 13:31:14 +03:00
John Spray	b7173b1ef0	storcon: fix case where we might fail to send compute notifications after two opposite migrations (#9435 ) ## Problem If we migrate A->B, then B->A, and the notification of A->B fails, then we might have retained state that makes us think "A" is the last state we sent to the compute hook, whereas when we migrate B->A we should really be sending a fresh notification in case our earlier failed notification has actually mutated the remote compute config. Closes: #9417 ## Summary of changes - Add a reproducer for the bug (`test_storage_controller_compute_hook_revert`) - Refactor compute hook code to represent remote state with `ComputeRemoteState` which stores a boolean for whether the compute has fully applied the change as well as the request that the compute accepted. - The actual bug fix: after sending a compute notification, if we got a 423 response then update our ComputeRemoteState to reflect that we have mutated the remote state. This way, when we later try and notify for our historic location, we will properly see that as a change and send the notification. Co-authored-by: Vlad Lazar <vlad@neon.tech>	2024-10-18 11:29:23 +01:00
Jere Vaara	24654b8eee	compute_ctl: Add endpoint that allows setting role grants (#9395 ) This PR introduces a `/grants` endpoint which allows setting specific `privileges` to certain `role` for a certain `schema`. Related to #9344 Together these endpoints will be used to configure JWT extension and set correct usage to its schema to specific roles that will need them. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-10-18 11:25:45 +01:00
Conrad Ludgate	b8304f90d6	2024 oct new clippy lints (#9448 ) Fixes new lints from `cargo +nightly clippy` (`clippy 0.1.83 (798fb83f 2024-10-16)`)	2024-10-18 10:27:50 +01:00
Conrad Ludgate	d762ad0883	update rustls (#9396 ) The forever ongoing effort of juggling multiple versions of rustls :3 now with new crypto library aws-lc. Because of dependencies, it is currently impossible to not have both ring and aws-lc in the dep tree, therefore our only options are not updating rustls or having both crypto backends enabled... According to benchmarks run by the rustls maintainer, aws-lc is faster than ring in some cases too <https://jbp.io/graviola/>, so it's not without its upsides,	2024-10-17 20:45:37 +01:00
Arpad Müller	928d98b6dc	Update Rust to 1.82.0 and mold to 2.34.0 (#9445 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1820-2024-10-17). Also update mold. [release notes for 2.34.0](https://github.com/rui314/mold/releases/tag/v2.34.0), [release notes for 2.34.1](https://github.com/rui314/mold/releases/tag/v2.34.1). Prior update was in #8939.	2024-10-17 21:25:51 +02:00
John Spray	24398bf060	pageserver: detect & warn on loading an old index which is probably the result of a bad generation (#9383 ) ## Problem The pageserver generally trusts the storage controller/control plane to give it valid generations. However, sometimes it should be obvious that a generation is bad, and for defense in depth we should detect that on the pageserver. This PR is part 1 of 2: 1. in this PR we detect and warn on such situations, but do not block starting up the tenant. Once we have confidence that the check is not firing unexpectedly in the field 2. part 2 of 2 will introduce a condition that refuses to start a tenant in this situtation, and a test for that (maybe, if we can figure out how to spoof an ancient mtime) Related: #6951 ## Summary of changes - When loading an index older than 2 weeks, log an INFO message noting that we will check for other indices - When loading an index older than 2 weeks _and_ a newer-generation index exists, log a warning.	2024-10-17 19:02:24 +01:00
Alex Chi Z.	63b3491c1b	refactor(pageserver): remove aux v1 code path (#9424 ) Part of the aux v1 retirement https://github.com/neondatabase/neon/issues/8623 ## Summary of changes Remove write/read path for aux v1, but keeping the config item and the index part field for now. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-17 17:22:44 +01:00
Anastasia Lubennikova	858867c627	Add logging of installed_extensions (#9438 ) Simple PR to log installed_extensions statistics. in the following format: ``` 2024-10-17T13:53:02.860595Z INFO [NEON_EXT_STAT] {"extensions":[{"extname":"plpgsql","versions":["1.0"],"n_databases":2},{"extname":"neon","versions":["1.5"],"n_databases":1}]} ```	2024-10-17 16:35:19 +01:00
Erik Grinaker	299cde899b	safekeeper: flush WAL on compute disconnect (#9436 ) ## Problem In #9259, we found that the `check_safekeepers_synced` fast path could result in a lower basebackup LSN than the `flush_lsn` reported by Safekeepers in `VoteResponse`, causing the compute to panic once on startup. This would happen if the Safekeeper had unflushed WAL records due to a compute disconnect. The `TIMELINE_STATUS` query would report a `flush_lsn` below these unflushed records, while `VoteResponse` would flush the WAL and report the advanced `flush_lsn`. See https://github.com/neondatabase/neon/issues/9259#issuecomment-2410849032. ## Summary of changes Flush the WAL if the compute disconnects during WAL processing.	2024-10-17 17:19:18 +02:00
Erik Grinaker	4c9835f4a3	storage_controller: delete stale shards when deleting tenant (#9333 ) ## Problem Tenant deletion only removes the current shards from remote storage. Any stale parent shards (before splits) will be left behind. These shards are kept since child shards may reference data from the parent until new image layers are generated. ## Summary of changes * Document a special case for pageserver tenant deletion that deletes all shards in remote storage when given an unsharded tenant ID, as well as any unsharded tenant data. * Pass an unsharded tenant ID to delete all remote storage under the tenant ID prefix. * Split out `RemoteStorage::delete_prefix()` to delete a bucket prefix, with additional test coverage. * Add a `delimiter` argument to `asset_prefix_empty()` to support partial prefix matches (i.e. all shards starting with a given tenant ID).	2024-10-17 14:34:51 +00:00
Alex Chi Z.	f3a3eefd26	feat(pageserver): do space check before gc-compaction (#9250 ) part of https://github.com/neondatabase/neon/issues/9114 ## Summary of changes gc-compaction may take a lot of disk space, and if it does, the caller should do a partial gc-compaction. This patch adds space check for the compaction job. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-17 10:29:53 -04:00
Ivan Efremov	a7c05686cc	test_runner: Update the README.md to build neon with 'testing' (#9437 ) Without having the '--features testing' in the cargo build the proxy won't start causing tests to fail.	2024-10-17 17:20:42 +03:00
Anastasia Lubennikova	8b47938140	Add support of extensions for v17 (part 3) (#9430 ) - pgvector 7.4 update support of extensions for v14-v16: - pgvector 7.2 -> 7.4	2024-10-17 13:37:21 +01:00
Arpad Müller	35e7d91bc9	Add config variable for timeline offloading (#9421 ) Adds a configuration variable for timeline offloading support. The added pageserver-global config option controls whether the pageserver automatically offloads timelines during compaction. Therefore, already offloaded timelines are not affected by this, nor is the manual testing endpoint. This allows the rollout of timeline offloading to be driven by the storage team. Part of #8088	2024-10-17 12:07:58 +00:00
Ivan Efremov	22d8834474	proxy: move the connection pools to separate file (#9398 ) First PR for #9284 Start unification of the client and connection pool interfaces: - Exclude the 'global_connections_count' out from the get_conn_entry() - Move remote connection pools to the conn_pool_lib as a reference - Unify clients among all the conn pools	2024-10-17 13:38:24 +03:00
Folke Behrens	9d9aab3680	Merge pull request #9426 from neondatabase/rc/proxy/2024-10-17 Proxy release 2024-10-17	2024-10-17 12:18:51 +02:00
John Spray	db68e82235	storage_scrubber: fixes to garbage commands (#9409 ) ## Problem While running `find-garbage` and `purge-garbage`, I encountered two things that needed updating: - Console API may omit `user_id` since org accounts were added - When we cut over to using GenericRemoteStorage, the object listings we do during purge did not get proper retry handling, so could easily fail on usual S3 errors, and make the whole process drop out. ...and one bug: - We had a `.unwrap` which expects that after finding an object in a tenant path, a listing in that path will always return objects. This is not true, because a pageserver might be deleting the path at the same time as we scan it. ## Summary of changes - When listing objects during purge, use backoff::retry - Make `user_id` an `Option` - Handle the case where a tenant's objects go away during find-garbage.	2024-10-17 10:06:02 +01:00
Konstantin Knizhnik	934dbb61f5	Check access_count in lfc_evict (#9407 ) ## Problem See https://neondb.slack.com/archives/C033A2WE6BZ/p1729007738526309?thread_ts=1722942856.987979&cid=C033A2WE6BZ When replica receives WAL record which target page is not present in shared buffer, we evict this page from LFC. If all pages from the LFC chunk are evicted, then chunk is moved to the beginning of LRU least to force it reuse. Unfortunately access_count is not checked and if the entry is access at this moment then this operation can cause LRU list corruption. ## Summary of changes Check `access_count` in `lfc_evict` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-17 08:04:57 +03:00
Christian Schwarz	67d5d98b19	readme: fix build instructions for debian 12 (#9371 ) We need libprotobuf-dev for some of the `/usr/include/google/protobuf/...*.proto` referenced by our protobuf decls.	2024-10-16 21:47:53 +02:00
Tristan Partin	e0fa6bcf1a	Fix some sql_exporter metrics for PG 17 Checkpointer related statistics moved from pg_stat_bgwriter to pg_stat_checkpointer, so we need to adjust our queries accordingly. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-16 14:46:33 -05:00
Tristan Partin	409a286eaa	Fix typo in sql_exporter generator Bad copy-paste seemingly. This manifested itself as a failure to start for the sql_exporter, and was just dying on loop in staging. A future PR will have E2E testing of sql_exporter. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-16 13:08:40 -05:00
Arpad Müller	0551cfb6a7	Fix beta clippy warnings (#9419 ) ``` warning: first doc comment paragraph is too long --> compute_tools/src/installed_extensions.rs:35:1 \| 35 \| / /// Connect to every database (see list_dbs above) and get the list of installed extensions. 36 \| \| /// Same extension can be installed in multiple databases with different versions, 37 \| \| /// we only keep the highest and lowest version across all databases. \| \|_ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph = note: `#[warn(clippy::too_long_first_doc_paragraph)]` on by default help: add an empty line \| 35 ~ /// Connect to every database (see list_dbs above) and get the list of installed extensions. 36 + /// \| ```	2024-10-16 19:04:56 +01:00
Folke Behrens	ed694732e7	proxy: merge AuthError and AuthErrorImpl (#9418 ) Since GetAuthInfoError now boxes the ControlPlaneError message the variant is not big anymore and AuthError is 32 bytes.	2024-10-16 19:10:49 +02:00
Alex Chi Z.	8a114e3aed	refactor(pageserver): upgrade remote_storage to use hyper1 (#9405 ) part of https://github.com/neondatabase/neon/issues/9255 ## Summary of changes Upgrade remote_storage crate to use hyper1. Hyper0 is used when providing the streaming HTTP body to the s3 SDK, and it is refactored to use hyper1. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-16 16:19:45 +01:00
Arpad Müller	55b246085e	Activate timelines during unoffload (#9399 ) The current code has forgotten to activate timelines during unoffload, leading to inability to receive the basebackup, due to the timeline still being in loading state. ``` stderr: command failed: compute startup failed: failed to get basebackup@0/0 from pageserver postgresql://no_user@localhost:15014 Caused by: 0: db error: ERROR: Not found: Timeline 508546c79b2b16a84ab609fdf966e0d3/bfc18c24c4b837ecae5dbb5216c80fce is not active, state: Loading 1: ERROR: Not found: Timeline 508546c79b2b16a84ab609fdf966e0d3/bfc18c24c4b837ecae5dbb5216c80fce is not active, state: Loading ``` Therefore, also activate the timeline during unoffloading. Part of #8088	2024-10-16 16:47:17 +02:00
Anastasia Lubennikova	9668601f46	Add support of extensions for v17 (part 2) (#9389 ) - plv8 3.2.3 - HypoPG 1.4.1 - pgtap 1.3.3 - timescaledb 2.17.0 - pg_hint_plan 17_1_7_0 - rdkit Release_2024_09_1 - pg_uuidv7 1.6.0 - wal2json 2.6 - pg_ivm 1.9 - pg_partman 5.1.0 update support of extensions for v14-v16: - HypoPG 1.4.0 -> 1.4.1 - pgtap 1.2.0 -> 1.3.3 - plpgsql_check 2.5.3 -> 2.7.11 - pg_uuidv7 1.0.1 -> 1.6.0 - wal2json 2.5 -> 2.6 - pg_ivm 1.7 -> 1.9 - pg_partman 5.0.1 -> 5.1.0	2024-10-16 15:29:23 +01:00
Arpad Müller	3140c14d60	Remove allow(clippy::unknown_lints) (#9416 ) the lint stabilized in 1.80.	2024-10-16 16:28:55 +02:00
John Spray	d6281cbe65	tests: stabilize test_timelines_parallel_endpoints (#9413 ) ## Problem This test would get failures like `command failed: Found no timeline id for branch name 'branch_8'` It's because neon_local is being invoked concurrently for branch creation, which is unsafe (they'll step on each others' JSON writes) Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9410/11363051979/index.html#testresult/5ddc56c640f5422b/retries ## Summary of changes - Don't do branch creation concurrently with endpoint creation via neon_local	2024-10-16 15:27:46 +01:00
Vlad Lazar	d490ad23e0	storcon: use the same trace fields for reconciler and results (#9410 ) ## Problem The reconciler use `seq`, but processing of results uses `sequence`. Order is different too. It makes it annoying to read logs. ## Summary of Changes Use the same tracing fields in both	2024-10-16 14:04:17 +01:00
Folke Behrens	f14e45f0ce	proxy: format imports with nightly rustfmt (#9414 ) ```shell cargo +nightly fmt -p proxy -- -l --config imports_granularity=Module,group_imports=StdExternalCrate,reorder_imports=true ``` These rust-analyzer settings for VSCode should help retain this style: ```json "rust-analyzer.imports.group.enable": true, "rust-analyzer.imports.prefix": "crate", "rust-analyzer.imports.merge.glob": false, "rust-analyzer.imports.granularity.group": "module", "rust-analyzer.imports.granularity.enforce": true, ```	2024-10-16 15:01:56 +02:00
John Spray	89a65a9e5a	pageserver: improve handling of archival_config calls during Timeline shutdown (#9415 ) ## Problem In test `test_timeline_offloading`, we see failures like: ``` PageserverApiException: queue is in state Stopped ``` Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/main/11356917668/index.html#testresult/ff0e348a78a974ee/retries ## Summary of changes - Amend code paths that handle errors from RemoteTimelineClient to check for cancellation and emit the Cancelled error variant in these cases (will give clients a 503 to retry) - Remove the implicit `#[from]` for the Other error case, to make it harder to add code that accidentally squashes errors into this (500-equivalent) error variant. This would be neater if we made RemoteTimelineClient return a structured error instead of anyhow::Error, but that's a bigger refactor. I'm not sure if the test really intends to hit this path, but the error handling fix makes sense either way.	2024-10-16 13:39:58 +01:00
Cihan Demirci	bc6b8cee01	don't trigger workflows in two repos (#9340 ) https://github.com/neondatabase/cloud/issues/16723	2024-10-16 10:43:48 +01:00
Tristan Partin	061ea0de7a	Add jsonnetfmt targets This should make it a little bit easier for people wanting to check if their files are formated correctly. Has the added bonus of making the CI check simpler as well. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-15 20:01:13 -05:00
Tristan Partin	be5d6a69dc	Fix jsonnet_files wildcard Just a typo in a path. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-15 16:30:31 -05:00
Matthias van de Meent	18f4e5f10c	Add newly added metrics from neondatabase/neon#9116 to exports (#9402 ) They weren't added in that PR, but should be available immediately on rollout as the neon extension already defaults to 1.5.	2024-10-15 23:13:31 +02:00
Alex Chi Z.	f1eb703256	fix(pageserver): use a buffer for basebackup; add aux basebackup metrics log (#9401 ) Our replication bench project is stuck because it is too slow to generate basebackup and it caused compute to disconnect. https://neondb.slack.com/archives/C03438W3FLZ/p1728330685012419 The compute timeout for waiting for basebackup is 10m (is it true?). Generating basebackup directly on pageserver takes ~3min. Therefore, I suspect it's because there are too many wasted round-trip time for writing the 10000+ snapshot aux files. Also, it is possible that the basebackup process takes too long time retrieving all aux files that it did not write anything over the wire protocol, causing a read timeout. Basebackup size is 800KB gzipped for that project and was 55MB tar before compression. ## Summary of changes * Potentially fix the issue by placing a write buffer for basebackup. * Log how many aux files did we read + the time spent on it. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-15 16:35:21 -04:00
Tristan Partin	cf7a596a15	Generate sql_exporter config files with Jsonnet There are quite a few benefits to this approach: - Reduce config duplication - The two sql_exporter configs were super similar with just a few differences - Pull SQL queries into standalone files - That means we could run a SQL formatter on the file in the future - It also means access to syntax highlighting - In the future, run different queries for different PG versions - This is relevant because right now, we have queries that are failing on PG 17 due to catalog updates Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-15 11:18:38 -05:00
Konstantin Knizhnik	614c3aef72	Remove redundant code (#9373 ) ## Problem There is double update of resize cache in `put_rel_truncation` Also `page_server_request` contains check that fork is MAIN_FORKNUM which 1. is incorrect (because Vm/FSM pages are shreded in the same way as MAIN fork pages and 2. is redundant because `page_server_request` is never called for `get page` request so first part to OR condition is always true. ## Summary of changes Remove redundant code ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-15 17:18:52 +03:00
Folke Behrens	fb74c21e8c	proxy: Migrate jwt module away from anyhow (#9361 )	2024-10-15 15:24:56 +02:00
Conrad Ludgate	d92d36a315	[local_proxy] update api for pg_session_jwt (#9359 ) pg_session_jwt now: 1. Sets the JWK in a PGU_BACKEND session guc, no longer in the init() function. 2. JWK no longer needs the kid.	2024-10-15 12:13:57 +00:00
Arpad Müller	ec4cc30de9	Shut down timelines during offload and add offload tests (#9289 ) Add a test for timeline offloading, and subsequent unoffloading. Also adds a manual endpoint, and issues a proper timeline shutdown during offloading which prevents a pageserver hang at shutdown. Part of #8088.	2024-10-15 09:46:51 +00:00
John Spray	73c6626b38	pageserver: stabilize & refine controller scale test (#8971 ) ## Problem We were seeing timeouts on migrations in this test. The test unfortunately tends to saturate local storage, which is shared between the pageservers and the control plane database, which makes the test kind of unrealistic. We will also want to increase the scale of this test, so it's worth fixing that. ## Summary of changes - Instead of randomly creating timelines at the same time as the other background operations, explicitly identify a subset of tenant which will have timelines, and create them at the start. This avoids pageservers putting a lot of load on the test node during the main body of the test. - Adjust the tenants created to create some number of 8 shard tenants and the rest 1 shard tenants, instead of just creating a lot of 2 shard tenants. - Use archival_config to exercise tenant-mutating operations, instead of using timeline creation for this. - Adjust reconcile_until_idle calls to avoid waiting 5 seconds between calls, which causes timelines with large shard count tenants. - Fix a pageserver bug where calls to archival_config during activation get 404	2024-10-15 09:31:18 +01:00
Alexander Bayandin	0fc4ada3ca	Switch CI, Storage and Proxy to Debian 12 (Bookworm) (#9170 ) ## Problem This PR switches CI and Storage to Debain 12 (Bookworm) based images. ## Summary of changes - Add Debian codename (`bookworm`/`bullseye`) to most of docker tags, create un-codenamed images to be used by default - `vm-compute-node-image`: create a separate spec for `bookworm` (we don't need to build cgroups in the future) - `neon-image`: Switch to `bookworm`-based `build-tools` image - Storage components and Proxy use it - CI: run lints and tests on `bookworm`-based `build-tools` image	2024-10-14 21:12:43 +01:00
Matthias van de Meent	dab96a6eb1	Add more timing histogram and gauge metrics to the Neon extension (#9116 ) We now also track: - Number of PS IOs in-flight - Number of pages cached by smgr prefetch implementation - IO timing histograms for LFC reads and writes, per IO issued ## Problem There's little insight into the timing metrics of LFC, and what the prefetch state of each backend is. This changes that, by measuring (and subsequently exposing) these data points. ## Summary of changes - Extract IOHistogram as separate type, rather than a collection of fields on NeonMetrics - others, see items above. Part of https://github.com/neondatabase/neon/issues/8926	2024-10-14 20:30:21 +02:00
Arpad Müller	f54e3e9147	Also consider offloaded timelines for obtaining retain_lsn (#9308 ) Also consider offloaded timelines for obtaining `retain_lsn`. This is required for correctness for all timelines that have not been flattened yet: otherwise we GC data that might still be required for reading. This somewhat counteracts the original purpose of timeline offloading of not having to iterate over offloaded timelines, but sadly it's required. In the future, we can improve the way the offloaded timelines are stored. We also make the `retain_lsn` optional so that in the future, when we implement flattening, we can make it None. This also applies to full timeline objects by the way, where it would probably make most sense to add a bool flag whether the timeline is successfully flattened, and if it is, one can exclude it from `retain_lsn` as well. Also, track whether a timeline was offloaded or not in `retain_lsn` so that the `retain_lsn` can be excluded from visibility and size calculation. Part of #8088	2024-10-14 17:54:03 +02:00
Vlad Lazar	f4f7ea247c	tests: make size comparisons more lenient (#9388 ) The empirically determined threshold doesn't hold for PG 17. Bump the limit to stabilise ci.	2024-10-14 16:50:12 +01:00
Arpad Müller	d92ff578c4	Add test for fixed storage broker issue (#9311 ) Adds a test for the (now fixed) storage broker limit issue, see #9268 for the description and #9299 for the fix. Also fix a race condition with endpoint creation/starts running in parallel, leading to file not found errors.	2024-10-14 14:34:57 +02:00
Alexander Bayandin	31b7703fa8	CI(build-build-tools): fix unexpected cancellations (#9357 ) ## Problem When `Dockerfile.build-tools` gets changed, several PRs catch up with it and some might get unexpectedly cancelled workflows because of GitHub's concurrency model for workflows. See the comment in the code for more details. It should be possible to revert it after https://github.com/orgs/community/discussions/41518 (I don't expect it anytime soon, but I subscribed) ## Summary of changes - Do not queue `build-build-tools-image` workflows in the concurrency group	2024-10-14 11:51:01 +01:00
Konstantin Knizhnik	d056ae9be5	Ignore pg_dynshmem fiel when comparing directories (#9374 ) ## Problem At MacOS `pg_dynshmem` file is create in PGDATADIR which cause mismatch in directories comparison ## Summary of changes Add this files to the ignore list. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-14 13:45:20 +03:00
Conrad Ludgate	cb9ab7463c	proxy: split out the console-redirect backend flow (#9270 ) removes the ConsoleRedirect backend from the main auth::Backends enum, copy-paste the existing crate::proxy::task_main structure to use the ConsoleRedirectBackend exclusively. This makes the logic a bit simpler at the cost of some fairly trivial code duplication.	2024-10-14 12:25:55 +02:00
Conrad Ludgate	ab5bbb445b	proxy: refactor auth backends (#9271 ) preliminary for #9270 The auth::Backend didn't need to be in the mega ProxyConfig object, so I split it off and passed it manually in the few places it was necessary. I've also refined some of the uses of config I saw while doing this small refactor. I've also followed the trend and make the console redirect backend it's own struct, same as LocalBackend and ControlPlaneBackend.	2024-10-11 20:14:52 +01:00
Alexander Bayandin	5ef805e12c	CI(run-python-test-set): allow to skip missing compatibility snapshot (#9365 ) ## Problem Action `run-python-test-set` fails if it is not used for `regress_tests` on release PR, because it expects `test_compatibility.py::test_create_snapshot` to generate a snapshot, and the test exists only in `regress_tests` suite. For example, in https://github.com/neondatabase/neon/pull/9291 [`test-postgres-client-libs`](https://github.com/neondatabase/neon/actions/runs/11209615321/job/31155111544) job failed. ## Summary of changes - Add `skip-if-does-not-exist` input to `.github/actions/upload` action (the same way we do for `.github/actions/download`) - Set `skip-if-does-not-exist=true` for "Upload compatibility snapshot" step in `run-python-test-set` action	2024-10-11 16:58:41 +01:00
a-masterov	091a175a3e	Test versions mismatch (#9167 ) ## Problem We faced the problem of incompatibility of the different components of different versions. This should be detected automatically to prevent production bugs. ## Summary of changes The test for this situation was implemented Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-10-11 15:29:54 +02:00
Fedor Dikarev	326cd80f0d	ci: gh-workflow-stats-action v0.1.4: remove debug output and proper pagination (#9356 ) ## Problem In previous version pagination didn't work so we collect information only for first 30 jobs in WorkflowRun	2024-10-11 14:46:45 +02:00
Folke Behrens	6baf1aae33	proxy: Demote some errors to warnings in logs (#9354 )	2024-10-11 11:29:08 +02:00
John Spray	184935619e	tests: stabilize test_storage_controller_heartbeats (#9347 ) ## Problem This could fail with `reconciliation in progress` if running on a slow test node such that background reconciliation happens at the same time as we call consistency_check. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/main/11258171952/index.html#/testresult/54889c9469afb232 ## Summary of changes - Call reconcile_until_idle before calling consistency check once, rather than calling consistency check until it passes	2024-10-11 09:41:08 +01:00
Ivan Efremov	b2ecbf3e80	Introduce "quota" ErrorKind (#9300 ) ## Problem Fixes #8340 ## Summary of changes Introduced ErrorKind::quota to handle quota-related errors ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-10-11 10:45:55 +03:00
Tristan Partin	53147b51f9	Use valid type hints for Python 3.9 I have no idea how this made it past the linters. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-10 13:00:25 -05:00
Tristan Partin	006d9dfb6b	Add compute_config_dir fixture Allows easy access to various compute config files. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-10 12:43:40 -05:00
Tristan Partin	1f7904c917	Enable cargo caching in check-codestyle-rust This job takes an extraordinary amount of time for what I understand it to do. The obvious win is caching dependencies. Rory disabled caching in `cd5732d9d8`. I assume this was to get gen3 runners up and running. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-10 12:40:30 -05:00
John Spray	07c714343f	tests: allow a log warning in test_cli_start_stop_multi (#9320 ) ## Problem This test restarts services in an undefined order (whatever neon_local does), which means we should be tolerant of warnings that come from restarting the storage controller while a pageserver is running. We can see failures with warnings from dropped requests, e.g. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9307/11229000712/index.html#/testresult/d33d5cb206331e28 ``` WARN request{method=GET path=/v1/location_config request_id=b7dbda15-6efb-4610-8b19-a3772b65455f}: request was dropped before completing\n') ``` ## Summary of changes - allow-list the `request was dropped before completing` message on pageservers before restarting services	2024-10-10 17:06:42 +01:00
Tristan Partin	264c34dfb7	Move path-related fixtures into their own module (#9304 ) neon_fixtures.py has grown into quite a beast. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-10 10:26:23 -05:00
Erik Grinaker	9dd80b9b4c	storage_scrubber: fix faulty assertion when no timelines (#9345 ) When there are no timelines in remote storage, the storage scrubber would incorrectly trip an assertion with "Must be set if results are present", referring to the last processed tenant ID. When there are no timelines we don't expect there to be a tenant ID either. The assertion was introduced in `37aa6fd`. Only apply the assertion when any timelines are present.	2024-10-10 09:09:53 -04:00
Erik Grinaker	c2623ffef4	CODEOWNERS: assign `storage_scrubber` to storage (#9346 )	2024-10-10 12:40:35 +01:00
John Spray	426b1c5f08	storage controller: use 'infra' JWT scope for node registration (#9343 ) ## Problem Storage controller `/control` API mostly requires admin tokens, for interactive use by engineers. But for endpoints used by scripts, we should not require admin tokens. Discussion at https://neondb.slack.com/archives/C033RQ5SPDH/p1728550081788989?thread_ts=1728548232.265019&cid=C033RQ5SPDH ## Summary of changes - Introduce the 'infra' JWT scope, which was not previously used in the neon repo - For pageserver & safekeeper node registrations, require infra scope instead of admin Note that admin will still work, as the controller auth checks permit admin tokens for all endpoints irrespective of what scope they require.	2024-10-10 12:26:43 +01:00
Folke Behrens	a202b1b5cc	Merge pull request #9341 from neondatabase/rc/proxy/2024-10-10 Proxy release 2024-10-10	2024-10-10 09:17:11 +02:00
Conrad Ludgate	306094a87d	add local-proxy suffix to wake-compute requests, respect the returned port (#9298 ) https://github.com/neondatabase/cloud/issues/18349 Use the `-local-proxy` suffix to make sure we get the 10432 local_proxy port back from cplane.	2024-10-09 22:43:35 +01:00
Tristan Partin	d3464584a6	Improve some typing in test_runner Fixes some types, adds some types, and adds some override annotations. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-09 15:42:22 -05:00
Tristan Partin	878135fe9c	Move PgBenchInitResult.EXTRACTORS to a private module constant This seems to paper over a behavioral difference in Python 3.9 and Python 3.12 with how dataclasses work with mutable variables. On Python 3.12, I get the following error: ValueError: mutable default <class 'dict'> for field EXTRACTORS is not allowed: use default_factory This obviously doesn't occur in our testing environment. When I do what the error tells me, EXTRACTORS doesn't seem to exist as an attribute on the class in at least Python 3.9. The solution provided in this commit seems like the least amount of friction to keep the wheels turning. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-09 14:02:09 -05:00
Conrad Ludgate	75434060a5	local_proxy: integrate with pg_session_jwt extension (#9086 )	2024-10-09 18:24:10 +01:00
Anastasia Lubennikova	721803a0e7	Add partial support of extensions for v17: (#9322 ) - PostGIS 3.5.0 - pgrouting 3.6.2 - h3 4.1.3 - unit 7.9 - pgjwt version (f3d82fd) - pg_hashids 1.2.1 - ip4r 2.4.2 - prefix 1.2.10 - postgresql-hll 2.18 - pg_roaringbitmap 0.5.4 - pg-semver 0.40.0 update support of extensions for v14-v16: - unit 7.7 -> 7.9 - pgjwt 9742dab -> f3d82fd --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-10-09 17:07:59 +01:00
Fedor Dikarev	108a211917	added workflow Report Workflow Stats (#9330 ) ## Summary of changes CI: Collect stats for Github Workflows Runs	2024-10-09 17:27:41 +02:00
Heikki Linnakangas	72ef0e0fa1	tests: Remove redundant log lines when stopping storage nodes (#9317 ) The neon_cli functions print the command that gets executed, which contains the same information. Before: 2024-10-07 22:32:28.884 INFO [neon_fixtures.py:3927] Stopping safekeeper 1 2024-10-07 22:32:28.884 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 1" 2024-10-07 22:32:28.989 INFO [neon_fixtures.py:3927] Stopping safekeeper 2 2024-10-07 22:32:28.989 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 2" 2024-10-07 22:32:29.93 INFO [neon_fixtures.py:3927] Stopping safekeeper 3 2024-10-07 22:32:29.94 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 3" 2024-10-07 22:32:29.251 INFO [neon_cli.py:450] Stopping pageserver with ['pageserver', 'stop', '--id=1'] 2024-10-07 22:32:29.251 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local pageserver stop --id=1" After: 2024-10-07 22:32:28.884 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 1" 2024-10-07 22:32:28.989 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 2" 2024-10-07 22:32:29.94 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local safekeeper stop 3" 2024-10-07 22:32:29.251 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local pageserver stop --id=1"	2024-10-09 15:51:34 +03:00
Heikki Linnakangas	eb23d355a9	tests: Use ThreadedMotoServer python class to launch mock S3 server (#9313 ) This is simpler than using subprocess. One difference is in how moto's log output is now collected. Previously, moto's logs went to stderr, and were collected and printed at the end of the test by pytest, like this: 2024-10-07T22:45:12.3705222Z ----------------------------- Captured stderr call ----------------------------- 2024-10-07T22:45:12.3705577Z 127.0.0.1 - - [07/Oct/2024 22:35:14] "PUT /pageserver-test-deletion-queue-2e6efa8245ec92a37a07004569c29eb7 HTTP/1.1" 200 - 2024-10-07T22:45:12.3706181Z 127.0.0.1 - - [07/Oct/2024 22:35:15] "GET /pageserver-test-deletion-queue-2e6efa8245ec92a37a07004569c29eb7/?list-type=2&delimiter=/&prefix=/tenants/43da25eac0f41412696dd31b94dbb83c/timelines/ HTTP/1.1" 200 - 2024-10-07T22:45:12.3706894Z 127.0.0.1 - - [07/Oct/2024 22:35:16] "PUT /pageserver-test-deletion-queue-2e6efa8245ec92a37a07004569c29eb7//tenants/43da25eac0f41412696dd31b94dbb83c/timelines/eabba5f0c1c72c8656d3ef1d85b98c1d/initdb.tar.zst?x-id=PutObject HTTP/1.1" 200 - Note the timestamps: the timestamp at the beginning of the line is the time that the stderr was dumped, i.e. the end of the test, which makes those timestamps rather useless. The timestamp in the middle of the line is when the operation actually happened, but it has only 1 s granularity. With this change, moto's log lines are printed in the "live log call" section, as they happen, which makes the timestamps more useful: 2024-10-08 12:12:31.129 INFO [_internal.py:97] 127.0.0.1 - - [08/Oct/2024 12:12:31] "GET /pageserver-test-deletion-queue-e24e7525d437e1874d8a52030dcabb4f/?list-type=2&delimiter=/&prefix=/tenants/7b6a16b1460eda5204083fba78bc360f/timelines/ HTTP/1.1" 200 - 2024-10-08 12:12:32.612 INFO [_internal.py:97] 127.0.0.1 - - [08/Oct/2024 12:12:32] "PUT /pageserver-test-deletion-queue-e24e7525d437e1874d8a52030dcabb4f//tenants/7b6a16b1460eda5204083fba78bc360f/timelines/7ab4c2b67fa8c712cada207675139877/initdb.tar.zst?x-id=PutObject HTTP/1.1" 200 -	2024-10-09 15:34:51 +03:00
Yuchen Liang	bee04b8a69	pageserver: add direct io config to virtual file (#9214 ) ## Problem We need a way to incrementally switch to direct IO. During the rollout we might want to switch to O_DIRECT on image and delta layer read path first before others. ## Summary of changes - Revisited and simplified direct io config in `PageserverConf`. - We could add a fallback mode for open, but for read there isn't a reasonable alternative (without creating another buffered virtual file). - Added a wrapper around `VirtualFile`, current implementation become `VirtualFileInner` - Use `open_v2`, `create_v2`, `open_with_options_v2` when we want to use the IO mode specified in PS config. - Once we onboard all IO through VirtualFile using this new API, we will delete the old code path. - Make io mode live configurable for benchmarking. - Only guaranteed for files opened after the config change, so do it before the experiment. As an example, we are using `open_v2` with `virtual_file::IoMode::Direct` in https://github.com/neondatabase/neon/pull/9169 We also remove `io_buffer_alignment` config in `a04cfd754b` and use it as a compile time constant. This way we don't have to carry the alignment around or make frequent call to retrieve this information from the static variable. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-10-09 08:33:07 -04:00
Anastasia Lubennikova	63e7fab990	Add /installed_extensions endpoint to collect statistics about extension usage. (#8917 ) Add /installed_extensions endpoint to collect statistics about extension usage. It returns a list of installed extensions in the format: ```json { "extensions": [ { "extname": "extension_name", "versions": ["1.0", "1.1"], "n_databases": 5, } ] } ``` --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-10-09 13:32:13 +01:00
Arseny Sher	a181392738	safekeeper: add evicted_timelines gauge. (#9318 ) showing total number of evicted timelines.	2024-10-09 14:40:30 +03:00
Alexander Bayandin	fc7397122c	test_runner: fix path to tpc-h queries (#9327 ) ## Problem The path to TPC-H queries was incorrectly changed in #9306. This path is used for `test_tpch` parameterization, so all perf tests started to fail: ``` ==================================== ERRORS ==================================== __________ ERROR collecting test_runner/performance/test_perf_olap.py __________ test_runner/performance/test_perf_olap.py:205: in <module> @pytest.mark.parametrize("query", tpch_queuies()) test_runner/performance/test_perf_olap.py:196: in tpch_queuies assert queries_dir.exists(), f"TPC-H queries dir not found: {queries_dir}" E AssertionError: TPC-H queries dir not found: /__w/neon/neon/test_runner/performance/performance/tpc-h/queries E assert False E + where False = <bound method Path.exists of PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries')>() E + where <bound method Path.exists of PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries')> = PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries').exists ``` ## Summary of changes - Fix the path to tpc-h queries	2024-10-09 12:11:06 +01:00
Vlad Lazar	cc599e23c1	storcon: make observed state updates more granular (#9276 ) ## Problem Previously, observed state updates from the reconciler may have clobbered inline changes made to the observed state by other code paths. ## Summary of changes Model observed state changes from reconcilers as deltas. This means that we only update what has changed. Handling for node going off-line concurrently during the reconcile is also added: set observed state to None in such cases to respect the convention. Closes https://github.com/neondatabase/neon/issues/9124	2024-10-09 11:53:29 +01:00
Folke Behrens	54d1185789	proxy: Unalias hyper1 and replace one use of hyper0 in test (#9324 ) Leaves one final use of hyper0 in proxy for the health service, which requires some coordinated effort with other services.	2024-10-09 12:44:17 +02:00
Heikki Linnakangas	8a138db8b7	tests: Reduce noise from logging renamed files (#9315 ) Instead of printing the full absolute path for every file, print just the filenames. Before: 2024-10-08 13:19:39.98 INFO [test_pageserver_generations.py:669] Found file /home/heikki/git-sandbox/neon/test_output/test_upgrade_generationless_local_file_paths[debug-pg16]/repo/pageserver_1/tenants/0c04a8df7691a367ad0bb1cc1373ba4d/timelines/f41022551e5f96ce8dbefb9b5d35ab45/000000067F0000000100000A8D0100000000-000000067F0000000100000AC10000000002__00000000014F16F0-v1-00000001 2024-10-08 13:19:39.99 INFO [test_pageserver_generations.py:673] Renamed /home/heikki/git-sandbox/neon/test_output/test_upgrade_generationless_local_file_paths[debug-pg16]/repo/pageserver_1/tenants/0c04a8df7691a367ad0bb1cc1373ba4d/timelines/f41022551e5f96ce8dbefb9b5d35ab45/000000067F0000000100000A8D0100000000-000000067F0000000100000AC10000000002__00000000014F16F0-v1-00000001 -> /home/heikki/git-sandbox/neon/test_output/test_upgrade_generationless_local_file_paths[debug-pg16]/repo/pageserver_1/tenants/0c04a8df7691a367ad0bb1cc1373ba4d/timelines/f41022551e5f96ce8dbefb9b5d35ab45/000000067F0000000100000A8D0100000000-000000067F0000000100000AC10000000002__00000000014F16F0 After: 2024-10-08 13:24:39.726 INFO [test_pageserver_generations.py:667] Renaming files in /home/heikki/git-sandbox/neon/test_output/test_upgrade_generationless_local_file_paths[debug-pg16]/repo/pageserver_1/tenants/3439538816c520adecc541cc8b1de21c/timelines/6a7be8ee707b355de48dd91b326d6ae1 2024-10-08 13:24:39.728 INFO [test_pageserver_generations.py:673] Renamed 000000067F0000000100000A8D0100000000-000000067F0000000100000AC10000000002__00000000014F16F0-v1-00000001 -> 000000067F0000000100000A8D0100000000-000000067F0000000100000AC10000000002__00000000014F16F0	2024-10-09 10:55:56 +01:00
Erik Grinaker	211970f0e0	remote_storage: add `DownloadOpts::byte_(start\|end)` (#9293 ) `download_byte_range()` is basically a copy of `download()` with an additional option passed to the backend SDKs. This can cause these code paths to diverge, and prevents combining various options. This patch adds `DownloadOpts::byte_(start\|end)` and move byte range handling into `download()`.	2024-10-09 10:29:06 +01:00
Heikki Linnakangas	f87f5a383e	tests: Remove redundant log lines when starting an endpoint (#9316 ) The "Starting postgres endpoint <name>" message is not needed, because the neon_cli.py prints the neon_local command line used to start the endpoint. That contains the same information. The "Postgres startup took XX seconds" message is not very useful because no one pays attention to those in the python test logs when things are going smoothly, and if you do wonder about the startup speed, the same information and more can be found in the compute log. Before: 2024-10-07 22:32:27.794 INFO [neon_fixtures.py:3492] Starting postgres endpoint ep-1 2024-10-07 22:32:27.794 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local endpoint start --safekeepers 1 ep-1" 2024-10-07 22:32:27.901 INFO [neon_fixtures.py:3690] Postgres startup took 0.11398935317993164 seconds After: 2024-10-07 22:32:27.794 INFO [neon_cli.py:73] Running command "/tmp/neon/bin/neon_local endpoint start --safekeepers 1 ep-1"	2024-10-09 09:58:50 +01:00
Arpad Müller	e8ae37652b	Add timeline offload mechanism (#8907 ) Implements an initial mechanism for offloading of archived timelines. Offloading is implemented as specified in the RFC. For now, there is no persistence, so a restart of the pageserver will retrigger downloads until the timeline is offloaded again. We trigger offloading in the compaction loop because we need the signal for whether compaction is done and everything has been uploaded or not. Part of #8088	2024-10-09 01:33:39 +02:00
Tristan Partin	5bd8e2363a	Enable all pyupgrade checks in ruff This will help to keep us from using deprecated Python features going forward. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-08 14:32:26 -05:00
Vlad Lazar	618680c299	storcon: apply all node status changes before handling transitions (#9281 ) ## Problem When a node goes offline, we trigger reconciles to migrate shards away from it. If multiple nodes go offline at the same time, we handled them in sequence. Hence, we might migrate shards from the first offline node to the second offline node and increase the unavailability period. ## Summary of changes Refactor heartbeat delta handling to: 1. Update in memory state for all nodes first 2. Handle availability transitions one by one (we have full picture for each node after (1)) Closes https://github.com/neondatabase/neon/issues/9126	2024-10-08 17:55:25 +01:00
Alexander Bayandin	baf27ba6a3	Fix compiler warnings on macOS (#9319 ) ## Problem On macOS: ``` /Users/runner/work/neon/neon//pgxn/neon/file_cache.c:623:19: error: variable 'has_remaining_pages' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized] ``` ## Summary of changes - Initialise `has_remaining_pages` with `false`	2024-10-08 17:34:35 +01:00
Tristan Partin	16417d919d	Remove get_self_dir() It didn't serve much value, and was only used twice. Path(__file__).parent is a pretty easy invocation to use. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-08 08:57:11 -05:00
Heikki Linnakangas	18b97150b2	Remove non-existent entries from .dockerignore (#9209 )	2024-10-08 14:55:24 +03:00
Heikki Linnakangas	17c59ed786	Don't override CFLAGS when building neon extension If you override CFLAGS, you also override any flags that PostgreSQL configure script had picked. That includes many options that enable extra compiler warnings, like '-Wall', '-Wmissing-prototypes', and so forth. The override was added in commit `171385ac14`, but the intention of that was to be more strict, by enabling '-Werror', not less strict. The proper way of setting '-Werror', as documented in the docs and mentioned in PR #2405, is to set COPT='-Werror', but leave CFLAGS alone. All the compiler warnings with the standard PostgreSQL flags have now been fixed, so we can do this without adding noise. Part of the cleanup issue #9217.	2024-10-07 23:49:33 +03:00
Heikki Linnakangas	d7b960c9b5	Silence compiler warning about using variable uninitialized It's not a bug, the variable is initialized when it's used, but the compiler isn't smart enough to see that through all the conditions. Part of the cleanup issue #9217.	2024-10-07 23:49:31 +03:00
Heikki Linnakangas	2ff6d2b6b5	Silence compiler warning about variable only used in assertions Part of the cleanup issue #9217.	2024-10-07 23:49:29 +03:00
Heikki Linnakangas	30f7fbc88d	Add pg_attribute_printf to WalProposerLibLog, per gcc's suggestion /pgxn/neon/walproposer_compat.c:192:9: warning: function ‘WalProposerLibLog’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format] 192 \| vsnprintf(buf, sizeof(buf), fmt, args); \| ^~~~~~~~~	2024-10-07 23:49:27 +03:00
Heikki Linnakangas	09f2000f91	Silence warnings about shadowed local variables Part of the cleanup issue #9217.	2024-10-07 23:49:24 +03:00
Heikki Linnakangas	e553ca9e4f	Silence warnings about mixed declarations and code The warning: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] It's PostgreSQL project style to stick to the old C90 style. (Alternatively, we could disable it for our extension.) Part of the cleanup issue #9217.	2024-10-07 23:49:22 +03:00
Heikki Linnakangas	0a80dbce83	neon_write() function is not used on v17 ifdef it out on v17, to silence compiler warning. Part of the cleanup issue #9217.	2024-10-07 23:49:20 +03:00
Heikki Linnakangas	e763256448	Fix warnings about missing function prototypes Prototypes for neon_writev(), neon_readv(), and neon_regisersync() were missing. But instead of adding the missing prototypes, mark all the smgr functions 'static'. Part of the cleanup issue #9217.	2024-10-07 23:49:18 +03:00
Heikki Linnakangas	129d4480bb	Move "/* fallthrough */" comments so that GCC recognizes them This silences warnings about implicit fallthroughs. Part of the cleanup issue #9217.	2024-10-07 23:49:16 +03:00
Heikki Linnakangas	776df963ba	Fix function prototypes Silences these compiler warnings: /pgxn/neon_walredo/walredoproc.c:452:1: warning: ‘CreateFakeSharedMemoryAndSemaphores’ was used with no prototype before its definition [-Wmissing-prototypes] 452 \| CreateFakeSharedMemoryAndSemaphores() \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /pgxn/neon/walproposer_pg.c:541:1: warning: no previous prototype for ‘GetWalpropShmemState’ [-Wmissing-prototypes] 541 \| GetWalpropShmemState() \| ^~~~~~~~~~~~~~~~~~~~ Part of the cleanup issue #9217.	2024-10-07 23:49:13 +03:00
Heikki Linnakangas	11dc5feb36	Remove unused static function In v16 merge, we copied much of heap RMGR, to distinguish vanilla Postgres heap records from records generated with neon patches, with the additional CID fields. This function is only used by the HEAP_TRUNCATE records, however, which we didn't need to copy. Part of the cleanup issue #9217.	2024-10-07 23:49:11 +03:00
Heikki Linnakangas	dbbe57a837	Remove unused local vars and a prototype for non-existent function Per compiler warnings. Part of the cleanup issue #9217.	2024-10-07 23:49:09 +03:00
Em Sharnoff	cc29def544	vm-monitor: Ignore LFC in postgres cgroup memory threshold (#8668 ) In short: Currently we reserve 75% of memory to the LFC, meaning that if we scale up to keep postgres using less than 25% of the compute's memory. This means that for certain memory-heavy workloads, we end up scaling much higher than is actually needed — in the worst case, up to 4x, although in practice it tends not to be quite so bad. Part of neondatabase/autoscaling#1030.	2024-10-07 21:25:34 +01:00
Arpad Müller	912d47ec02	storage_broker: update hyper and tonic again (#9299 ) Update hyper and tonic again in the storage broker, this time with a fix for the issue that made us revert the update last time. The first commit is a revert of #9268, the second a fix for the issue. fixes #9231.	2024-10-07 21:12:13 +02:00
Tristan Partin	6eba29c732	Improve logging on changes in a compute's status I'm trying to debug a situation with the LR benchmark publisher not being in the correct state. This should aid in debugging, while just being generally useful. PR: https://github.com/neondatabase/neon/pull/9265 Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-07 13:19:48 -04:00
Heikki Linnakangas	99d4c1877b	Replace BUFFERTAGS_EQUAL compatibility macro with new-style function (#9294 ) In PostgreSQL v16, BUFFERTAGS_EQUAL was replaced with a static inline macro, BufferTagsEqual. Let's use the new name going forward, and have backwards-compatibility glue to allow using the new name on v14 and v15, rather than the other way round. This also makes BufferTagsEquals consistent with InitBufferTag, for which we were already using the new name.	2024-10-07 19:49:27 +03:00
Jere Vaara	2272dc8a48	feat(compute_tools): Create JWKS Postgres roles without attributes (#9031 ) Requires https://github.com/neondatabase/neon/pull/9086 first to have `local_proxy_config`. This logic can still be reviewed implementation wise. Create JWT Auth functionality related roles without attributes and `neon_superuser` group. Read the JWT related roles from `local_proxy_config` `JWKS` settings and handle them differently than other console created roles.	2024-10-07 19:37:32 +03:00
Heikki Linnakangas	323bd018cd	Make sure BufferTag padding bytes are cleared in hash keys (#9292 ) The prefetch-queue hash table uses a BufferTag struct as the hash key, and it's hashed using hash_bytes(). It's important that all the padding bytes in the key are cleared, because hash_bytes() will include them. I was getting compiler warnings like this on v14 and v15, when compiling with -Warray-bounds: In function ‘prfh_lookup_hash_internal’, inlined from ‘prfh_lookup’ at pg_install/v14/include/postgresql/server/lib/simplehash.h:821:9, inlined from ‘neon_read_at_lsnv’ at pgxn/neon/pagestore_smgr.c:2789:11, inlined from ‘neon_read_at_lsn’ at pgxn/neon/pagestore_smgr.c:2904:2: pg_install/v14/include/postgresql/server/storage/relfilenode.h:90:43: warning: array subscript ‘PrefetchRequest[0]’ is partly outside array bounds of ‘BufferTag[1]’ {aka ‘struct buftag[1]’} [-Warray-bounds] 89 \| ((node1).relNode == (node2).relNode && \ \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 90 \| (node1).dbNode == (node2).dbNode && \ \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ 91 \| (node1).spcNode == (node2).spcNode) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ pg_install/v14/include/postgresql/server/storage/buf_internals.h:116:9: note: in expansion of macro ‘RelFileNodeEquals’ 116 \| RelFileNodeEquals((a).rnode, (b).rnode) && \ \| ^~~~~~~~~~~~~~~~~ pgxn/neon/neon_pgversioncompat.h:25:31: note: in expansion of macro ‘BUFFERTAGS_EQUAL’ 25 \| #define BufferTagsEqual(a, b) BUFFERTAGS_EQUAL((a), (b)) \| ^~~~~~~~~~~~~~~~ pgxn/neon/pagestore_smgr.c:220:34: note: in expansion of macro ‘BufferTagsEqual’ 220 \| #define SH_EQUAL(tb, a, b) (BufferTagsEqual(&(a)->buftag, &(b)->buftag)) \| ^~~~~~~~~~~~~~~ pg_install/v14/include/postgresql/server/lib/simplehash.h:280:77: note: in expansion of macro ‘SH_EQUAL’ 280 \| #define SH_COMPARE_KEYS(tb, ahash, akey, b) (ahash == SH_GET_HASH(tb, b) && SH_EQUAL(tb, b->SH_KEY, akey)) \| ^~~~~~~~ pg_install/v14/include/postgresql/server/lib/simplehash.h:799:21: note: in expansion of macro ‘SH_COMPARE_KEYS’ 799 \| if (SH_COMPARE_KEYS(tb, hash, key, entry)) \| ^~~~~~~~~~~~~~~ pgxn/neon/pagestore_smgr.c: In function ‘neon_read_at_lsn’: pgxn/neon/pagestore_smgr.c:2742:25: note: object ‘buftag’ of size 20 2742 \| BufferTag buftag = {0}; \| ^~~~~~ This commit silences those warnings, although it's not clear to me why the compiler complained like that in the first place. I found the issue with padding bytes while looking into those warnings, but that was coincidental, I don't think the padding bytes explain the warnings as such. In v16, the BUFFERTAGS_EQUAL macro was replaced with a static inline function, and that also silences the compiler warning. Not clear to me why.	2024-10-07 18:04:04 +03:00
Folke Behrens	ad267d849f	proxy: Move module base files into module directory (#9297 )	2024-10-07 16:25:34 +02:00
Conrad Ludgate	8cd7b5bf54	proxy: rename console -> control_plane, rename web -> console_redirect (#9266 ) rename console -> control_plane rename web -> console_redirect I think these names are a little more representative.	2024-10-07 14:09:54 +01:00
Konstantin Knizhnik	47c3c9a413	Fix update of statistic for LFC/prefetch (#9272 ) ## Problem See #9199 ## Summary of changes Fix update of hits/misses for LFC and prefetch introduced in `78938d1b59` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-10-07 12:21:16 +03:00
Arseny Sher	eae4470bb6	safekeeper: remove local WAL files ignoring peer_horizon_lsn. (#8900 ) If peer safekeeper needs garbage collected segment it will be fetched now from s3 using on-demand WAL download. Reduces danger of running out of disk space when safekeeper fails.	2024-10-04 19:07:39 +03:00
Ivan Efremov	2d248aea6f	proxy: exclude triple logging of connect compute errors (#9277 ) Fixes (#9020) - Use the compute::COULD_NOT_CONNECT for connection error message; - Eliminate logging for one connection attempt; - Typo fix.	2024-10-04 18:21:39 +03:00
Conrad Ludgate	6c05f89f7d	proxy: add local-proxy to compute image (#8823 ) 1. Adds local-proxy to compute image and vm spec 2. Updates local-proxy config processing, writing PID to a file eagerly 3. Updates compute-ctl to understand local proxy compute spec and to send SIGHUP to local-proxy over that pid. closes https://github.com/neondatabase/cloud/issues/16867	2024-10-04 14:52:01 +00:00
Arseny Sher	db53f98725	neon walsender_hooks: take basebackup LSN directly. (#9263 ) NeonWALReader needs to know LSN before which WAL is not available locally, that is, basebackup LSN. Previously it was taken from WalpropShmemState, but that's racy, as walproposer sets its there only after successfull election. Get it directly with GetRedoStartLsn. Should fix flakiness of test_ondemand_wal_download_in_replication_slot_funcs etc. ref #9201	2024-10-04 14:56:15 +01:00
Erik Grinaker	04a6222418	remote_storage: add `head_object` integration test (#9274 )	2024-10-04 12:40:41 +01:00
Vlad Lazar	dcf7af5a16	storcon: do timeline creation on all attached location (#9237 ) ## Problem Creation of a timelines during a reconciliation can lead to unavailability if the user attempts to start a compute before the storage controller has notified cplane of the cut-over. ## Summary of changes Create timelines on all currently attached locations. For the latest location, we still look at the database (this is a previously). With this change we also look into the observed state to find other attached locations. Related https://github.com/neondatabase/neon/issues/9144	2024-10-04 11:56:43 +01:00
Erik Grinaker	37158d0424	pageserver: use conditional GET for secondary tenant heatmaps (#9236 ) ## Problem Secondary tenant heatmaps were always downloaded, even when they hadn't changed. This can be avoided by using a conditional GET request passing the `ETag` of the previous heatmap. ## Summary of changes The `ETag` was already plumbed down into the heatmap downloader, and just needed further plumbing into the remote storage backends. * Add a `DownloadOpts` struct and pass it to `RemoteStorage::download()`. * Add an optional `DownloadOpts::etag` field, which uses a conditional GET and returns `DownloadError::Unmodified` on match.	2024-10-04 12:29:48 +02:00
Erik Grinaker	60fb840e1f	Cargo.toml: enable `sso` for `aws-config` (#9261 ) ## Problem The S3 tests couldn't use SSO authentication for local tests against S3. ## Summary of changes Enable the `sso` feature of `aws-config`. Also run `cargo hakari generate` which made some updates to `workspace_hack`.	2024-10-04 11:27:06 +01:00
Heikki Linnakangas	52232dd85c	tests: Add a comment explaining the rules of NeonLocalCli wrappers (#9195 )	2024-10-03 22:03:29 +03:00
Heikki Linnakangas	8ef0c38b23	tests: Rename NeonLocalCli functions to match the 'neon_local' commands (#9195 ) This makes it more clear that the functions in NeonLocalCli are just typed wrappers around the corresponding 'neon_local' commands.	2024-10-03 22:03:27 +03:00
Heikki Linnakangas	56bb1ac458	tests: Move NeonCli and friends to separate file (#9195 ) In the passing, rename it to NeonLocalCli, to reflect that the binary is called 'neon_local'. Add wrapper for the 'timeline_import' command, eliminating the last raw call to the raw_cli() function from tests, except for a few in test_neon_cli.py which are about testing the 'neon_local' iteself. All the other calls are now made through the strongly-typed wrapper functions	2024-10-03 22:03:25 +03:00
Heikki Linnakangas	19db9e9aad	tests: Replace direct calls to neon_cli with wrappers in NeonEnv (#9195 ) Add wrappers for a few commands that didn't have them before. Move the logic to generate tenant and timeline IDs from NeonCli to the callers, so that NeonCli is more purely just a type-safe wrapper around 'neon_local'.	2024-10-03 22:03:22 +03:00
David Gomes	4e9b32c442	chore: makes some onboarding document improvements (#9216 ) * I had to install `m4` in order to be able to run locally * The docs/docker.md was missing a pointer to where the compute node code is (Was originally on #8888 but I am pulling this out)	2024-10-03 20:58:30 +02:00
David Gomes	2fac0b7fac	chore: remove unnecessary comments in compute/Dockerfile.compute-node (#9253 ) See [this comment](https://github.com/neondatabase/neon/pull/8888#discussion_r1783130082).	2024-10-03 18:26:41 +00:00
Arpad Müller	e3d6ecaeee	Revert hyper and tonic updates (#9268 )	2024-10-03 19:21:22 +01:00
Arseny Sher	d785fcb5ff	safekeeper: fix panic in debug_dump. (#9097 ) Panic was triggered only when dump selected no timelines. sentry report: https://neondatabase.sentry.io/issues/5832368589/	2024-10-03 19:22:22 +03:00
Vlad Lazar	552fa2b972	pageserver: tweak oversized key read path warning (#9221 ) ## Problem `Oversized vectored read [...]` logs are spewing in prod because we have a few keys that are unexpectedly large: * reldir/relblock - these are unbounded, so it's known technical debt * slru block - they can be a bit bigger than 128KiB due to storage format overhead ## Summary of changes * Bump threshold to 130KiB * Don't warn on oversized reldir and dbdir keys Closes https://github.com/neondatabase/neon/issues/8967	2024-10-03 16:40:35 +01:00
Arpad Müller	9d93dd4807	Rename hyper 1.0 to hyper and hyper 0.14 to hyper0 (#9254 ) Follow-up of #9234 to give hyper 1.0 the version-free name, and the legacy version of hyper the one with the version number inside. As we move away from hyper 0.14, we can remove the `hyper0` name piece by piece. Part of #9255	2024-10-03 16:33:43 +02:00
Heikki Linnakangas	53b6e1a01c	vm-monitor: Upgrade axum from 0.6 to 0.7 (#9257 ) Because: - it's nice to be up-to-date, - we already had axum 0.7 in our dependency tree, so this avoids having to compile two versions, and - removes one of the remaining dpendencies to hyper version 0 Also bumps the 'tokio-tungstenite' dependency, to avoid having two versions in the dependency tree.	2024-10-03 16:49:39 +03:00
Folke Behrens	90f731f3b1	Merge pull request #9256 from neondatabase/rc/proxy/2024-10-03 Proxy release 2024-10-03	2024-10-03 11:01:41 +02:00
Joonas Koivunen	dbef1b064c	chore: smaller layer changes (#9247 ) Address minor technical debt in Layer inspired by #9224: - layer usage as arg same as in spans - avoid one Weak::upgrade	2024-10-03 09:38:45 +01:00
Heikki Linnakangas	6a9e2d657c	Remove unnecessary dependencies from postgis-build image (#9211 ) The apt install stage before this commit: 0 upgraded, 391 newly installed, 0 to remove and 9 not upgraded. Need to get 261 MB of archives. after: 0 upgraded, 367 newly installed, 0 to remove and 9 not upgraded. Need to get 220 MB of archives.	2024-10-03 10:05:23 +03:00
Arpad Müller	2d8f6d7906	Suppress wal lag timeout warnings right after tenant attachment (#9232 ) As seen in https://github.com/neondatabase/cloud/issues/17335, during releases we can have ingest lags that are above the limits for warnings. However, such lags are part of normal pageserver startup. Therefore, calculate a certain cooldown timestamp until which we accept lags up to a certain size. The heuristic is chosen to grow the later we get to fully load the tenant, and we also add 60 seconds as a grace period after that term.	2024-10-03 02:33:09 +01:00
Arpad Müller	1b176fe74a	Use hyper 1.0 and tonic 0.12 in storage broker (#9234 ) Fixes #9231 . Upgrade hyper to 1.4.0 and use hyper 1.4 instead of 0.14 in the storage broker, together with tonic 0.12. The two upgrades go hand in hand. Thanks to the broker being independent from other components, we can upgrade its hyper version without touching the other components, which makes things easier.	2024-10-03 00:48:12 +02:00
Heikki Linnakangas	1dec93f129	Add compute_tools/ to the list of paths that trigger an E2E run on a PR (#9251 ) compute_ctl is an important part of the interfaces between the control plane and the compute, so it seems important to E2E test any changes there.	2024-10-03 00:31:19 +03:00
Alexander Bayandin	16002f5e45	test_runner: bump `requests` and `psycopg2-binary` (#9248 ) ## Problem ``` Warning: The file chosen for install of requests 2.32.0 (requests-2.32.0-py3-none-any.whl) is yanked. Reason for being yanked: Yanked due to conflicts with CVE-2024-35195 mitigation ``` ## Summary of changes - Update `requests` to fix the warning - Update `psycopg2-binary`	2024-10-02 21:26:45 +01:00
dotdister	09d4bad1be	Change parentheses to clarify conditions in walproposer (#9180 ) Some parentheses in conditional expressions are redundant or necessary for clarity conditional expressions in walproposer. ## Summary of changes Change some parentheses to clarify conditions in walproposer. Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-10-02 14:49:52 -04:00
Heikki Linnakangas	d20448986c	Fix metric name of the 'getpage_wait_seconds_bucket' metric (#9242 ) Per convention, histogram buckets have the '_bucket' suffix. I got that wrong in commit `0d500bbd5b`. Fixes https://github.com/neondatabase/neon/issues/9241	2024-10-02 20:05:14 +03:00
John Spray	d54624153d	tests: sync_after_each_test -> sync_between_tests (#9239 ) ## Problem We are seeing frequent pageserver startup timelines while it calls syncfs(). There is an existing fixture that syncs _after_ tests, but not before the first one. We hypothesize that some failures are happening on the first test in a job. ## Summary of changes - extend the existing sync_after_each_test to be a sync between all tests, including sync'ing before running the first test. That should remove any ambiguity about whether the sync is happening on the correct node. This is an alternative to https://github.com/neondatabase/neon/pull/8957 -- I didn't realize until I saw Alexander's comment on that PR that we have an existing hook that syncs filesystems and can be extended.	2024-10-02 17:44:25 +01:00
Alex Chi Z.	700885471f	fix(test): only test num of L1 layers in compaction smoke test (#9186 ) close https://github.com/neondatabase/neon/issues/9160 For whatever reason, pg17's WAL pattern seems different from others, which triggers some flaky behavior within the compaction smoke test. ## Summary of changes * Run L0 compaction before proceeding with the read benchmark. * So that we can ensure the num of L0 layers is 0 and test the compaction behavior only with L1 layers. We have a threshold for triggering L0 compaction. In some cases, the test case did not produce enough L0 layers to do a L0 compaction, therefore leaving the layer map with 3+ L0 layers above the L1 layers. This increases the average read depth for the timeline. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-02 17:42:35 +01:00
Vlad Lazar	38a8dcab9f	storcon: add metric for long running reconciles (#9207 ) ## Problem We don't have an alert for long running reconciles. Stuck reconciles are problematic as we've seen in a recent incident. ## Summary of changes Add a new metric `storage_controller_reconcile_long_running_total` with labels: `{tenant_id, shard_number, seq}`. The metric is removed after the long running reconcile finishes. These events should be rare, so we won't break the bank on cardinality. Related https://github.com/neondatabase/neon/issues/9150	2024-10-02 17:25:11 +01:00
Vlad Lazar	8dbfda98d4	storcon: ignore deleted timelines on new location catch-up (#9244 ) ## Problem If a timeline was deleted right before waiting for LSNs to catch up before the cut-over, then we would wait forever. ## Summary of changes Fix the issue and add a test for timeline deletions mid migration. Related https://github.com/neondatabase/neon/issues/9144	2024-10-02 17:23:26 +01:00
John Spray	f875e107aa	pageserver: tweak logging of "became visible" for layers (#9224 ) ## Problem Recent change to avoid the "became visible" log messages from certain tasks missed a task: the logical size calculation that happens as a child of synthetic size calculation. Related: https://github.com/neondatabase/neon/issues/9058 ## Summary of changes - Add OnDemandLogicalSize to the list of permitted tasks for reads making a covered layer visible - Tweak the log message to use layer name instead of key: this is more terse, and easier to use when debugging, as one can search for it elsewhere to see when the layer was written/downloaded etc.	2024-10-02 13:21:04 +01:00
Folke Behrens	1e90e792d6	proxy: Add timeout to webauth confirmation wait (#9227 ) ```shell $ cargo run -p proxy --bin proxy -- --auth-backend=web --webauth-confirmation-timeout=5s ``` ``` $ psql -h localhost -p 4432 NOTICE: Welcome to Neon! Authenticate by visiting within 5s: http://localhost:3000/psql_session/e946900c8a9bc6e9 psql: error: connection to server at "localhost" (::1), port 4432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? connection to server at "localhost" (127.0.0.1), port 4432 failed: ERROR: Disconnected due to inactivity after 5s. ```	2024-10-02 12:10:56 +02:00
Matthias van de Meent	ea32f1d0a3	Expose more granular wait event data to the user (#9163 ) In PG17, there is this newfangled custom wait events system. This commit adds that feature to Neon, so that users can see what their backends may be waiting for when a PostgreSQL backend is playing the waiting game in Neon code.	2024-10-02 11:12:50 +02:00
Heikki Linnakangas	2e3b7862d0	Fix compute metrics collector config (#9235 )	2024-10-02 09:44:00 +01:00
Arpad Müller	387e569259	Update aws SDK crates (#9233 ) This updates the aws SDK crates to their newest released versions.	2024-10-02 08:00:08 +02:00
Alex Chi Z.	31f12f6426	fix: ignore tonic to resolve advisories (#9230 ) check-rust-style fails because tonic version too old, this does not seem to be an easy fix, so ignore it from the deny list. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-01 19:26:54 -04:00
Heikki Linnakangas	8861e8a323	Fix the size of the perf counters shared memory array (#9226 ) MaxBackends doesn't include auxiliary processes. Whenever an aux process made IO operations that updated the counters, they would scribble over shared memory beoynd the end of the array. The relsize cache hash table comes after the array, so the symptom was an error about hash table corruption in the relsize cache hash.	2024-10-01 20:07:51 +01:00
Arseny Sher	62e22dfd85	Backpressure: reset ps display after it is done. (#8980 ) Previously we set the 'backpressure throttling' status, but overwrote current one and never reset it back.	2024-10-01 20:55:05 +03:00
Arseny Sher	17672c88ff	tests: wait walreceiver on sks to be gone on 'immediate' ep restart. (#9099 ) When endpoint is stopped in immediate mode and started again there is a chance of old connection delivering some WAL to safekeepers after second start checked need for sync-safekeepers and thus grabbed basebackup LSN. It makes basebackup unusable, so compute panics. Avoid flakiness by waiting for walreceivers on safekeepers to be gone in such cases. A better way would be to bump term on safekeepers if sync-safekeepers is skipped, but it needs more infrastructure. ref https://github.com/neondatabase/neon/issues/9079	2024-10-01 20:54:00 +03:00
Matthias van de Meent	6efdb1d0f3	Fix small memory accounting bug in libpagestore (#9223 ) Found while searching for other issues in shared memory. The bug should be benign, in that it over-allocates memory for this struct, but doesn't allow for out-of-bounds writes.	2024-10-01 17:37:59 +01:00
Erik Grinaker	325de52e73	pageserver: remove `TenantConfOpt::TryFrom<toml_edit::Item>` (#9219 ) Following #7656, `TenantConfOpt::TryFrom<toml_edit::Item>` appears to be dead code. This patch removes `TenantConfOpt::TryFrom<toml_edit::Item>`. The code does appear to be dead, since the TOML config is deserialized into `TenantConfig` (via `LocationConfig`) and then converted into `TenantConfOpt`. This was verified by adding a panic to `try_from()` and running the pageserver unit tests as well as a local end-to-end cluster (including creating a new tenant and restarting the pageserver). This did not fail, so this is not used on the common happy path at least. No explicit `try_from` or `try_into` calls were found either. Resolves #8918.	2024-10-01 16:35:18 +01:00
Anastasia Lubennikova	ce73db9316	Fix post_apply_config() (#9220 ) Bring back post_apply_config() step that was accidentally removed in `78938d1`	2024-10-01 16:28:58 +01:00
Shinya Kato	b675997f48	safekeeper: Fix a log message of HTTP worker (#9213 ) ## Problem There is a wrong log message. ## Summary of changes Fixed the log message.	2024-10-01 17:16:53 +02:00
Alex Chi Z.	49f99eb729	docs: add aux file v2 RFC (#9115 ) aux v2 migration is near the end and I rewrote the RFC based on what I proposed (several months before...) and what I actually implemented. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-01 15:56:54 +01:00
Heikki Linnakangas	0d500bbd5b	Add new compute metrics to sql exporter (#9190 ) These are the perf counters added in commit `263dfba6ee`. Note: This relies on 'neon' extension version 1.5. The default was bumped to 1.5 in commit `d696c41807`. --------- Co-authored-by: Matthias van de Meent <matthias@neon.tech>	2024-10-01 17:38:19 +03:00
Heikki Linnakangas	1b8b50755c	Use debian packages for cmake again (#9212 ) On bookworm, 'cmake' is new enough that we can just use it. On bullseye, we can get a new-enough package from backports. By including 'cmake' in the build-deps stage, we don't need to install it separately in all the later build stages that need it. See https://github.com/neondatabase/neon/pull/2699, where we switched to downloading and building a specific version.	2024-10-01 15:09:09 +03:00
Conrad Ludgate	4391b25d01	proxy: ignore typ and use jwt.alg rather than jwk.alg (#9215 ) Microsoft exposes JWKs without the alg header. It's only included on the tokens. Not a problem. Also noticed that wrt the `typ` header: > It will typically not be used by applications when it is already known that the object is a JWT. This parameter is ignored by JWT implementations; any processing of this parameter is performed by the JWT application. Since we know we are expecting JWTs only, I've followed the guidance and removed the validation.	2024-10-01 10:36:49 +01:00
John Spray	40b10b878a	storage_scrubber: retry on index deletion failures (#9204 ) ## Problem In automated tests running on AWS S3, we frequently see scrubber failures when it can't delete an index. `location_conf_churn`: https://neon-github-public-dev.s3.amazonaws.com/reports/main/11076221056/index.html#/testresult/f89b1916b6a693e2 `scrubber_physical_gc`: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9178/11074269153/index.html#/testresult/9885ed5aa0fe38b6 ## Summary of changes Wrap index deletion in a backoff::retry --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-10-01 10:34:39 +01:00
David Gomes	d6c6b0a509	feat(compute): adds pg_session_jwt extension to compute image (#8888 ) ## Problem We need the [pg_session_jwt](https://github.com/neondatabase/pg_session_jwt/) extension in the compute image. This PR adds it. ## Summary of changes I added the `pg_session_jwt` extension in a very similar way to how the pggraphql and pgtiktoken extensions were added (since they're all written with pgrx). Then I tested this. ``` $ cd docker-compose/ $ PG_VERSION=16 TAG=10667533475 docker-compose up --build -d $ psql postgresql://cloud_admin:cloud_admin@localhost:55433/postgres cloud_admin@postgres=# create extension pg_session_jwt; CREATE EXTENSION Time: 43.048 ms cloud_admin@postgres=# \df auth.*; List of functions ┌────────┬──────────────────┬──────────────────┬─────────────────────┬──────┐ │ Schema │ Name │ Result data type │ Argument data types │ Type │ ├────────┼──────────────────┼──────────────────┼─────────────────────┼──────┤ │ auth │ get │ jsonb │ s text │ func │ │ auth │ init │ void │ kid bigint, s jsonb │ func │ │ auth │ jwt_session_init │ void │ s text │ func │ │ auth │ user_id │ text │ │ func │ └────────┴──────────────────┴──────────────────┴─────────────────────┴──────┘ (4 rows) cloud_admin@postgres=# select auth.init(cast('1' as bigint), to_jsonb(TEXT '{ "kty": "EC", "kid": "571683be-33cf-4e67-bccc-8905c0ebb862", "crv": "P-521", "alg": "ES512", "x": "AM_GsnQvKML2yXdn_OsN8PdgO1Sf9XMXih5vQMKLmJkp-Iz_FFWJUt6uyR_qp4brr8Ji2kjGJgN4cQJpg2kskH7V", "y": "AZg-salw24lCmsBP-BCBa5jT6INkTwLtCOC7o0BIxDVvmIEH1-PQAJVYVJPTFvPMi_PLa0QlOm-ufJYkynwa2Mau" }')); ERROR: called `Result::unwrap()` on an `Err` value: Error("invalid type: string \"{ \\\"kty\\\": \\\"EC\\\", \\\"kid\\\": \\\"571683be-33cf-4e67-bccc-8905c0ebb862\\\", \\\"crv\\\": \\\"P-521\\\", \\\"alg\\\": \\\"ES512\\\", \\\"x\\\": \\\"AM_GsnQvKML2yXdn_OsN8PdgO1Sf9XMXih5vQMKLmJkp-Iz_FFWJUt6uyR_qp4brr8Ji2kjGJgN4cQJpg2kskH7V\\\", \\\"y\\\": \\\"AZg-salw24lCmsBP-BCBa5jT6INkTwLtCOC7o0BIxDVvmIEH1-PQAJVYVJPTFvPMi_PLa0QlOm-ufJYkynwa2Mau\\\" }\", expected struct JwkEcKey", line: 0, column: 0) Time: 6.991 ms ``` ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Move the download location to a proper URL	2024-10-01 10:29:56 +01:00
John Spray	d515727e94	tests: make test_multi_attach more stable (#9202 ) ## Problem `test_multi_attach` is sometimes failing with `invalid compute status for configuration request: Configuration`. This is likely a result of the test attempting to reconfigure the compute at the same time as the storage controller is doing so. This test was originally written before the storage controller existed, and is not expecting anything else to be reconfiguring computes at the same time. ## Summary of changes - Configure the tenant into scheduling policy `Stop` in the storage controller at the start of the test, so that it won't try to do anything to the tenant while the test is running.	2024-10-01 10:15:18 +01:00
Folke Behrens	2e508b1ff9	Upgrade OpenTelemetry and other tracing crates (#9200 ) * tracing-utils now returns a `Layer` impl. Removes the need for crates to import OTel crates. * Drop the /v1/traces URI check. Verified that the code does the right thing. * Leave a TODO to hook in an error handler for OTel to log errors to when it assumes the regular pipeline cannot be used/is broken.	2024-10-01 11:02:54 +02:00
John Spray	651ae44569	storage controller: drop out of blocking compute notification loop if migration origin becomes unavailable (#9147 ) ## Problem The live migration code waits forever for the compute notification hook, on the basis that until it succeeds, the compute is probably using the old location and we shouldn't detach it. However, if a pageserver stops or restarts in the background, then this original location might no longer be available, so there is no point waiting. Waiting is also actively harmful, because it prevents other reconciliations happening for the tenant shard, such as during an upgrade where a stuck "drain" migration might prevent the later "fill" migration from moving the shard back to its original location. ## Summary of changes - Refactor the notification wait loop into a function - Add a checks during the loop, for the origin node's cancellation token and an explicit HTTP request to the origin node to confirm the shard is still attached there. Closes: https://github.com/neondatabase/neon/issues/8901	2024-10-01 07:57:22 +00:00
Heikki Linnakangas	65bda19051	Remove unnecessary dev package from compute image (#9210 ) libcurl4-openssl-dev is needed to build pgxn/, but libcurl4 is enough at runtime.	2024-10-01 01:07:43 +03:00
Conrad Ludgate	94a5ca2817	proxy: auth broker (#8855 ) Opens http2 connection to local-proxy and forwards requests over with all headers and body closes https://github.com/neondatabase/cloud/issues/16039	2024-09-30 20:43:45 +01:00
Arthur Petukhovsky	c07cea80bd	Bump vm-builder v0.29.3 -> v0.35.0 (#9208 ) We haven't updated it for a while. Now I need the update to add quotas support to compute images (https://github.com/neondatabase/cloud/issues/13127). Previous update: https://github.com/neondatabase/neon/pull/7849	2024-09-30 19:18:42 +01:00
Conrad Ludgate	a2e2362ee9	add proxy-protocol header disable option (#9203 ) resolves https://github.com/neondatabase/cloud/issues/18026	2024-09-30 18:11:50 +00:00
Heikki Linnakangas	0a567acdb9	tests: Move comment to more appropriate place There is no 'pg_bin' in NeonEnv.	2024-09-30 17:56:43 +03:00
Heikki Linnakangas	69ea2776e9	tests: Remove creation of extra timelines in some tests neon_cli.create_tenant() creates a new tenant and a timeline on the tenant, with name "main". In most tests, there's no need to create another timeline on the same tenant. There are some more tests that do that, but in the remaining cases, I wasn't be 100% if the presence of extra root timelines affect what the tests test, so I left them alone.	2024-09-30 17:56:40 +03:00
Heikki Linnakangas	4dc9cb7cf9	tests: Remove some spurious list_timelines calls These calls seem really out of place. We know what the initial tenant and branch are in these tests, just like in all other tests.	2024-09-30 17:56:37 +03:00
John Spray	7424e7269c	tests: longer timeout in `test_delete_timeline_client_hangup` (#9161 ) ## Problem This test waits for a request to finish, and then expects deletion to complete almost immediately. The request completes, but it's a 202, the timeline is still deleting in the background: we need to be more patient. ## Summary of changes - Adjust iterations from 2 to 10 when waiting for deletion	2024-09-30 15:46:07 +01:00
a-masterov	5dc68e4e6a	test_compatibility: fix the regexes detecting the version (#9205 ) ## Problem The Neon components, built locally and by the GitHub workflow have slightly different version prefixes (git: vs git-env:) This does not allow running tests against local builds correctly. ## Summary of changes The regular expressions were changed to work with both prefixes.	2024-09-30 16:37:14 +02:00
John Spray	7cfd116856	pageserver: refactor immediate_gc into TenantManager (#9183 ) ## Problem Legacy functions that were called as `mgr::` and relied on the static TENANTS, see #5796 ## Summary of changes - Move the last stray function (immediate_gc) into TenantManager Closes: https://github.com/neondatabase/neon/issues/5796	2024-09-30 09:27:28 +01:00
Heikki Linnakangas	d696c41807	Bump default neon extension version to 1.5 (#9188 ) Commit `263dfba6ee` introduced neon extension version 1.5, which included some new functions and views for metrics. It didn't bump the default neon extension number yet, so that we could still safely roll back to the old binary if necessary. This bumps the default version.	2024-09-30 09:20:52 +03:00
Alexander Bayandin	3c72192065	CI(benchmarking): fix setting LD_LIBRARY_PATH (#9191 ) ## Problem `pgbench-pgvector` job from Nightly Benchmarks fails with the error: ``` /__w/_temp/f45bc2eb-4c4c-4f0a-8030-99079303fa65.sh: line 17: LD_LIBRARY_PATH: unbound variable ``` ## Summary of changes - Fix `LD_LIBRARY_PATH: unbound variable` error in benchmarks	2024-09-29 22:27:53 +00:00
Alexander Bayandin	d2d9921761	CI(benchmarking): fix Nightly Benchmarks (#9178 ) ## Problem Nightly Benchmarks have been broken for some time due to various reasons, this PR fixes it ## Summary of changes - Pull `build-tools` image from dockerhub for `benchmarking` workflow - Use `aws-actions/configure-aws-credentials` to upload/download artifacts from S3 - Fix Postgres 16 installation (for pgbench)	2024-09-28 02:44:22 +01:00
Arthur Petukhovsky	ba498a630a	Set disk quotas on bind in compute_ctl (#8936 ) Part of https://github.com/neondatabase/cloud/issues/13127. Resolves #9153 What changed in this PR: 1. Adds `ComputeSpec.disk_quota_bytes: Option<u64>` 2. Adds new arg to compute_ctl: `--set-disk-quota-for-fs <mountpoint>` 3. Implements running `/neonvm/bin/set-disk-quota` with the right value if both cmdline arg AND field in the spec are specified 4. Patches `/etc/sudoers.d` to allow `compute_ctl` to set quota with sudo This PR is very similar to the swap support added earlier, you can take a look at it as prior art: #7434 In theory, it can be implemented outside of compute_ctl when we will have a separate neonvm daemon, but we are not there yet. Current implementation is the simplest possible to unblock computes with larger disks. All code related to usage of `/neonvm/bin/set-disk-quota` is located in `disk_quota.rs`. We need to call this script with the following arguments: `/neonvm/bin/set-disk-quota {size_kb} {mountpoint}`. Quotas are set on the filesystem level, so we need to provide path to the directory that filesystem was mounted to. I tested this change locally with https://github.com/neondatabase/cloud/pull/17270. It should be safe to merge, because this feature is gated by both cmdline arg and field in the spec. If control-plane doesn't set values in both places, compute_ctl won't be affected by this change.	2024-09-27 20:52:22 +01:00
Heikki Linnakangas	e989a5e4a2	neon_local: Use clap derive macros to parse the CLI args (#9103 ) This is easier to work with.	2024-09-27 22:08:46 +03:00
Alex Chi Z.	cde1654d7b	fix(pageserver): abort process if fsync fails (#9108 ) close https://github.com/neondatabase/neon/issues/8140 The original issue is rather vague on what we should do. After discussion w/ @problame we decided to narrow down the problems we want to solve in that issue. * read path -- do not panic for now. * write path -- panic only on write errors (i.e., device error, fsync error), but not on no-space for now. The guideline is that if the pageserver behavior could lead to violation of persistent constraints (i.e., return an operation as successful but not actually persisting things), we should panic. Fsync is the place where both of us agree that we should panic, because if fsync fails, the kernel will mark dirty pages as clean, and the next fsync will not necessarily return false. This would make the storage client assume the operation is successful. ## Summary of changes Make fsync panic on fatal errors. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-27 19:58:50 +01:00
Heikki Linnakangas	cf6a776fcf	tests: Reduce the # of iterations in safekeeper::test_random_schedules (#9182 ) To make it faster. On my laptop, it takes about 30 before this commit. In the arm64 debug variant in CI, it takes about 120 s. Reduce it by factor of 4.	2024-09-27 16:25:35 +00:00
Matthias van de Meent	5c5871111a	WalProposer: Read WAL directly from WAL buffers in PG17 (#9171 ) This reduces the overhead of the WalProposer when it is not being throttled by SK WAL acceptance rate	2024-09-27 17:47:05 +02:00
Yuchen Liang	d56c4e7a38	pageserver: remove AdjacentVectoredReadBuilder and bump minmimum io_buffer_alignment to 512 (#9175 ) Part of #8130 ## Problem After deploying https://github.com/neondatabase/infra/pull/1927, we shipped `io_buffer_alignment=512` to all prod region. The `AdjacentVectoredReadBuilder` code path is no longer taken and we are running pageserver unit tests 6 times in the CI. Removing it would reduce the test duration by 30-60s. ## Summary of changes - Remove `AdjacentVectoredReadBuilder` code. - Bump the minimum `io_buffer_alignment` requirement to at least 512 bytes. - Use default `io_buffer_alignment` for Rust unit tests. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-09-27 16:41:42 +01:00
Conrad Ludgate	43b2445d0b	proxy: add jwks endpoint to control plane and mock providers (#9165 )	2024-09-27 16:08:43 +01:00
Yuchen Liang	42ef08db47	fix(pageserver): LSN lease edge cases around restarts/migrations (#9055 ) Part of #7497, closes #8817. ## Problem See #8817. ## Summary of changes compute_ctl - Renew lsn lease as soon as `/configure` updates pageserver_connstr, use `state_changed` Condvar for synchronization. pageserver As mentioned in https://github.com/neondatabase/neon/issues/8817#issuecomment-2315768076, we still want some permanent error reported if a lease cannot be granted. By considering attachment mode and the added `lsn_lease_deadline` when processing lease requests, we can also bound the case of bad requests to a very short period after migration/restart. - Refactor https://github.com/neondatabase/neon/pull/9024 and move `lsn_lease_deadline` to `AttachedTenantConf` so timeline can easily access it. - Have separate HTTP `init_lsn_lease` and libpq `renew_lsn_lease` API. - Always do LSN verification for the initial HTTP lease request. - LSN verification for the renewal is still done when tenants are not in `AttachedSingle` and we have pass the `lsn_lease_deadline`, which give plenty of time for compute to renew the lease. neon_local - add and call `timeline_init_lsn_lease` mgmt_api at static endpoint start. The initial lsn lease http request is sent when we run `cargo neon endpoint start <static endpoint>`. ## Testing - Extend `test_readonly_node_gc` to do pageserver restarts and migration. ## Future Work - The control plane should make the initial lease request through HTTP when creating a static endpoint. This is currently only done in `neon_local`. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-09-27 09:56:52 -04:00
Tristan Partin	fc962c9605	Use long options when calling initdb Verbosity in this case is good when reading the code. Short options are better when operating in an interactive shell. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-27 08:22:16 -05:00
Heikki Linnakangas	357fa070a3	Add gdb to build-tools (#9125 ) So that compute_ctl can use it to print backtrace on core dumps See issue #2800.	2024-09-27 15:36:24 +03:00
Heikki Linnakangas	02cdd37b56	Dump backtrace if a core dump is called just "core" (#9125 ) I hope this lets us capture backtraces in CI. At least it makes it work on my laptop, which is valuable even if we need to do more for CI. See issue #2800.	2024-09-27 15:36:24 +03:00
Vlad Lazar	fa354a65ab	libs: improve logging on PG connection errors (#9130 ) ## Problem We get some unexpected errors, but don't know who they're happening for. ## Summary of change Add tenant id and peer address to PG connection error logs. Related https://github.com/neondatabase/cloud/issues/17336	2024-09-27 12:36:43 +01:00
Arseny Sher	40f7930a7d	safekeeper: skip syncfs on start if --no-sync is specified. (#9166 ) https://neondb.slack.com/archives/C059ZC138NR/p1727350911890989?thread_ts=1727350211.370869&cid=C059ZC138NR	2024-09-27 09:59:38 +03:00
Conrad Ludgate	ec07a1ecc9	proxy: make local-proxy config by signal with PID, refine JWKS apis with role caching (#9164 )	2024-09-26 19:01:48 +01:00
Arseny Sher	c4cdfe66ac	Fix flakiness of test_timeline_copy. Timeline might be not initialized when timeline_start_lsn is queried. Spotted by CI.	2024-09-26 19:01:45 +03:00
Alex Chi Z.	42e19e952f	fix(pageserver): categorize client error in basebackup metrics (#9110 ) We separated client error from basebackup error log lines in https://github.com/neondatabase/neon/pull/7523, but we didn't do anything for the metrics. In this patch, we fixed it. ref https://github.com/neondatabase/neon/issues/8970 ## Summary of changes We use the same criteria as in `log_query_error` producing an info line (instead of error) for the metrics. We added a `client_error` category for the basebackup query time metrics. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-26 11:38:19 -04:00
John Spray	3d255d601b	pageserver: rename control plane client & chunk validation requests (#8997 ) ## Problem - In https://github.com/neondatabase/neon/pull/8784, the validate controller API is modified to check generations directly in the database. It batches tenants into separate queries to avoid generating a huge statement, but - While updating this, I realized that "control_plane_client" is a kind of confusing name for the client code now that it primarily talks to the storage controller (the case of talking to the control plane will go away in a few months). ## Summary of changes - Big rename to "ControllerUpcallClient" -- this reflects the storage controller's api naming, where the paths used by the pageserver are in `/upcall/` - When sending validate requests, break them up into chunks so that we avoid possible edge cases of generating any HTTP requests that require database I/O across many thousands of tenants. This PR mixes a functional change with a refactor, but the commits are cleanly separated -- only the last commit is a functional change. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-09-26 16:06:34 +01:00
Arthur Petukhovsky	80e974d05b	fix(compute_ctl): race condition in configurator (#9162 ) There was a tricky race condition in compute_ctl, that sometimes makes configurator skip updates. It makes a deadlock because: - control-plane cannot configure compute, because it's in ConfigurationPending state - compute_ctl doesn't do any reconfiguration because `configurator_main_loop` missed notification for it Full sequence that reproduces the issue: 1. `start_compute` finishes works and changes status `self.set_status(ComputeStatus::Running);` 2. configurator received update about `Running` state and dropped the mutex lock in the iteration 3. `/configure` request was triggered at the same time as step 1, and got the mutex lock 4. same `/configure` request set the spec and updated the state to `ConfigurationPending`, also sent a notification 5. next iteration in configurator got the mutex lock, but missed the notification There are more details in this slack thread: https://neondb.slack.com/archives/C03438W3FLZ/p1727281028478689?thread_ts=1727261220.483799&cid=C03438W3FLZ --------- Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2024-09-26 15:42:17 +01:00
Alexander Bayandin	7fdf1ab5b6	CI: run compatibility tests on Postgres 17 (#9145 ) ## Problem The latest storage release has generated artifacts for Postgres 17, so we can enable compatibility tests this version ## Summary of changes - Unskip `test_backward_compatibility` / `test_forward_compatibility` on Postgres 17	2024-09-26 15:17:01 +01:00
Conrad Ludgate	7736b748d3	Merge pull request #9159 from neondatabase/rc/proxy/2024-09-26 Proxy release 2024-09-26	2024-09-26 09:22:33 +01:00
Arpad Müller	7bae78186b	Forbid creation of child timelines of archived timeline (#9122 ) We don't want to allow any new child timelines of archived timelines. If you want any new child timelines, you should first un-archive the timeline. Part of #8088	2024-09-26 02:05:25 +02:00
Heikki Linnakangas	7e560dd00e	chore: Silence clippy warning with nightly (#9157 ) The warning: warning: first doc comment paragraph is too long --> pageserver/src/tenant/checks.rs:7:1 \| 7 \| / /// Checks whether a layer map is valid (i.e., is a valid result of the current compaction algorithm if no... 8 \| \| /// The function checks if we can split the LSN range of a delta layer only at the LSNs of the delta layer... 9 \| \| /// 10 \| \| /// ```plain \| \|_ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph = note: `#[warn(clippy::too_long_first_doc_paragraph)]` on by default help: add an empty line \| 7 ~ /// Checks whether a layer map is valid (i.e., is a valid result of the current compaction algorithm if nothing goes wrong). 8 + /// \| Fix by applying the suggestion.	2024-09-25 21:29:16 +00:00
Tristan Partin	684e924211	Fix compute_logical_snapshot_files for v14 The function, pg_ls_logicalsnapdir(), was added in version 15. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-25 16:25:17 -05:00
Tristan Partin	8ace9ea25f	Format long single DATA line in pgxn/Makefile This should be a little more readable. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-25 16:25:17 -05:00
Alex Chi Z.	6a4f49b08b	fix(pageserver): passthrough partition cancel error (#9154 ) close https://github.com/neondatabase/neon/issues/9142 ## Summary of changes passthrough CollectKeyspaceError::Cancelled to CompactionError::ShuttingDown Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-25 21:35:33 +01:00
Alexander Bayandin	c6e89445e2	CI(promote-images): fix prod ECR auth (#9146 ) A cherry-pick from the previous release (#9131) ## Problem Login to prod ECR doesn't work anymore: ``` Retrieving registries data through * SDK... * ECR detected with eu-central-1 region Error: The security token included in the request is invalid. ``` ## Summary of changes - Fix login to prod ECR by using `aws-actions/configure-aws-credentials`	2024-09-25 18:22:39 +01:00
Vlad Lazar	04f32b9526	tests: remove patching up of az id column (#8968 ) This was required since the compat tests used a snapshot generated from a version of neon local which didn't contain the availability_zone_id column.	2024-09-25 17:22:32 +01:00
Heikki Linnakangas	6f2333f52b	CI: Leave out unnecessary build files from binary artifact (#9135 ) The pg_install/build directory contains .o files and such intermediate results from the build, which are not needed in the final tarball. Except for src/test/regress/regress.so and a few other .so files in that directory; keep those. This reduces the size of the neon-Linux-X64-release-artifact.tar.zst artifact from about 1.5 GB to 700 MB. (I attempted this a long time ago already, by moving the build/ directory out of pg_install altogether, see PR #2127. But I never got around to finish that work.) Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-09-25 19:07:20 +03:00
Yuchen Liang	d447f49bc3	fix(pageserver): handle lsn lease requests for unnormalized lsns (#9137 ) Fixes https://github.com/neondatabase/neon/issues/9098. ## Problem See https://github.com/neondatabase/neon/issues/9098#issuecomment-2372484969. ### Related A similar problem happened with branch creation, which was discussed [here](https://github.com/neondatabase/neon/pull/2143#issuecomment-1199969052) and fixed by https://github.com/neondatabase/neon/pull/2529. ## Summary of changes - Normalize the lsn on pageserver side upon lsn lease request, stores the normalized LSN. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-09-25 14:57:38 +00:00
Vlad Lazar	c5972389aa	storcon: include timeline ID in LSN waiting logs (#9141 ) ## Problem Hard to tell which timeline is holding the migration. ## Summary of Changes Add timeline id to log.	2024-09-25 15:54:41 +01:00
Matthias van de Meent	c4f5736d5a	Build images for PG17 using Debian 12 "Bookworm" (#9132 ) This increases the support window of the OS used for PG17 by 2 years compared to the previous usage of Debian 11 "Bullseye".	2024-09-25 17:50:05 +03:00
Alexey Kondratov	518f598e2d	docs(rfc): Independent compute release flow (#8881 ) Related to https://github.com/neondatabase/cloud/issues/11698	2024-09-25 16:24:09 +02:00
John Spray	4b711caf5e	storage controller: make proxying of GETs to pageservers more robust (#9065 ) ## Problem These commits are split off from https://github.com/neondatabase/neon/pull/8971/commits where I was fixing this to make a better scale test pass -- Vlad also independently recognized these issues with cloudbench in https://github.com/neondatabase/neon/issues/9062. 1. The storage controller proxies GET requests to pageservers based on their intent, not the ground truth of where they're really attached. 2. Proxied requests can race with scheduling to tenants, resulting in 404 responses if the request hits the wrong pageserver. Closes: https://github.com/neondatabase/neon/issues/9062 ## Summary of changes 1. If a shard has a running reconciler, then use the database generation_pageserver to decide who to proxy the request to 2. If such a request gets a 404 response and its scheduled node has changed since the request was dispatched.	2024-09-25 13:56:39 +00:00
Vlad Lazar	2cf47b1477	storcon: do az aware scheduling (#9083 ) ## Problem Storage controller didn't previously consider AZ locality between compute and pageservers when scheduling nodes. Control plane has this feature, and, since we are migrating tenants away from it, we need feature parity to avoid perf degradations. ## Summary of changes The change itself is fairly simple: 1. Thread az info into the scheduler 2. Add an extra member to the scheduling scores Step (2) deserves some more discussion. Let's break it down by the shard type being scheduled: Attached Shards We wish for attached shards of a tenant to end up in the preferred AZ of the tenant since that is where the compute is like to be. The AZ member for `NodeAttachmentSchedulingScore` has been placed below the affinity score (so it's got the second biggest weight for picking the node). The rationale for going below the affinity score is to avoid having all shards of a single tenant placed on the same node in 2 node regions, since that would mean that one tenant can drive the general workload of an entire pageserver. I'm not 100% sure this is the right decision, so open to discussing hoisting the AZ up to first place. Secondary Shards We wish for secondary shards of a tenant to be scheduled in a different AZ from the preferred one for HA purposes. The AZ member for `NodeSecondarySchedulingScore` has been placed first, so nodes in different AZs from the preferred one will always be considered first. On small clusters, this can mean that all the secondaries of a tenant are scheduled to the same pageserver, but secondaries don't use up as many resources as the attached location, so IMO the argument made for attached shards doesn't hold. Related: https://github.com/neondatabase/neon/issues/8848	2024-09-25 14:31:04 +01:00
Folke Behrens	7dcfcccf7c	Re-export git-version from utils and remove as direct dep (#9138 )	2024-09-25 14:38:35 +02:00
Vlad Lazar	a26cc29d92	storcon: add tags to scheduler logs (#9127 ) We log something at info level each time we schedule a shard to a non-secondary location. Might as well have context for it.	2024-09-25 10:16:06 +01:00
Alex Chi Z.	5f2f31e879	fix(test): storage scrubber should only log to stdout with info (#9067 ) As @koivunej mentioned in the storage channel, for regress test, we don't need to create a log file for the scrubber, and we should reduce noisy logs. ## Summary of changes * Disable log file creation for storage scrubber * Only log at info level --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-24 22:33:03 +00:00
Damian972	938b163b42	chore(docker-compose): fix typo in readme (#9133 ) Typo in the readme inside docker-compose folder ## Summary of changes - Update the readme	2024-09-24 18:05:23 -04:00
Heikki Linnakangas	5cbf5b45ae	Remove TenantState::Loading (#9118 ) The last real use was removed in commit `de90bf4663`. It was still used in a few unit tests, but they can use Attaching too.	2024-09-24 20:58:54 +00:00
Heikki Linnakangas	af5c54ed14	test: Make test_lfc_resize more robust (#9117 ) 1. Increase statement_timeout. It defaults to 120 s, which is not quite enough on slow or busy systems with debug build. On my laptop, the index creation takes about 100 s. On buildfarm, we've seen failures, e.g: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9084/10997888708/index.html#suites/821f97908a487f1d7d3a2a4dd1571e99/db1834bddfe8c5b9/ 2. Keep twiddling the LFC size through the whole test. Before, we would do it for the first 10 seconds, but that only covers a small part of the pgbench initialization phase. Change the loop so that the pgbench run time determines how long the test runs, and we keep changing the LFC for the whole time. In the passing, also fix bogus test description, copy-pasted from a completely unrelated test.	2024-09-24 23:38:16 +03:00
Alexander Bayandin	523cf71721	Fix compiler warnings on macOS (#9128 ) ## Problem Compilation of neon extension on macOS produces a warning ``` pgxn/neon/neon_perf_counters.c:50:1: error: non-void function does not return a value [-Werror,-Wreturn-type] ``` ## Summary of changes - Change the return type of `NeonPerfCountersShmemInit` to void	2024-09-24 18:11:31 +00:00
Arpad Müller	c47f355ec1	Catch Cancelled and don't print a warning for it (#9121 ) In the `imitate_synthetic_size_calculation_worker` function, we might obtain the `Cancelled` error variant instead of hitting the cancellation token based path. Therefore, catch `Cancelled` and handle it analogously to the cancellation case. Fixes #8886.	2024-09-24 17:28:56 +00:00
Yuchen Liang	4f67b0225b	pageserver: handle decompression outside vectored `read_blobs` (#8942 ) Part of #8130. ## Problem Currently, decompression is performed within the `read_blobs` implementation and the decompressed blob will be appended to the end of the `BytesMut` buffer. We will lose this flexibility of extending the buffer when we switch to using our own dio-aligned buffer (WIP in https://github.com/neondatabase/neon/pull/8730). To facilitate the adoption of aligned buffer, we need to refactor the code to perform decompression outside `read_blobs`. ## Summary of changes - `VectoredBlobReader::read_blobs` will return `VectoredBlob` without performing decompression and appending decompressed blob. It becomes the caller's responsibility to decompress the buffer. - Added a new `BufView` type that functions as `Cow<Bytes, &[u8]>`. - Perform decompression within `VectoredBlob::read` so that people don't have to explicitly thinking about compression when using the reader interface. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-09-24 16:41:38 +00:00
Heikki Linnakangas	2f7cecaf6a	test: Poll pageserver availability more aggressively at test startup Even with the 100 ms interval, on my laptop the pageserver always becomes available on second attempt, so this saves about 900 ms at every test startup.	2024-09-24 17:16:43 +03:00
Heikki Linnakangas	589594c2e1	test: Skip fsync when initdb'ing the storage controller db After initdb, we configure it with "fsync=off" anyway.	2024-09-24 17:16:43 +03:00
Heikki Linnakangas	70fe007519	test: Make test_hot_standby_feedback more forgiving of slow initialization (#9113 ) Don't start waiting for the index to appear in the secondary until it has been created in the primary. Before, if the "pgbench -i" step took more than 60 s, we would give up. There was a flaky test failure along those lines at: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9105/10997477941/index.html#suites/950eff205b552e248417890b8b8f189e/73cf4b5648fa6f74/ Hopefully, this avoids such failures in the future.	2024-09-24 16:41:59 +03:00
a-masterov	b224a5a377	Move the patch to compute (#9120 ) ## Problem All the other patches were moved to the compute directory, and only one was left in the patches subdirectory in the root directory. ## Summary of changes The patch was moved to the compute directory as others	2024-09-24 15:13:18 +02:00
Christian Schwarz	a65d437930	chore(#9077 ): cleanups & code dedup (#9082 ) Punted from https://github.com/neondatabase/neon/pull/9077	2024-09-24 13:05:07 +00:00
Matthias van de Meent	fc67f8dc60	Update PostgreSQL 17 from 17rc1 to 17.0 (#9119 ) The PostgreSQL 17 vendor module is now based on postgres/postgres @ d7ec59a63d745ba74fba0e280bbf85dc6d1caa3e, presumably the final code change before the V17 tag.	2024-09-24 14:15:52 +02:00
Folke Behrens	2b65a2b53e	proxy: check if IP is allowed during webauth flow (#9101 ) neondatabase/cloud#12018	2024-09-24 11:52:25 +02:00
Vlad Lazar	9490360df4	storcon: improve initial shard scheduling (#9081 ) ## Problem Scheduling on tenant creation uses different heuristics compared to the scheduling done during background optimizations. This results in scenarios where shards are created and then immediately migrated by the optimizer. ## Summary of changes 1. Make scheduler aware of the type of the shard it is scheduling (attached vs secondary). We wish to have different heuristics. 2. For attached shards, include the attached shard count from the context in the node score calculation. This brings initial shard scheduling in line with what the optimization passes do. 3. Add a test for (2). This looks like a bigger change than required, but the refactoring serves as the basis for az-aware shard scheduling where we also need to make the distinction between attached and secondary shards. Closes https://github.com/neondatabase/neon/issues/8969	2024-09-24 09:03:41 +00:00
a-masterov	91d947654e	Add regression tests for a cloud-based Neon instance (#8681 ) ## Problem We need to be able to run the regression tests against a cloud-based Neon staging instance to prepare the migration to the arm architecture. ## Summary of changes Some tests were modified to work on the cloud instance (i.e. added passwords, server-side copy changed to client-side, etc) --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-09-24 09:44:45 +02:00
Yuchen Liang	37aa6fd953	scrubber: retry when missing index key in the listing (#8873 ) Part of #8128, fixes #8872. ## Problem See #8872. ## Summary of changes - Retry `list_timeline_blobs` another time if - there are layer file keys listed but not index. - failed to download index. - Instrument code with `analyze-tenant` and `analyze-timeline` span. - Remove `initdb_archive` check, it could have been deleted. - Return with exit code 1 on fatal error if `--exit-code` parameter is set. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-09-23 21:58:12 +00:00
Heikki Linnakangas	3ad567290c	Move metric exporter and pgbouncer config files Instead of adding them to the VM image late in the build process, when putting together the final VM image, include them in the earlier compute image already. That makes it more convenient to edit the files, and to test them.	2024-09-24 00:35:52 +03:00
Heikki Linnakangas	3a110e45ed	Move files related to building compute image into compute/ dir Seems nice to keep all these together. This also provides a nice place for a README file to describe the compute image build process. For now, it briefly describes the contents of the directory, but can be expanded.	2024-09-24 00:35:52 +03:00
Heikki Linnakangas	e7e6319e20	Fix compiler warnings with nightly rustc about elided lifetimes having names (#9105 ) The warnings: warning: elided lifetime has a name --> pageserver/src/metrics.rs:1386:29 \| 1382 \| pub(crate) fn start_timer<'c: 'a, 'a>( \| -- lifetime `'a` declared here ... 1386 \| ) -> Option<impl Drop + '_> { \| ^^ this elided lifetime gets resolved as `'a` \| = note: `#[warn(elided_named_lifetimes)]` on by default warning: elided lifetime has a name --> pageserver/src/metrics.rs:1537:46 \| 1534 \| pub(crate) fn start_recording<'c: 'a, 'a>( \| -- lifetime `'a` declared here ... 1537 \| ) -> BasebackupQueryTimeOngoingRecording<'_, '_> { \| ^^ this elided lifetime gets resolved as `'a` warning: elided lifetime has a name --> pageserver/src/metrics.rs:1537:50 \| 1534 \| pub(crate) fn start_recording<'c: 'a, 'a>( \| -- lifetime `'a` declared here ... 1537 \| ) -> BasebackupQueryTimeOngoingRecording<'_, '_> { \| ^^ this elided lifetime gets resolved as `'a` warning: elided lifetime has a name --> pageserver/src/tenant.rs:3630:25 \| 3622 \| async fn prepare_new_timeline<'a>( \| -- lifetime `'a` declared here ... 3630 \| ) -> anyhow::Result<UninitializedTimeline> { \| ^^^^^^^^^^^^^^^^^^^^^ this elided lifetime gets resolved as `'a`	2024-09-23 23:31:32 +02:00
Matthias van de Meent	d865881d59	NOAI (#9084 ) We can't FlushOneBuffer when we're in redo-only mode on PageServer, so make execution of that function conditional on us not running in pageserver walredo mode.	2024-09-23 21:16:42 +00:00
Konstantin Knizhnik	1c5d6e59a0	Maintain number of used pages for LFC (#9088 ) ## Problem LFC cache entry is chunk (right now size of chunk is 1Mb). LFC statistics shows number of chunks, but not number of used pages. And autoscaling team wants to know how sparse LFC is: https://neondb.slack.com/archives/C04DGM6SMTM/p1726782793595969 It is possible to obtain it from the view `select count() from local_cache`. Nut it is expensive operation, enumerating all entries in LFC under lock. ## Summary of changes This PR added "file_cache_used_pages" to `neon_lfc_stats` view: ``` select from neon_lfc_stats; lfc_key \| lfc_value -----------------------+----------- file_cache_misses \| 3139029 file_cache_hits \| 4098394 file_cache_used \| 1024 file_cache_writes \| 3173728 file_cache_size \| 1024 file_cache_used_pages \| 25689 (6 rows) ``` Please notice that this PR doesn't change neon extension API, so no need to create new version of Neon extension. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-09-23 22:05:32 +03:00
Heikki Linnakangas	263dfba6ee	Add views for metrics about pageserver requests (#9008 ) The metrics include a histogram of how long we need to wait for a GetPage request, number of reconnects, and number of requests among other things. The metrics are not yet exported anywhere, but you can query them manually. Note: This does not bump the default version of the 'neon' extension. We will do that later, as a separate PR. The reason is that this allows us to roll back the compute image smoothly, if necessary. Once the image that includes the new extension .so file with the new functions has been rolled out, and we're confident that we don't need to roll back the image anymore, we can change default extension version and actually start using the new functions and views. This is what the view looks like: ``` postgres=# select * from neon_perf_counters ; metric \| bucket_le \| value ---------------------------------------+-----------+---------- getpage_wait_seconds_count \| \| 300 getpage_wait_seconds_sum \| \| 0.048506 getpage_wait_seconds_bucket \| 2e-05 \| 0 getpage_wait_seconds_bucket \| 3e-05 \| 0 getpage_wait_seconds_bucket \| 6e-05 \| 71 getpage_wait_seconds_bucket \| 0.0001 \| 124 getpage_wait_seconds_bucket \| 0.0002 \| 248 getpage_wait_seconds_bucket \| 0.0003 \| 279 getpage_wait_seconds_bucket \| 0.0006 \| 297 getpage_wait_seconds_bucket \| 0.001 \| 298 getpage_wait_seconds_bucket \| 0.002 \| 298 getpage_wait_seconds_bucket \| 0.003 \| 298 getpage_wait_seconds_bucket \| 0.006 \| 300 getpage_wait_seconds_bucket \| 0.01 \| 300 getpage_wait_seconds_bucket \| 0.02 \| 300 getpage_wait_seconds_bucket \| 0.03 \| 300 getpage_wait_seconds_bucket \| 0.06 \| 300 getpage_wait_seconds_bucket \| 0.1 \| 300 getpage_wait_seconds_bucket \| 0.2 \| 300 getpage_wait_seconds_bucket \| 0.3 \| 300 getpage_wait_seconds_bucket \| 0.6 \| 300 getpage_wait_seconds_bucket \| 1 \| 300 getpage_wait_seconds_bucket \| 2 \| 300 getpage_wait_seconds_bucket \| 3 \| 300 getpage_wait_seconds_bucket \| 6 \| 300 getpage_wait_seconds_bucket \| 10 \| 300 getpage_wait_seconds_bucket \| 20 \| 300 getpage_wait_seconds_bucket \| 30 \| 300 getpage_wait_seconds_bucket \| 60 \| 300 getpage_wait_seconds_bucket \| 100 \| 300 getpage_wait_seconds_bucket \| Infinity \| 300 getpage_prefetch_requests_total \| \| 69 getpage_sync_requests_total \| \| 231 getpage_prefetch_misses_total \| \| 0 getpage_prefetch_discards_total \| \| 0 pageserver_requests_sent_total \| \| 323 pageserver_requests_disconnects_total \| \| 0 pageserver_send_flushes_total \| \| 323 file_cache_hits_total \| \| 0 (39 rows) ```	2024-09-23 21:28:50 +03:00
Heikki Linnakangas	df3996265f	test: Downgrade info message on removing empty directories (#9093 ) It was pretty noisy. It changed from debug to info level in commit `78938d1b59`, but I believe that was not purpose.	2024-09-23 20:10:22 +02:00
Alex Chi Z.	29699529df	feat(pageserver): filter keys with gc-compaction (#9004 ) Part of https://github.com/neondatabase/neon/issues/8002 Close https://github.com/neondatabase/neon/issues/8920 Legacy compaction (as well as gc-compaction) rely on the GC process to remove unused layer files, but this relies on many factors (i.e., key partition) to ensure data in a dropped table can be eventually removed. In gc-compaction, we consider the keyspace information when doing the compaction process. If a key is not in the keyspace, we will skip that key and not include it in the final output. However, this is not easy to implement because gc-compaction considers branch points (i.e., retain_lsns) and the retained keyspaces could change across different LSNs. Therefore, for now, we only remove aux v1 keys in the compaction process. ## Summary of changes * Add `FilterIterator` to filter out keys. * Integrate `FilterIterator` with gc-compaction. * Add `collect_gc_compaction_keyspace` for a spec of keyspaces that can be retained during the gc-compaction process. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-23 16:30:44 +00:00
Nikita Kalyanov	f446e08fb8	change HTTP method to comply with spec (#9100 ) There is discrepancy with the spec, it has PUT	2024-09-23 15:53:06 +02:00
Christian Schwarz	4d5add9ca0	compact_level0_phase1: remove final traces of value access mode config (#8935 ) refs https://github.com/neondatabase/neon/issues/8184 stacked atop https://github.com/neondatabase/neon/pull/8934 This PR changes from ignoring the config field to rejecting configs that contain it. PR https://github.com/neondatabase/infra/pull/1903 removes the field usage from `pageserver.toml`. It rolls into prod sooner or in the same release as this PR.	2024-09-23 15:05:22 +02:00
Christian Schwarz	59b4c2eaf9	walredo: add a ping method (#8952 ) Not used in production, but in benchmarks, to demonstrate minimal RTT. (It would be nice to not have to copy the 8KiB of zeroes, but, that would require larger protocol changes). Found this useful in investigation https://github.com/neondatabase/neon/pull/8952.	2024-09-23 10:19:37 +00:00
Vlad Lazar	5432155b0d	storcon: update compute hook state on detach (#9045 ) ## Problem Previously, the storage controller may send compute notifications containing stale pageservers (i.e. pageserver serving the shard was detached). This happened because detaches did not update the compute hook state. ## Summary of Changes Update compute hook state on shard detach. Fixes #8928	2024-09-23 10:05:02 +01:00
Heikki Linnakangas	e16e82749f	Remove unused crates from workspace Cargo.toml These were not referenced in any of the other Cargo.toml files in the workspace. They were not being built because of that, so there was little harm in having them listed, but let's be tidy.	2024-09-23 00:37:41 +03:00
Heikki Linnakangas	9f653893b9	Update a few dependencies, removing some indirect dependencies cargo update ciborium iana-time-zone lazy_static schannel uuid cargo update hyper@0.14 cargo update --precise 2.9.7 ureq It might be worthwhile just update all our dependencies at some point, but this is aimed at pruning the dependency tree, to make the build a little faster. That's also why I didn't update ureq to the latest version: that would've added a dependency to yet another version of rustls.	2024-09-23 00:37:41 +03:00
Heikki Linnakangas	913af44219	Update "memoffset" crate To eliminate one version of it from our dependency tree.	2024-09-23 00:37:41 +03:00
Heikki Linnakangas	ecd615ab6d	Update "hostname" crate We were already building v0.4.0 as an indirect dependency, so this avoids having to build two different versions of it.	2024-09-23 00:37:41 +03:00
Heikki Linnakangas	c9b2ec9ff1	Check submodule forward progress (#8949 ) We frequently mess up our submodule references. This adds one safeguard: it checks that the submodule references are only updated "forwards", not to some older commit, or a commit that's not a descended of the previous one. As next step, I'm thinking that we should automate things so that when you merge a PR to the 'neon' repository that updates the submodule references, the REL_*_STABLE_neon branches are automatically updated to match the submodule references. That way, you never need to manually merge PRs in the postgres repository, it's all triggered from commits in the 'neon' repository. But that's not included here.	2024-09-22 21:46:53 +03:00
Arpad Müller	a3800dcb0c	Move load_timeline_metadata into separate function (#9080 ) Moves the per-timeline code to load timeline metadata into a new dedicated function called `load_timeline_metadata`. The old `load_timeline_metadata` becomes `load_timelines_metadata`. Split out of #8907 Part of #8088	2024-09-21 12:36:41 +00:00
Heikki Linnakangas	9a32aa828d	Fix init of WAL page header at startup (#8914 ) If the primary is started at an LSN within the first of a 16 MB WAL segment, the "long XLOG page header" at the beginning of the segment was not initialized correctly. That has gone unnnoticed, because under normal circumstances, nothing looks at the page header. The WAL that is streamed to the safekeepers starts at the new record's LSN, not at the beginning of the page, so that bogus page header didn't propagate elsewhere, and a primary server doesn't normally read the WAL its written. Which is good because the contents of the page would be bogus anyway, as it wouldn't contain any of the records before the LSN where the new record is written. Except that in the following cases a primary does read its own WAL: 1. When there are two-phase transactions in prepared state at checkpoint. The checkpointer reads the two-phase state from the XLOG_XACT_PREPARE record, and writes it to a file in pg_twophase/. 2. Logical decoding reads the WAL starting from the replication slot's restart LSN. This PR fixes the problem with two-phase transactions. For that, it's sufficient to initialize the page header correctly. The checkpointer only needs to read XLOG_XACT_PREPARE records that were generated after the server startup, so it's still OK that older WAL is missing / bogus. I have not investigated if we have a problem with logical decoding, however. Let's deal with that separately. Special thanks to @Lzjing-1997, who independently found the same bug and opened a PR to fix it, although I did not use that PR.	2024-09-21 04:00:38 +03:00
Anastasia Lubennikova	f03f7b3868	Bump vendor/postgres to include extension path fix (#9076 ) This is a pre requisite for https://github.com/neondatabase/neon/pull/8681	2024-09-20 20:24:40 +03:00
Christian Schwarz	ec5dce04eb	pageserver: throttling: per-tenant metrics + more metrics to help understand throttle queue depth (#9077 )	2024-09-20 16:48:26 +00:00
John Spray	6014f15157	pageserver: suppress noisy "layer became visible" logs (#9064 ) ## Problem When layer visibility was added, an info log was included for the situation where actual access to a layer disagrees with the visibility calculation. This situation is safe, but I was interested in seeing when it happens. The log is pretty high volume, so this PR refines it to fire less often. ## Summary of changes - For cases where accessing non-visible layers is normal, don't log at all. - Extend a unit test to increase confidence that the updates to visibility on access are working as expected - During compaction, only call the visibility calculation routine if some image layers were created: previously, frequent calls resulted in the visibility of layers getting reset every time we passed through create_image_layers.	2024-09-20 16:07:09 +00:00
Conrad Ludgate	e675a21346	utils: leaky bucket should only report throttled if the notify queue is blocked on sleep (#9072 ) ## Problem Seems that PS might be too eager in reporting throttled tasks ## Summary of changes Introduce a sleep counter. If the sleep counter increases, then the acquire tasks was throttled.	2024-09-20 16:09:39 +01:00
Alex Chi Z.	6b93230270	fix(pageserver): receive body error now 500 (#9052 ) close https://github.com/neondatabase/neon/issues/8903 In https://github.com/neondatabase/neon/issues/8903 we observed JSON decoding error to have the following error message in the log: ``` Error processing HTTP request: Resource temporarily unavailable: 3956 (pageserver-6.ap-southeast-1.aws.neon.tech) error receiving body: error decoding response body ``` This is hard to understand. In this patch, we make the error message more reasonable. ## Summary of changes * receive body error is now an internal server error, passthrough the `reqwest::Error` (only decoding error) as `anyhow::Error`. * instead of formatting the error using `to_string`, we use the alternative `anyhow::Error` formatting, so that it prints out the cause of the error (i.e., what exactly cannot serde decode). I would expect seeing something like `error receiving body: error decoding response body: XXX field not found` after this patch, though I didn't set up a testing environment to observe the exact behavior. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-20 10:37:28 -04:00
Heikki Linnakangas	797aa4ffaa	Skip running clippy in --release mode. (#9073 ) It's pretty expensive to run, and there is very little difference between debug and release builds that could lead to different clippy warnings. This is extracted from PR #8912. That PR wandered off into various improvements we could make, but we seem to have consensus on this part at least.	2024-09-20 17:22:58 +03:00
Christian Schwarz	c45b56e0bb	pageserver: add counters for started smgr/getpage requests (#9069 ) After this PR ``` curl localhost:9898/metrics \| grep smgr_ \| grep start ``` ``` pageserver_smgr_query_started_count{shard_id="0000",smgr_query_type="get_page_at_lsn",tenant_id="...",timeline_id="..."} 0 pageserver_smgr_query_started_global_count{smgr_query_type="get_db_size"} 0 pageserver_smgr_query_started_global_count{smgr_query_type="get_page_at_lsn"} 0 pageserver_smgr_query_started_global_count{smgr_query_type="get_rel_exists"} 0 pageserver_smgr_query_started_global_count{smgr_query_type="get_rel_size"} 0 pageserver_smgr_query_started_global_count{smgr_query_type="get_slru_segment"} 0 ``` We instantiate the per-tenant counter only for `get_page_at_lsn`.	2024-09-20 14:55:50 +01:00
Alexander Bayandin	3104f0f250	Safekeeper: fix OpenAPI spec (#9066 ) ## Problem Safekeeper's OpenAPI spec is incorrect: ``` Semantic error at paths./v1/tenant/{tenant_id}/timeline/{timeline_id}.get.responses.404.content.application/json.schema.$ref $refs must reference a valid location in the document Jump to line 126 ``` Checked on https://editor.swagger.io ## Summary of changes - Add `NotFoundError` - Add `description` and `license` fields to make Cloud OpenAPI spec linter happy	2024-09-20 12:00:05 +01:00
Arseny Sher	f2c08195f0	Bump vendor/postgres. Includes PRs: - ERROR out instead of segfaulting when walsender slots are full. - logical worker: respond to publisher even under dense stream.	2024-09-20 12:38:42 +03:00
Alex Chi Z.	d0cbfda15c	refactor(pageserver): check layer map valid in one place (#9051 ) We have 3 places where we implement layer map checks. ## Summary of changes Now we have a single check function being called in all places. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-19 20:29:28 +00:00
Yuchen Liang	1708743e78	pageserver: wait for lsn lease duration after transition into AttachedSingle (#9024 ) Part of #7497, closes https://github.com/neondatabase/neon/issues/8890. ## Problem Since leases are in-memory objects, we need to take special care of them after pageserver restarts and while doing a live migration. The approach we took for pageserver restart is to wait for at least lease duration before doing first GC. We want to do the same for live migration. Since we do not do any GC when a tenant is in `AttachedStale` or `AttachedMulti` mode, only the transition from `AttachedMulti` to `AttachedSingle` requires this treatment. ## Summary of changes - Added `lsn_lease_deadline` field in `GcBlock::reasons`: the tenant is temporarily blocked from GC until we reach the deadline. This information does not persist to S3. - In `GCBlock::start`, skip the GC iteration if we are blocked by the lsn lease deadline. - In `TenantManager::upsert_location`, set the lsn_lease_deadline to `Instant::now() + lsn_lease_length` so the granted leases have a chance to be renewed before we run GC for the first time after transitioned from AttachedMulti to AttachedSingle. Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-09-19 17:27:10 +01:00
Conrad Ludgate	0a1ca7670c	proxy: remove auth info from http conn info & fixup jwt api trait (#9047 ) misc changes split out from #8855 - allow cloning the request context in a read-only fashion for background tasks - propagate endpoint and request context through the jwk cache - only allow password based auth for md5 during testing - remove auth info from conn info	2024-09-19 15:09:30 +00:00
Alex Chi Z.	ff9f065c43	impr(pageserver): log image layer creation (#9050 ) https://github.com/neondatabase/neon/pull/9028 changed the image layer creation log into trace level. However, I personally find logging image layer creation useful when reading the logs -- it makes it clear that the image layer creation is happening and gives a clear idea of the progress. Therefore, I propose to continue logging them for create_image_layers set of functions. ## Summary of changes * Add info logging for all image layers created in legacy compaction. * Add info logging for all layers creation in testing functions. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-19 10:43:12 -04:00
Vlad Lazar	21eeafaaa5	pageserver: simple fix for vectored read image layer skip (#9026 ) ## Problem Different keyspaces may require different floor LSNs in vectored delta layer visits. This patch adds support for such cases. ## Summary of changes Different keyspaces wishing to read the same layer might require different stop lsns (or lsn floor). The start LSN of the read (or the lsn ceil) will always be the same. With this observation, we fix skipping of image layers by indexing the fringe by layer id plus lsn floor. This is very simple, but means that we can visit delta layers twice in certain cases. Still, I think it's very unlikely for any extra merging to have taken place in this case, so perhaps it makes sense to go with the simpler patch. Fixes https://github.com/neondatabase/neon/issues/9012 Alternative to https://github.com/neondatabase/neon/pull/9025	2024-09-19 14:51:00 +01:00
Arseny Sher	32a0e759bd	safekeeper: add wal_last_modified to debug_dump. Adds to debug_dump option to include highest modified time among all WAL segments. In passing replace some str with OsStr to have less unwraps.	2024-09-19 16:17:25 +03:00
Heikki Linnakangas	7c489092b7	Remove unused duplicate DEFAULT_INGEST_BATCH_SIZE constant This constant in 'tenant_conf_defaults' was unused, but there's another constant with the same name in the global 'defaults'. I wish the setting was configurable per-tenant, but it isn't, so let's remove the confusing duplicate.	2024-09-19 15:41:35 +03:00
Heikki Linnakangas	06d55a3b12	Clean up concurrent logical size calc semaphore initialization The DEFAULT_CONCURRENT_TENANT_SIZE_LOGICAL_SIZE_QUERIES constant was unused, because we had just hardcoded it to 1 where the constant should've been used. Remove the ConfigurableSemaphore::Default implementation, since it was unused.	2024-09-19 15:41:35 +03:00
Heikki Linnakangas	5c68e6a172	Remove unused constant The code that used it was removed in commit `b9d2c7bdd5`	2024-09-19 15:41:35 +03:00
Heikki Linnakangas	2753abc0d8	Remove leftover enums for configuring vectored get implementation The settings were removed in commit corb9d2c7b.	2024-09-19 15:41:35 +03:00
Conrad Ludgate	9c23333cb3	Merge pull request #9056 from neondatabase/rc/proxy/2024-09-19 Proxy release 2024-09-19	2024-09-19 10:41:17 +01:00
Heikki Linnakangas	a523548ed1	Remove unused cleanup_remaining_timeline_fs_traces function There's some more code that still checks for uninit and delete markers, see callers of is_delete_mark and is_uninit_mark, and github issue #5718. But these functions were outright dead.	2024-09-19 11:57:10 +03:00
Heikki Linnakangas	2d4e5af18b	Remove unused code for parsing a postgresql.conf file	2024-09-19 11:57:10 +03:00
Heikki Linnakangas	5da2340e74	Remove misc dead code in control_plane/	2024-09-19 11:57:10 +03:00
Heikki Linnakangas	7b34c2d7af	Remove misc dead code in libs/	2024-09-19 11:57:10 +03:00
Heikki Linnakangas	15ae1fc3df	Remove a few postgres constants that were not used Dead code is generally useless, but with Postgres constants in particular, I'm also worried that if they're not used anywhere, we might fail to update them at a Postgres version update, and get very confused later when they have wrong values.	2024-09-19 11:57:10 +03:00
Heikki Linnakangas	728b79b9dd	Remove some unnecessary derives	2024-09-19 11:57:10 +03:00
Alex Chi Z.	9d1c6f23d3	fix(storage-scrubber): log version after initialize the logger (#9049 ) When I checked the log in Grafana I couldn't find the scrubber version. Then I realized that it should be logged after the logger gets initialized. ## Summary of changes Log after initializing the logger for the scrubber. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-18 14:13:57 -04:00
Christian Schwarz	035a49a6b2	`neon_local start`: parallel startup to break cyclic dependency (#8950 ) (Found this useful during investigation https://github.com/neondatabase/cloud/issues/16886.) Problem ------- Before this PR, `neon_local` sequentially does the following: 1. launch storcon process 2. wait for storcon to signal readiness [here](`75310fe441/control_plane/src/storage_controller.rs (L804-L808)`) 3. start pageserver 4. wait for pageserver to become ready [here](`c43e664ff5/control_plane/src/pageserver.rs (L343-L346)`) 5. etc The problem is that storcon's readiness waits for the [`startup_reconcile`](`cbcd4058ed/storage_controller/src/service.rs (L520-L523)`) to complete. But pageservers aren't started at this point. So, worst case we wait for `STARTUP_RECONCILE_TIMEOUT/2`, i.e., 15s. This is more than the 10s default timeout allowed by neon_local. So, the result is that `neon_local start` fails to start storcon and stops everything. Solution -------- In this PR I choose the the radical solution to start everything in parallel. It junks up the output because we do stuff like `print!(".")` to indicate progress. We should just abandon that. And switch to `utils::logging` + `tracing` with separate spans for each component. I can do that in this PR or we leave it as a follow-up. Alternatives Considered ----------------------- The Pageserver's `/v1/status` or in fact any endpoint of the mgmt API will not `accept()` on the mgmt API socket until after the `re-attach` call to storcon returned success. So, it's insufficient to change the startup order to start Pageservers first. We cannot easily change Pageserver startup order because `init_tenant_mgr` must complete before we start serving the mgmt API. Otherwise tenant detach calls et al can race with `init_tenant_mgr`. We'd have to add a "loading" state to tenant mgr and make all API endpoints except `/v1/status` wait for _that_ to complete. Related ------- - https://github.com/neondatabase/neon/pull/6475	2024-09-18 18:17:55 +02:00
Folke Behrens	794bd4b866	proxy: mock cplane usable without allowed-ips table (#9046 )	2024-09-18 17:14:53 +02:00
Alexander Bayandin	ac6a1151ae	test_postgres_version: reenable version check for prereleased versions	2024-09-18 14:51:59 +01:00
Tristan Partin	2f37f0384c	Add v17 to revisions.json Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-18 14:51:59 +01:00
Alexander Bayandin	e161a2fa42	CI(deploy): fix deploy to staging and prod (#9030 ) ## Problem It turns out the previous approach (with `skip_if` input) doesn't work (from https://github.com/neondatabase/neon/pull/9017). Revert it and use more straightforward if-conditions ## Summary of changes - Revert `efbe8db7f1` - Add if-condition to`promote-compatibility-data` job and relevant comments	2024-09-18 14:26:47 +01:00
Folke Behrens	c5cd8577ff	proxy: make sql-over-http max request/response sizes configurable (#9029 )	2024-09-18 13:58:51 +02:00
Heikki Linnakangas	3454ef7507	Refactor ImageLayerWriter to avoid passing a Timeline to finish() (#9028 ) Commit `ca5390a89d` made a similar change to DeltaLayerWriter. We bumped into this with Stas with our hackathon project, to create a standalong program to create image layers directly from a Postgres data directory. It needs to create image layers without having a Timeline and other pageserver machinery. This downgrades the "created image layer {}" message from INFO to TRACE level. TRACE is used for the corresponding message on delta layer creation too. The path logged in the message is now the temporary path, before the file is renamed to its final name. Again commit `ca5390a89d` made the same change for the message on delta layer creation.	2024-09-18 13:16:51 +03:00
Christian Schwarz	135e7e4306	add `neon_local` subcommand for the broker & use that from regression tests (#8948 ) There's currently no way to just start/stop broker from `neon_local`. This PR * adds a sub-command * uses that sub-command from the test suite instead of the pre-existing Python `subprocess` based approach. Found this useful during investigation https://github.com/neondatabase/cloud/issues/16886.	2024-09-18 09:10:27 +02:00
Christian Schwarz	3cd2a3f931	refactor(walredo): process launch & kill-on-error machinery (#8951 ) Immediate benefit: easier to spot what's going on. Later benefit: use the extracted method in PR - https://github.com/neondatabase/neon/pull/8952 which adds a `ping` command to walredo. Found this useful during investigation https://github.com/neondatabase/cloud/issues/16886.	2024-09-17 19:16:33 +00:00
Alexander Bayandin	d78f5ce6da	CI: don't fetch the whole git history if it's not required (#9021 ) ## Problem We do use `actions/checkout` with `fetch-depth: 0` when it's not required ## Summary of changes - Remove unneeded `fetch-depth: 0` - Add a comment if `fetch-depth: 0` is required	2024-09-17 18:40:05 +01:00
Arpad Müller	a1b71b73fe	Rename some S3 usages to "remote storage" in exposed messages (#8999 ) In exposed messages like log messages we mentioned "S3", which is not entirely accurate as we support Azure blob storage now as well.	2024-09-17 19:15:01 +02:00
Tristan Partin	6138eb50e9	Fix test code related to migrations We added another migration in `5876c441ab`, but didn't bump this value. This had no effect, but best to fix it anyway. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-17 15:56:05 +01:00
Heikki Linnakangas	d211f00f05	Remove unnecessary dependencies (#9000 ) Found by "cargo machete"	2024-09-17 17:55:45 +03:00
Alexander Bayandin	cd4276fd65	CI: fix release pipeline (#9017 ) ## Problem We've got 2 non-blocking failures on the release pipeline: - `promote-compatibility-data` job got skipped _presumably_ because one of the dependencies of `deploy` job (`push-to-acr-dev`) got skipped (https://github.com/neondatabase/neon/pull/8940) - `coverage-report` job fails because we don't build debug artifacts in the release branch (https://github.com/neondatabase/neon/pull/8561) ## Summary of changes - Always run `push-to-acr-dev` / `push-to-acr-prod` jobs, but add `skip_if` parameter to the reusable workflow, which can skip the job internally, without skipping externally - Do not run `coverage-report` on release branches	2024-09-17 10:17:48 +01:00
Vlad Lazar	b719d58863	storcon: forward requests from stepped down instance to the current leader (#8954 ) ## Problem It turns out that we can't rely on external orchestration to promptly route trafic to the new leader. This is downtime inducing. Forwarding provides a safe way out. ## Safety We forward when: 1. Request is not one of ["/control/v1/step_down", "/status", "/ready", "/metrics"] 2. Current instance is in [`LeadershipStatus::SteppedDown`] state 3. There is a leader in the database to forward to 4. Leader from step (3) is not the current instance If a storcon instance is persisted in the database, then we know that it is the current leader. There's one exception: time between handling step-down request and the new leader updating the database. Let's treat the happy case first. The stepped down node does not produce any side effects, since all request handling happens on the leader. As for the edge case, we are guaranteed to always have a maximum of two running instances. Hence, if we are in the edge case scenario the leader persisted in the database is the stepped down instance that received the request. Condition (4) above covers this scenario. ## Summary of changes * Conversion utilities for reqwest <-> hyper. I'm not happy with these, but I don't see a better way. Open to suggestions. * Add request forwarding logic * Update each request handler. Again, not happy with this. If anyone knows a nice to wrap the handlers, lmk. Me and Joonas tried :/ * Update each handler to maybe forward * Tweak tests to showcase new behaviour	2024-09-17 09:25:42 +01:00
Heikki Linnakangas	2db840d8b8	Move a few test functions related to auth tokens to separate file (#9018 ) For readability. neon_fixtures.py is huge.	2024-09-17 06:53:18 +03:00
Heikki Linnakangas	4295ff0f07	Mark a couple of test fixtures as session-scoped (#9018 ) pg_distrib_dir doesn't include the Postgres version and only depends on env variables which cannot change during a test run, so it can be marked as session-scoped. Similarly, the platform cannot change during a test run.	2024-09-17 06:53:18 +03:00
Heikki Linnakangas	c6f56b8462	Remove redundant get_dir_size() function (#9018 ) There was another copy of it in utils.py. The only difference is that the version in utils.py tolerates files that are concurrently removed. That seems fine for the few callers in neon_fixtures.py too.	2024-09-17 06:53:18 +03:00
Heikki Linnakangas	fec9321fc0	Use Path type in a few more places in neon_fixtures.py (#9018 ) This is in preparation of replacing neon_fixtures.get_dir_size with neon_fixtures.utils.get_dir_size() in next commit.	2024-09-17 06:53:18 +03:00
Heikki Linnakangas	3a52e356c1	Remove unused function (#9018 )	2024-09-17 06:53:18 +03:00
Tristan Partin	5e16c7bb0b	Generate pgbench data on the server for most tests This should generally be faster when running tests, especially those that run with higher scales. Ignoring test_lfc_resize since it seems like we are hitting a query timeout for some reason that I have yet to investigate. A little bit of improvemnt is better than none. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-16 23:37:36 +01:00
Heikki Linnakangas	2bbb4d3e1c	Remove misc unused code (#9014 )	2024-09-16 18:45:19 +00:00
Matthias van de Meent	c8bedca582	Fix PG17's extension modifications (#9010 ) This also reduces the GRANT statements to one per created _reset function	2024-09-16 17:06:31 +01:00
Tristan Partin	5876c441ab	Grant access to pg_show_replication_origin_status for neon_superuser Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-09-16 16:38:55 +01:00
Alexander Bayandin	b2c83db54d	CI(gather-rust-build-stats): set PQ_LIB_DIR to Postgres 17 (#9001 ) ## Problem `gather-rust-build-stats` extra CI job fails with ``` "PQ_LIB_DIR" doesn't exist in the configured path: "/__w/neon/neon/pg_install/v16/lib" ``` ## Summary of changes - Use the path to Postgres 17 for the `gather-rust-build-stats` job. The job uses Postgres built by `make walproposer-lib`	2024-09-16 12:44:26 +01:00
Matthias van de Meent	0a8c5e1214	Fix broken image for PG17 (#8998 ) Most extensions are not required to run Neon-based PostgreSQL, but the Neon extension is _quite_ critical, so let's make sure we include it. ## Problem Staging doesn't have working compute images for PG17 ## Summary of changes Disable some PG17 filters so that we get the critical components into the PG17 image	2024-09-13 15:10:52 +01:00
Matthias van de Meent	78938d1b59	[compute/postgres] feature: PostgreSQL 17 (#8573 ) This adds preliminary PG17 support to Neon, based on RC1 / 2024-09-04 `07b828e9d4` NOTICE: The data produced by the included version of the PostgreSQL fork may not be compatible with the future full release of PostgreSQL 17 due to expected or unexpected future changes in magic numbers and internals. DO NOT EXPECT DATA IN V17-TENANTS TO BE COMPATIBLE WITH THE 17.0 RELEASE! Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-09-12 23:18:41 +01:00
Conrad Ludgate	66a99009ba	Merge pull request #8799 from neondatabase/rc/proxy/2024-08-22 Proxy release 2024-08-22	2024-08-22 10:04:56 +01:00
Conrad Ludgate	5d4c57491f	Merge pull request #8723 from neondatabase/rc/proxy/2024-08-14 Proxy release 2024-08-14	2024-08-14 13:05:51 +01:00
Conrad Ludgate	73935ea3a2	Merge pull request #8647 from neondatabase/rc/proxy/2024-08-08 Proxy release 2024-08-08	2024-08-08 15:37:09 +01:00
Conrad Ludgate	32e595d4dd	Merge branch 'release-proxy' into rc/proxy/2024-08-08	2024-08-08 13:53:33 +01:00
Conrad Ludgate	b0d69acb07	Merge pull request #8505 from neondatabase/rc/proxy/2024-07-25 Proxy release 2024-07-25	2024-07-25 11:07:19 +01:00
Conrad Ludgate	98355a419a	Merge pull request #8351 from neondatabase/rc/proxy/2024-07-11 Proxy release 2024-07-11	2024-07-11 10:40:17 +01:00
Conrad Ludgate	cfb03d6cf0	Merge pull request #8178 from neondatabase/rc/proxy/2024-06-27 Proxy release 2024-06-27	2024-06-27 11:35:30 +01:00
Conrad Ludgate	d81ef3f962	Revert "proxy: update tokio-postgres to allow arbitrary config params (#8076 )" This reverts commit `78d9059fc7`.	2024-06-27 09:46:58 +01:00
Conrad Ludgate	5d62c67e75	Merge pull request #8117 from neondatabase/rc/proxy/2024-06-20 Proxy release 2024-06-20	2024-06-20 11:42:35 +01:00
Anna Khanova	53d53d5b1e	Merge pull request #7980 from neondatabase/rc/proxy/2024-06-06 Proxy release 2024-06-06	2024-06-06 13:14:40 +02:00
Anna Khanova	29fe6ea47a	Merge pull request #7909 from neondatabase/rc/proxy/2024-05-30 Proxy release 2024-05-30	2024-05-30 14:59:41 +02:00
Alexander Bayandin	640327ccb3	Merge pull request #7880 from neondatabase/rc/proxy/2024-05-24 Proxy release 2024-05-24	2024-05-24 18:00:18 +01:00
Anna Khanova	7cf0f6b37e	Merge pull request #7853 from neondatabase/rc/proxy/2024-05-23 Proxy release 2024-05-23	2024-05-23 12:09:13 +02:00
Anna Khanova	03c2c569be	[proxy] Do not fail after parquet upload error (#7858 ) ## Problem If the parquet upload was unsuccessful, it will panic. ## Summary of changes Write error in logs instead.	2024-05-23 11:44:47 +02:00
Conrad Ludgate	eff6d4538a	Merge pull request #7654 from neondatabase/rc/proxy/2024-05-08 Proxy release 2024-05-08	2024-05-08 11:56:20 +01:00
Conrad Ludgate	5ef7782e9c	Merge pull request #7649 from neondatabase/rc/proxy/2024-05-08 Proxy release 2024-05-08	2024-05-08 06:54:03 +01:00
Conrad Ludgate	73101db8c4	Merge branch 'release-proxy' into rc/proxy/2024-05-08	2024-05-08 06:43:57 +01:00
Anna Khanova	bccdfc6d39	Merge pull request #7580 from neondatabase/rc/proxy/2024-05-02 Proxy release 2024-05-02	2024-05-02 12:00:01 +02:00
Anna Khanova	99595813bb	proxy: keep track on the number of events from redis by type. (#7582 ) ## Problem It's unclear what is the distribution of messages, proxy is consuming from redis. ## Summary of changes Add counter.	2024-05-02 11:56:19 +02:00
Anna Khanova	fe07b54758	Merge pull request #7507 from neondatabase/rc/proxy/2024-04-25 Proxy release 2024-04-25	2024-04-25 13:50:05 +02:00
Anna Khanova	a42d173e7b	proxy: Fix cancellations (#7510 ) ## Problem Cancellations were published to the channel, that was never read. ## Summary of changes Fallback to global redis publishing.	2024-04-25 13:42:25 +02:00
Anna Khanova	e07f689238	Update connect to compute and wake compute retry configs (#7509 ) ## Problem ## Summary of changes Decrease waiting time	2024-04-25 13:20:21 +02:00
Conrad Ludgate	7831eddc88	Merge pull request #7417 from neondatabase/rc/proxy/2024-04-18 Proxy release 2024-04-18	2024-04-18 12:03:07 +01:00
Conrad Ludgate	943b1bc80c	Merge pull request #7366 from neondatabase/proxy-hotfix Release proxy (2024-04-11 hotfix)	2024-04-12 10:15:14 +01:00
Conrad Ludgate	95a184e9b7	proxy: fix overloaded db connection closure (#7364 ) ## Problem possible for the database connections to not close in time. ## Summary of changes force the closing of connections if the client has hung up	2024-04-11 23:38:47 +01:00
Conrad Ludgate	3fa17e9d17	Merge pull request #7357 from neondatabase/rc/proxy/2024-04-11 Proxy release 2024-04-11	2024-04-11 11:49:45 +01:00
Anna Khanova	55e0fd9789	Merge pull request #7304 from neondatabase/rc/proxy/2024-04-04 Proxy release 2024-04-04	2024-04-04 12:40:11 +02:00
Anna Khanova	2a88889f44	Merge pull request #7254 from neondatabase/rc/proxy/2024-03-27 Proxy release 2024-03-27	2024-03-27 11:44:09 +01:00
Conrad Ludgate	5bad8126dc	Merge pull request #7173 from neondatabase/rc/proxy/2024-03-19 Proxy release 2024-03-19	2024-03-19 12:11:42 +00:00
Anna Khanova	27bc242085	Merge pull request #7119 from neondatabase/rc/proxy/2024-03-14 Proxy release 2024-03-14	2024-03-14 14:57:05 +05:00
Anna Khanova	192b49cc6d	Merge branch 'release-proxy' into rc/proxy/2024-03-14	2024-03-14 14:16:36 +05:00
Conrad Ludgate	e1b60f3693	Merge pull request #7041 from neondatabase/rc/proxy/2024-03-07 Proxy release 2024-03-07	2024-03-08 08:19:16 +00:00
Anna Khanova	2804f5323b	Merge pull request #6997 from neondatabase/rc/proxy/2024-03-04 Proxy release 2024-03-04	2024-03-04 17:36:11 +04:00
Anna Khanova	676adc6b32	Merge branch 'release-proxy' into rc/proxy/2024-03-04	2024-03-04 16:41:46 +04:00
Nikita Kalyanov	96a4e8de66	Add /terminate API (#6745 ) (#6853 ) this is to speed up suspends, see https://github.com/neondatabase/cloud/issues/10284 Cherry-pick to release branch to build new compute images	2024-02-22 11:51:19 +02:00
Arseny Sher	01180666b0	Merge pull request #6803 from neondatabase/releases/2024-02-19 Release 2024-02-19	2024-02-19 16:38:35 +04:00
Conrad Ludgate	6c94269c32	Merge pull request #6758 from neondatabase/release-proxy-2024-02-14 2024-02-14 Proxy Release	2024-02-15 09:45:08 +00:00
Anna Khanova	edc691647d	Proxy: remove fail fast logic to connect to compute (#6759 ) ## Problem Flaky tests ## Summary of changes Remove failfast logic	2024-02-15 07:42:12 +00:00
Conrad Ludgate	855d7b4781	hold cancel session (#6750 ) ## Problem In a recent refactor, we accidentally dropped the cancel session early ## Summary of changes Hold the cancel session during proxy passthrough	2024-02-14 14:57:22 +00:00
Anna Khanova	c49c9707ce	Proxy: send cancel notifications to all instances (#6719 ) ## Problem If cancel request ends up on the wrong proxy instance, it doesn't take an effect. ## Summary of changes Send redis notifications to all proxy pods about the cancel request. Related issue: https://github.com/neondatabase/neon/issues/5839, https://github.com/neondatabase/cloud/issues/10262	2024-02-14 14:57:22 +00:00
Anna Khanova	2227540a0d	Proxy refactor auth+connect (#6708 ) ## Problem Not really a problem, just refactoring. ## Summary of changes Separate authenticate from wake compute. Do not call wake compute second time if we managed to connect to postgres or if we got it not from cache.	2024-02-14 14:57:22 +00:00
Conrad Ludgate	f1347f2417	proxy: add more http logging (#6726 ) ## Problem hard to see where time is taken during HTTP flow. ## Summary of changes add a lot more for query state. add a conn_id field to the sql-over-http span	2024-02-14 14:57:22 +00:00
Conrad Ludgate	30b295b017	proxy: some more parquet data (#6711 ) ## Summary of changes add auth_method and database to the parquet logs	2024-02-14 14:57:22 +00:00
Anna Khanova	1cef395266	Proxy: copy bidirectional fork (#6720 ) ## Problem `tokio::io::copy_bidirectional` doesn't close the connection once one of the sides closes it. It's not really suitable for the postgres protocol. ## Summary of changes Fork `copy_bidirectional` and initiate a shutdown for both connections. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-02-14 14:57:22 +00:00
John Spray	78d160f76d	Merge pull request #6721 from neondatabase/releases/2024-02-12 Release 2024-02-12	2024-02-12 09:35:30 +00:00
Vlad Lazar	b9238059d6	Merge pull request #6617 from neondatabase/releases/2024-02-05 Release 2024-02-05	2024-02-05 12:50:38 +00:00
Arpad Müller	d0cb4b88c8	Don't preserve temp files on creation errors of delta layers (#6612 ) There is currently no cleanup done after a delta layer creation error, so delta layers can accumulate. The problem gets worse as the operation gets retried and delta layers accumulate on the disk. Therefore, delete them from disk (if something has been written to disk).	2024-02-05 09:58:18 +00:00
John Spray	1ec3e39d4e	Merge pull request #6504 from neondatabase/releases/2024-01-29 Release 2024-01-29	2024-01-29 10:05:01 +00:00
John Spray	a1a74eef2c	Merge pull request #6420 from neondatabase/releases/2024-01-22 Release 2024-01-22	2024-01-22 17:24:11 +00:00
John Spray	90e689adda	pageserver: mark tenant broken when cancelling attach (#6430 ) ## Problem When a tenant is in Attaching state, and waiting for the `concurrent_tenant_warmup` semaphore, it also listens for the tenant cancellation token. When that token fires, Tenant::attach drops out. Meanwhile, Tenant::set_stopping waits forever for the tenant to exit Attaching state. Fixes: https://github.com/neondatabase/neon/issues/6423 ## Summary of changes - In the absence of a valid state for the tenant, it is set to Broken in this path. A more elegant solution will require more refactoring, beyond this minimal fix. (cherry picked from commit `93572a3e99`)	2024-01-22 16:20:57 +00:00
Christian Schwarz	f0b2d4b053	fixup(#6037 ): actually fix the issue, #6388 failed to do so (#6429 ) Before this patch, the select! still retured immediately if `futs` was empty. Must have tested a stale build in my manual testing of #6388. (cherry picked from commit `15c0df4de7`)	2024-01-22 15:23:12 +00:00
Anna Khanova	299d9474c9	Proxy: fix gc (#6426 ) ## Problem Gc currently doesn't work properly. ## Summary of changes Change statement on running gc.	2024-01-22 14:39:09 +01:00
Conrad Ludgate	7234208b36	bump shlex (#6421 ) ## Problem https://rustsec.org/advisories/RUSTSEC-2024-0006 ## Summary of changes `cargo update -p shlex` (cherry picked from commit `5559b16953`)	2024-01-22 09:49:33 +00:00
Christian Schwarz	93450f11f5	Merge pull request #6354 from neondatabase/releases/2024-01-15 Release 2024-01-15 NB: the previous release PR https://github.com/neondatabase/neon/pull/6286 was accidentally merged by merge-by-squash instead of merge-by-merge-commit. See https://github.com/neondatabase/neon/pull/6354#issuecomment-1891706321 for more context.	2024-01-15 14:30:25 +01:00
Christian Schwarz	2f0f9edf33	Merge remote-tracking branch 'origin/release' into releases/2024-01-15	2024-01-15 09:36:42 +00:00
Christian Schwarz	d424f2b7c8	empty commit so we can produce a merge commit	2024-01-15 09:36:22 +00:00
Christian Schwarz	21315e80bc	Merge branch 'releases/2024-01-08--not-squashed' into releases/2024-01-15	2024-01-15 09:31:07 +00:00
vipvap	483b66d383	Merge branch 'release' into releases/2024-01-08 (not-squashed merge of #6286 ) Release PR https://github.com/neondatabase/neon/pull/6286 got accidentally merged-by-squash intstead of merge-by-merge-commit. This commit shows how things would look like if 6286 had been merged-by-squash. ``` git reset --hard `9f1327772` git merge --no-ff `5c0264b591` ``` Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-01-15 09:28:08 +00:00
vipvap	aa72a22661	Release 2024-01-08 (#6286 ) Release 2024-01-08	2024-01-08 09:26:27 +00:00
Shany Pozin	5c0264b591	Merge branch 'release' into releases/2024-01-08	2024-01-08 09:34:06 +02:00
Arseny Sher	9f13277729	Merge pull request #6242 from neondatabase/releases/2024-01-02 Release 2024-01-02	2024-01-02 12:04:43 +04:00
Arseny Sher	54aa319805	Don't split WAL record across two XLogData's when sending from safekeepers. As protocol demands. Not following this makes standby complain about corrupted WAL in various ways. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 closes https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:54:00 +04:00
Arseny Sher	4a227484bf	Add large insertion and slow WAL sending to test_hot_standby. To exercise MAX_SEND_SIZE sending from safekeeper; we've had a bug with WAL records torn across several XLogData messages. Add failpoint to safekeeper to slow down sending. Also check for corrupted WAL complains in standby log. Make the test a bit simpler in passing, e.g. we don't need explicit commits as autocommit is enabled by default. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:54:00 +04:00
Arseny Sher	2f83f85291	Add failpoint support to safekeeper. Just a copy paste from pageserver.	2024-01-02 10:54:00 +04:00
Arseny Sher	d6cfcb0d93	Move failpoint support code to utils. To enable them in safekeeper as well.	2024-01-02 10:54:00 +04:00
Arseny Sher	392843ad2a	Fix safekeeper START_REPLICATION (term=n). It was giving WAL only up to commit_lsn instead of flush_lsn, so recovery of uncommitted WAL since `cdb08f03` hanged. Add test for this.	2024-01-02 10:54:00 +04:00
Arseny Sher	bd4dae8f4a	compute_ctl: kill postgres and sync-safekeeprs on exit. Otherwise they are left orphaned when compute_ctl is terminated with a signal. It was invisible most of the time because normally neon_local or k8s kills postgres directly and then compute_ctl finishes gracefully. However, in some tests compute_ctl gets stuck waiting for sync-safekeepers which intentionally never ends because safekeepers are offline, and we want to stop compute_ctl without leaving orphanes behind. This is a quite rough approach which doesn't wait for children termination. A better way would be to convert compute_ctl to async which would make waiting easy.	2024-01-02 10:54:00 +04:00
Shany Pozin	b05fe53cfd	Merge pull request #6240 from neondatabase/releases/2024-01-01 Release 2024-01-01	2024-01-01 11:07:30 +02:00
Christian Schwarz	c13a2f0df1	Merge pull request #6192 from neondatabase/releases/2023-12-19 Release 2023-12-19 We need to do a config change that requires restarting the pageservers. Slip in two metrics-related commits that didn't make this week's regularly release.	2023-12-19 14:52:47 +01:00
Christian Schwarz	39be366fc5	higher resolution histograms for getpage@lsn (#6177 ) part of https://github.com/neondatabase/cloud/issues/7811	2023-12-19 13:46:59 +00:00
Christian Schwarz	6eda0a3158	[PRE-MERGE] fix metric `pageserver_initial_logical_size_start_calculation` (This is a pre-merge cherry-pick of https://github.com/neondatabase/neon/pull/6191) It wasn't being incremented. Fixup of commit `1c88824ed0` Author: Christian Schwarz <christian@neon.tech> Date: Fri Dec 1 12:52:59 2023 +0100 initial logical size calculation: add a bunch of metrics (#5995)	2023-12-19 13:46:55 +00:00
Shany Pozin	306c7a1813	Merge pull request #6173 from neondatabase/sasha_release_bypassrls_replication Grant BYPASSRLS and REPLICATION explicitly to neon_superuser roles	2023-12-18 22:16:36 +02:00
Sasha Krassovsky	80be423a58	Grant BYPASSRLS and REPLICATION explicitly to neon_superuser roles	2023-12-18 10:22:36 -08:00
Shany Pozin	5dcfef82f2	Merge pull request #6163 from neondatabase/releases/2023-12-18 Release 2023-12-18-2	2023-12-18 15:34:17 +02:00
Christian Schwarz	e67b8f69c0	[PRE-MERGE] pageserver: Reduce tracing overhead in timeline::get #6115 Pre-merge `git merge --squash` of https://github.com/neondatabase/neon/pull/6115 Lowering the tracing level in get_value_reconstruct_data and get_or_maybe_download from info to debug reduces the overhead of span creation in non-debug environments.	2023-12-18 13:39:48 +01:00
Shany Pozin	e546872ab4	Merge pull request #6158 from neondatabase/releases/2023-12-18 Release 2023-12-18	2023-12-18 14:24:34 +02:00
John Spray	322ea1cf7c	pageserver: on-demand activation cleanups (#6157 ) ## Problem #6112 added some logs and metrics: clean these up a bit: - Avoid counting startup completions for tenants launched after startup - exclude no-op cases from timing histograms - remove a rogue log messages	2023-12-18 11:14:19 +00:00
Vadim Kharitonov	3633742de9	Merge pull request #6121 from neondatabase/releases/2023-12-13 Release 2023-12-13	2023-12-13 12:39:43 +01:00
Joonas Koivunen	079d3a37ba	Merge remote-tracking branch 'origin/release' into releases/2023-12-13 this handles the hotfix introduced conflict.	2023-12-13 10:07:19 +00:00
Vadim Kharitonov	a46e77b476	Merge pull request #6090 from neondatabase/releases/2023-12-11 Release 2023-12-11	2023-12-12 12:10:35 +01:00
Tristan Partin	a92702b01e	Add submodule paths as safe directories as a precaution The check-codestyle-rust-arm job requires this for some reason, so let's just add them everywhere we do this workaround.	2023-12-11 22:00:35 +00:00
Tristan Partin	8ff3253f20	Fix git ownership issue in check-codestyle-rust-arm We have this workaround for other jobs. Looks like this one was forgotten about.	2023-12-11 22:00:35 +00:00
Joonas Koivunen	04b82c92a7	fix: accidential return Ok (#6106 ) Error indicating request cancellation OR timeline shutdown was deemed as a reason to exit the background worker that calculated synthetic size. Fix it to only be considered for avoiding logging such of such errors. This conflicted on tenant_shard_id having already replaced tenant_id on `main`.	2023-12-11 21:41:36 +00:00
Vadim Kharitonov	e5bf423e68	Merge branch 'release' into releases/2023-12-11	2023-12-11 11:55:48 +01:00
Vadim Kharitonov	60af392e45	Merge pull request #6057 from neondatabase/vk/patch_timescale_for_production Revert timescaledb for pg14 and pg15 (#6056)	2023-12-06 16:21:16 +01:00
Vadim Kharitonov	661fc41e71	Revert timescaledb for pg14 and pg15 (#6056 ) ``` could not start the compute node: compute is in state "failed": db error: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory Caused by: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory ```	2023-12-06 16:14:07 +01:00
Shany Pozin	702c488f32	Merge pull request #6022 from neondatabase/releases/2023-12-04 Release 2023-12-04	2023-12-05 17:03:28 +02:00
Sasha Krassovsky	45c5122754	Remove trusted from wal2json	2023-12-04 12:36:19 -08:00
Shany Pozin	558394f710	fix merge	2023-12-04 11:41:27 +02:00
Shany Pozin	73b0898608	Merge branch 'release' into releases/2023-12-04	2023-12-04 11:36:26 +02:00
Joonas Koivunen	e65be4c2dc	Merge pull request #6013 from neondatabase/releases/2023-12-01-hotfix fix: use create_new instead of create for mutex file	2023-12-01 15:35:56 +02:00
Joonas Koivunen	40087b8164	fix: use create_new instead of create for mutex file	2023-12-01 12:54:49 +00:00
Shany Pozin	c762b59483	Merge pull request #5986 from neondatabase/Release-11-30-hotfix Notify safekeeper readiness with systemd.	2023-11-30 10:01:05 +02:00
Arseny Sher	5d71601ca9	Notify safekeeper readiness with systemd. To avoid downtime during deploy, as in busy regions initial load can currently take ~30s.	2023-11-30 08:23:31 +03:00
Shany Pozin	a113c3e433	Merge pull request #5945 from neondatabase/release-2023-11-28-hotfix Release 2023 11 28 hotfix	2023-11-28 08:14:59 +02:00
Anastasia Lubennikova	e81fc598f4	Update neon extension relocatable for existing installations (#5943 )	2023-11-28 00:12:39 +00:00
Anastasia Lubennikova	48b845fa76	Make neon extension relocatable to allow SET SCHEMA (#5942 )	2023-11-28 00:12:32 +00:00
Shany Pozin	27096858dc	Merge pull request #5922 from neondatabase/releases/2023-11-27 Release 2023-11-27	2023-11-27 09:58:51 +02:00
Shany Pozin	4430d0ae7d	Merge pull request #5876 from neondatabase/releases/2023-11-17 Release 2023-11-17	2023-11-20 09:11:58 +02:00
Joonas Koivunen	6e183aa0de	Merge branch 'main' into releases/2023-11-17	2023-11-19 15:25:47 +00:00
Vadim Kharitonov	fd6d0b7635	Merge branch 'release' into releases/2023-11-17	2023-11-17 10:51:45 +01:00
Vadim Kharitonov	3710c32aae	Merge pull request #5778 from neondatabase/releases/2023-11-03 Release 2023-11-03	2023-11-03 16:06:58 +01:00
Vadim Kharitonov	be83bee49d	Merge branch 'release' into releases/2023-11-03	2023-11-03 11:18:15 +01:00
Alexander Bayandin	cf28e5922a	Merge pull request #5685 from neondatabase/releases/2023-10-26 Release 2023-10-26	2023-10-27 10:42:12 +01:00
Em Sharnoff	7d384d6953	Bump vm-builder v0.18.2 -> v0.18.4 (#5666 ) Only applicable change was neondatabase/autoscaling#584, setting pgbouncer auth_dbname=postgres in order to fix superuser connections from preventing dropping databases.	2023-10-26 20:15:45 +01:00
Em Sharnoff	4b3b37b912	Bump vm-builder v0.18.1 -> v0.18.2 (#5646 ) Only applicable change was neondatabase/autoscaling#571, removing the postgres_exporter flags `--auto-discover-databases` and `--exclude-databases=...`	2023-10-26 20:15:29 +01:00
Shany Pozin	1d8d200f4d	Merge pull request #5668 from neondatabase/sp/aux_files_cherry_pick Cherry pick: Ignore missed AUX_FILES_KEY when generating image layer (#5660)	2023-10-26 10:08:16 +03:00
Konstantin Knizhnik	0d80d6ce18	Ignore missed AUX_FILES_KEY when generating image layer (#5660 ) ## Problem Logical replication requires new AUX_FILES_KEY which is definitely absent in existed database. We do not have function to check if key exists in our KV storage. So I have to handle the error in `list_aux_files` method. But this key is also included in key space range and accessed y `create_image_layer` method. ## Summary of changes Check if AUX_FILES_KEY exists before including it in keyspace. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Shany Pozin <shany@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2023-10-26 09:30:28 +03:00
Shany Pozin	f653ee039f	Merge pull request #5638 from neondatabase/releases/2023-10-24 Release 2023-10-24	2023-10-24 12:10:52 +03:00
Em Sharnoff	e614a95853	Merge pull request #5610 from neondatabase/sharnoff/rc-2023-10-20-vm-monitor-fixes Release 2023-10-20: vm-monitor memory.high throttling fixes	2023-10-20 00:11:06 -07:00
Em Sharnoff	850db4cc13	vm-monitor: Deny not fail downscale if no memory stats yet (#5606 ) Fixes an issue we observed on staging that happens when the autoscaler-agent attempts to immediately downscale the VM after binding, which is typical for pooled computes. The issue was occurring because the autoscaler-agent was requesting downscaling before the vm-monitor had gathered sufficient cgroup memory stats to be confident in approving it. When the vm-monitor returned an internal error instead of denying downscaling, the autoscaler-agent retried the connection and immediately hit the same issue (in part because cgroup stats are collected per-connection, rather than globally).	2023-10-19 21:56:55 -07:00
Em Sharnoff	8a316b1277	vm-monitor: Log full error on message handling failure (#5604 ) There's currently an issue with the vm-monitor on staging that's not really feasible to debug because the current display impl gives no context to the errors (just says "failed to downscale"). Logging the full error should help. For communications with the autoscaler-agent, it's ok to only provide the outermost cause, because we can cross-reference with the VM logs. At some point in the future, we may want to change that.	2023-10-19 21:56:50 -07:00
Em Sharnoff	4d13bae449	vm-monitor: Switch from memory.high to polling memory.stat (#5524 ) tl;dr it's really hard to avoid throttling from memory.high, and it counts tmpfs & page cache usage, so it's also hard to make sense of. In the interest of fixing things quickly with something that should be good enough, this PR switches to instead periodically fetch memory statistics from the cgroup's memory.stat and use that data to determine if and when we should upscale. This PR fixes #5444, which has a lot more detail on the difficulties we've hit with memory.high. This PR also supersedes #5488.	2023-10-19 21:56:36 -07:00
Vadim Kharitonov	49377abd98	Merge pull request #5577 from neondatabase/releases/2023-10-17 Release 2023-10-17	2023-10-17 12:21:20 +02:00
Christian Schwarz	a6b2f4e54e	limit imitate accesses concurrency, using same semaphore as compactions (#5578 ) Before this PR, when we restarted pageserver, we'd see a rush of `$number_of_tenants` concurrent eviction tasks starting to do imitate accesses building up in the period of `[init_order allows activations, $random_access_delay + EvictionPolicyLayerAccessThreshold::period]`. We simply cannot handle that degree of concurrent IO. We already solved the problem for compactions by adding a semaphore. So, this PR shares that semaphore for use by evictions. Part of https://github.com/neondatabase/neon/issues/5479 Which is again part of https://github.com/neondatabase/neon/issues/4743 Risks / Changes In System Behavior ================================== * we don't do evictions as timely as we currently do * we log a bunch of warnings about eviction taking too long * imitate accesses and compactions compete for the same concurrency limit, so, they'll slow each other down through this shares semaphore Changes ======= - Move the `CONCURRENT_COMPACTIONS` semaphore into `tasks.rs` - Rename it to `CONCURRENT_BACKGROUND_TASKS` - Use it also for the eviction imitate accesses: - Imitate acceses are both per-TIMELINE and per-TENANT - The per-TENANT is done through coalescing all the per-TIMELINE tasks via a tokio mutex `eviction_task_tenant_state`. - We acquire the CONCURRENT_BACKGROUND_TASKS permit early, at the beginning of the eviction iteration, much before the imitate acesses start (and they may not even start at all in the given iteration, as they happen only every $threshold). - Acquiring early is sub-optimal because when the per-timline tasks coalesce on the `eviction_task_tenant_state` mutex, they are already holding a CONCURRENT_BACKGROUND_TASKS permit. - It's also unfair because tenants with many timelines win the CONCURRENT_BACKGROUND_TASKS more often. - I don't think there's another way though, without refactoring more of the imitate accesses logic, e.g, making it all per-tenant. - Add metrics for queue depth behind the semaphore. I found these very useful to understand what work is queued in the system. - The metrics are tagged by the new `BackgroundLoopKind`. - On a green slate, I would have used `TaskKind`, but we already had pre-existing labels whose names didn't map exactly to task kind. Also the task kind is kind of a lower-level detail, so, I think it's fine to have a separate enum to identify background work kinds. Future Work =========== I guess I could move the eviction tasks from a ticker to "sleep for $period". The benefit would be that the semaphore automatically "smears" the eviction task scheduling over time, so, we only have the rush on restart but a smeared-out rush afterward. The downside is that this perverts the meaning of "$period", as we'd actually not run the eviction at a fixed period. It also means the the "took to long" warning & metric becomes meaningless. Then again, that is already the case for the compaction and gc tasks, which do sleep for `$period` instead of using a ticker. (cherry picked from commit `9256788273`)	2023-10-17 12:16:26 +02:00
Shany Pozin	face60d50b	Merge pull request #5526 from neondatabase/releases/2023-10-11 Release 2023-10-11	2023-10-11 11:16:39 +03:00
Shany Pozin	9768aa27f2	Merge pull request #5516 from neondatabase/releases/2023-10-10 Release 2023-10-10	2023-10-10 14:16:47 +03:00
Shany Pozin	96b2e575e1	Merge pull request #5445 from neondatabase/releases/2023-10-03 Release 2023-10-03	2023-10-04 13:53:37 +03:00
Alexander Bayandin	7222777784	Update checksums for pg_jsonschema & pg_graphql (#5455 ) ## Problem Folks have re-taged releases for `pg_jsonschema` and `pg_graphql` (to increase timeouts on their CI), for us, these are a noop changes, but unfortunately, this will cause our builds to fail due to checksums mismatch (this might not strike right away because of the build cache). - `8ba7c7be9d` - `aa7509370a` ## Summary of changes - `pg_jsonschema` update checksum - `pg_graphql` update checksum	2023-10-03 18:44:30 +01:00
Em Sharnoff	5469fdede0	Merge pull request #5422 from neondatabase/sharnoff/rc-2023-09-28-fix-restart-on-postmaster-SIGKILL Release 2023-09-28: Fix (lack of) restart on neonvm postmaster SIGKILL	2023-09-28 10:48:51 -07:00
MMeent	72aa6b9fdd	Fix neon_zeroextend's WAL logging (#5387 ) When you log more than a few blocks, you need to reserve the space in advance. We didn't do that, so we got errors. Now we do that, and shouldn't get errors.	2023-09-28 09:37:28 -07:00
Em Sharnoff	ae0634b7be	Bump vm-builder v0.17.11 -> v0.17.12 (#5407 ) Only relevant change is neondatabase/autoscaling#534 - refer there for more details.	2023-09-28 09:28:04 -07:00
Shany Pozin	70711f32fa	Merge pull request #5375 from neondatabase/releases/2023-09-26 Release 2023-09-26	2023-09-26 15:19:45 +03:00
Vadim Kharitonov	52a88af0aa	Merge pull request #5336 from neondatabase/releases/2023-09-19 Release 2023-09-19	2023-09-19 11:16:43 +02:00
Alexander Bayandin	b7a43bf817	Merge branch 'release' into releases/2023-09-19	2023-09-19 09:07:20 +01:00
Alexander Bayandin	dce91b33a4	Merge pull request #5318 from neondatabase/releases/2023-09-15-1 Postgres 14/15: Use previous extensions versions	2023-09-15 16:30:44 +01:00
Alexander Bayandin	23ee4f3050	Revert plv8 only	2023-09-15 15:45:23 +01:00
Alexander Bayandin	46857e8282	Postgres 14/15: Use previous extensions versions	2023-09-15 15:27:00 +01:00
Alexander Bayandin	368ab0ce54	Merge pull request #5313 from neondatabase/releases/2023-09-15 Release 2023-09-15	2023-09-15 10:39:56 +01:00
Konstantin Knizhnik	a5987eebfd	References to old and new blocks were mixed in xlog_heap_update handler (#5312 ) ## Problem See https://neondb.slack.com/archives/C05L7D1JAUS/p1694614585955029 https://www.notion.so/neondatabase/Duplicate-key-issue-651627ce843c45188fbdcb2d30fd2178 ## Summary of changes Swap old/new block references ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-15 10:11:41 +01:00
Alexander Bayandin	6686ede30f	Update checksum for pg_hint_plan (#5309 ) ## Problem The checksum for `pg_hint_plan` doesn't match: ``` sha256sum: WARNING: 1 computed checksum did NOT match ``` Ref https://github.com/neondatabase/neon/actions/runs/6185715461/job/16793609251?pr=5307 It seems that the release was retagged yesterday: https://github.com/ossc-db/pg_hint_plan/releases/tag/REL16_1_6_0 I don't see any malicious changes from 15_1.5.1: https://github.com/ossc-db/pg_hint_plan/compare/REL15_1_5_1...REL16_1_6_0, so it should be ok to update. ## Summary of changes - Update checksum for `pg_hint_plan` 16_1.6.0	2023-09-15 09:54:42 +01:00
Em Sharnoff	373c7057cc	vm-monitor: Fix cgroup throttling (#5303 ) I believe this (not actual IO problems) is the cause of the "disk speed issue" that we've had for VMs recently. See e.g.: 1. https://neondb.slack.com/archives/C03H1K0PGKH/p1694287808046179?thread_ts=1694271790.580099&cid=C03H1K0PGKH 2. https://neondb.slack.com/archives/C03H1K0PGKH/p1694511932560659 The vm-informant (and now, the vm-monitor, its replacement) is supposed to gradually increase the `neon-postgres` cgroup's memory.high value, because otherwise the kernel will throttle all the processes in the cgroup. This PR fixes a bug with the vm-monitor's implementation of this behavior. --- Other references, for the vm-informant's implementation: - Original issue: neondatabase/autoscaling#44 - Original PR: neondatabase/autoscaling#223	2023-09-15 09:54:42 +01:00
Shany Pozin	7d6ec16166	Merge pull request #5296 from neondatabase/releases/2023-09-13 Release 2023-09-13	2023-09-13 13:49:14 +03:00
Shany Pozin	0e6fdc8a58	Merge pull request #5283 from neondatabase/releases/2023-09-12 Release 2023-09-12	2023-09-12 14:56:47 +03:00
Christian Schwarz	521438a5c6	fix deadlock around TENANTS (#5285 ) The sequence that can lead to a deadlock: 1. DELETE request gets all the way to `tenant.shutdown(progress, false).await.is_err() ` , while holding TENANTS.read() 2. POST request for tenant creation comes in, calls `tenant_map_insert`, it does `let mut guard = TENANTS.write().await;` 3. Something that `tenant.shutdown()` needs to wait for needs a `TENANTS.read().await`. The only case identified in exhaustive manual scanning of the code base is this one: Imitate size access does `get_tenant().await`, which does `TENANTS.read().await` under the hood. In the above case (1) waits for (3), (3)'s read-lock request is queued behind (2)'s write-lock, and (2) waits for (1). Deadlock. I made a reproducer/proof-that-above-hypothesis-holds in https://github.com/neondatabase/neon/pull/5281 , but, it's not ready for merge yet and we want the fix _now_. fixes https://github.com/neondatabase/neon/issues/5284	2023-09-12 14:13:13 +03:00
Vadim Kharitonov	07d7874bc8	Merge pull request #5202 from neondatabase/releases/2023-09-05 Release 2023-09-05	2023-09-05 12:16:06 +02:00
Anastasia Lubennikova	1804111a02	Merge pull request #5161 from neondatabase/rc-2023-08-31 Release 2023-08-31	2023-08-31 16:53:17 +03:00
Arthur Petukhovsky	cd0178efed	Merge pull request #5150 from neondatabase/release-sk-fix-active-timeline Release 2023-08-30	2023-08-30 11:43:39 +02:00
Shany Pozin	333574be57	Merge pull request #5133 from neondatabase/releases/2023-08-29 Release 2023-08-29	2023-08-29 14:02:58 +03:00
Alexander Bayandin	79a799a143	Merge branch 'release' into releases/2023-08-29	2023-08-29 11:17:57 +01:00
Conrad Ludgate	9da06af6c9	Merge pull request #5113 from neondatabase/release-http-connection-fix Release 2023-08-25	2023-08-25 17:21:35 +01:00
Conrad Ludgate	ce1753d036	proxy: dont return connection pending (#5107 ) ## Problem We were returning Pending when a connection had a notice/notification (introduced recently in #5020). When returning pending, the runtime assumes you will call `cx.waker().wake()` in order to continue processing. We weren't doing that, so the connection task would get stuck ## Summary of changes Don't return pending. Loop instead	2023-08-25 16:42:30 +01:00
Alek Westover	67db8432b4	Fix cargo deny errors (#5068 ) ## Problem cargo deny lint broken Links to the CVEs: [rustsec.org/advisories/RUSTSEC-2023-0052](https://rustsec.org/advisories/RUSTSEC-2023-0052) [rustsec.org/advisories/RUSTSEC-2023-0053](https://rustsec.org/advisories/RUSTSEC-2023-0053) One is fixed, the other one isn't so we allow it (for now), to unbreak CI. Then later we'll try to get rid of webpki in favour of the rustls fork. ## Summary of changes ``` +ignore = ["RUSTSEC-2023-0052"] ```	2023-08-25 16:42:30 +01:00
Vadim Kharitonov	4e2e44e524	Enable neon-pool-opt-in (#5062 )	2023-08-22 09:06:14 +01:00
Vadim Kharitonov	ed786104f3	Merge pull request #5060 from neondatabase/releases/2023-08-22 Release 2023-08-22	2023-08-22 09:41:02 +02:00
Stas Kelvich	84b74f2bd1	Merge pull request #4997 from neondatabase/sk/proxy-release-23-07-15 Fix lint	2023-08-15 18:54:20 +03:00
Arthur Petukhovsky	fec2ad6283	Fix lint	2023-08-15 18:49:02 +03:00
Stas Kelvich	98eebd4682	Merge pull request #4996 from neondatabase/sk/proxy_release Disable neon-pool-opt-in	2023-08-15 18:37:50 +03:00
Arthur Petukhovsky	2f74287c9b	Disable neon-pool-opt-in	2023-08-15 18:34:17 +03:00
Shany Pozin	aee1bf95e3	Merge pull request #4990 from neondatabase/releases/2023-08-15 Release 2023-08-15	2023-08-15 15:34:38 +03:00
Shany Pozin	b9de9d75ff	Merge branch 'release' into releases/2023-08-15	2023-08-15 14:35:00 +03:00
Stas Kelvich	7943b709e6	Merge pull request #4940 from neondatabase/sk/release-23-05-25-proxy-fixup Release: proxy retry fixup	2023-08-09 13:53:19 +03:00
Conrad Ludgate	d7d066d493	proxy: delay auth on retry (#4929 ) ## Problem When an endpoint is shutting down, it can take a few seconds. Currently when starting a new compute, this causes an "endpoint is in transition" error. We need to add delays before retrying to ensure that we allow time for the endpoint to shutdown properly. ## Summary of changes Adds a delay before retrying in auth. connect_to_compute already has this delay	2023-08-09 12:54:24 +03:00
Felix Prasanna	e78ac22107	release fix: revert vm builder bump from 0.13.1 -> 0.15.0-alpha1 (#4932 ) This reverts commit `682dfb3a31`. hotfix for a CLI arg issue in the monitor	2023-08-08 21:08:46 +03:00
Vadim Kharitonov	76a8f2bb44	Merge pull request #4923 from neondatabase/releases/2023-08-08 Release 2023-08-08	2023-08-08 11:44:38 +02:00
Vadim Kharitonov	8d59a8581f	Merge branch 'release' into releases/2023-08-08	2023-08-08 10:54:34 +02:00
Vadim Kharitonov	b1ddd01289	Define NEON_SMGR to make it possible for extensions to use Neon SMG API (#4889 ) Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-08-03 16:28:31 +03:00
Alexander Bayandin	6eae4fc9aa	Release 2023-08-02: update pg_embedding (#4877 ) Cherry-picking `ca4d71a954` from `main` into the `release` Co-authored-by: Vadim Kharitonov <vadim2404@users.noreply.github.com>	2023-08-03 08:48:09 +02:00
Christian Schwarz	765455bca2	Merge pull request #4861 from neondatabase/releases/2023-08-01--2-fix-pipeline ci: fix upload-postgres-extensions-to-s3 job	2023-08-01 13:22:07 +02:00
Christian Schwarz	4204960942	ci: fix upload-postgres-extensions-to-s3 job commit commit `5f8fd640bf` Author: Alek Westover <alek.westover@gmail.com> Date: Wed Jul 26 08:24:03 2023 -0400 Upload Test Remote Extensions (#4792) switched to using the release tag instead of `latest`, but, the `promote-images` job only uploads `latest` to the prod ECR. The switch to using release tag was good in principle, but, reverting that part to make the release pipeine work. Note that a proper fix should abandon use of `:latest` tag at all: currently, if a `main` pipeline runs concurrently with a `release` pipeline, the `release` pipeline may end up using the `main` pipeline's images.	2023-08-01 12:01:45 +02:00
Christian Schwarz	67345d66ea	Merge pull request #4858 from neondatabase/releases/2023-08-01 Release 2023-08-01	2023-08-01 10:44:01 +02:00
Shany Pozin	2266ee5971	Merge pull request #4803 from neondatabase/releases/2023-07-25 Release 2023-07-25	2023-07-25 14:21:07 +03:00
Shany Pozin	b58445d855	Merge pull request #4746 from neondatabase/releases/2023-07-18 Release 2023-07-18	2023-07-18 14:45:39 +03:00
Conrad Ludgate	36050e7f3d	Merge branch 'release' into releases/2023-07-18	2023-07-18 12:00:09 +01:00
Alexander Bayandin	33360ed96d	Merge pull request #4705 from neondatabase/release-2023-07-12 Release 2023-07-12 (only proxy)	2023-07-12 19:44:36 +01:00
Conrad Ludgate	39a28d1108	proxy wake_compute loop (#4675 ) ## Problem If we fail to wake up the compute node, a subsequent connect attempt will definitely fail. However, kubernetes won't fail the connection immediately, instead it hangs until we timeout (10s). ## Summary of changes Refactor the loop to allow fast retries of compute_wake and to skip a connect attempt.	2023-07-12 18:40:11 +01:00
Conrad Ludgate	efa6aa134f	allow repeated IO errors from compute node (#4624 ) ## Problem #4598 compute nodes are not accessible some time after wake up due to kubernetes DNS not being fully propagated. ## Summary of changes Update connect retry mechanism to support handling IO errors and sleeping for 100ms ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-07-12 18:40:06 +01:00
Alexander Bayandin	2c724e56e2	Merge pull request #4646 from neondatabase/releases/2023-07-06-hotfix Release 2023-07-06 (add pg_embedding extension only)	2023-07-06 12:19:52 +01:00
Alexander Bayandin	feff887c6f	Compile `pg_embedding` extension (#4634 ) ``` CREATE EXTENSION embedding; CREATE TABLE t (val real[]); INSERT INTO t (val) VALUES ('{0,0,0}'), ('{1,2,3}'), ('{1,1,1}'), (NULL); CREATE INDEX ON t USING hnsw (val) WITH (maxelements = 10, dims=3, m=3); INSERT INTO t (val) VALUES (array[1,2,4]); SELECT * FROM t ORDER BY val <-> array[3,3,3]; val --------- {1,2,3} {1,2,4} {1,1,1} {0,0,0} (5 rows) ```	2023-07-06 09:39:41 +01:00
Vadim Kharitonov	353d915fcf	Merge pull request #4633 from neondatabase/releases/2023-07-05 Release 2023-07-05	2023-07-05 15:10:47 +02:00
Vadim Kharitonov	2e38098cbc	Merge branch 'release' into releases/2023-07-05	2023-07-05 12:41:48 +02:00
Vadim Kharitonov	a6fe5ea1ac	Merge pull request #4571 from neondatabase/releases/2023-06-27 Release 2023-06-27	2023-06-27 12:55:33 +02:00
Vadim Kharitonov	05b0aed0c1	Merge branch 'release' into releases/2023-06-27	2023-06-27 12:22:12 +02:00
Alex Chi Z	cd1705357d	Merge pull request #4561 from neondatabase/releases/2023-06-23-hotfix Release 2023-06-23 (pageserver-only)	2023-06-23 15:38:50 -04:00
Christian Schwarz	6bc7561290	don't use MGMT_REQUEST_RUNTIME for consumption metrics synthetic size worker The consumption metrics synthetic size worker does logical size calculation. Logical size calculation currently does synchronous disk IO. This blocks the MGMT_REQUEST_RUNTIME's executor threads, starving other futures. While there's work on the way to move the synchronous disk IO into spawn_blocking, the quickfix here is to use the BACKGROUND_RUNTIME instead of MGMT_REQUEST_RUNTIME. Actually it's not just a quickfix. We simply shouldn't be blocking MGMT_REQUEST_RUNTIME executor threads on CPU or sync disk IO. That work isn't done yet, as many of the mgmt tasks still _do_ disk IO. But it's not as intensive as the logical size calculations that we're fixing here. While we're at it, fix disk-usage-based eviction in a similar way. It wasn't the culprit here, according to prod logs, but it can theoretically be a little CPU-intensive. More context, including graphs from Prod: https://neondb.slack.com/archives/C03F5SM1N02/p1687541681336949 (cherry picked from commit `d6e35222ea`)	2023-06-23 20:54:07 +02:00
Christian Schwarz	fbd3ac14b5	Merge pull request #4544 from neondatabase/releases/2023-06-21-hotfix Release 2023-06-21 (fixup for post-merge failed 2023-06-20)	2023-06-21 16:54:34 +03:00
Christian Schwarz	e437787c8f	cargo update -p openssl (#4542 ) To unblock release https://github.com/neondatabase/neon/pull/4536#issuecomment-1600678054 Context: https://rustsec.org/advisories/RUSTSEC-2023-0044	2023-06-21 15:52:56 +03:00
Christian Schwarz	3460dbf90b	Merge pull request #4536 from neondatabase/releases/2023-06-20 Release 2023-06-20 (actually 2023-06-21)	2023-06-21 14:19:14 +03:00
Vadim Kharitonov	6b89d99677	Merge pull request #4521 from neondatabase/release_2023-06-15 Release 2023 06 15	2023-06-15 17:40:01 +02:00
Vadim Kharitonov	6cc8ea86e4	Merge branch 'main' into release_2023-06-15	2023-06-15 16:50:44 +02:00
Shany Pozin	e62a492d6f	Merge pull request #4486 from neondatabase/releases/2023-06-13 Release 2023-06-13	2023-06-13 15:21:35 +03:00
Alexey Kondratov	a475cdf642	[compute_ctl] Fix logging if catalog updates are skipped (#4480 ) Otherwise, it wasn't clear from the log when Postgres started up completely if catalog updates were skipped. Follow-up for `4936ab6`	2023-06-13 13:37:24 +02:00
Stas Kelvich	7002c79a47	Merge pull request #4447 from neondatabase/release_proxy_08-06-2023 Release proxy 08 06 2023	2023-06-08 21:02:54 +03:00
Vadim Kharitonov	ee6cf357b4	Merge pull request #4427 from neondatabase/releases/2023-06-06 Release 2023-06-06	2023-06-06 14:42:21 +02:00
Vadim Kharitonov	e5c2086b5f	Merge branch 'release' into releases/2023-06-06	2023-06-06 12:33:56 +02:00
Shany Pozin	5f1208296a	Merge pull request #4395 from neondatabase/releases/2023-06-01 Release 2023-06-01	2023-06-01 10:58:00 +03:00
Stas Kelvich	88e8e473cd	Merge pull request #4345 from neondatabase/release-23-05-25-proxy Release 23-05-25, take 3	2023-05-25 19:40:43 +03:00
Stas Kelvich	b0a77844f6	Add SQL-over-HTTP endpoint to Proxy This commit introduces an SQL-over-HTTP endpoint in the proxy, with a JSON response structure resembling that of the node-postgres driver. This method, using HTTP POST, achieves smaller amortized latencies in edge setups due to fewer round trips and an enhanced open connection reuse by the v8 engine. This update involves several intricacies: 1. SQL injection protection: We employed the extended query protocol, modifying the rust-postgres driver to send queries in one roundtrip using a text protocol rather than binary, bypassing potential issues like those identified in https://github.com/sfackler/rust-postgres/issues/1030. 2. Postgres type compatibility: As not all postgres types have binary representations (e.g., acl's in pg_class), we adjusted rust-postgres to respond with text protocol, simplifying serialization and fixing queries with text-only types in response. 3. Data type conversion: Considering JSON supports fewer data types than Postgres, we perform conversions where possible, passing all other types as strings. Key conversions include: - postgres int2, int4, float4, float8 -> json number (NaN and Inf remain text) - postgres bool, null, text -> json bool, null, string - postgres array -> json array - postgres json and jsonb -> json object 4. Alignment with node-postgres: To facilitate integration with js libraries, we've matched the response structure of node-postgres, returning command tags and column oids. Command tag capturing was added to the rust-postgres functionality as part of this change.	2023-05-25 17:59:17 +03:00
Vadim Kharitonov	1baf464307	Merge pull request #4309 from neondatabase/releases/2023-05-23 Release 2023-05-23	2023-05-24 11:56:54 +02:00
Alexander Bayandin	e9b8e81cea	Merge branch 'release' into releases/2023-05-23	2023-05-23 12:54:08 +01:00
Alexander Bayandin	85d6194aa4	Fix regress-tests job for Postgres 15 on release branch (#4254 ) ## Problem Compatibility tests don't support Postgres 15 yet, but we're still trying to upload compatibility snapshot (which we do not collect). Ref https://github.com/neondatabase/neon/actions/runs/4991394158/jobs/8940369368#step:4:38129 ## Summary of changes Add `pg_version` parameter to `run-python-test-set` actions and do not upload compatibility snapshot for Postgres 15	2023-05-16 17:19:12 +01:00
Vadim Kharitonov	333a7a68ef	Merge pull request #4245 from neondatabase/releases/2023-05-16 Release 2023-05-16	2023-05-16 13:38:40 +02:00
Vadim Kharitonov	6aa4e41bee	Merge branch 'release' into releases/2023-05-16	2023-05-16 12:48:23 +02:00
Joonas Koivunen	840183e51f	try: higher page_service timeouts to isolate an issue	2023-05-11 16:24:53 +03:00
Shany Pozin	cbccc94b03	Merge pull request #4184 from neondatabase/releases/2023-05-09 Release 2023-05-09	2023-05-09 15:30:36 +03:00
Stas Kelvich	fce227df22	Merge pull request #4163 from neondatabase/main Release 23-05-05	2023-05-05 15:56:23 +03:00
Stas Kelvich	bd787e800f	Merge pull request #4133 from neondatabase/main Release 23-04-01	2023-05-01 18:52:46 +03:00
Shany Pozin	4a7704b4a3	Merge pull request #4131 from neondatabase/sp/hotfix_adding_sks_us_west Hotfix: Adding 4 new pageservers and two sets of safekeepers to us west 2	2023-05-01 15:17:38 +03:00
Shany Pozin	ff1119da66	Add 2 new sets of safekeepers to us-west2	2023-05-01 14:35:31 +03:00
Shany Pozin	4c3ba1627b	Add 4 new Pageservers for retool launch	2023-05-01 14:34:38 +03:00
Vadim Kharitonov	1407174fb2	Merge pull request #4110 from neondatabase/vk/release_2023-04-28 Release 2023 04 28	2023-04-28 17:43:16 +02:00
Vadim Kharitonov	ec9dcb1889	Merge branch 'release' into vk/release_2023-04-28	2023-04-28 16:32:26 +02:00
Joonas Koivunen	d11d781afc	revert: "Add check for duplicates of generated image layers" (#4104 ) This reverts commit `732acc5`. Reverted PR: #3869 As noted in PR #4094, we do in fact try to insert duplicates to the layer map, if L0->L1 compaction is interrupted. We do not have a proper fix for that right now, and we are in a hurry to make a release to production, so revert the changes related to this to the state that we have in production currently. We know that we have a bug here, but better to live with the bug that we've had in production for a long time, than rush a fix to production without testing it in staging first. Cc: #4094, #4088	2023-04-28 16:31:35 +02:00
Anastasia Lubennikova	4e44565b71	Merge pull request #4000 from neondatabase/releases/2023-04-11 Release 2023-04-11	2023-04-11 17:47:41 +03:00
Stas Kelvich	4ed51ad33b	Add more proxy cnames	2023-04-11 15:59:35 +03:00
Arseny Sher	1c1ebe5537	Merge pull request #3946 from neondatabase/releases/2023-04-04 Release 2023-04-04	2023-04-04 14:38:40 +04:00
Christian Schwarz	c19cb7f386	Merge pull request #3935 from neondatabase/releases/2023-04-03 Release 2023-04-03	2023-04-03 16:19:49 +02:00
Vadim Kharitonov	4b97d31b16	Merge pull request #3896 from neondatabase/releases/2023-03-28 Release 2023-03-28	2023-03-28 17:58:06 +04:00
Shany Pozin	923ade3dd7	Merge pull request #3855 from neondatabase/releases/2023-03-21 Release 2023-03-21	2023-03-21 13:12:32 +02:00
Arseny Sher	b04e711975	Merge pull request #3825 from neondatabase/release-2023-03-15 Release 2023.03.15	2023-03-15 15:38:00 +03:00
Arseny Sher	afd0a6b39a	Forward framed read buf contents to compute before proxy pass. Otherwise they get lost. Normally buffer is empty before proxy pass, but this is not the case with pipeline mode of out npm driver; fixes connection hangup introduced by `b80fe41af3` for it. fixes https://github.com/neondatabase/neon/issues/3822	2023-03-15 15:36:06 +04:00
Lassi Pölönen	99752286d8	Use RollingUpdate strategy also for legacy proxy (#3814 ) ## Describe your changes We have previously changed the neon-proxy to use RollingUpdate. This should be enabled in legacy proxy too in order to avoid breaking connections for the clients and allow for example backups to run even during deployment. (https://github.com/neondatabase/neon/pull/3683) ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-03-15 15:35:51 +04:00
Arseny Sher	15df93363c	Merge pull request #3804 from neondatabase/release-2023-03-13 Release 2023.03.13	2023-03-13 20:25:40 +03:00
Vadim Kharitonov	bc0ab741af	Merge pull request #3758 from neondatabase/releases/2023-03-07 Release 2023-03-07	2023-03-07 12:38:47 +01:00
Christian Schwarz	51d9dfeaa3	Merge pull request #3743 from neondatabase/releases/2023-03-03 Release 2023-03-03	2023-03-03 19:20:21 +01:00
Shany Pozin	f63cb18155	Merge pull request #3713 from neondatabase/releases/2023-02-28 Release 2023-02-28	2023-02-28 12:52:24 +02:00
Arseny Sher	0de603d88e	Merge pull request #3707 from neondatabase/release-2023-02-24 Release 2023-02-24 Hotfix for UNLOGGED tables. Contains #3706 Also contains rebase on 14.7 and 15.2 #3581	2023-02-25 00:32:11 +04:00
Heikki Linnakangas	240913912a	Fix UNLOGGED tables. Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref https://github.com/neondatabase/postgres/pull/264 ref https://github.com/neondatabase/postgres/pull/259 ref https://github.com/neondatabase/neon/issues/1222	2023-02-24 23:54:53 +04:00
MMeent	91a4ea0de2	Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581 ) ## Describe your changes Rebase vendored PostgreSQL onto 14.7 and 15.2 ## Issue ticket number and link #3579 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ``` The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL 14 and 15.2 for PostgreSQL 15. ```	2023-02-24 23:54:42 +04:00
Arseny Sher	8608704f49	Merge pull request #3691 from neondatabase/release-2023-02-23 Release 2023-02-23 Hotfix for the unlogged tables with indexes issue. neondatabase/postgres#259 neondatabase/postgres#262	2023-02-23 13:39:33 +04:00
Arseny Sher	efef68ce99	Bump vendor/postgres to include hotfix for unlogged tables with indexes. https://github.com/neondatabase/postgres/pull/259 https://github.com/neondatabase/postgres/pull/262	2023-02-23 08:49:43 +04:00
Joonas Koivunen	8daefd24da	Merge pull request #3679 from neondatabase/releases/2023-02-22 Releases/2023-02-22	2023-02-22 15:56:55 +02:00
Arthur Petukhovsky	46cc8b7982	Remove safekeeper-1.ap-southeast-1.aws.neon.tech (#3671 ) We migrated all timelines to `safekeeper-3.ap-southeast-1.aws.neon.tech`, now old instance can be removed.	2023-02-22 15:07:57 +02:00
Sergey Melnikov	38cd90dd0c	Add -v to ansible invocations (#3670 ) To get more debug output on failures	2023-02-22 15:07:57 +02:00
Joonas Koivunen	a51b269f15	fix: hold permit until GetObject eof (#3663 ) previously we applied the ratelimiting only up to receiving the headers from s3, or somewhere near it. the commit adds an adapter which carries the permit until the AsyncRead has been disposed. fixes #3662.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	43bf6d0a0f	calculate_logical_size: no longer use spawn_blocking (#3664 ) Calculation of logical size is now async because of layer downloads, so we shouldn't use spawn_blocking for it. Use of `spawn_blocking` exhausted resources which are needed by `tokio::io::copy` when copying from a stream to a file which lead to deadlock. Fixes: #3657	2023-02-22 15:07:57 +02:00
Joonas Koivunen	15273a9b66	chore: ignore all compaction inactive tenant errors (#3665 ) these are happening in tests because of #3655 but they sure took some time to appear. makes the `Compaction failed, retrying in 2s: Cannot run compaction iteration on inactive tenant` into a globally allowed error, because it has been seen failing on different test cases.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	78aca668d0	fix: log download failed error (#3661 ) Fixes #3659	2023-02-22 15:07:57 +02:00
Vadim Kharitonov	acbf4148ea	Merge pull request #3656 from neondatabase/releases/2023-02-21 Release 2023-02-21	2023-02-21 16:03:48 +01:00
Vadim Kharitonov	6508540561	Merge branch 'release' into releases/2023-02-21	2023-02-21 15:31:16 +01:00
Arthur Petukhovsky	a41b5244a8	Add new safekeeper to ap-southeast-1 prod (#3645 ) (#3646 ) To trigger deployment of #3645 to production.	2023-02-20 15:22:49 +00:00
Shany Pozin	2b3189be95	Merge pull request #3600 from neondatabase/releases/2023-02-14 Release 2023-02-14	2023-02-15 13:31:30 +02:00
Vadim Kharitonov	248563c595	Merge pull request #3553 from neondatabase/releases/2023-02-07 Release 2023-02-07	2023-02-07 14:07:44 +01:00
Vadim Kharitonov	14cd6ca933	Merge branch 'release' into releases/2023-02-07	2023-02-07 12:11:56 +01:00
Vadim Kharitonov	eb36403e71	Release 2023 01 31 (#3497 ) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Shany Pozin <shany@neon.tech> Co-authored-by: Sergey Melnikov <sergey@neon.tech> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Lassi Pölönen <lassi.polonen@iki.fi>	2023-01-31 15:06:35 +02:00
Anastasia Lubennikova	3c6f779698	Merge pull request #3411 from neondatabase/release_2023_01_23 Fix Release 2023 01 23	2023-01-23 20:10:03 +02:00
Joonas Koivunen	f67f0c1c11	More tenant size fixes (#3410 ) Small changes, but hopefully this will help with the panic detected in staging, for which we cannot get the debugging information right now (end-of-branch before branch-point).	2023-01-23 17:46:13 +02:00
Shany Pozin	edb02d3299	Adding pageserver3 to staging (#3403 )	2023-01-23 17:46:13 +02:00
Konstantin Knizhnik	664a69e65b	Fix slru_segment_key_range function: segno was assigned to incorrect Key field (#3354 )	2023-01-23 17:46:13 +02:00
Anastasia Lubennikova	478322ebf9	Fix tenant size orphans (#3377 ) Before only the timelines which have passed the `gc_horizon` were processed which failed with orphans at the tree_sort phase. Example input in added `test_branched_empty_timeline_size` test case. The PR changes iteration to happen through all timelines, and in addition to that, any learned branch points will be calculated as they would had been in the original implementation if the ancestor branch had been over the `gc_horizon`. This also changes how tenants where all timelines are below `gc_horizon` are handled. Previously tenant_size 0 was returned, but now they will have approximately `initdb_lsn` worth of tenant_size. The PR also adds several new tenant size tests that describe various corner cases of branching structure and `gc_horizon` setting. They are currently disabled to not consume time during CI. Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-01-23 17:46:13 +02:00
Joonas Koivunen	802f174072	fix: dont stop pageserver if we fail to calculate synthetic size	2023-01-23 17:46:13 +02:00
Alexey Kondratov	47f9890bae	[compute_ctl] Make role deletion spec processing idempotent (#3380 ) Previously, we were trying to re-assign owned objects of the already deleted role. This were causing a crash loop in the case when compute was restarted with a spec that includes delta operation for role deletion. To avoid such cases, check that role is still present before calling `reassign_owned_objects`. Resolves neondatabase/cloud#3553	2023-01-23 17:46:13 +02:00
Christian Schwarz	262265daad	Revert "Use actual temporary dir for pageserver unit tests" This reverts commit `826e89b9ce`. The problem with that commit was that it deletes the TempDir while there are still EphemeralFile instances open. At first I thought this could be fixed by simply adding Handle::current().block_on(task_mgr::shutdown(None, Some(tenant_id), None)) to TenantHarness::drop, but it turned out to be insufficient. So, reverting the commit until we find a proper solution. refs https://github.com/neondatabase/neon/issues/3385	2023-01-23 17:46:13 +02:00
bojanserafimov	300da5b872	Improve layer map docstrings (#3382 )	2023-01-23 17:46:13 +02:00
Heikki Linnakangas	7b22b5c433	Switch to 'tracing' for logging, restructure code to make use of spans. Refactors Compute::prepare_and_run. It's split into subroutines differently, to make it easier to attach tracing spans to the different stages. The high-level logic for waiting for Postgres to exit is moved to the caller. Replace 'env_logger' with 'tracing', and add `#instrument` directives to different stages fo the startup process. This is a fairly mechanical change, except for the changes in 'spec.rs'. 'spec.rs' contained some complicated formatting, where parts of log messages were printed directly to stdout with `print`s. That was a bit messed up because the log normally goes to stderr, but those lines were printed to stdout. In our docker images, stderr and stdout both go to the same place so you wouldn't notice, but I don't think it was intentional. This changes the log format to the default 'tracing_subscriber::format' format. It's different from the Postgres log format, however, and because both compute_tools and Postgres print to the same log, it's now a mix of two different formats. I'm not sure how the Grafana log parsing pipeline can handle that. If it's a problem, we can build custom formatter to change the compute_tools log format to be the same as Postgres's, like it was before this commit, or we can change the Postgres log format to match tracing_formatter's, or we can start printing compute_tool's log output to a different destination than Postgres	2023-01-23 17:46:12 +02:00
Kirill Bulatov	ffca97bc1e	Enable logs in unit tests	2023-01-23 17:46:12 +02:00
Kirill Bulatov	cb356f3259	Use actual temporary dir for pageserver unit tests	2023-01-23 17:46:12 +02:00
Vadim Kharitonov	c85374295f	Change SENTRY_ENVIRONMENT from "development" to "staging"	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	4992160677	Fix metric_collection_endpoint for prod. It was incorrectly set to staging url	2023-01-23 17:46:12 +02:00
Heikki Linnakangas	bd535b3371	If an error happens while checking for core dumps, don't panic. If we panic, we skip the 30s wait in 'main', and don't give the console a chance to observe the error. Which is not nice. Spotted by @ololobus at https://github.com/neondatabase/neon/pull/3352#discussion_r1072806981	2023-01-23 17:46:12 +02:00
Kirill Bulatov	d90c5a03af	Add more io::Error context when fail to operate on a path (#3254 ) I have a test failure that shows ``` Caused by: 0: Failed to reconstruct a page image: 1: Directory not empty (os error 39) ``` but does not really show where exactly that happens. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3227/release/3823785365/index.html#categories/c0057473fc9ec8fb70876fd29a171ce8/7088dab272f2c7b7/?attachment=60fe6ed2add4d82d The PR aims to add more context in debugging that issue.	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	2d02cc9079	Merge pull request #3365 from neondatabase/main Release 2023-01-17	2023-01-17 16:41:34 +02:00
Christian Schwarz	49ad94b99f	Merge pull request #3301 from neondatabase/release-2023-01-10 Release 2023-01-10	2023-01-10 16:42:26 +01:00
Christian Schwarz	948a217398	Merge commit '95bf19b85a06b27a7fc3118dee03d48648efab15' into release-2023-01-10 Conflicts: .github/helm-values/neon-stress.proxy-scram.yaml .github/helm-values/neon-stress.proxy.yaml .github/helm-values/staging.proxy-scram.yaml .github/helm-values/staging.proxy.yaml All of the above were deleted in `main` after we hotfixed them in `release. Deleting them here storage_broker/src/bin/storage_broker.rs Hotfix toned down logging, but `main` has sinced implemented a proper fix. Taken `main`'s side, see https://neondb.slack.com/archives/C033RQ5SPDH/p1673354385387479?thread_ts=1673354306.474729&cid=C033RQ5SPDH closes https://github.com/neondatabase/neon/issues/3287	2023-01-10 15:40:14 +01:00
Dmitry Rodionov	125381eae7	Merge pull request #3236 from neondatabase/dkr/retrofit-sk4-sk4-change Move zenith-1-sk-3 to zenith-1-sk-4 (#3164)	2022-12-30 14:13:50 +03:00
Arthur Petukhovsky	cd01bbc715	Move zenith-1-sk-3 to zenith-1-sk-4 (#3164 )	2022-12-30 12:32:52 +02:00
Dmitry Rodionov	d8b5e3b88d	Merge pull request #3229 from neondatabase/dkr/add-pageserver-for-release add pageserver to new region see https://github.com/neondatabase/aws/pull/116 decrease log volume for pageserver	2022-12-30 12:34:04 +03:00
Dmitry Rodionov	06d25f2186	switch to debug from info to produce less noise	2022-12-29 17:48:47 +02:00
Dmitry Rodionov	f759b561f3	add pageserver to new region see https://github.com/neondatabase/aws/pull/116	2022-12-29 17:17:35 +02:00
Sergey Melnikov	ece0555600	Push proxy metrics to Victoria Metrics (#3106 )	2022-12-16 14:44:49 +02:00
Joonas Koivunen	73ea0a0b01	fix(remote_storage): use cached credentials (#3128 ) IMDSv2 has limits, and if we query it on every s3 interaction we are going to go over those limits. Changes the s3_bucket client configuration to use: - ChainCredentialsProvider to handle env variables or imds usage - LazyCachingCredentialsProvider to actually cache any credentials Related: https://github.com/awslabs/aws-sdk-rust/issues/629 Possibly related: https://github.com/neondatabase/neon/issues/3118	2022-12-16 14:44:49 +02:00
Arseny Sher	d8f6d6fd6f	Merge pull request #3126 from neondatabase/broker-lb-release Deploy broker with L4 LB in new env.	2022-12-16 01:25:28 +03:00
Arseny Sher	d24de169a7	Deploy broker with L4 LB in new env. Seems to be fixing issue with missing keepalives.	2022-12-16 01:45:32 +04:00
Arseny Sher	0816168296	Hotfix: terminate subscription if channel is full. Might help as a hotfix, but need to understand root better.	2022-12-15 12:23:56 +03:00
Dmitry Rodionov	277b44d57a	Merge pull request #3102 from neondatabase/main Hotfix. See commits for details	2022-12-14 19:38:43 +03:00
MMeent	68c2c3880e	Merge pull request #3038 from neondatabase/main Release 22-12-14	2022-12-14 14:35:47 +01:00
Arthur Petukhovsky	49da498f65	Merge pull request #2833 from neondatabase/main Release 2022-11-16	2022-11-17 08:44:10 +01:00
Stas Kelvich	2c76ba3dd7	Merge pull request #2718 from neondatabase/main-rc-22-10-28 Release 22-10-28	2022-10-28 20:33:56 +03:00
Arseny Sher	dbe3dc69ad	Merge branch 'main' into main-rc-22-10-28 Release 22-10-28.	2022-10-28 19:10:11 +04:00
Arseny Sher	8e5bb3ed49	Enable etcd compaction in neon_local.	2022-10-27 12:53:20 +03:00
Stas Kelvich	ab0be7b8da	Avoid debian-testing packages in compute Dockerfiles plv8 can only be built with a fairly new gold linker version. We used to install it via binutils packages from testing, but it also updates libc and that causes troubles in the resulting image as different extensions were built against different libc versions. We could either use libc from debian-testing everywhere or restrain from using testing packages and install necessary programs manually. This patch uses the latter approach: gold for plv8 and cmake for h3 are installed manually. In a passing declare h3_postgis as a safe extension (previous omission).	2022-10-27 12:53:20 +03:00
bojanserafimov	b4c55f5d24	Move pagestream api to libs/pageserver_api (#2698 )	2022-10-27 12:53:20 +03:00
mikecaat	ede70d833c	Add a docker-compose example file (#1943 ) (#2666 ) Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>	2022-10-27 12:53:20 +03:00
Sergey Melnikov	70c3d18bb0	Do not release to new staging proxies on release (#2685 )	2022-10-27 12:53:20 +03:00
bojanserafimov	7a491f52c4	Add draw_timeline binary (#2688 )	2022-10-27 12:53:20 +03:00
Alexander Bayandin	323c4ecb4f	Add data format backward compatibility tests (#2626 )	2022-10-27 12:53:20 +03:00
Anastasia Lubennikova	3d2466607e	Merge pull request #2692 from neondatabase/main-rc Release 2022-10-25	2022-10-25 18:18:58 +03:00
Anastasia Lubennikova	ed478b39f4	Merge branch 'release' into main-rc	2022-10-25 17:06:33 +03:00
Stas Kelvich	91585a558d	Merge pull request #2678 from neondatabase/stas/hotfix_schema Hotfix to disable grant create on public schema	2022-10-22 02:54:31 +03:00
Stas Kelvich	93467eae1f	Hotfix to disable grant create on public schema `GRANT CREATE ON SCHEMA public` fails if there is no schema `public`. Disable it in release for now and make a better fix later (it is needed for v15 support).	2022-10-22 02:26:28 +03:00
Stas Kelvich	f3aac81d19	Merge pull request #2668 from neondatabase/main Release 2022-10-21	2022-10-21 15:21:42 +03:00
Stas Kelvich	979ad60c19	Merge pull request #2581 from neondatabase/main Release 2022-10-07	2022-10-07 16:50:55 +03:00
Stas Kelvich	9316cb1b1f	Merge pull request #2573 from neondatabase/main Release 2022-10-06	2022-10-07 11:07:06 +03:00
Anastasia Lubennikova	e7939a527a	Merge pull request #2377 from neondatabase/main Release 2022-09-01	2022-09-01 20:20:44 +03:00
Arthur Petukhovsky	36d26665e1	Merge pull request #2299 from neondatabase/main * Check for entire range during sasl validation (#2281) * Gen2 GH runner (#2128) * Re-add rustup override * Try s3 bucket * Set git version * Use v4 cache key to prevent problems * Switch to v5 for key * Add second rustup fix * Rebase * Add kaniko steps * Fix typo and set compress level * Disable global run default * Specify shell for step * Change approach with kaniko * Try less verbose shell spec * Add submodule pull * Add promote step * Adjust dependency chain * Try default swap again * Use env * Don't override aws key * Make kaniko build conditional * Specify runs on * Try without dependency link * Try soft fail * Use image with git * Try passing to next step * Fix duplicate * Try other approach * Try other approach * Fix typo * Try other syntax * Set env * Adjust setup * Try step 1 * Add link * Try global env * Fix mistake * Debug * Try other syntax * Try other approach * Change order * Move output one step down * Put output up one level * Try other syntax * Skip build * Try output * Re-enable build * Try other syntax * Skip middle step * Update check * Try first step of dockerhub push * Update needs dependency * Try explicit dir * Add missing package * Try other approach * Try other approach * Specify region * Use with * Try other approach * Add debug * Try other approach * Set region * Follow AWS example * Try github approach * Skip Qemu * Try stdin * Missing steps * Add missing close * Add echo debug * Try v2 endpoint * Use v1 endpoint * Try without quotes * Revert * Try crane * Add debug * Split steps * Fix duplicate * Add shell step * Conform to options * Add verbose flag * Try single step * Try workaround * First request fails hunch * Try bullseye image * Try other approach * Adjust verbose level * Try previous step * Add more debug * Remove debug step * Remove rogue indent * Try with larger image * Add build tag step * Update workflow for testing * Add tag step for test * Remove unused * Update dependency chain * Add ownership fix * Use matrix for promote * Force update * Force build * Remove unused * Add new image * Add missing argument * Update dockerfile copy * Update Dockerfile * Update clone * Update dockerfile * Go to correct folder * Use correct format * Update dockerfile * Remove cd * Debug find where we are * Add debug on first step * Changedir to postgres * Set workdir * Use v1 approach * Use other dependency * Try other approach * Try other approach * Update dockerfile * Update approach * Update dockerfile * Update approach * Update dockerfile * Update dockerfile * Add workspace hack * Update Dockerfile * Update Dockerfile * Update Dockerfile * Change last step * Cleanup pull in prep for review * Force build images * Add condition for latest tagging * Use pinned version * Try without name value * Remove more names * Shorten names * Add kaniko comments * Pin kaniko * Pin crane and ecr helper * Up one level * Switch to pinned tag for rust image * Force update for test Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> * Add missing step output, revert one deploy step (#2285) * Add missing step output, revert one deploy step * Conform to syntax * Update approach * Add missing value * Add missing needs Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Error for fatal not git repo (#2286) Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Use main, not branch for ref check (#2288) * Use main, not branch for ref check * Add more debug * Count main, not head * Try new approach * Conform to syntax * Update approach * Get full history * Skip checkout * Cleanup debug * Remove more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix docker zombie process issue (#2289) * Fix docker zombie process issue * Init everywhere Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix 1.63 clippy lints (#2282) * split out timeline metrics, track layer map loading and size calculation * reset rust cache for clippy run to avoid an ICE additionally remove trailing whitespaces * Rename pg_control_ffi.h to bindgen_deps.h, for clarity. The pg_control_ffi.h name implies that it only includes stuff related to pg_control.h. That's mostly true currently, but really the point of the file is to include everything that we need to generate Rust definitions from. * Make local mypy behave like CI mypy (#2291) * Fix flaky pageserver restarts in tests (#2261) * Remove extra type aliases (#2280) * Update cachepot endpoint (#2290) * Update cachepot endpoint * Update dockerfile & remove env * Update image building process * Cannot use metadata endpoint for this * Update workflow * Conform to kaniko syntax * Update syntax * Update approach * Update dockerfiles * Force update * Update dockerfiles * Update dockerfile * Cleanup dockerfiles * Update s3 test location * Revert s3 experiment * Add more debug * Specify aws region * Remove debug, add prefix * Remove one more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * workflows/benchmarking: increase timeout (#2294) * Rework `init` in pageserver CLI (#2272) * Do not create initial tenant and timeline (adjust Python tests for that) * Rework config handling during init, add --update-config to manage local config updates * Fix: Always build images (#2296) * Always build images * Remove unused Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Move auto-generated 'bindings' to a separate inner module. Re-export only things that are used by other modules. In the future, I'm imagining that we run bindgen twice, for Postgres v14 and v15. The two sets of bindings would go into separate 'bindings_v14' and 'bindings_v15' modules. Rearrange postgres_ffi modules. Move function, to avoid Postgres version dependency in timelines.rs Move function to generate a logical-message WAL record to postgres_ffi. * fix cargo test * Fix walreceiver and safekeeper bugs (#2295) - There was an issue with zero commit_lsn `reason: LaggingWal { current_commit_lsn: 0/0, new_commit_lsn: 1/6FD90D38, threshold: 10485760 } }`. The problem was in `send_wal.rs`, where we initialized `end_pos = Lsn(0)` and in some cases sent it to the pageserver. - IDENTIFY_SYSTEM previously returned `flush_lsn` as a physical end of WAL. Now it returns `flush_lsn` (as it was) to walproposer and `commit_lsn` to everyone else including pageserver. - There was an issue with backoff where connection was cancelled right after initialization: `connected!` -> `safekeeper_handle_db: Connection cancelled` -> `Backoff: waiting 3 seconds`. The problem was in sleeping before establishing the connection. This is fixed by reworking retry logic. - There was an issue with getting `NoKeepAlives` reason in a loop. The issue is probably the same as the previous. - There was an issue with filtering safekeepers based on retry attempts, which could filter some safekeepers indefinetely. This is fixed by using retry cooldown duration instead of retry attempts. - Some `send_wal.rs` connections failed with errors without context. This is fixed by adding a timeline to safekeepers errors. New retry logic works like this: - Every candidate has a `next_retry_at` timestamp and is not considered for connection until that moment - When walreceiver connection is closed, we update `next_retry_at` using exponential backoff, increasing the cooldown on every disconnect. - When `last_record_lsn` was advanced using the WAL from the safekeeper, we reset the retry cooldown and exponential backoff, allowing walreceiver to reconnect to the same safekeeper instantly. * on safekeeper registration pass availability zone param (#2292) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Anton Galitsyn <agalitsyn@users.noreply.github.com>	2022-08-18 15:32:33 +03:00
Arthur Petukhovsky	873347f977	Merge pull request #2275 from neondatabase/main * github/workflows: Fix git dubious ownership (#2223) * Move relation size cache from WalIngest to DatadirTimeline (#2094) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * refactor: replace lazy-static with once-cell (#2195) - Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy` - fixes #1147 Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> * Add more buckets to pageserver latency metrics (#2225) * ignore record property warning to fix benchmarks * increase statement timeout * use event so it fires only if workload thread successfully finished * remove debug log * increase timeout to pass test with real s3 * avoid duplicate parameter, increase timeout * Major migration script (#2073) This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview. Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> * Fix etcd typos * Fix links to safekeeper protocol docs. (#2188) safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in commit `0b14fdb078`, as part of reorganizing the docs into 'mdbook' format. Fixes issue #1475. Thanks to @banks for spotting the outdated references. In addition to fixing the above issue, this patch also fixes other broken links as a result of `0b14fdb078`. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> * Update CONTRIBUTING.md * Update CONTRIBUTING.md * support node id and remote storage params in docker_entrypoint.sh * Safe truncate (#2218) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Check if relation exists before trying to truncat it refer #1932 * Add test reporducing FSM truncate problem Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Fix exponential backoff values * Update back `vendor/postgres` back; it was changed accidentally. (#2251) Commit `4227cfc96e` accidentally reverted vendor/postgres to an older version. Update it back. * Add pageserver checkpoint_timeout option. To flush inmemory layer eventually when no new data arrives, which helps safekeepers to suspend activity (stop pushing to the broker). Default 10m should be ok. * Share exponential backoff code and fix logic for delete task failure (#2252) * Fix bug when import large (>1GB) relations (#2172) Resolves #2097 - use timeline modification's `lsn` and timeline's `last_record_lsn` to determine the corresponding LSN to query data in `DatadirModification::get` - update `test_import_from_pageserver`. Split the test into 2 variants: `small` and `multisegment`. + `small` is the old test + `multisegment` is to simulate #2097 by using a larger number of inserted rows to create multiple segment files of a relation. `multisegment` is configured to only run with a `release` build * Fix timeline physical size flaky tests (#2244) Resolves #2212. - use `wait_for_last_flush_lsn` in `test_timeline_physical_size_` tests ## Context Need to wait for the pageserver to catch up with the compute's last flush LSN because during the timeline physical size API call, it's possible that there are running `LayerFlushThread` threads. These threads flush new layers into disk and hence update the physical size. This results in a mismatch between the physical size reported by the API and the actual physical size on disk. ### Note The `LayerFlushThread` threads are processed concurrently, so it's possible that the above error still persists even with this patch. However, making the tests wait to finish processing all the WALs (not flushing) before calculating the physical size should help reduce the "flakiness" significantly postgres_ffi/waldecoder: validate more header fields * postgres_ffi/waldecoder: remove unused startlsn * postgres_ffi/waldecoder: introduce explicit `enum State` Previously it was emulated with a combination of nullable fields. This change should make the logic more readable. * disable `test_import_from_pageserver_multisegment` (#2258) This test failed consistently on `main` now. It's better to temporarily disable it to avoid blocking others' PRs while investigating the root cause for the test failure. See: #2255, #2256 * get_binaries uses DOCKER_TAG taken from docker image build step (#2260) * [proxy] Rework wire format of the password hack and some errors (#2236) The new format has a few benefits: it's shorter, simpler and human-readable as well. We don't use base64 anymore, since url encoding got us covered. We also show a better error in case we couldn't parse the payload; the users should know it's all about passing the correct project name. * test_runner/pg_clients: collect docker logs (#2259) * get_binaries script fix (#2263) * get_binaries uses DOCKER_TAG taken from docker image build step * remove docker tag discovery at all and fix get_binaries for version variable * Better storage sync logs (#2268) * Find end of WAL on safekeepers using WalStreamDecoder. We could make it inside wal_storage.rs, but taking into account that - wal_storage.rs reading is async - we don't need s3 here - error handling is different; error during decoding is normal I decided to put it separately. Test cargo test test_find_end_of_wal_last_crossing_segment prepared earlier by @yeputons passes now. Fixes https://github.com/neondatabase/neon/issues/544 https://github.com/neondatabase/cloud/issues/2004 Supersedes https://github.com/neondatabase/neon/pull/2066 * Improve walreceiver logic (#2253) This patch makes walreceiver logic more complicated, but it should work better in most cases. Added `test_wal_lagging` to test scenarios where alive safekeepers can lag behind other alive safekeepers. - There was a bug which looks like `etcd_info.timeline.commit_lsn > Some(self.local_timeline.get_last_record_lsn())` filtered all safekeepers in some strange cases. I removed this filter, it should probably help with #2237 - Now walreceiver_connection reports status, including commit_lsn. This allows keeping safekeeper connection even when etcd is down. - Safekeeper connection now fails if pageserver doesn't receive safekeeper messages for some time. Usually safekeeper sends messages at least once per second. - `LaggingWal` check now uses `commit_lsn` directly from safekeeper. This fixes the issue with often reconnects, when compute generates WAL really fast. - `NoWalTimeout` is rewritten to trigger only when we know about the new WAL and the connected safekeeper doesn't stream any WAL. This allows setting a small `lagging_wal_timeout` because it will trigger only when we observe that the connected safekeeper has stuck. * increase timeout in wait_for_upload to avoid spurious failures when testing with real s3 * Bump vendor/postgres to include XLP_FIRST_IS_CONTRECORD fix. (#2274) * Set up a workflow to run pgbench against captest (#2077) Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Ankur Srivastava <ansrivas@users.noreply.github.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Egor Suvorov <egor@neon.tech> Co-authored-by: Andrey Taranik <andrey@cicd.team> Co-authored-by: Dmitry Ivanov <ivadmi5@gmail.com>	2022-08-15 21:30:45 +03:00
Arthur Petukhovsky	e814ac16f9	Merge pull request #2219 from neondatabase/main Release 2022-08-04	2022-08-04 20:06:34 +03:00
Heikki Linnakangas	ad3055d386	Merge pull request #2203 from neondatabase/release-uuid-ossp Deploy new storage and compute version to production Release 2022-08-02	2022-08-02 15:08:14 +03:00
Heikki Linnakangas	94e03eb452	Merge remote-tracking branch 'origin/main' into 'release' Release 2022-08-01	2022-08-02 12:43:49 +03:00
Sergey Melnikov	380f26ef79	Merge pull request #2170 from neondatabase/main (Release 2022-07-28) Release 2022-07-28	2022-07-28 14:16:52 +03:00
Arthur Petukhovsky	3c5b7f59d7	Merge pull request #2119 from neondatabase/main Release 2022-07-19	2022-07-19 11:58:48 +03:00
Arthur Petukhovsky	fee89f80b5	Merge pull request #2115 from neondatabase/main-2022-07-18 Release 2022-07-18	2022-07-18 19:21:11 +03:00
Arthur Petukhovsky	41cce8eaf1	Merge remote-tracking branch 'origin/release' into main-2022-07-18	2022-07-18 18:21:20 +03:00
Alexey Kondratov	f88fe0218d	Merge pull request #1842 from neondatabase/release-deploy-hotfix [HOTFIX] Release deploy fix This PR uses this branch neondatabase/postgres#171 and several required commits from the main to use only locally built compute-tools. This should allow us to rollout safekeepers sync issue fix on prod	2022-06-01 11:04:30 +03:00
Alexey Kondratov	cc856eca85	Install missing openssl packages in the Github Actions workflow	2022-05-31 21:31:31 +02:00
Alexey Kondratov	cf350c6002	Use :local compute-tools tag to build compute-node image	2022-05-31 21:31:16 +02:00
Arseny Sher	0ce6b6a0a3	Merge pull request #1836 from neondatabase/release-hotfix-basebackup-lsn-page-boundary Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:54:03 +04:00
Arseny Sher	73f247d537	Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:00:50 +04:00
Andrey Taranik	960be82183	Merge pull request #1792 from neondatabase/main Release 2202-05-25 (second)	2022-05-25 16:37:57 +03:00
Andrey Taranik	806e5a6c19	Merge pull request #1787 from neondatabase/main Release 2022-05-25	2022-05-25 13:34:11 +03:00
Alexey Kondratov	8d5df07cce	Merge pull request #1385 from zenithdb/main Release main 2022-03-22	2022-03-22 05:04:34 -05:00
Andrey Taranik	df7a9d1407	release fix 2022-03-16 (#1375 )	2022-03-17 00:43:28 +03:00

1541 changed files with 195475 additions and 56641 deletions

									
										10

.cargo/config.toml
									
												View File
												
				@@ -3,6 +3,16 @@

				# by the RUSTDOCFLAGS env var in CI.

				rustdocflags = ["-Arustdoc::private_intra_doc_links"]

				# Enable frame pointers. This may have a minor performance overhead, but makes it easier and more

				# efficient to obtain stack traces (and thus CPU/heap profiles). It may also avoid seg faults that

				# we've seen with libunwind-based profiling. See also:

				#

				# * <https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html>

				# * <https://github.com/rust-lang/rust/pull/122646>

				#

				# NB: the RUSTFLAGS envvar will replace this. Make sure to update e.g. Dockerfile as well.

				rustflags = ["-Cforce-frame-pointers=yes"]

				[alias]

				build_testing = ["build", "--features", "testing"]

				neon = ["run", "--bin", "neon_local"]

									
										3

.config/hakari.toml
									
												View File
												
				@@ -46,6 +46,9 @@ workspace-members = [

				    "utils",

				    "wal_craft",

				    "walproposer",

				    "postgres-protocol2",

				    "postgres-types2",

				    "tokio-postgres2",

				]

				# Write out exact versions rather than a semver range. (Defaults to false.)

10

.dockerignore

View File

@@ -4,27 +4,27 @@
 !Cargo.lock
 !Cargo.toml
 !Makefile
 !postgres.mk
 !rust-toolchain.toml
 !scripts/combine_control_files.py
 !scripts/ninstall.sh
 !vm-cgconfig.conf
 !docker-compose/run-tests.sh
 # Directories
 !.cargo/
 !.config/
 !compute/
 !compute_tools/
 !control_plane/
 !docker-compose/ext-src
 !libs/
 !neon_local/
 !pageserver/
 !patches/
 !pgxn/
 !proxy/
 !endpoint_storage/
 !storage_scrubber/
 !safekeeper/
 !storage_broker/
 !storage_controller/
 !trace/
 !vendor/postgres-*/
 !workspace_hack/
 !build_tools/patches

1

.github/ISSUE_TEMPLATE/bug-template.md vendored

View File

@@ -3,6 +3,7 @@ name: Bug Template
 about: Used for describing bugs
 title: ''
 labels: t/bug
 type: Bug
 assignees: ''
 ---

1

.github/ISSUE_TEMPLATE/epic-template.md vendored

View File

@@ -4,6 +4,7 @@ about: A set of related tasks contributing towards specific outcome, comprising
   more than 1 week of work.
 title: 'Epic: '
 labels: t/Epic
 type: Epic
 assignees: ''
 ---

									
										21

.github/PULL_REQUEST_TEMPLATE/release-pr.md
									
										vendored
									
												View File
											
				@@ -1,21 +0,0 @@

				## Release 202Y-MM-DD

				**NB: this PR must be merged only by 'Create a merge commit'!**

				### Checklist when preparing for release

				- [ ] Read or refresh [the release flow guide](https://www.notion.so/neondatabase/Release-general-flow-61f2e39fd45d4d14a70c7749604bd70b)

				- [ ] Ask in the [cloud Slack channel](https://neondb.slack.com/archives/C033A2WE6BZ) that you are going to rollout the release. Any blockers?

				- [ ] Does this release contain any db migrations? Destructive ones? What is the rollback plan?

				<!-- List everything that should be done **before** release, any issues / setting changes / etc -->

				### Checklist after release

				- [ ] Make sure instructions from PRs included in this release and labeled `manual_release_instructions` are executed (either by you or by people who wrote them).

				- [ ] Based on the merged commits write release notes and open a PR into `website` repo ([example](https://github.com/neondatabase/website/pull/219/files))

				- [ ] Check [#dev-production-stream](https://neondb.slack.com/archives/C03F5SM1N02) Slack channel

				- [ ] Check [stuck projects page](https://console.neon.tech/admin/projects?sort=last_active&order=desc&stuck=true)

				- [ ] Check [recent operation failures](https://console.neon.tech/admin/operations?action=create_timeline%2Cstart_compute%2Cstop_compute%2Csuspend_compute%2Capply_config%2Cdelete_timeline%2Cdelete_tenant%2Ccreate_branch%2Ccheck_availability&sort=updated_at&order=desc&had_retries=some)

				- [ ] Check [cloud SLO dashboard](https://neonprod.grafana.net/d/_oWcBMJ7k/cloud-slos?orgId=1)

				- [ ] Check [compute startup metrics dashboard](https://neonprod.grafana.net/d/5OkYJEmVz/compute-startup-time)

				<!-- List everything that should be done **after** release, any admin UI configuration / Grafana dashboard / alert changes / setting changes / etc -->

									
										26

.github/actionlint.yml
									
										vendored
									
												View File
												
				@@ -4,9 +4,12 @@ self-hosted-runner:

				    - large

				    - large-arm64

				    - small

				    - small-metal

				    - small-arm64

				    - unit-perf

				    - us-east-2

				config-variables:

				  - AWS_ECR_REGION

				  - AZURE_DEV_CLIENT_ID

				  - AZURE_DEV_REGISTRY_NAME

				  - AZURE_DEV_SUBSCRIPTION_ID

				@@ -14,9 +17,30 @@ config-variables:

				  - AZURE_PROD_REGISTRY_NAME

				  - AZURE_PROD_SUBSCRIPTION_ID

				  - AZURE_TENANT_ID

				  - BENCHMARK_INGEST_TARGET_PROJECTID

				  - BENCHMARK_LARGE_OLTP_PROJECTID

				  - BENCHMARK_PROJECT_ID_PUB

				  - BENCHMARK_PROJECT_ID_SUB

				  - DEV_AWS_OIDC_ROLE_ARN

				  - DEV_AWS_OIDC_ROLE_MANAGE_BENCHMARK_EC2_VMS_ARN

				  - HETZNER_CACHE_BUCKET

				  - HETZNER_CACHE_ENDPOINT

				  - HETZNER_CACHE_REGION

				  - NEON_DEV_AWS_ACCOUNT_ID

				  - NEON_PROD_AWS_ACCOUNT_ID

				  - PGREGRESS_PG16_PROJECT_ID

				  - PGREGRESS_PG17_PROJECT_ID

				  - REMOTE_STORAGE_AZURE_CONTAINER

				  - REMOTE_STORAGE_AZURE_REGION

				  - SLACK_CICD_CHANNEL_ID

				  - SLACK_COMPUTE_CHANNEL_ID

				  - SLACK_ON_CALL_DEVPROD_STREAM

				  - SLACK_ON_CALL_QA_STAGING_STREAM

				  - SLACK_ON_CALL_STORAGE_STAGING_STREAM

				  - SLACK_ONCALL_COMPUTE_GROUP

				  - SLACK_ONCALL_PROXY_GROUP

				  - SLACK_ONCALL_STORAGE_GROUP

				  - SLACK_PROXY_CHANNEL_ID

				  - SLACK_RUST_CHANNEL_ID

				  - SLACK_STORAGE_CHANNEL_ID

				  - SLACK_UPCOMING_RELEASE_CHANNEL_ID

				  - DEV_AWS_OIDC_ROLE_ARN

									
										30

.github/actions/allure-report-generate/action.yml
									
										vendored
									
												View File
												
				@@ -7,6 +7,9 @@ inputs:

				    type: boolean

				    required: false

				    default: false

				  aws-oidc-role-arn:

				    description: 'OIDC role arn to interract with S3'

				    required: true

				outputs:

				  base-url:

				@@ -35,11 +38,14 @@ runs:

				    #

				    - name: Set variables

				      shell: bash -euxo pipefail {0}

				      env:

				        PR_NUMBER: ${{ github.event.pull_request.number }}

				        BUCKET: neon-github-public-dev

				      run: |

				        PR_NUMBER=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)

				        if [ "${PR_NUMBER}" != "null" ]; then

				        if [ -n "${PR_NUMBER}" ]; then

				          BRANCH_OR_PR=pr-${PR_NUMBER}

				        elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || [ "${GITHUB_REF_NAME}" = "release-proxy" ]; then

				        elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || \

				             [ "${GITHUB_REF_NAME}" = "release-proxy" ] || [ "${GITHUB_REF_NAME}" = "release-compute" ]; then

				          # Shortcut for special branches

				          BRANCH_OR_PR=${GITHUB_REF_NAME}

				        else

				@@ -55,8 +61,6 @@ runs:

				        echo "LOCK_FILE=${LOCK_FILE}"       >> $GITHUB_ENV

				        echo "WORKDIR=${WORKDIR}"           >> $GITHUB_ENV

				        echo "BUCKET=${BUCKET}"             >> $GITHUB_ENV

				      env:

				        BUCKET: neon-github-public-dev

				    # TODO: We can replace with a special docker image with Java and Allure pre-installed

				    - uses: actions/setup-java@v4

				@@ -66,6 +70,7 @@ runs:

				    - name: Install Allure

				      shell: bash -euxo pipefail {0}

				      working-directory: /tmp

				      run: |

				        if ! which allure; then

				          ALLURE_ZIP=allure-${ALLURE_VERSION}.zip

				@@ -76,8 +81,15 @@ runs:

				          rm -f ${ALLURE_ZIP}

				        fi

				      env:

				        ALLURE_VERSION: 2.27.0

				        ALLURE_ZIP_SHA256: b071858fb2fa542c65d8f152c5c40d26267b2dfb74df1f1608a589ecca38e777

				        ALLURE_VERSION: 2.32.2

				        ALLURE_ZIP_SHA256: 3f28885e2118f6317c92f667eaddcc6491400af1fb9773c1f3797a5fa5174953

				    - uses: aws-actions/configure-aws-credentials@v4

				      if: ${{ !cancelled() }}

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ inputs.aws-oidc-role-arn }}

				        role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

				    # Potentially we could have several running build for the same key (for example, for the main branch), so we use improvised lock for this

				    - name: Acquire lock

				@@ -183,7 +195,7 @@ runs:

				      uses: actions/cache@v4

				      with:

				        path: ~/.cache/pypoetry/virtualenvs

				        key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-${{ hashFiles('poetry.lock') }}

				        key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-bookworm-${{ hashFiles('poetry.lock') }}

				    - name: Store Allure test stat in the DB (new)

				      if: ${{ !cancelled() && inputs.store-test-results-into-db == 'true' }}

				@@ -221,6 +233,8 @@ runs:

				        REPORT_URL: ${{ steps.generate-report.outputs.report-url }}

				        COMMIT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}

				      with:

				        # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				        retries: 5

				        script: |

				          const { REPORT_URL, COMMIT_SHA } = process.env

									
										21

.github/actions/allure-report-store/action.yml
									
										vendored
									
												View File
												
				@@ -8,6 +8,9 @@ inputs:

				  unique-key:

				    description: 'string to distinguish different results in the same run'

				    required: true

				  aws-oidc-role-arn:

				    description: 'OIDC role arn to interract with S3'

				    required: true

				runs:

				  using: "composite"

				@@ -15,11 +18,14 @@ runs:

				  steps:

				    - name: Set variables

				      shell: bash -euxo pipefail {0}

				      env:

				        PR_NUMBER: ${{ github.event.pull_request.number }}

				        REPORT_DIR: ${{ inputs.report-dir }}

				      run: |

				        PR_NUMBER=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)

				        if [ "${PR_NUMBER}" != "null" ]; then

				        if [ -n "${PR_NUMBER}" ]; then

				          BRANCH_OR_PR=pr-${PR_NUMBER}

				        elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || [ "${GITHUB_REF_NAME}" = "release-proxy" ]; then

				        elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || \

				             [ "${GITHUB_REF_NAME}" = "release-proxy" ] || [ "${GITHUB_REF_NAME}" = "release-compute" ]; then

				          # Shortcut for special branches

				          BRANCH_OR_PR=${GITHUB_REF_NAME}

				        else

				@@ -28,8 +34,13 @@ runs:

				        echo "BRANCH_OR_PR=${BRANCH_OR_PR}" >> $GITHUB_ENV

				        echo "REPORT_DIR=${REPORT_DIR}"     >> $GITHUB_ENV

				      env:

				        REPORT_DIR: ${{ inputs.report-dir }}

				    - uses: aws-actions/configure-aws-credentials@v4

				      if: ${{ !cancelled() }}

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ inputs.aws-oidc-role-arn }}

				        role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

				    - name: Upload test results

				      shell: bash -euxo pipefail {0}

									
										9

.github/actions/download/action.yml
									
										vendored
									
												View File
												
				@@ -15,10 +15,19 @@ inputs:

				  prefix:

				    description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"

				    required: false

				  aws-oidc-role-arn:

				    description: 'OIDC role arn to interract with S3'

				    required: true

				runs:

				  using: "composite"

				  steps:

				    - uses: aws-actions/configure-aws-credentials@v4

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ inputs.aws-oidc-role-arn }}

				        role-duration-seconds: 3600

				    - name: Download artifact

				      id: download-artifact

				      shell: bash -euxo pipefail {0}

									
										12

.github/actions/neon-branch-create/action.yml
									
										vendored
									
												View File
												
				@@ -84,7 +84,13 @@ runs:

				          --header "Authorization: Bearer ${API_KEY}"

				          )

				        role_name=$(echo $roles | jq --raw-output '.roles[] | select(.protected == false) | .name')

				        role_name=$(echo "$roles" | jq --raw-output '

				          (.roles | map(select(.protected == false))) as $roles |

				          if any($roles[]; .name == "neondb_owner")

				          then "neondb_owner"

				          else $roles[0].name

				          end

				        ')

				        echo "role_name=${role_name}" >> $GITHUB_OUTPUT

				      env:

				        API_HOST: ${{ inputs.api_host }}

				@@ -107,13 +113,13 @@ runs:

				            )

				          if [ -z "${reset_password}" ]; then

				            sleep 1

				            sleep $i

				            continue

				          fi

				          password=$(echo $reset_password | jq --raw-output '.role.password')

				          if [ "${password}" == "null" ]; then

				            sleep 1

				            sleep $i # increasing backoff

				            continue

				          fi

									
										88

.github/actions/neon-project-create/action.yml
									
										vendored
									
												View File
												
				@@ -17,6 +17,38 @@ inputs:

				  compute_units:

				    description: '[Min, Max] compute units'

				    default: '[1, 1]'

				  # settings below only needed if you want the project to be sharded from the beginning

				  shard_split_project:

				    description: 'by default new projects are not shard-split initiailly, but only when shard-split threshold is reached, specify true to explicitly shard-split initially'

				    required: false

				    default: 'false'

				  disable_sharding:

				    description: 'by default new projects use storage controller default policy to shard-split when shard-split threshold is reached, specify true to explicitly disable sharding'

				    required: false

				    default: 'false'

				  admin_api_key:

				    description: 'Admin API Key needed for shard-splitting. Must be specified if shard_split_project is true'

				    required: false

				  shard_count:

				    description: 'Number of shards to split the project into, only applies if shard_split_project is true'

				    required: false

				    default: '8'

				  stripe_size:

				    description: 'Stripe size, optional, in 8kiB pages.  e.g. set 2048 for 16MB stripes. Default is 128 MiB, only applies if shard_split_project is true'

				    required: false

				    default: '32768'

				  psql_path:

				    description: 'Path to psql binary - it is caller responsibility to provision the psql binary'

				    required: false

				    default: '/tmp/neon/pg_install/v16/bin/psql'

				  libpq_lib_path:

				    description: 'Path to directory containing libpq library - it is caller responsibility to provision the libpq library'

				    required: false

				    default: '/tmp/neon/pg_install/v16/lib'

				  project_settings:

				    description: 'A JSON object with project settings'

				    required: false

				    default: '{}'

				outputs:

				  dsn:

				@@ -34,9 +66,9 @@ runs:

				      # A shell without `set -x` to not to expose password/dsn in logs

				      shell: bash -euo pipefail {0}

				      run: |

				        project=$(curl \

				        res=$(curl \

				          "https://${API_HOST}/api/v2/projects" \

				          --fail \

				          -w "%{http_code}" \

				          --header "Accept: application/json" \

				          --header "Content-Type: application/json" \

				          --header "Authorization: Bearer ${API_KEY}" \

				@@ -48,9 +80,18 @@ runs:

				              \"provisioner\": \"k8s-neonvm\",

				              \"autoscaling_limit_min_cu\": ${MIN_CU},

				              \"autoscaling_limit_max_cu\": ${MAX_CU},

				              \"settings\": { }

				              \"settings\": ${PROJECT_SETTINGS}

				            }

				          }")

				        code=${res: -3}

				        if [[ ${code} -ge 400 ]]; then

				          echo Request failed with error code ${code}

				          echo ${res::-3}

				          exit 1

				        else

				          project=${res::-3}

				        fi

				        # Mask password

				        echo "::add-mask::$(echo $project | jq --raw-output '.roles[] | select(.name != "web_access") | .password')"

				@@ -63,6 +104,39 @@ runs:

				        echo "project_id=${project_id}" >> $GITHUB_OUTPUT

				        echo "Project ${project_id} has been created"

				        if [ "${SHARD_SPLIT_PROJECT}" = "true" ]; then

				          # determine tenant ID

				          TENANT_ID=`${PSQL} ${dsn} -t -A -c "SHOW neon.tenant_id"`

				          echo "Splitting project ${project_id} with tenant_id ${TENANT_ID} into $((SHARD_COUNT)) shards with stripe size $((STRIPE_SIZE))"

				          echo "Sending PUT request to https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/shard_split"

				          echo "with body {\"new_shard_count\": $((SHARD_COUNT)), \"new_stripe_size\": $((STRIPE_SIZE))}"

				          # we need an ADMIN API KEY to invoke storage controller API for shard splitting (bash -u above checks that the variable is set)

				          curl -X PUT \

				            "https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/shard_split" \

				            -H "Accept: application/json" -H "Content-Type: application/json" -H "Authorization: Bearer ${ADMIN_API_KEY}" \

				            -d "{\"new_shard_count\": $SHARD_COUNT, \"new_stripe_size\": $STRIPE_SIZE}"

				        fi

				        if [ "${DISABLE_SHARDING}" = "true" ]; then

				          # determine tenant ID

				          TENANT_ID=`${PSQL} ${dsn} -t -A -c "SHOW neon.tenant_id"`

				          echo "Explicitly disabling shard-splitting for project ${project_id} with tenant_id ${TENANT_ID}"

				          echo "Sending PUT request to https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/policy"

				          echo "with body {\"scheduling\": \"Essential\"}"

				          # we need an ADMIN API KEY to invoke storage controller API for shard splitting (bash -u above checks that the variable is set)

				          curl -X PUT \

				            "https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/policy" \

				            -H "Accept: application/json" -H "Content-Type: application/json" -H "Authorization: Bearer ${ADMIN_API_KEY}" \

				            -d "{\"scheduling\": \"Essential\"}"

				        fi

				      env:

				        API_HOST: ${{ inputs.api_host }}

				        API_KEY: ${{ inputs.api_key }}

				@@ -70,3 +144,11 @@ runs:

				        POSTGRES_VERSION: ${{ inputs.postgres_version }}

				        MIN_CU: ${{ fromJSON(inputs.compute_units)[0] }}

				        MAX_CU: ${{ fromJSON(inputs.compute_units)[1] }}

				        SHARD_SPLIT_PROJECT: ${{ inputs.shard_split_project }}

				        DISABLE_SHARDING: ${{ inputs.disable_sharding }}

				        ADMIN_API_KEY: ${{ inputs.admin_api_key }}

				        SHARD_COUNT: ${{ inputs.shard_count }}

				        STRIPE_SIZE: ${{ inputs.stripe_size }}

				        PSQL: ${{ inputs.psql_path }}

				        LD_LIBRARY_PATH: ${{ inputs.libpq_lib_path }}

				        PROJECT_SETTINGS: ${{ inputs.project_settings }}

									
										63

.github/actions/run-python-test-set/action.yml
									
										vendored
									
												View File
												
				@@ -36,18 +36,26 @@ inputs:

				    description: 'Region name for real s3 tests'

				    required: false

				    default: ''

				  rerun_flaky:

				    description: 'Whether to rerun flaky tests'

				  rerun_failed:

				    description: 'Whether to rerun failed tests'

				    required: false

				    default: 'false'

				  pg_version:

				    description: 'Postgres version to use for tests'

				    required: false

				    default: 'v16'

				  sanitizers:

				    description: 'enabled or disabled'

				    required: false

				    default: 'disabled'

				    type: string

				  benchmark_durations:

				    description: 'benchmark durations JSON'

				    required: false

				    default: '{}'

				  aws-oidc-role-arn:

				    description: 'OIDC role arn to interract with S3'

				    required: true

				runs:

				  using: "composite"

				@@ -56,8 +64,9 @@ runs:

				      if: inputs.build_type != 'remote'

				      uses: ./.github/actions/download

				      with:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact

				        name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}${{ inputs.sanitizers == 'enabled' && '-sanitized' || '' }}-artifact

				        path: /tmp/neon

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

				    - name: Download Neon binaries for the previous release

				      if: inputs.build_type != 'remote'

				@@ -66,6 +75,7 @@ runs:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact

				        path: /tmp/neon-previous

				        prefix: latest

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

				    - name: Download compatibility snapshot

				      if: inputs.build_type != 'remote'

				@@ -77,6 +87,7 @@ runs:

				        # The lack of compatibility snapshot (for example, for the new Postgres version)

				        # shouldn't fail the whole job. Only relevant test should fail.

				        skip-if-does-not-exist: true

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

				    - name: Checkout

				      if: inputs.needs_postgres_source == 'true'

				@@ -88,7 +99,7 @@ runs:

				      uses: actions/cache@v4

				      with:

				        path: ~/.cache/pypoetry/virtualenvs

				        key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-${{ hashFiles('poetry.lock') }}

				        key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-bookworm-${{ hashFiles('poetry.lock') }}

				    - name: Install Python deps

				      shell: bash -euxo pipefail {0}

				@@ -102,10 +113,9 @@ runs:

				        TEST_OUTPUT: /tmp/test_output

				        BUILD_TYPE: ${{ inputs.build_type }}

				        COMPATIBILITY_SNAPSHOT_DIR: /tmp/compatibility_snapshot_pg${{ inputs.pg_version }}

				        ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE: contains(github.event.pull_request.labels.*.name, 'backward compatibility breakage')

				        ALLOW_FORWARD_COMPATIBILITY_BREAKAGE: contains(github.event.pull_request.labels.*.name, 'forward compatibility breakage')

				        RERUN_FLAKY: ${{ inputs.rerun_flaky }}

				        RERUN_FAILED: ${{ inputs.rerun_failed }}

				        PG_VERSION: ${{ inputs.pg_version }}

				        SANITIZERS: ${{ inputs.sanitizers }}

				      shell: bash -euxo pipefail {0}

				      run: |

				        # PLATFORM will be embedded in the perf test report

				@@ -115,12 +125,15 @@ runs:

				        export DEFAULT_PG_VERSION=${PG_VERSION#v}

				        export LD_LIBRARY_PATH=${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/lib

				        export BENCHMARK_CONNSTR=${BENCHMARK_CONNSTR:-}

				        export ASAN_OPTIONS=detect_leaks=0:detect_stack_use_after_return=0:abort_on_error=1:strict_string_checks=1:check_initialization_order=1:strict_init_order=1

				        export UBSAN_OPTIONS=abort_on_error=1:print_stacktrace=1

				        if [ "${BUILD_TYPE}" = "remote" ]; then

				          export REMOTE_ENV=1

				        fi

				        PERF_REPORT_DIR="$(realpath test_runner/perf-report-local)"

				        echo "PERF_REPORT_DIR=${PERF_REPORT_DIR}" >> ${GITHUB_ENV}

				        rm -rf $PERF_REPORT_DIR

				        TEST_SELECTION="test_runner/${{ inputs.test_selection }}"

				@@ -150,15 +163,8 @@ runs:

				          EXTRA_PARAMS="--out-dir $PERF_REPORT_DIR $EXTRA_PARAMS"

				        fi

				        if [ "${RERUN_FLAKY}" == "true" ]; then

				          mkdir -p $TEST_OUTPUT

				          poetry run ./scripts/flaky_tests.py "${TEST_RESULT_CONNSTR}" \

				                                              --days 7 \

				                                              --output "$TEST_OUTPUT/flaky.json" \

				                                              --pg-version "${DEFAULT_PG_VERSION}" \

				                                              --build-type "${BUILD_TYPE}"

				          EXTRA_PARAMS="--flaky-tests-json $TEST_OUTPUT/flaky.json $EXTRA_PARAMS"

				        if [ "${RERUN_FAILED}" == "true" ]; then

				          EXTRA_PARAMS="--reruns 2 $EXTRA_PARAMS"

				        fi

				        # We use pytest-split plugin to run benchmarks in parallel on different CI runners

				@@ -204,11 +210,12 @@ runs:

				          --verbose \

				          -rA $TEST_SELECTION $EXTRA_PARAMS

				        if [[ "${{ inputs.save_perf_report }}" == "true" ]]; then

				          export REPORT_FROM="$PERF_REPORT_DIR"

				          export REPORT_TO="$PLATFORM"

				          scripts/generate_and_push_perf_report.sh

				        fi

				    - name: Upload performance report

				      if: ${{ !cancelled() && inputs.save_perf_report == 'true' }}

				      shell: bash -euxo pipefail {0}

				      run: |

				        export REPORT_FROM="${PERF_REPORT_DIR}"

				        scripts/generate_and_push_perf_report.sh

				    - name: Upload compatibility snapshot

				      # Note, that we use `github.base_ref` which is a target branch for a PR

				@@ -218,10 +225,22 @@ runs:

				        name: compatibility-snapshot-${{ runner.arch }}-${{ inputs.build_type }}-pg${{ inputs.pg_version }}

				        # Directory is created by test_compatibility.py::test_create_snapshot, keep the path in sync with the test

				        path: /tmp/test_output/compatibility_snapshot_pg${{ inputs.pg_version }}/

				        # The lack of compatibility snapshot shouldn't fail the job

				        # (for example if we didn't run the test for non build-and-test workflow)

				        skip-if-does-not-exist: true

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

				    - uses: aws-actions/configure-aws-credentials@v4

				      if: ${{ !cancelled() }}

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ inputs.aws-oidc-role-arn }}

				        role-duration-seconds: 3600 # 1 hour should be more than enough to upload report

				    - name: Upload test results

				      if: ${{ !cancelled() }}

				      uses: ./.github/actions/allure-report-store

				      with:

				        report-dir: /tmp/test_output/allure/results

				        unique-key: ${{ inputs.build_type }}-${{ inputs.pg_version }}

				        unique-key: ${{ inputs.build_type }}-${{ inputs.pg_version }}-${{ runner.arch }}

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

									
										2

.github/actions/save-coverage-data/action.yml
									
										vendored
									
												View File
												
				@@ -14,9 +14,11 @@ runs:

				        name: coverage-data-artifact

				        path: /tmp/coverage

				        skip-if-does-not-exist: true # skip if there's no previous coverage to download

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

				    - name: Upload coverage data

				      uses: ./.github/actions/upload

				      with:

				        name: coverage-data-artifact

				        path: /tmp/coverage

				        aws-oidc-role-arn: ${{ inputs.aws-oidc-role-arn }}

									
										36

.github/actions/set-docker-config-dir/action.yml
									
										vendored
									
												View File
											
				@@ -1,36 +0,0 @@

				name: "Set custom docker config directory"

				description: "Create a directory for docker config and set DOCKER_CONFIG"

				# Use custom DOCKER_CONFIG directory to avoid conflicts with default settings

				runs:

				  using: "composite"

				  steps:

				  - name: Show warning on GitHub-hosted runners

				    if: runner.environment == 'github-hosted'

				    shell: bash -euo pipefail {0}

				    run: |

				      # Using the following environment variables to find a path to the workflow file

				      # ${GITHUB_WORKFLOW_REF} - octocat/hello-world/.github/workflows/my-workflow.yml@refs/heads/my_branch

				      # ${GITHUB_REPOSITORY}   - octocat/hello-world

				      # ${GITHUB_REF}          - refs/heads/my_branch

				      # From https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/variables

				      filename_with_ref=${GITHUB_WORKFLOW_REF#"$GITHUB_REPOSITORY/"}

				      filename=${filename_with_ref%"@$GITHUB_REF"}

				      # https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#setting-a-warning-message

				      title='Unnecessary usage of `.github/actions/set-docker-config-dir`'

				      message='No need to use `.github/actions/set-docker-config-dir` action on GitHub-hosted runners'

				      echo "::warning file=${filename},title=${title}::${message}"

				  - uses: pyTooling/Actions/with-post-step@74afc5a42a17a046c90c68cb5cfa627e5c6c5b6b # v1.0.7

				    env:

				      DOCKER_CONFIG: .docker-custom-${{ github.run_id }}-${{ github.run_attempt }}

				    with:

				      main: |

				        mkdir -p "${DOCKER_CONFIG}"

				        echo DOCKER_CONFIG=${DOCKER_CONFIG} | tee -a $GITHUB_ENV

				      post: |

				        if [ -d "${DOCKER_CONFIG}" ]; then

				          rm -r "${DOCKER_CONFIG}"

				        fi

									
										29

.github/actions/upload/action.yml
									
										vendored
									
												View File
												
				@@ -7,18 +7,28 @@ inputs:

				  path:

				    description: "A directory or file to upload"

				    required: true

				  skip-if-does-not-exist:

				    description: "Allow to skip if path doesn't exist, fail otherwise"

				    default: false

				    required: false

				  prefix:

				    description: "S3 prefix. Default is '${GITHUB_SHA}/${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"

				    required: false

				  aws-oidc-role-arn:

				    description: "the OIDC role arn for aws auth"

				    required: false

				    default: ""

				runs:

				  using: "composite"

				  steps:

				    - name: Prepare artifact

				      id: prepare-artifact

				      shell: bash -euxo pipefail {0}

				      env:

				        SOURCE: ${{ inputs.path }}

				        ARCHIVE: /tmp/uploads/${{ inputs.name }}.tar.zst

				        SKIP_IF_DOES_NOT_EXIST: ${{ inputs.skip-if-does-not-exist }}

				      run: |

				        mkdir -p $(dirname $ARCHIVE)

				@@ -33,14 +43,29 @@ runs:

				        elif [ -f ${SOURCE} ]; then

				          time tar -cf ${ARCHIVE} --zstd ${SOURCE}

				        elif ! ls ${SOURCE} > /dev/null 2>&1; then

				          echo >&2 "${SOURCE} does not exist"

				          exit 2

				          if [ "${SKIP_IF_DOES_NOT_EXIST}" = "true" ]; then

				            echo 'SKIPPED=true' >> $GITHUB_OUTPUT

				            exit 0

				          else

				            echo >&2 "${SOURCE} does not exist"

				            exit 2

				          fi

				        else

				          echo >&2 "${SOURCE} is neither a directory nor a file, do not know how to handle it"

				          exit 3

				        fi

				        echo 'SKIPPED=false' >> $GITHUB_OUTPUT

				    - name: Configure AWS credentials

				      uses: aws-actions/configure-aws-credentials@v4

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ inputs.aws-oidc-role-arn }}

				        role-duration-seconds: 3600

				    - name: Upload artifact

				      if: ${{ steps.prepare-artifact.outputs.SKIPPED == 'false' }}

				      shell: bash -euxo pipefail {0}

				      env:

				        SOURCE: ${{ inputs.path }}

									
										13

.github/file-filters.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,13 @@

				rust_code: ['**/*.rs', '**/Cargo.toml', '**/Cargo.lock']

				rust_dependencies: ['**/Cargo.lock']

				v14: ['vendor/postgres-v14/**', 'Makefile', 'pgxn/**']

				v15: ['vendor/postgres-v15/**', 'Makefile', 'pgxn/**']

				v16: ['vendor/postgres-v16/**', 'Makefile', 'pgxn/**']

				v17: ['vendor/postgres-v17/**', 'Makefile', 'pgxn/**']

				rebuild_neon_extra:

				    - .github/workflows/neon_extra_builds.yml

				rebuild_macos:

				    - .github/workflows/build-macos.yml

									
										11

.github/pull_request_template.md
									
										vendored
									
												View File
												
				@@ -1,14 +1,3 @@

				## Problem

				## Summary of changes

				## Checklist before requesting a review

				- [ ] I have performed a self-review of my code.

				- [ ] If it is a core feature, I have added thorough tests.

				- [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?

				- [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

				## Checklist before merging

				- [ ] Do not forget to reformat commit message to not include the above checklist

									
										69

.github/scripts/generate_image_maps.py
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,69 @@

				import itertools

				import json

				import os

				import sys

				source_tag = os.getenv("SOURCE_TAG")

				target_tag = os.getenv("TARGET_TAG")

				branch = os.getenv("BRANCH")

				dev_acr = os.getenv("DEV_ACR")

				prod_acr = os.getenv("PROD_ACR")

				dev_aws = os.getenv("DEV_AWS")

				prod_aws = os.getenv("PROD_AWS")

				aws_region = os.getenv("AWS_REGION")

				components = {

				    "neon": ["neon"],

				    "compute": [

				        "compute-node-v14",

				        "compute-node-v15",

				        "compute-node-v16",

				        "compute-node-v17",

				        "vm-compute-node-v14",

				        "vm-compute-node-v15",

				        "vm-compute-node-v16",

				        "vm-compute-node-v17",

				    ],

				}

				registries = {

				    "dev": [

				        "docker.io/neondatabase",

				        "ghcr.io/neondatabase",

				        f"{dev_aws}.dkr.ecr.{aws_region}.amazonaws.com",

				        f"{dev_acr}.azurecr.io/neondatabase",

				    ],

				    "prod": [

				        f"{prod_aws}.dkr.ecr.{aws_region}.amazonaws.com",

				        f"{prod_acr}.azurecr.io/neondatabase",

				    ],

				}

				release_branches = ["release", "release-proxy", "release-compute"]

				outputs: dict[str, dict[str, list[str]]] = {}

				target_tags = (

				    [target_tag, "latest"]

				    if branch == "main"

				    else [target_tag, "released"]

				    if branch in release_branches

				    else [target_tag]

				)

				target_stages = ["dev", "prod"] if branch in release_branches else ["dev"]

				for component_name, component_images in components.items():

				    for stage in target_stages:

				        outputs[f"{component_name}-{stage}"] = {

				            f"ghcr.io/neondatabase/{component_image}:{source_tag}": [

				                f"{registry}/{component_image}:{tag}"

				                for registry, tag in itertools.product(registries[stage], target_tags)

				                if not (registry == "ghcr.io/neondatabase" and tag == source_tag)

				            ]

				            for component_image in component_images

				        }

				with open(os.getenv("GITHUB_OUTPUT", "/dev/null"), "a") as f:

				    for key, value in outputs.items():

				        f.write(f"{key}={json.dumps(value)}\n")

				        print(f"Image map for {key}:\n{json.dumps(value, indent=2)}\n\n", file=sys.stderr)

									
										110

.github/scripts/lint-release-pr.sh
									
										vendored
									
										Executable file
									
												View File
												
				@@ -0,0 +1,110 @@

				#!/usr/bin/env bash

				set -euo pipefail

				DOCS_URL="https://docs.neon.build/overview/repositories/neon.html"

				message() {

				  if [[ -n "${GITHUB_PR_NUMBER:-}" ]]; then

				    gh pr comment --repo "${GITHUB_REPOSITORY}" "${GITHUB_PR_NUMBER}" --edit-last --body "$1" \

				      || gh pr comment --repo "${GITHUB_REPOSITORY}" "${GITHUB_PR_NUMBER}" --body "$1"

				  fi

				  echo "$1"

				}

				report_error() {

				  message "❌ $1

				  For more details, see the documentation: ${DOCS_URL}"

				  exit 1

				}

				case "$RELEASE_BRANCH" in

				  "release") COMPONENT="Storage" ;;

				  "release-proxy") COMPONENT="Proxy" ;;

				  "release-compute") COMPONENT="Compute" ;;

				  *)

				    report_error "Unknown release branch: ${RELEASE_BRANCH}"

				    ;;

				esac

				# Identify main and release branches

				MAIN_BRANCH="origin/main"

				REMOTE_RELEASE_BRANCH="origin/${RELEASE_BRANCH}"

				# Find merge base

				MERGE_BASE=$(git merge-base "${MAIN_BRANCH}" "${REMOTE_RELEASE_BRANCH}")

				echo "Merge base of ${MAIN_BRANCH} and ${RELEASE_BRANCH}: ${MERGE_BASE}"

				# Get the HEAD commit (last commit in PR, expected to be the merge commit)

				LAST_COMMIT=$(git rev-parse HEAD)

				MERGE_COMMIT_MESSAGE=$(git log -1 --format=%s "${LAST_COMMIT}")

				EXPECTED_MESSAGE_REGEX="^$COMPONENT release [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2} UTC$"

				if ! [[ "${MERGE_COMMIT_MESSAGE}" =~ ${EXPECTED_MESSAGE_REGEX} ]]; then

				  report_error "Merge commit message does not match expected pattern: '<component> release YYYY-MM-DD'

				  Expected component: ${COMPONENT}

				  Found: '${MERGE_COMMIT_MESSAGE}'"

				fi

				echo "✅ Merge commit message is correctly formatted: '${MERGE_COMMIT_MESSAGE}'"

				LAST_COMMIT_PARENTS=$(git cat-file -p "${LAST_COMMIT}" | jq -sR '[capture("parent (?<parent>[0-9a-f]{40})"; "g") | .parent]')

				if [[ "$(echo "${LAST_COMMIT_PARENTS}" | jq 'length')" -ne 2 ]]; then

				  report_error "Last commit must be a merge commit with exactly two parents"

				fi

				EXPECTED_RELEASE_HEAD=$(git rev-parse "${REMOTE_RELEASE_BRANCH}")

				if echo "${LAST_COMMIT_PARENTS}" | jq -e --arg rel "${EXPECTED_RELEASE_HEAD}" 'index($rel) != null' > /dev/null; then

				  LINEAR_HEAD=$(echo "${LAST_COMMIT_PARENTS}" | jq -r '[.[] | select(. != $rel)][0]' --arg rel "${EXPECTED_RELEASE_HEAD}")

				else

				  report_error "Last commit must merge the release branch (${RELEASE_BRANCH})"

				fi

				echo "✅ Last commit correctly merges the previous commit and the release branch"

				echo "Top commit of linear history: ${LINEAR_HEAD}"

				MERGE_COMMIT_TREE=$(git rev-parse "${LAST_COMMIT}^{tree}")

				LINEAR_HEAD_TREE=$(git rev-parse "${LINEAR_HEAD}^{tree}")

				if [[ "${MERGE_COMMIT_TREE}" != "${LINEAR_HEAD_TREE}" ]]; then

				  report_error "Tree of merge commit (${MERGE_COMMIT_TREE}) does not match tree of linear history head (${LINEAR_HEAD_TREE})

				  This indicates that the merge of ${RELEASE_BRANCH} into this branch was not performed using the merge strategy 'ours'"

				fi

				echo "✅ Merge commit tree matches the linear history head"

				EXPECTED_PREVIOUS_COMMIT="${LINEAR_HEAD}"

				# Now traverse down the history, ensuring each commit has exactly one parent

				CURRENT_COMMIT="${EXPECTED_PREVIOUS_COMMIT}"

				while [[ "${CURRENT_COMMIT}" != "${MERGE_BASE}" && "${CURRENT_COMMIT}" != "${EXPECTED_RELEASE_HEAD}" ]]; do

				  CURRENT_COMMIT_PARENTS=$(git cat-file -p "${CURRENT_COMMIT}" | jq -sR '[capture("parent (?<parent>[0-9a-f]{40})"; "g") | .parent]')

				  if [[ "$(echo "${CURRENT_COMMIT_PARENTS}" | jq 'length')" -ne 1 ]]; then

				    report_error "Commit ${CURRENT_COMMIT} must have exactly one parent"

				  fi

				  NEXT_COMMIT=$(echo "${CURRENT_COMMIT_PARENTS}" | jq -r '.[0]')

				  if [[ "${NEXT_COMMIT}" == "${MERGE_BASE}" ]]; then

				    echo "✅ Reached merge base (${MERGE_BASE})"

				    PR_BASE="${MERGE_BASE}"

				  elif [[ "${NEXT_COMMIT}" == "${EXPECTED_RELEASE_HEAD}" ]]; then

				    echo "✅ Reached release branch (${EXPECTED_RELEASE_HEAD})"

				    PR_BASE="${EXPECTED_RELEASE_HEAD}"

				  elif [[ -z "${NEXT_COMMIT}" ]]; then

				    report_error "Unexpected end of commit history before reaching merge base"

				  fi

				  # Move to the next commit in the chain

				  CURRENT_COMMIT="${NEXT_COMMIT}"

				done

				echo "✅ All commits are properly ordered and linear"

				echo "✅ Release PR structure is valid"

				echo

				message "Commits that are part of this release:

				$(git log --oneline "${PR_BASE}..${LINEAR_HEAD}")"

31

.github/scripts/previous-releases.jq vendored Normal file

View File

@@ -0,0 +1,31 @@
 # Expects response from https://docs.github.com/en/rest/releases/releases?apiVersion=2022-11-28#list-releases as input,
 # with tag names `release` for storage, `release-compute` for compute and `release-proxy` for proxy releases.
 # Extract only the `tag_name` field from each release object
 [ .[].tag_name ]
 # Transform each tag name into a structured object using regex capture
 | reduce map(
     capture("^(?<full>release(-(?<component>proxy|compute))?-(?<version>\\d+))$")
     | {
         component: (.component // "storage"),  # Default to "storage" if no component is specified
         version: (.version | tonumber),        # Convert the version number to an integer
         full: .full                            # Store the full tag name for final output
       }
   )[] as $entry  # Loop over the transformed list
 # Accumulate the latest (highest-numbered) version for each component
 ({};
  .[$entry.component] |= (if . == null or $entry.version > .version then $entry else . end))
 # Ensure that each component exists, or fail
 | (["storage", "compute", "proxy"] - (keys)) as $missing
 | if ($missing | length) > 0 then
     "Error: Found no release for \($missing | join(", "))!\n" | halt_error(1)
   else . end
 # Convert the resulting object into an array of formatted strings
 | to_entries
 | map("\(.key)=\(.value.full)")
 # Output each string separately
 | .[]

									
										45

.github/scripts/push_with_image_map.py
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,45 @@

				import json

				import os

				import subprocess

				RED = "\033[91m"

				RESET = "\033[0m"

				image_map = os.getenv("IMAGE_MAP")

				if not image_map:

				    raise ValueError("IMAGE_MAP environment variable is not set")

				try:

				    parsed_image_map: dict[str, list[str]] = json.loads(image_map)

				except json.JSONDecodeError as e:

				    raise ValueError("Failed to parse IMAGE_MAP as JSON") from e

				failures = []

				pending = [(source, target) for source, targets in parsed_image_map.items() for target in targets]

				while len(pending) > 0:

				    if len(failures) > 10:

				        print("Error: more than 10 failures!")

				        for failure in failures:

				            print(f'"{failure[0]}" failed with the following output:')

				            print(failure[1])

				        raise RuntimeError("Retry limit reached.")

				    source, target = pending.pop(0)

				    cmd = ["docker", "buildx", "imagetools", "create", "-t", target, source]

				    print(f"Running: {' '.join(cmd)}")

				    result = subprocess.run(cmd, text=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

				    if result.returncode != 0:

				        failures.append((" ".join(cmd), result.stdout, target))

				        pending.append((source, target))

				        print(

				            f"{RED}[RETRY]{RESET} Push failed for {target}. Retrying... (failure count: {len(failures)})"

				        )

				        print(result.stdout)

				if len(failures) > 0 and (github_output := os.getenv("GITHUB_OUTPUT")):

				    failed_targets = [target for _, _, target in failures]

				    with open(github_output, "a") as f:

				        f.write(f"push_failures={json.dumps(failed_targets)}\n")

									
										60

.github/workflows/_benchmarking_preparation.yml
									
										vendored
									
												View File
												
				@@ -3,19 +3,26 @@ name: Prepare benchmarking databases by restoring dumps

				on:

				  workflow_call:

				    # no inputs needed

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				permissions:

				  contents: read

				jobs:

				  setup-databases:

				    permissions:

				      contents: write

				      statuses: write

				      id-token: write # aws-actions/configure-aws-credentials

				    strategy:

				      fail-fast: false

				      matrix:

				        platform: [ aws-rds-postgres, aws-aurora-serverless-v2-postgres, neon ] 

				        platform: [ aws-rds-postgres, aws-aurora-serverless-v2-postgres, neon, neon_pg17 ]

				        database: [ clickbench, tpch, userexample ]

				    env:

				      LD_LIBRARY_PATH: /tmp/neon/pg_install/v16/lib

				      PLATFORM: ${{ matrix.platform }}

				@@ -23,22 +30,33 @@ jobs:

				    runs-on: [ self-hosted, us-east-2, x64 ]

				    container:

				      image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/build-tools:pinned

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - name: Set up Connection String

				      id: set-up-prep-connstr

				      run: |

				        case "${PLATFORM}" in

				          neon)

				            CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR }} 

				            CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR }}

				            ;;

				          neon_pg17)

				            CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR_PG17 }}

				            ;;

				          aws-rds-postgres)

				            CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_CONNSTR }} 

				            CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_CONNSTR }}

				            ;;

				          aws-aurora-serverless-v2-postgres)

				            CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_CONNSTR }} 

				            CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_CONNSTR }}

				            ;;

				          *)

				            echo >&2 "Unknown PLATFORM=${PLATFORM}"

				@@ -46,9 +64,16 @@ jobs:

				            ;;

				        esac

				        echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT  

				        echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				    - name: Configure AWS credentials

				      uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # 5 hours

				    - name: Download Neon artifact

				      uses: ./.github/actions/download

				@@ -56,24 +81,25 @@ jobs:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				        path: /tmp/neon/

				        prefix: latest

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    # we create a table that has one row for each database that we want to restore with the status whether the restore is done    

				    # we create a table that has one row for each database that we want to restore with the status whether the restore is done

				    - name: Create benchmark_restore_status table if it does not exist

				      env:

				        BENCHMARK_CONNSTR: ${{ steps.set-up-prep-connstr.outputs.connstr }}

				        DATABASE_NAME: ${{ matrix.database }}

				      # to avoid a race condition of multiple jobs trying to create the table at the same time, 

				      # to avoid a race condition of multiple jobs trying to create the table at the same time,

				      # we use an advisory lock

				      run: |

				        ${PG_BINARIES}/psql "${{ env.BENCHMARK_CONNSTR }}" -c "

				        SELECT pg_advisory_lock(4711);  

				        SELECT pg_advisory_lock(4711);

				        CREATE TABLE IF NOT EXISTS benchmark_restore_status (

				        databasename text primary key,

				        restore_done boolean

				        );

				        SELECT pg_advisory_unlock(4711);

				        "

				    - name: Check if restore is already done

				      id: check-restore-done

				      env:

				@@ -107,7 +133,7 @@ jobs:

				        DATABASE_NAME: ${{ matrix.database }}

				      run: |

				        mkdir -p /tmp/dumps

				        aws s3 cp s3://neon-github-dev/performance/pgdumps/$DATABASE_NAME/$DATABASE_NAME.pg_dump /tmp/dumps/ 

				        aws s3 cp s3://neon-github-dev/performance/pgdumps/$DATABASE_NAME/$DATABASE_NAME.pg_dump /tmp/dumps/

				    - name: Replace database name in connection string

				      if: steps.check-restore-done.outputs.skip != 'true'

				@@ -126,17 +152,17 @@ jobs:

				        else

				          new_connstr="${base_connstr}/${DATABASE_NAME}"

				        fi

				        echo "database_connstr=${new_connstr}" >> $GITHUB_OUTPUT  

				        echo "database_connstr=${new_connstr}" >> $GITHUB_OUTPUT

				    - name: Restore dump

				      if: steps.check-restore-done.outputs.skip != 'true'

				      env:

				        DATABASE_NAME: ${{ matrix.database }}

				        DATABASE_CONNSTR: ${{ steps.replace-dbname.outputs.database_connstr }}

				        # the following works only with larger computes: 

				        # the following works only with larger computes:

				        # PGOPTIONS: "-c maintenance_work_mem=8388608 -c max_parallel_maintenance_workers=7"

				        # we add the || true because:

				        # the dumps were created with Neon and contain neon extensions that are not 

				        # the dumps were created with Neon and contain neon extensions that are not

				        # available in RDS, so we will always report an error, but we can ignore it

				      run: |

				        ${PG_BINARIES}/pg_restore --clean --if-exists --no-owner --jobs=4 \

									
										257

.github/workflows/_build-and-test-locally.yml
									
										vendored
									
												View File
												
				@@ -19,10 +19,30 @@ on:

				        description: 'debug or release'

				        required: true

				        type: string

				      pg-versions:

				        description: 'a json array of postgres versions to run regression tests on'

				      test-cfg:

				        description: 'a json object of postgres versions and lfc states to run regression tests on'

				        required: true

				        type: string

				      sanitizers:

				        description: 'enabled or disabled'

				        required: false

				        default: 'disabled'

				        type: string

				      test-selection:

				        description: 'specification of selected test(s) to run'

				        required: false

				        default: ''

				        type: string

				      test-run-count:

				        description: 'number of runs to perform for selected tests'

				        required: false

				        default: 1

				        type: number

				      rerun-failed:

				        description: 'rerun failed tests to ignore flaky tests'

				        required: false

				        default: true

				        type: boolean

				defaults:

				  run:

				@@ -31,17 +51,21 @@ defaults:

				env:

				  RUST_BACKTRACE: 1

				  COPT: '-Werror'

				  AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}

				  AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}

				permissions:

				  contents: read

				jobs:

				  build-neon:

				    runs-on: ${{ fromJson(format('["self-hosted", "{0}"]', inputs.arch == 'arm64' && 'large-arm64' || 'large')) }}

				    runs-on: ${{ fromJSON(format('["self-hosted", "{0}"]', inputs.arch == 'arm64' && 'large-arm64' || 'large')) }}

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      contents: read

				    container:

				      image: ${{ inputs.build-tools-image }}

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      # Raise locked memory limit for tokio-epoll-uring.

				      # On 5.10 LTS kernels < 5.10.162 (and generally mainline kernels < 5.12),

				      # io_uring will account the memory of the CQ and SQ as locked.

				@@ -53,21 +77,12 @@ jobs:

				      BUILD_TAG: ${{ inputs.build-tag }}

				    steps:

				      - name: Fix git ownership

				        run: |

				          # Workaround for `fatal: detected dubious ownership in repository at ...`

				          #

				          # Use both ${{ github.workspace }} and ${GITHUB_WORKSPACE} because they're different on host and in containers

				          #   Ref https://github.com/actions/checkout/issues/785

				          #

				          git config --global --add safe.directory ${{ github.workspace }}

				          git config --global --add safe.directory ${GITHUB_WORKSPACE}

				          for r in 14 15 16; do

				            git config --global --add safe.directory "${{ github.workspace }}/vendor/postgres-v$r"

				            git config --global --add safe.directory "${GITHUB_WORKSPACE}/vendor/postgres-v$r"

				          done

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@v4

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				@@ -83,93 +98,120 @@ jobs:

				        id: pg_v16_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v16) >> $GITHUB_OUTPUT

				      - name: Set pg 17 revision for caching

				        id: pg_v17_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v17) >> $GITHUB_OUTPUT

				      # Set some environment variables used by all the steps.

				      #

				      # CARGO_FLAGS is extra options to pass to "cargo build", "cargo test" etc.

				      #   It also includes --features, if any

				      # CARGO_FLAGS is extra options to pass to all "cargo" subcommands.

				      #

				      # CARGO_FEATURES is passed to "cargo metadata". It is separate from CARGO_FLAGS,

				      #   because "cargo metadata" doesn't accept --release or --debug options

				      # CARGO_PROFILE is passed to "cargo build", "cargo test" etc, but not to

				      #   "cargo metadata", because it doesn't accept --release or --debug options.

				      #

				      # We run tests with addtional features, that are turned off by default (e.g. in release builds), see

				      # corresponding Cargo.toml files for their descriptions.

				      - name: Set env variables

				        env:

				          ARCH: ${{ inputs.arch }}

				          SANITIZERS: ${{ inputs.sanitizers }}

				        run: |

				          CARGO_FEATURES="--features testing"

				          CARGO_FLAGS="--locked --features testing"

				          if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' ]]; then

				            cov_prefix="scripts/coverage --profraw-prefix=$GITHUB_JOB --dir=/tmp/coverage run"

				            CARGO_FLAGS="--locked"

				            CARGO_PROFILE=""

				          elif [[ $BUILD_TYPE == "debug" ]]; then

				            cov_prefix=""

				            CARGO_FLAGS="--locked"

				            CARGO_PROFILE=""

				          elif [[ $BUILD_TYPE == "release" ]]; then

				            cov_prefix=""

				            CARGO_FLAGS="--locked --release"

				            CARGO_PROFILE="--release"

				          fi

				          if [[ $SANITIZERS == 'enabled' ]]; then

				            make_vars="WITH_SANITIZERS=yes"

				          else

				            make_vars=""

				          fi

				          {

				            echo "cov_prefix=${cov_prefix}"

				            echo "CARGO_FEATURES=${CARGO_FEATURES}"

				            echo "make_vars=${make_vars}"

				            echo "CARGO_FLAGS=${CARGO_FLAGS}"

				            echo "CARGO_PROFILE=${CARGO_PROFILE}"

				            echo "CARGO_HOME=${GITHUB_WORKSPACE}/.cargo"

				          } >> $GITHUB_ENV

				      - name: Cache postgres v14 build

				        id: cache_pg_14

				        uses: actions/cache@v4

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: pg_install/v14

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile', 'Dockerfile.build-tools') }}

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-bookworm-${{ hashFiles('Makefile', 'build-tools.Dockerfile') }}

				      - name: Cache postgres v15 build

				        id: cache_pg_15

				        uses: actions/cache@v4

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: pg_install/v15

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile', 'Dockerfile.build-tools') }}

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-bookworm-${{ hashFiles('Makefile', 'build-tools.Dockerfile') }}

				      - name: Cache postgres v16 build

				        id: cache_pg_16

				        uses: actions/cache@v4

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: pg_install/v16

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v16_rev.outputs.pg_rev }}-${{ hashFiles('Makefile', 'Dockerfile.build-tools') }}

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v16_rev.outputs.pg_rev }}-bookworm-${{ hashFiles('Makefile', 'build-tools.Dockerfile') }}

				      - name: Build postgres v14

				        if: steps.cache_pg_14.outputs.cache-hit != 'true'

				        run: mold -run make postgres-v14 -j$(nproc)

				      - name: Cache postgres v17 build

				        id: cache_pg_17

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: pg_install/v17

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-pg-${{ steps.pg_v17_rev.outputs.pg_rev }}-bookworm-${{ hashFiles('Makefile', 'build-tools.Dockerfile') }}

				      - name: Build postgres v15

				        if: steps.cache_pg_15.outputs.cache-hit != 'true'

				        run: mold -run make postgres-v15 -j$(nproc)

				      - name: Build postgres v16

				        if: steps.cache_pg_16.outputs.cache-hit != 'true'

				        run: mold -run make postgres-v16 -j$(nproc)

				      - name: Build neon extensions

				        run: mold -run make neon-pg-ext -j$(nproc)

				      - name: Build all

				        # Note: the Makefile picks up BUILD_TYPE and CARGO_PROFILE from the env variables

				        run: mold -run make ${make_vars} all -j$(nproc) CARGO_BUILD_FLAGS="$CARGO_FLAGS"

				      - name: Build walproposer-lib

				        run: mold -run make walproposer-lib -j$(nproc)

				        run: mold -run make ${make_vars} walproposer-lib -j$(nproc)

				      - name: Run cargo build

				      - name: Build unit tests

				        if: inputs.sanitizers != 'enabled'

				        run: |

				          PQ_LIB_DIR=$(pwd)/pg_install/v16/lib

				          export PQ_LIB_DIR

				          ${cov_prefix} mold -run cargo build $CARGO_FLAGS $CARGO_FEATURES --bins --tests

				          export ASAN_OPTIONS=detect_leaks=0

				          ${cov_prefix} mold -run cargo build $CARGO_FLAGS $CARGO_PROFILE --tests

				      # Do install *before* running rust tests because they might recompile the

				      # binaries with different features/flags.

				      - name: Install rust binaries

				        env:

				          ARCH: ${{ inputs.arch }}

				          SANITIZERS: ${{ inputs.sanitizers }}

				        run: |

				          # Install target binaries

				          mkdir -p /tmp/neon/bin/

				          binaries=$(

				            ${cov_prefix} cargo metadata $CARGO_FEATURES --format-version=1 --no-deps |

				            ${cov_prefix} cargo metadata $CARGO_FLAGS --format-version=1 --no-deps |

				            jq -r '.packages[].targets[] | select(.kind | index("bin")) | .name'

				          )

				          for bin in $binaries; do

				@@ -179,14 +221,14 @@ jobs:

				          done

				          # Install test executables and write list of all binaries (for code coverage)

				          if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' ]]; then

				          if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' && $SANITIZERS != 'enabled' ]]; then

				            # Keep bloated coverage data files away from the rest of the artifact

				            mkdir -p /tmp/coverage/

				            mkdir -p /tmp/neon/test_bin/

				            test_exe_paths=$(

				              ${cov_prefix} cargo test $CARGO_FLAGS $CARGO_FEATURES --message-format=json --no-run |

				              ${cov_prefix} cargo test $CARGO_FLAGS $CARGO_PROFILE --message-format=json --no-run |

				              jq -r '.executable | select(. != null)'

				            )

				            for bin in $test_exe_paths; do

				@@ -204,33 +246,41 @@ jobs:

				            done

				          fi

				      - name: Configure AWS credentials

				        uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				        with:

				          aws-region: eu-central-1

				          role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				          role-duration-seconds: 18000 # 5 hours

				      - name: Run rust tests

				        if: ${{ inputs.sanitizers != 'enabled' }}

				        env:

				          NEXTEST_RETRIES: 3

				        run: |

				          PQ_LIB_DIR=$(pwd)/pg_install/v16/lib

				          export PQ_LIB_DIR

				          LD_LIBRARY_PATH=$(pwd)/pg_install/v16/lib

				          LD_LIBRARY_PATH=$(pwd)/pg_install/v17/lib

				          export LD_LIBRARY_PATH

				          #nextest does not yet support running doctests

				          ${cov_prefix} cargo test --doc $CARGO_FLAGS $CARGO_FEATURES

				          ${cov_prefix} cargo test --doc $CARGO_FLAGS $CARGO_PROFILE

				          # run all non-pageserver tests

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_FEATURES -E '!package(pageserver)'

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_PROFILE -E '!package(pageserver)'

				          # run pageserver tests with different settings

				          for io_engine in std-fs tokio-epoll-uring ; do

				            for io_buffer_alignment in 0 1 512 ; do

				              NEON_PAGESERVER_UNIT_TEST_VIRTUAL_FILE_IOENGINE=$io_engine NEON_PAGESERVER_UNIT_TEST_IO_BUFFER_ALIGNMENT=$io_buffer_alignment ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_FEATURES  -E 'package(pageserver)'

				            done

				          done

				          # run pageserver tests

				          # (When developing new pageserver features gated by config fields, we commonly make the rust

				          # unit tests sensitive to an environment variable NEON_PAGESERVER_UNIT_TEST_FEATURENAME.

				          # Then run the nextest invocation below for all relevant combinations. Singling out the

				          # pageserver tests from non-pageserver tests cuts down the time it takes for this CI step.)

				          NEON_PAGESERVER_UNIT_TEST_VIRTUAL_FILE_IOENGINE=tokio-epoll-uring  \

				          ${cov_prefix} \

				          cargo nextest run $CARGO_FLAGS $CARGO_PROFILE  -E 'package(pageserver)'

				          # Run separate tests for real S3

				          export ENABLE_REAL_S3_REMOTE_STORAGE=nonempty

				          export REMOTE_STORAGE_S3_BUCKET=neon-github-ci-tests

				          export REMOTE_STORAGE_S3_REGION=eu-central-1

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_FEATURES -E 'package(remote_storage)' -E 'test(test_real_s3)'

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_PROFILE -E 'package(remote_storage)' -E 'test(test_real_s3)'

				          # Run separate tests for real Azure Blob Storage

				          # XXX: replace region with `eu-central-1`-like region

				@@ -239,16 +289,46 @@ jobs:

				          export AZURE_STORAGE_ACCESS_KEY="${{ secrets.AZURE_STORAGE_ACCESS_KEY_DEV }}"

				          export REMOTE_STORAGE_AZURE_CONTAINER="${{ vars.REMOTE_STORAGE_AZURE_CONTAINER }}"

				          export REMOTE_STORAGE_AZURE_REGION="${{ vars.REMOTE_STORAGE_AZURE_REGION }}"

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_FEATURES -E 'package(remote_storage)' -E 'test(test_real_azure)'

				          ${cov_prefix} cargo nextest run $CARGO_FLAGS $CARGO_PROFILE -E 'package(remote_storage)' -E 'test(test_real_azure)'

				      - name: Install postgres binaries

				        run: cp -a pg_install /tmp/neon/pg_install

				        run: |

				          # Use tar to copy files matching the pattern, preserving the paths in the destionation

				          tar c \

				            pg_install/v* \

				            build/*/src/test/regress/*.so \

				            build/*/src/test/regress/pg_regress \

				            build/*/src/test/isolation/isolationtester \

				            build/*/src/test/isolation/pg_isolation_regress \

				            | tar  x -C /tmp/neon

				      - name: Upload Neon artifact

				        uses: ./.github/actions/upload

				        with:

				          name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-artifact

				          name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}${{ inputs.sanitizers == 'enabled' && '-sanitized' || '' }}-artifact

				          path: /tmp/neon

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      - name: Check diesel schema

				        if: inputs.build-type == 'release' && inputs.arch == 'x64'

				        env:

				          DATABASE_URL: postgresql://localhost:1235/storage_controller

				          POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				        run: |

				          export ASAN_OPTIONS=detect_leaks=0

				          /tmp/neon/bin/neon_local init

				          /tmp/neon/bin/neon_local storage_controller start

				          diesel print-schema > storage_controller/src/schema.rs

				          if [ -n "$(git diff storage_controller/src/schema.rs)" ]; then

				            echo >&2 "Uncommitted changes in diesel schema"

				            git diff .

				            exit 1

				          fi

				          /tmp/neon/bin/neon_local storage_controller stop

				      # XXX: keep this after the binaries.list is formed, so the coverage can properly work later

				      - name: Merge and upload coverage data

				@@ -258,27 +338,36 @@ jobs:

				  regress-tests:

				    # Don't run regression tests on debug arm64 builds

				    if: inputs.build-type != 'debug' || inputs.arch != 'arm64'

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      contents: read

				      statuses: write

				    needs: [ build-neon ]

				    runs-on: ${{ fromJson(format('["self-hosted", "{0}"]', inputs.arch == 'arm64' && 'large-arm64' || 'large')) }}

				    runs-on: ${{ fromJSON(format('["self-hosted", "{0}"]', inputs.arch == 'arm64' && 'large-arm64' || 'large-metal')) }}

				    container:

				      image: ${{ inputs.build-tools-image }}

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      # for changed limits, see comments on `options:` earlier in this file

				      options: --init --shm-size=512mb --ulimit memlock=67108864:67108864

				    strategy:

				      fail-fast: false

				      matrix:

				        pg_version: ${{ fromJson(inputs.pg-versions) }}

				      matrix: ${{ fromJSON(format('{{"include":{0}}}', inputs.test-cfg)) }}

				    steps:

				      - uses: actions/checkout@v4

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				      - name: Pytest regression tests

				        continue-on-error: ${{ matrix.lfc_state == 'with-lfc' && inputs.build-type == 'debug' }}

				        uses: ./.github/actions/run-python-test-set

				        timeout-minutes: 60

				        timeout-minutes: ${{ (inputs.build-type == 'release' && inputs.sanitizers != 'enabled') && 75 || 180 }}

				        with:

				          build_type: ${{ inputs.build-type }}

				          test_selection: regress

				@@ -286,13 +375,21 @@ jobs:

				          run_with_real_s3: true

				          real_s3_bucket: neon-github-ci-tests

				          real_s3_region: eu-central-1

				          rerun_flaky: true

				          rerun_failed: ${{ inputs.rerun-failed }}

				          pg_version: ${{ matrix.pg_version }}

				          sanitizers: ${{ inputs.sanitizers }}

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				          # `--session-timeout` is equal to (timeout-minutes - 10 minutes) * 60 seconds.

				          # Attempt to stop tests gracefully to generate test reports

				          # until they are forcibly stopped by the stricter `timeout-minutes` limit.

				          extra_params: --session-timeout=${{ (inputs.build-type == 'release' && inputs.sanitizers != 'enabled') && 3000 || 10200 }} --count=${{ inputs.test-run-count }}

				                        ${{ inputs.test-selection != '' && format('-k "{0}"', inputs.test-selection) || '' }}

				        env:

				          TEST_RESULT_CONNSTR: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

				          CHECK_ONDISK_DATA_COMPATIBILITY: nonempty

				          BUILD_TAG: ${{ inputs.build-tag }}

				          PAGESERVER_VIRTUAL_FILE_IO_ENGINE: tokio-epoll-uring

				          USE_LFC: ${{ matrix.lfc_state == 'with-lfc' && 'true' || 'false' }}

				      # Temporary disable this step until we figure out why it's so flaky

				      # Ref https://github.com/neondatabase/neon/issues/4540

									
										55

.github/workflows/_check-codestyle-python.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,55 @@

				name: Check Codestyle Python

				on:

				  workflow_call:

				    inputs:

				      build-tools-image:

				        description: 'build-tools image'

				        required: true

				        type: string

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				permissions:

				  contents: read

				jobs:

				  check-codestyle-python:

				    runs-on: [ self-hosted, small ]

				    permissions:

				      packages: read

				    container:

				      image: ${{ inputs.build-tools-image }}

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Cache poetry deps

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: ~/.cache/pypoetry/virtualenvs

				          key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-bookworm-${{ hashFiles('poetry.lock') }}

				      - run: ./scripts/pysync

				      - run: poetry run ruff check .

				      - run: poetry run ruff format --check .

				      - run: poetry run mypy .

									
										102

.github/workflows/_check-codestyle-rust.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,102 @@

				name: Check Codestyle Rust

				on:

				  workflow_call:

				    inputs:

				      build-tools-image:

				        description: "build-tools image"

				        required: true

				        type: string

				      archs:

				        description: "Json array of architectures to run on"

				        type: string

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				jobs:

				  check-codestyle-rust:

				    strategy:

				      matrix:

				        arch: ${{ fromJSON(inputs.archs) }}

				    runs-on: ${{ fromJSON(format('["self-hosted", "{0}"]', matrix.arch == 'arm64' && 'small-arm64' || 'small')) }}

				    permissions:

				      packages: read

				    container:

				      image: ${{ inputs.build-tools-image }}

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				      - name: Cache cargo deps

				        uses: tespkg/actions-cache@b7bf5fcc2f98a52ac6080eb0fd282c2f752074b1  # v1.8.0

				        with:

				          endpoint: ${{ vars.HETZNER_CACHE_REGION }}.${{ vars.HETZNER_CACHE_ENDPOINT }}

				          bucket: ${{ vars.HETZNER_CACHE_BUCKET }}

				          accessKey: ${{ secrets.HETZNER_CACHE_ACCESS_KEY }}

				          secretKey: ${{ secrets.HETZNER_CACHE_SECRET_KEY }}

				          use-fallback: false

				          path: |

				            ~/.cargo/registry

				            !~/.cargo/registry/src

				            ~/.cargo/git

				            target

				          key: v1-${{ runner.os }}-${{ runner.arch }}-cargo-${{ hashFiles('./Cargo.lock') }}-${{ hashFiles('./rust-toolchain.toml') }}-rust

				      # Some of our rust modules use FFI and need those to be checked

				      - name: Get postgres headers

				        run: make postgres-headers -j$(nproc)

				      # cargo hack runs the given cargo subcommand (clippy in this case) for all feature combinations.

				      # This will catch compiler & clippy warnings in all feature combinations.

				      # TODO: use cargo hack for build and test as well, but, that's quite expensive.

				      # NB: keep clippy args in sync with ./run_clippy.sh

				      #

				      # The only difference between "clippy --debug" and "clippy --release" is that in --release mode,

				      # #[cfg(debug_assertions)] blocks are not built. It's not worth building everything for second

				      # time just for that, so skip "clippy --release".

				      - run: |

				          CLIPPY_COMMON_ARGS="$( source .neon_clippy_args; echo "$CLIPPY_COMMON_ARGS")"

				          if [ "$CLIPPY_COMMON_ARGS" = "" ]; then

				            echo "No clippy args found in .neon_clippy_args"

				            exit 1

				          fi

				          echo "CLIPPY_COMMON_ARGS=${CLIPPY_COMMON_ARGS}" >> $GITHUB_ENV

				      - name: Run cargo clippy (debug)

				        run: cargo hack --features default --ignore-unknown-features --feature-powerset clippy $CLIPPY_COMMON_ARGS

				      - name: Check documentation generation

				        run: cargo doc --workspace --no-deps --document-private-items

				        env:

				          RUSTDOCFLAGS: "-Dwarnings -Arustdoc::private_intra_doc_links"

				      # Use `${{ !cancelled() }}` to run quck tests after the longer clippy run

				      - name: Check formatting

				        if: ${{ !cancelled() }}

				        run: cargo fmt --all -- --check

				      # https://github.com/facebookincubator/cargo-guppy/tree/bec4e0eb29dcd1faac70b1b5360267fc02bf830e/tools/cargo-hakari#2-keep-the-workspace-hack-up-to-date-in-ci

				      - name: Check rust dependencies

				        if: ${{ !cancelled() }}

				        run: |

				          cargo hakari generate --diff  # workspace-hack Cargo.toml is up-to-date

				          cargo hakari manage-deps --dry-run  # all workspace crates depend on workspace-hack

									
										169

.github/workflows/_meta.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,169 @@

				name: Generate run metadata

				on:

				  workflow_call:

				    inputs:

				      github-event-name:

				        type: string

				        required: true

				      github-event-json:

				        type: string

				        required: true

				    outputs:

				      build-tag:

				        description: "Tag for the current workflow run"

				        value: ${{ jobs.tags.outputs.build-tag }}

				      release-tag:

				        description: "Tag for the release if this is an RC PR run"

				        value: ${{ jobs.tags.outputs.release-tag }}

				      previous-storage-release:

				        description: "Tag of the last storage release"

				        value: ${{ jobs.tags.outputs.storage }}

				      previous-proxy-release:

				        description: "Tag of the last proxy release"

				        value: ${{ jobs.tags.outputs.proxy }}

				      previous-compute-release:

				        description: "Tag of the last compute release"

				        value: ${{ jobs.tags.outputs.compute }}

				      run-kind:

				        description: "The kind of run we're currently in. Will be one of `push-main`, `storage-release`, `compute-release`, `proxy-release`, `storage-rc-pr`, `compute-rc-pr`,  `proxy-rc-pr`, `pr`, or `workflow-dispatch`"

				        value: ${{ jobs.tags.outputs.run-kind }}

				      release-pr-run-id:

				        description: "Only available if `run-kind in [storage-release, proxy-release, compute-release]`. Contains the run ID of the `Build and Test` workflow, assuming one with the current commit can be found."

				        value: ${{ jobs.tags.outputs.release-pr-run-id }}

				      sha:

				        description: "github.event.pull_request.head.sha on release PRs, github.sha otherwise"

				        value: ${{ jobs.tags.outputs.sha }}

				permissions: {}

				defaults:

				  run:

				    shell: bash -euo pipefail {0}

				jobs:

				  tags:

				    runs-on: ubuntu-22.04

				    outputs:

				      build-tag: ${{ steps.build-tag.outputs.build-tag }}

				      release-tag: ${{ steps.build-tag.outputs.release-tag }}

				      compute: ${{ steps.previous-releases.outputs.compute }}

				      proxy: ${{ steps.previous-releases.outputs.proxy }}

				      storage: ${{ steps.previous-releases.outputs.storage }}

				      run-kind: ${{ steps.run-kind.outputs.run-kind }}

				      release-pr-run-id: ${{ steps.release-pr-run-id.outputs.release-pr-run-id }}

				      sha: ${{ steps.sha.outputs.sha }}

				    permissions:

				      contents: read

				    steps:

				      # Need `fetch-depth: 0` to count the number of commits in the branch

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Get run kind

				        id: run-kind

				        env:

				          RUN_KIND: >-

				            ${{

				              false

				              || (inputs.github-event-name == 'push'         && github.ref_name == 'main')            && 'push-main'

				              || (inputs.github-event-name == 'push'         && github.ref_name == 'release')         && 'storage-release'

				              || (inputs.github-event-name == 'push'         && github.ref_name == 'release-compute') && 'compute-release'

				              || (inputs.github-event-name == 'push'         && github.ref_name == 'release-proxy')   && 'proxy-release'

				              || (inputs.github-event-name == 'pull_request' && github.base_ref == 'release')         && 'storage-rc-pr'

				              || (inputs.github-event-name == 'pull_request' && github.base_ref == 'release-compute') && 'compute-rc-pr'

				              || (inputs.github-event-name == 'pull_request' && github.base_ref == 'release-proxy')   && 'proxy-rc-pr'

				              || (inputs.github-event-name == 'pull_request')                                         && 'pr'

				              || (inputs.github-event-name == 'workflow_dispatch')                                    && 'workflow-dispatch'

				              || 'unknown'

				            }}

				        run: |

				          echo "run-kind=$RUN_KIND" | tee -a $GITHUB_OUTPUT

				      - name: Get the right SHA

				        id: sha

				        env:

				          SHA: >

				            ${{

				              contains(fromJSON('["storage-rc-pr", "proxy-rc-pr", "compute-rc-pr"]'), steps.run-kind.outputs.run-kind)

				              && fromJSON(inputs.github-event-json).pull_request.head.sha

				              || github.sha

				            }}

				        run: |

				          echo "sha=$SHA" | tee -a $GITHUB_OUTPUT

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          ref: ${{ steps.sha.outputs.sha }}

				      - name: Get build tag

				        id: build-tag

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          CURRENT_BRANCH: ${{ github.head_ref || github.ref_name }}

				          CURRENT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}

				          RUN_KIND: ${{ steps.run-kind.outputs.run-kind }}

				        run: |

				          case $RUN_KIND in

				          push-main)

				            echo "build-tag=$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				            ;;

				          storage-release)

				            echo "build-tag=release-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				            ;;

				          proxy-release)

				            echo "build-tag=release-proxy-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				            ;;

				          compute-release)

				            echo "build-tag=release-compute-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				            ;;

				          pr|storage-rc-pr|compute-rc-pr|proxy-rc-pr)

				            BUILD_AND_TEST_RUN_ID=$(gh api --paginate \

				              -H "Accept: application/vnd.github+json" \

				              -H "X-GitHub-Api-Version: 2022-11-28" \

				              "/repos/${GITHUB_REPOSITORY}/actions/runs?head_sha=${CURRENT_SHA}&branch=${CURRENT_BRANCH}" \

				              | jq '[.workflow_runs[] | select(.name == "Build and Test")][0].id // ("Error: No matching workflow run found." | halt_error(1))')

				            echo "build-tag=$BUILD_AND_TEST_RUN_ID" | tee -a $GITHUB_OUTPUT

				            case $RUN_KIND in

				            storage-rc-pr)

				              echo "release-tag=release-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				              ;;

				            proxy-rc-pr)

				              echo "release-tag=release-proxy-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				              ;;

				            compute-rc-pr)

				              echo "release-tag=release-compute-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				              ;;

				            esac

				            ;;

				          workflow-dispatch)

				            echo "build-tag=$GITHUB_RUN_ID" | tee -a $GITHUB_OUTPUT

				            ;;

				          *)

				            echo "Unexpected RUN_KIND ('${RUN_KIND}'), failing to assign build-tag!"

				            exit 1

				          esac

				      - name: Get the previous release-tags

				        id: previous-releases

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				        run: |

				          gh api --paginate \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            "/repos/${GITHUB_REPOSITORY}/releases" \

				          | jq -f .github/scripts/previous-releases.jq -r \

				          | tee -a "${GITHUB_OUTPUT}"

				      - name: Get the release PR run ID

				        id: release-pr-run-id

				        if: ${{ contains(fromJSON('["storage-release", "compute-release", "proxy-release"]'), steps.run-kind.outputs.run-kind) }}

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          CURRENT_SHA: ${{ github.sha }}

				        run: |

				          RELEASE_PR_RUN_ID=$(gh api "/repos/${GITHUB_REPOSITORY}/actions/runs?head_sha=$CURRENT_SHA" | jq '[.workflow_runs[] | select(.name == "Build and Test") | select(.head_branch | test("^rc/release.*$"; "s"))] | first | .id // ("Failed to find Build and Test run from  RC PR!" | halt_error(1))')

				          echo "release-pr-run-id=$RELEASE_PR_RUN_ID" | tee -a $GITHUB_OUTPUT

									
										56

.github/workflows/_push-to-acr.yml
									
										vendored
									
												View File
											
				@@ -1,56 +0,0 @@

				name: Push images to ACR

				on:

				  workflow_call:

				    inputs:

				      client_id:

				        description: Client ID of Azure managed identity or Entra app

				        required: true

				        type: string

				      image_tag:

				        description: Tag for the container image

				        required: true

				        type: string

				      images:

				        description: Images to push

				        required: true

				        type: string

				      registry_name:

				        description: Name of the container registry

				        required: true

				        type: string

				      subscription_id:

				        description: Azure subscription ID

				        required: true

				        type: string

				      tenant_id:

				        description: Azure tenant ID

				        required: true

				        type: string

				jobs:

				  push-to-acr:

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read  # This is required for actions/checkout

				      id-token: write # This is required for Azure Login to work.

				    steps:

				      - name: Azure login

				        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1

				        with:

				          client-id: ${{ inputs.client_id }}

				          subscription-id: ${{ inputs.subscription_id }}

				          tenant-id: ${{ inputs.tenant_id }}

				      - name: Login to ACR

				        run: |

				          az acr login --name=${{ inputs.registry_name }}

				      - name: Copy docker images to ACR ${{ inputs.registry_name }}

				        run: |

				          images='${{ inputs.images }}'

				          for image in ${images}; do

				            docker buildx imagetools create \

				              -t ${{ inputs.registry_name }}.azurecr.io/neondatabase/${image}:${{ inputs.image_tag }} \

				                                        neondatabase/${image}:${{ inputs.image_tag }}

				          done

									
										128

.github/workflows/_push-to-container-registry.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,128 @@

				name: Push images to Container Registry

				on:

				  workflow_call:

				    inputs:

				      # Example: {"docker.io/neondatabase/neon:13196061314":["${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/neon:13196061314","neoneastus2.azurecr.io/neondatabase/neon:13196061314"]}

				      image-map:

				        description: JSON map of images, mapping from a source image to an array of target images that should be pushed.

				        required: true

				        type: string

				      aws-region:

				        description: AWS region to log in to. Required when pushing to ECR.

				        required: false

				        type: string

				      aws-account-id:

				        description: AWS account ID to log in to for pushing to ECR. Required when pushing to ECR.

				        required: false

				        type: string

				      aws-role-to-assume:

				        description: AWS role to assume to for pushing to ECR. Required when pushing to ECR.

				        required: false

				        type: string

				      azure-client-id:

				        description: Client ID of Azure managed identity or Entra app. Required when pushing to ACR.

				        required: false

				        type: string

				      azure-subscription-id:

				        description: Azure subscription ID. Required when pushing to ACR.

				        required: false

				        type: string

				      azure-tenant-id:

				        description: Azure tenant ID. Required when pushing to ACR.

				        required: false

				        type: string

				      acr-registry-name:

				        description: ACR registry name. Required when pushing to ACR.

				        required: false

				        type: string

				permissions: {}

				defaults:

				  run:

				    shell: bash -euo pipefail {0}

				jobs:

				  push-to-container-registry:

				    runs-on: ubuntu-22.04

				    permissions:

				      id-token: write  # Required for aws/azure login

				      packages: write  # required for pushing to GHCR

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          sparse-checkout: .github/scripts/push_with_image_map.py

				          sparse-checkout-cone-mode: false

				      - name: Print image-map

				        run: echo '${{ inputs.image-map }}' | jq

				      - name: Configure AWS credentials

				        if: contains(inputs.image-map, 'amazonaws.com/')

				        uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				        with:

				          aws-region: "${{ inputs.aws-region }}"

				          role-to-assume: "arn:aws:iam::${{ inputs.aws-account-id }}:role/${{ inputs.aws-role-to-assume }}"

				          role-duration-seconds: 3600

				      - name: Login to ECR

				        if: contains(inputs.image-map, 'amazonaws.com/')

				        uses: aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 # v2.0.1

				        with:

				          registries: "${{ inputs.aws-account-id }}"

				      - name: Configure Azure credentials

				        if: contains(inputs.image-map, 'azurecr.io/')

				        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1

				        with:

				          client-id: ${{ inputs.azure-client-id }}

				          subscription-id: ${{ inputs.azure-subscription-id }}

				          tenant-id: ${{ inputs.azure-tenant-id }}

				      - name: Login to ACR

				        if: contains(inputs.image-map, 'azurecr.io/')

				        run: |

				          az acr login --name=${{ inputs.acr-registry-name }}

				      - name: Login to GHCR

				        if: contains(inputs.image-map, 'ghcr.io/')

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Log in to Docker Hub

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				          password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				      - name: Copy docker images to target registries

				        id: push

				        run: python3 .github/scripts/push_with_image_map.py

				        env:

				          IMAGE_MAP: ${{ inputs.image-map }}

				      - name: Notify Slack if container image pushing fails

				        if: steps.push.outputs.push_failures || failure()

				        uses: slackapi/slack-github-action@485a9d42d3a73031f12ec201c457e2162c45d02d # v2.0.0

				        with:

				          method: chat.postMessage

				          token: ${{ secrets.SLACK_BOT_TOKEN }}

				          payload: |

				            channel: ${{ vars.SLACK_ON_CALL_DEVPROD_STREAM }}

				            text: >

				              *Container image pushing ${{

				                steps.push.outcome == 'failure' && 'failed completely' || 'succeeded with some retries'

				              }}* in

				              <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				              ${{ steps.push.outputs.push_failures && format(

				                '*Failed targets:*\n• {0}', join(fromJson(steps.push.outputs.push_failures), '\n• ')

				              ) || '' }}

									
										11

.github/workflows/actionlint.yml
									
										vendored
									
												View File
												
				@@ -26,14 +26,19 @@ jobs:

				    needs: [ check-permissions ]

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@v4

				      - uses: reviewdog/action-actionlint@v1

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - uses: reviewdog/action-actionlint@a5524e1c19e62881d79c1f1b9b6f09f16356e281 # v1.65.2

				        env:

				          # SC2046 - Quote this to prevent word splitting. - https://www.shellcheck.net/wiki/SC2046

				          # SC2086 - Double quote to prevent globbing and word splitting. - https://www.shellcheck.net/wiki/SC2086

				          SHELLCHECK_OPTS: --exclude=SC2046,SC2086

				        with:

				          fail_on_error: true

				          fail_level: error

				          filter_mode: nofilter

				          level: error

									
										29

.github/workflows/approved-for-ci-run.yml
									
										vendored
									
												View File
												
				@@ -47,6 +47,11 @@ jobs:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - run: gh pr --repo "${GITHUB_REPOSITORY}" edit "${PR_NUMBER}" --remove-label "approved-for-ci-run"

				  create-or-update-pr-for-ci-run:

				@@ -63,13 +68,18 @@ jobs:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - run: gh pr --repo "${GITHUB_REPOSITORY}" edit "${PR_NUMBER}" --remove-label "approved-for-ci-run"

				      - uses: actions/checkout@v4

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: main

				          ref: ${{ github.event.pull_request.head.sha }}

				          token: ${{ secrets.CI_ACCESS_TOKEN }}

				      - name: Look for existing PR

				        id: get-pr

				        env:

				@@ -77,7 +87,7 @@ jobs:

				        run: |

				          ALREADY_CREATED="$(gh pr --repo ${GITHUB_REPOSITORY} list --head ${BRANCH} --base main --json number --jq '.[].number')"

				          echo "ALREADY_CREATED=${ALREADY_CREATED}" >> ${GITHUB_OUTPUT}

				      - name: Get changed labels

				        id: get-labels

				        if: steps.get-pr.outputs.ALREADY_CREATED != ''

				@@ -94,8 +104,6 @@ jobs:

				          echo "LABELS_TO_ADD=${LABELS_TO_ADD}" >> ${GITHUB_OUTPUT}

				          echo "LABELS_TO_REMOVE=${LABELS_TO_REMOVE}" >> ${GITHUB_OUTPUT}

				      - run: gh pr checkout "${PR_NUMBER}"

				      - run: git checkout -b "${BRANCH}"

				      - run: git push --force origin "${BRANCH}"

				@@ -103,7 +111,7 @@ jobs:

				      - name: Create a Pull Request for CI run (if required)

				        if: steps.get-pr.outputs.ALREADY_CREATED == ''

				        env: 

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				        run: |

				          cat << EOF > body.md

				@@ -140,7 +148,7 @@ jobs:

				      - run: git push --force origin "${BRANCH}"

				        if: steps.get-pr.outputs.ALREADY_CREATED != ''

				  cleanup:

				    # Close PRs and delete branchs if the original PR is closed.

				@@ -155,6 +163,11 @@ jobs:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Close PR and delete `ci-run/pr-${{ env.PR_NUMBER }}` branch

				        run: |

				          CLOSED="$(gh pr --repo ${GITHUB_REPOSITORY} list --head ${BRANCH} --json 'closed' --jq '.[].closed')"

556

.github/workflows/benchmarking.yml vendored

View File

File diff suppressed because it is too large Load Diff

									
										178

.github/workflows/build-build-tools-image.yml
									
										vendored
									
												View File
												
				@@ -3,32 +3,100 @@ name: Build build-tools image

				on:

				  workflow_call:

				    inputs:

				      image-tag:

				        description: "build-tools image tag"

				        required: true

				      archs:

				        description: "Json array of architectures to build"

				        # Default values are set in `check-image` job, `set-variables` step

				        type: string

				        required: false

				      debians:

				        description: "Json array of Debian versions to build"

				        # Default values are set in `check-image` job, `set-variables` step

				        type: string

				        required: false

				    outputs:

				      image-tag:

				        description: "build-tools tag"

				        value: ${{ inputs.image-tag }}

				        value: ${{ jobs.check-image.outputs.tag }}

				      image:

				        description: "build-tools image"

				        value: neondatabase/build-tools:${{ inputs.image-tag }}

				        value: ghcr.io/neondatabase/build-tools:${{ jobs.check-image.outputs.tag }}

				defaults:

				  run:

				    shell: bash -euo pipefail {0}

				concurrency:

				  group: build-build-tools-image-${{ inputs.image-tag }}

				  cancel-in-progress: false

				# The initial idea was to prevent the waste of resources by not re-building the `build-tools` image

				# for the same tag in parallel workflow runs, and queue them to be skipped once we have

				# the first image pushed to Docker registry, but GitHub's concurrency mechanism is not working as expected.

				# GitHub can't have more than 1 job in a queue and removes the previous one, it causes failures if the dependent jobs.

				#

				# Ref https://github.com/orgs/community/discussions/41518

				#

				# concurrency:

				#   group: build-build-tools-image-${{ inputs.image-tag }}

				#   cancel-in-progress: false

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				jobs:

				  check-image:

				    uses: ./.github/workflows/check-build-tools-image.yml

				    runs-on: ubuntu-22.04

				    outputs:

				      archs: ${{ steps.set-variables.outputs.archs }}

				      debians: ${{ steps.set-variables.outputs.debians }}

				      tag: ${{ steps.set-variables.outputs.image-tag }}

				      everything: ${{ steps.set-more-variables.outputs.everything }}

				      found: ${{ steps.set-more-variables.outputs.found }}

				    permissions:

				      packages: read

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Set variables

				        id: set-variables

				        env:

				          ARCHS: ${{ inputs.archs || '["x64","arm64"]' }}

				          DEBIANS: ${{ inputs.debians || '["bullseye","bookworm"]' }}

				          IMAGE_TAG: |

				            ${{ hashFiles('build-tools.Dockerfile',

				                          '.github/workflows/build-build-tools-image.yml') }}

				        run: |

				          echo "archs=${ARCHS}"           | tee -a ${GITHUB_OUTPUT}

				          echo "debians=${DEBIANS}"       | tee -a ${GITHUB_OUTPUT}

				          echo "image-tag=${IMAGE_TAG}"   | tee -a ${GITHUB_OUTPUT}

				      - name: Set more variables

				        id: set-more-variables

				        env:

				          IMAGE_TAG: ${{ steps.set-variables.outputs.image-tag }}

				          EVERYTHING: |

				            ${{ contains(fromJSON(steps.set-variables.outputs.archs), 'x64') &&

				                contains(fromJSON(steps.set-variables.outputs.archs), 'arm64') &&

				                contains(fromJSON(steps.set-variables.outputs.debians), 'bullseye') &&

				                contains(fromJSON(steps.set-variables.outputs.debians), 'bookworm') }}

				        run: |

				          if docker manifest inspect ghcr.io/neondatabase/build-tools:${IMAGE_TAG}; then

				            found=true

				          else

				            found=false

				          fi

				          echo "everything=${EVERYTHING}" | tee -a ${GITHUB_OUTPUT}

				          echo "found=${found}"           | tee -a ${GITHUB_OUTPUT}

				  build-image:

				    needs: [ check-image ]

				@@ -36,68 +104,100 @@ jobs:

				    strategy:

				      matrix:

				        arch: [ x64, arm64 ]

				        arch: ${{ fromJSON(needs.check-image.outputs.archs) }}

				        debian: ${{ fromJSON(needs.check-image.outputs.debians) }}

				    runs-on: ${{ fromJson(format('["self-hosted", "{0}"]', matrix.arch == 'arm64' && 'large-arm64' || 'large')) }}

				    permissions:

				      packages: write

				    env:

				      IMAGE_TAG: ${{ inputs.image-tag }}

				    runs-on: ${{ fromJSON(format('["self-hosted", "{0}"]', matrix.arch == 'arm64' && 'large-arm64' || 'large')) }}

				    steps:

				      - name: Check `input.tag` is correct

				        env:

				          INPUTS_IMAGE_TAG: ${{ inputs.image-tag }}

				          CHECK_IMAGE_TAG : ${{ needs.check-image.outputs.image-tag }}

				        run: |

				          if [ "${INPUTS_IMAGE_TAG}" != "${CHECK_IMAGE_TAG}" ]; then

				            echo "'inputs.image-tag' (${INPUTS_IMAGE_TAG}) does not match the tag of the latest build-tools image 'inputs.image-tag' (${CHECK_IMAGE_TAG})"

				            exit 1

				          fi

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@v4

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - uses: ./.github/actions/set-docker-config-dir

				      - uses: docker/setup-buildx-action@v3

				      - uses: neondatabase/dev-actions/set-docker-config-dir@6094485bf440001c94a94a3f9e221e81ff6b6193

				      - uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				        with:

				          cache-binary: false

				      - uses: docker/login-action@v3

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				          password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				      - uses: docker/login-action@v3

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: cache.neon.build

				          username: ${{ secrets.NEON_CI_DOCKERCACHE_USERNAME }}

				          password: ${{ secrets.NEON_CI_DOCKERCACHE_PASSWORD }}

				      - uses: docker/build-push-action@v6

				      - uses: docker/build-push-action@471d1dc4e07e5cdedd4c2171150001c434f0b7a4 # v6.15.0

				        with:

				          file: build-tools.Dockerfile

				          context: .

				          provenance: false

				          push: true

				          pull: true

				          file: Dockerfile.build-tools

				          cache-from: type=registry,ref=cache.neon.build/build-tools:cache-${{ matrix.arch }}

				          cache-to: ${{ github.ref_name == 'main' && format('type=registry,ref=cache.neon.build/build-tools:cache-{0},mode=max', matrix.arch) || '' }}

				          tags: neondatabase/build-tools:${{ inputs.image-tag }}-${{ matrix.arch }}

				          build-args: |

				            DEBIAN_VERSION=${{ matrix.debian }}

				          cache-from: type=registry,ref=cache.neon.build/build-tools:cache-${{ matrix.debian }}-${{ matrix.arch }}

				          cache-to: ${{ github.ref_name == 'main' && format('type=registry,ref=cache.neon.build/build-tools:cache-{0}-{1},mode=max', matrix.debian, matrix.arch) || '' }}

				          tags: |

				            ghcr.io/neondatabase/build-tools:${{ needs.check-image.outputs.tag }}-${{ matrix.debian }}-${{ matrix.arch }}

				  merge-images:

				    needs: [ build-image ]

				    needs: [ check-image, build-image ]

				    runs-on: ubuntu-22.04

				    env:

				      IMAGE_TAG: ${{ inputs.image-tag }}

				    permissions:

				      packages: write

				    steps:

				      - uses: docker/login-action@v3

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				          password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Create multi-arch image

				        env:

				          DEFAULT_DEBIAN_VERSION: bookworm

				          ARCHS: ${{ join(fromJSON(needs.check-image.outputs.archs), ' ') }}

				          DEBIANS: ${{ join(fromJSON(needs.check-image.outputs.debians), ' ') }}

				          EVERYTHING: ${{ needs.check-image.outputs.everything }}

				          IMAGE_TAG: ${{ needs.check-image.outputs.tag }}

				        run: |

				          docker buildx imagetools create -t neondatabase/build-tools:${IMAGE_TAG} \

				                                             neondatabase/build-tools:${IMAGE_TAG}-x64 \

				                                             neondatabase/build-tools:${IMAGE_TAG}-arm64

				          for debian in ${DEBIANS}; do

				            tags=("-t" "ghcr.io/neondatabase/build-tools:${IMAGE_TAG}-${debian}")

				            if [ "${EVERYTHING}" == "true" ] && [ "${debian}" == "${DEFAULT_DEBIAN_VERSION}" ]; then

				              tags+=("-t" "ghcr.io/neondatabase/build-tools:${IMAGE_TAG}")

				            fi

				            for arch in ${ARCHS}; do

				              tags+=("ghcr.io/neondatabase/build-tools:${IMAGE_TAG}-${debian}-${arch}")

				            done

				            docker buildx imagetools create "${tags[@]}"

				          done

									
										265

.github/workflows/build-macos.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,265 @@

				name: Check neon with MacOS builds

				on:

				  workflow_call:

				    inputs:

				      pg_versions:

				        description: "Array of the pg versions to build for, for example: ['v14', 'v17']"

				        type: string

				        default: '[]'

				        required: false

				      rebuild_rust_code:

				        description: "Rebuild Rust code"

				        type: boolean

				        default: false

				        required: false

				      rebuild_everything:

				        description: "If true, rebuild for all versions"

				        type: boolean

				        default: false

				        required: false

				env:

				  RUST_BACKTRACE: 1

				  COPT: '-Werror'

				# TODO: move `check-*` and `files-changed` jobs to the "Caller" Workflow

				# We should care about that as Github has limitations:

				# - You can connect up to four levels of workflows

				# - You can call a maximum of 20 unique reusable workflows from a single workflow file.

				# https://docs.github.com/en/actions/sharing-automations/reusing-workflows#limitations

				permissions:

				  contents: read

				jobs:

				  build-pgxn:

				    if: |

				      inputs.pg_versions != '[]' || inputs.rebuild_everything ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-macos')  ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				      github.ref_name == 'main'

				    timeout-minutes: 30

				    runs-on: macos-15

				    strategy:

				      matrix:

				        postgres-version: ${{ inputs.rebuild_everything && fromJSON('["v14", "v15", "v16", "v17"]') || fromJSON(inputs.pg_versions) }}

				    env:

				      # Use release build only, to have less debug info around

				      # Hence keeping target/ (and general cache size) smaller

				      BUILD_TYPE: release

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout main repo

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Set pg ${{ matrix.postgres-version }} for caching

				        id: pg_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-${{ matrix.postgres-version }}) | tee -a "${GITHUB_OUTPUT}"

				      - name: Cache postgres ${{ matrix.postgres-version }} build

				        id: cache_pg

				        uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3

				        with:

				          path: pg_install/${{ matrix.postgres-version }}

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ env.BUILD_TYPE }}-pg-${{ matrix.postgres-version }}-${{ steps.pg_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}

				      - name: Checkout submodule vendor/postgres-${{ matrix.postgres-version }}

				        if: steps.cache_pg.outputs.cache-hit != 'true'

				        run: |

				          git submodule init vendor/postgres-${{ matrix.postgres-version }}

				          git submodule update --depth 1 --recursive

				      - name: Install build dependencies

				        if: steps.cache_pg.outputs.cache-hit != 'true'

				        run: |

				          brew install flex bison openssl protobuf icu4c

				      - name: Set extra env for macOS

				        if: steps.cache_pg.outputs.cache-hit != 'true'

				        run: |

				          echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV

				          echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV

				      - name: Build Postgres ${{ matrix.postgres-version }}

				        if: steps.cache_pg.outputs.cache-hit != 'true'

				        run: |

				          make postgres-${{ matrix.postgres-version }} -j$(sysctl -n hw.ncpu)

				      - name: Build Neon Pg Ext ${{ matrix.postgres-version }}

				        if: steps.cache_pg.outputs.cache-hit != 'true'

				        run: |

				          make "neon-pg-ext-${{ matrix.postgres-version }}" -j$(sysctl -n hw.ncpu)

				      - name: Upload "pg_install/${{ matrix.postgres-version }}" artifact

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: pg_install--${{ matrix.postgres-version }}

				          path: pg_install/${{ matrix.postgres-version }}

				          # The artifact is supposed to be used by the next job in the same workflow,

				          # so there’s no need to store it for too long.

				          retention-days: 1

				  build-walproposer-lib:

				    if: |

				      contains(inputs.pg_versions, 'v17') || inputs.rebuild_everything ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-macos')  ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				      github.ref_name == 'main'

				    timeout-minutes: 30

				    runs-on: macos-15

				    needs: [build-pgxn]

				    env:

				      # Use release build only, to have less debug info around

				      # Hence keeping target/ (and general cache size) smaller

				      BUILD_TYPE: release

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout main repo

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Set pg v17 for caching

				        id: pg_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v17) | tee -a "${GITHUB_OUTPUT}"

				      - name: Download "pg_install/v17" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: pg_install--v17

				          path: pg_install/v17

				      # `actions/download-artifact` doesn't preserve permissions:

				      # https://github.com/actions/download-artifact?tab=readme-ov-file#permission-loss

				      - name: Make pg_install/v*/bin/* executable

				        run: |

				          chmod +x pg_install/v*/bin/*

				      - name: Cache walproposer-lib

				        id: cache_walproposer_lib

				        uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3

				        with:

				          path: build/walproposer-lib

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ env.BUILD_TYPE }}-walproposer_lib-v17-${{ steps.pg_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}

				      - name: Checkout submodule vendor/postgres-v17

				        if: steps.cache_walproposer_lib.outputs.cache-hit != 'true'

				        run: |

				          git submodule init vendor/postgres-v17

				          git submodule update --depth 1 --recursive

				      - name: Install build dependencies

				        if: steps.cache_walproposer_lib.outputs.cache-hit != 'true'

				        run: |

				          brew install flex bison openssl protobuf icu4c

				      - name: Set extra env for macOS

				        if: steps.cache_walproposer_lib.outputs.cache-hit != 'true'

				        run: |

				          echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV

				          echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV

				      - name: Build walproposer-lib (only for v17)

				        if: steps.cache_walproposer_lib.outputs.cache-hit != 'true'

				        run:

				          make walproposer-lib -j$(sysctl -n hw.ncpu) PG_INSTALL_CACHED=1

				      - name: Upload "build/walproposer-lib" artifact

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: build--walproposer-lib

				          path: build/walproposer-lib

				          # The artifact is supposed to be used by the next job in the same workflow,

				          # so there’s no need to store it for too long.

				          retention-days: 1

				  cargo-build:

				    if: |

				      inputs.pg_versions != '[]' || inputs.rebuild_rust_code || inputs.rebuild_everything ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-macos') ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				      github.ref_name == 'main'

				    timeout-minutes: 30

				    runs-on: macos-15

				    needs: [build-pgxn, build-walproposer-lib]

				    env:

				      # Use release build only, to have less debug info around

				      # Hence keeping target/ (and general cache size) smaller

				      BUILD_TYPE: release

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout main repo

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				      - name: Download "pg_install/v14" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: pg_install--v14

				          path: pg_install/v14

				      - name: Download "pg_install/v15" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: pg_install--v15

				          path: pg_install/v15

				      - name: Download "pg_install/v16" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: pg_install--v16

				          path: pg_install/v16

				      - name: Download "pg_install/v17" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: pg_install--v17

				          path: pg_install/v17

				      - name: Download "build/walproposer-lib" artifact

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: build--walproposer-lib

				          path: build/walproposer-lib

				      # `actions/download-artifact` doesn't preserve permissions:

				      # https://github.com/actions/download-artifact?tab=readme-ov-file#permission-loss

				      - name: Make pg_install/v*/bin/* executable

				        run: |

				          chmod +x pg_install/v*/bin/*

				      - name: Cache cargo deps

				        uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3

				        with:

				          path: |

				            ~/.cargo/registry

				            !~/.cargo/registry/src

				            ~/.cargo/git

				            target

				          key: v1-${{ runner.os }}-${{ runner.arch }}-cargo-${{ hashFiles('./Cargo.lock') }}-${{ hashFiles('./rust-toolchain.toml') }}-rust

				      - name: Install build dependencies

				        run: |

				          brew install flex bison openssl protobuf icu4c

				      - name: Set extra env for macOS

				        run: |

				          echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV

				          echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV

				      - name: Run cargo build

				        run: cargo build --all --release -j$(sysctl -n hw.ncpu)

				      - name: Check that no warnings are produced

				        run: ./run_clippy.sh

									
										121

.github/workflows/build_and_run_selected_test.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,121 @@

				name: Build and Run Selected Test

				on:

				  workflow_dispatch:

				    inputs:

				      test-selection:

				        description: 'Specification of selected test(s), as accepted by pytest -k'

				        required: true

				        type: string

				      run-count:

				        description: 'Number of test runs to perform'

				        required: true

				        type: number

				      archs:

				        description: 'Archs to run tests on, e. g.: ["x64", "arm64"]'

				        default: '["x64"]'

				        required: true

				        type: string

				      build-types:

				        description: 'Build types to run tests on, e. g.: ["debug", "release"]'

				        default: '["release"]'

				        required: true

				        type: string

				      pg-versions:

				        description: 'Postgres versions to use for testing,  e.g,: [{"pg_version":"v16"}, {"pg_version":"v17"}])'

				        default: '[{"pg_version":"v17"}]'

				        required: true

				        type: string

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				env:

				  RUST_BACKTRACE: 1

				  COPT: '-Werror'

				jobs:

				  meta:

				    uses: ./.github/workflows/_meta.yml

				    with:

				      github-event-name: ${{ github.event_name }}

				      github-event-json: ${{ toJSON(github.event) }}

				  build-and-test-locally:

				    needs: [ meta ]

				    strategy:

				      fail-fast: false

				      matrix:

				        arch: ${{ fromJson(inputs.archs) }}

				        build-type: ${{ fromJson(inputs.build-types) }}

				    uses: ./.github/workflows/_build-and-test-locally.yml

				    with:

				      arch: ${{ matrix.arch }}

				      build-tools-image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      build-tag: ${{ needs.meta.outputs.build-tag }}

				      build-type: ${{ matrix.build-type }}

				      test-cfg: ${{ inputs.pg-versions }}

				      test-selection: ${{ inputs.test-selection }}

				      test-run-count: ${{ fromJson(inputs.run-count) }}

				      rerun-failed: false

				    secrets: inherit

				  create-test-report:

				    needs: [ build-and-test-locally ]

				    if: ${{ !cancelled() }}

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				      pull-requests: write

				    outputs:

				      report-url: ${{ steps.create-allure-report.outputs.report-url }}

				    runs-on: [ self-hosted, small ]

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Create Allure report

				        if: ${{ !cancelled() }}

				        id: create-allure-report

				        uses: ./.github/actions/allure-report-generate

				        with:

				          store-test-results-into-db: true

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_DEV }}

				      - uses: actions/github-script@v7

				        if: ${{ !cancelled() }}

				        with:

				          # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				          retries: 5

				          script: |

				            const report = {

				              reportUrl:     "${{ steps.create-allure-report.outputs.report-url }}",

				              reportJsonUrl: "${{ steps.create-allure-report.outputs.report-json-url }}",

				            }

				            const coverage = {}

				            const script = require("./scripts/comment-test-report.js")

				            await script({

				              github,

				              context,

				              fetch,

				              report,

				              coverage,

				            })

1451

.github/workflows/build_and_test.yml vendored

View File

File diff suppressed because it is too large Load Diff

									
										151

.github/workflows/build_and_test_fully.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,151 @@

				name: Build and Test Fully

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:   '0 3 * * *' # run once a day, timezone is utc

				  workflow_dispatch:

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow per any non-`main` branch.

				  group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.ref_name == 'main' && github.sha || 'anysha' }}

				  cancel-in-progress: true

				env:

				  RUST_BACKTRACE: 1

				  COPT: '-Werror'

				jobs:

				  tag:

				    runs-on: [ self-hosted, small ]

				    container: ${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/base:pinned

				    outputs:

				      build-tag: ${{steps.build-tag.outputs.tag}}

				    steps:

				      # Need `fetch-depth: 0` to count the number of commits in the branch

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				      - name: Get build tag

				        run: |

				          echo run:$GITHUB_RUN_ID

				          echo ref:$GITHUB_REF_NAME

				          echo rev:$(git rev-list --count HEAD)

				          if [[ "$GITHUB_REF_NAME" == "main" ]]; then

				            echo "tag=$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release" ]]; then

				            echo "tag=release-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then

				            echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then

				            echo "tag=release-compute-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          else

				            echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release', 'release-proxy', 'release-compute'"

				            echo "tag=$GITHUB_RUN_ID" >> $GITHUB_OUTPUT

				          fi

				        shell: bash

				        id: build-tag

				  build-build-tools-image:

				    uses: ./.github/workflows/build-build-tools-image.yml

				    secrets: inherit

				  build-and-test-locally:

				    needs: [ tag, build-build-tools-image ]

				    strategy:

				      fail-fast: false

				      matrix:

				        arch: [ x64, arm64 ]

				        build-type: [ debug, release ]

				    uses: ./.github/workflows/_build-and-test-locally.yml

				    with:

				      arch: ${{ matrix.arch }}

				      build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      build-tag: ${{ needs.tag.outputs.build-tag }}

				      build-type: ${{ matrix.build-type }}

				      rerun-failed: false

				      test-cfg: '[{"pg_version":"v14", "lfc_state": "with-lfc"},

				                  {"pg_version":"v15", "lfc_state": "with-lfc"},

				                  {"pg_version":"v16", "lfc_state": "with-lfc"},

				                  {"pg_version":"v17", "lfc_state": "with-lfc"},

				                  {"pg_version":"v14", "lfc_state": "without-lfc"},

				                  {"pg_version":"v15", "lfc_state": "without-lfc"},

				                  {"pg_version":"v16", "lfc_state": "without-lfc"},

				                  {"pg_version":"v17", "lfc_state": "withouts-lfc"}]'

				    secrets: inherit

				  create-test-report:

				    needs: [ build-and-test-locally, build-build-tools-image ]

				    if: ${{ !cancelled() }}

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				      pull-requests: write

				    outputs:

				      report-url: ${{ steps.create-allure-report.outputs.report-url }}

				    runs-on: [ self-hosted, small ]

				    container:

				      image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Create Allure report

				        if: ${{ !cancelled() }}

				        id: create-allure-report

				        uses: ./.github/actions/allure-report-generate

				        with:

				          store-test-results-into-db: true

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

				      - uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1

				        if: ${{ !cancelled() }}

				        with:

				          # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				          retries: 5

				          script: |

				            const report = {

				              reportUrl:     "${{ steps.create-allure-report.outputs.report-url }}",

				              reportJsonUrl: "${{ steps.create-allure-report.outputs.report-json-url }}",

				            }

				            const coverage = {}

				            const script = require("./scripts/comment-test-report.js")

				            await script({

				              github,

				              context,

				              fetch,

				              report,

				              coverage,

				            })

									
										145

.github/workflows/build_and_test_with_sanitizers.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,145 @@

				name: Build and Test with Sanitizers

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:   '0 1 * * *' # run once a day, timezone is utc

				  workflow_dispatch:

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow per any non-`main` branch.

				  group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.ref_name == 'main' && github.sha || 'anysha' }}

				  cancel-in-progress: true

				env:

				  RUST_BACKTRACE: 1

				  COPT: '-Werror'

				jobs:

				  tag:

				    runs-on: [ self-hosted, small ]

				    container: ${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/base:pinned

				    outputs:

				      build-tag: ${{steps.build-tag.outputs.tag}}

				    steps:

				      # Need `fetch-depth: 0` to count the number of commits in the branch

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				      - name: Get build tag

				        run: |

				          echo run:$GITHUB_RUN_ID

				          echo ref:$GITHUB_REF_NAME

				          echo rev:$(git rev-list --count HEAD)

				          if [[ "$GITHUB_REF_NAME" == "main" ]]; then

				            echo "tag=$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release" ]]; then

				            echo "tag=release-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then

				            echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then

				            echo "tag=release-compute-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          else

				            echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release', 'release-proxy', 'release-compute'"

				            echo "tag=$GITHUB_RUN_ID" >> $GITHUB_OUTPUT

				          fi

				        shell: bash

				        id: build-tag

				  build-build-tools-image:

				    uses: ./.github/workflows/build-build-tools-image.yml

				    secrets: inherit

				  build-and-test-locally:

				    needs: [ tag, build-build-tools-image ]

				    strategy:

				      fail-fast: false

				      matrix:

				        arch: [ x64, arm64 ]

				        build-type: [ release ]

				    uses: ./.github/workflows/_build-and-test-locally.yml

				    with:

				      arch: ${{ matrix.arch }}

				      build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      build-tag: ${{ needs.tag.outputs.build-tag }}

				      build-type: ${{ matrix.build-type }}

				      rerun-failed: false

				      test-cfg: '[{"pg_version":"v17"}]'

				      sanitizers: enabled

				    secrets: inherit

				  create-test-report:

				    needs: [ build-and-test-locally, build-build-tools-image ]

				    if: ${{ !cancelled() }}

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				      pull-requests: write

				    outputs:

				      report-url: ${{ steps.create-allure-report.outputs.report-url }}

				    runs-on: [ self-hosted, small ]

				    container:

				      image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Create Allure report

				        if: ${{ !cancelled() }}

				        id: create-allure-report

				        uses: ./.github/actions/allure-report-generate

				        with:

				          store-test-results-into-db: true

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

				      - uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1

				        if: ${{ !cancelled() }}

				        with:

				          # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				          retries: 5

				          script: |

				            const report = {

				              reportUrl:     "${{ steps.create-allure-report.outputs.report-url }}",

				              reportJsonUrl: "${{ steps.create-allure-report.outputs.report-json-url }}",

				            }

				            const coverage = {}

				            const script = require("./scripts/comment-test-report.js")

				            await script({

				              github,

				              context,

				              fetch,

				              report,

				              coverage,

				            })

									
										69

.github/workflows/cargo-deny.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,69 @@

				name: cargo deny checks

				on:

				  workflow_call:

				    inputs:

				      build-tools-image:

				        required: false

				        type: string

				  schedule:

				    - cron: '0 10 * * *'

				permissions:

				  contents: read

				jobs:

				  cargo-deny:

				    strategy:

				      matrix:

				        ref: >-

				          ${{

				            fromJSON(

				              github.event_name == 'schedule'

				                && '["main","release","release-proxy","release-compute"]'

				                || format('["{0}"]', github.sha)

				            )

				          }}

				    runs-on: [self-hosted, small]

				    permissions:

				      packages: read

				    container:

				      image: ${{ inputs.build-tools-image || 'ghcr.io/neondatabase/build-tools:pinned' }}

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ matrix.ref }}

				      - name: Check rust licenses/bans/advisories/sources

				        env:

				          CARGO_DENY_TARGET: >-

				            ${{ github.event_name == 'schedule' && 'advisories' || 'all' }}

				        run: cargo deny check --hide-inclusion-graph $CARGO_DENY_TARGET

				      - name: Post to a Slack channel

				        if: ${{ github.event_name == 'schedule' && failure() }}

				        uses: slackapi/slack-github-action@485a9d42d3a73031f12ec201c457e2162c45d02d # v2.0.0

				        with:

				          method: chat.postMessage

				          token: ${{ secrets.SLACK_BOT_TOKEN }}

				          payload: |

				            channel: ${{ vars.SLACK_ON_CALL_DEVPROD_STREAM }}

				            text: |

				              Periodic cargo-deny on ${{ matrix.ref }}: ${{ job.status }}

				              <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				              Fixing the problem should be fairly straight forward from the logs. If not, <#${{ vars.SLACK_RUST_CHANNEL_ID }}> is there to help.

				              Pinging <!subteam^S0838JPSH32|@oncall-devprod>.

									
										51

.github/workflows/check-build-tools-image.yml
									
										vendored
									
												View File
											
				@@ -1,51 +0,0 @@

				name: Check build-tools image

				on:

				  workflow_call:

				    outputs:

				      image-tag:

				        description: "build-tools image tag"

				        value: ${{ jobs.check-image.outputs.tag }}

				      found:

				        description: "Whether the image is found in the registry"

				        value: ${{ jobs.check-image.outputs.found }}

				defaults:

				  run:

				    shell: bash -euo pipefail {0}

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				jobs:

				  check-image:

				    runs-on: ubuntu-22.04

				    outputs:

				      tag: ${{ steps.get-build-tools-tag.outputs.image-tag }}

				      found: ${{ steps.check-image.outputs.found }}

				    steps:

				      - uses: actions/checkout@v4

				      - name: Get build-tools image tag for the current commit

				        id: get-build-tools-tag

				        env:

				          IMAGE_TAG: |

				            ${{ hashFiles('Dockerfile.build-tools',

				                          '.github/workflows/check-build-tools-image.yml',

				                          '.github/workflows/build-build-tools-image.yml') }}

				        run: |

				          echo "image-tag=${IMAGE_TAG}" | tee -a $GITHUB_OUTPUT

				      - name: Check if such tag found in the registry

				        id: check-image

				        env:

				          IMAGE_TAG: ${{ steps.get-build-tools-tag.outputs.image-tag }}

				        run: |

				          if docker manifest inspect neondatabase/build-tools:${IMAGE_TAG}; then

				            found=true

				          else

				            found=false

				          fi

				          echo "found=${found}" | tee -a $GITHUB_OUTPUT

									
										5

.github/workflows/check-permissions.yml
									
										vendored
									
												View File
												
				@@ -18,6 +18,11 @@ jobs:

				  check-permissions:

				    runs-on: ubuntu-22.04

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				      with:

				        egress-policy: audit

				    - name: Disallow CI runs on PRs from forks

				      if: |

				        inputs.github-event-name  == 'pull_request' &&

									
										5

.github/workflows/cleanup-caches-by-a-branch.yml
									
										vendored
									
												View File
												
				@@ -11,6 +11,11 @@ jobs:

				  cleanup:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				        with:

				          egress-policy: audit

				      - name: Cleanup

				        run: |

				          gh extension install actions/gh-actions-cache

									
										99

.github/workflows/cloud-extensions.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,99 @@

				name: Cloud Extensions Test

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:  '45 1 * * *' # run once a day, timezone is utc

				  workflow_dispatch: # adds ability to run this manually

				    inputs:

				      region_id:

				        description: 'Project region id. If not set, the default region will be used'

				        required: false

				        default: 'aws-us-east-2'

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				permissions:

				  id-token: write # aws-actions/configure-aws-credentials

				  statuses: write

				  contents: write

				jobs:

				  regress:

				    env:

				      POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				      TEST_OUTPUT: /tmp/test_output

				      BUILD_TYPE: remote

				    strategy:

				      fail-fast: false

				      matrix:

				        pg-version: [16, 17]

				    runs-on: us-east-2

				    container:

				      # We use the neon-test-extensions image here as it contains the source code for the extensions.

				      image: ghcr.io/neondatabase/neon-test-extensions-v${{ matrix.pg-version }}:latest

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Evaluate the settings

				        id: project-settings

				        run: |

				          if [[ $((${{ matrix.pg-version }})) -lt 17 ]]; then

				            ULID=ulid

				          else

				            ULID=pgx_ulid

				          fi

				          LIBS=timescaledb:rag_bge_small_en_v15,rag_jina_reranker_v1_tiny_en:$ULID

				          settings=$(jq -c -n --arg libs $LIBS '{preload_libraries:{use_defaults:false,enabled_libraries:($libs| split(":"))}}')

				          echo settings=$settings >> $GITHUB_OUTPUT

				      - name: Create Neon Project

				        id: create-neon-project

				        uses: ./.github/actions/neon-project-create

				        with:

				          region_id: ${{ inputs.region_id || 'aws-us-east-2' }}

				          postgres_version: ${{ matrix.pg-version }}

				          project_settings: ${{ steps.project-settings.outputs.settings }}

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				      - name: Run the regression tests

				        run: /run-tests.sh -r /ext-src

				        env:

				          BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}

				          SKIP: "pg_hint_plan-src,pg_repack-src,pg_cron-src,plpgsql_check-src"

				      - name: Delete Neon Project

				        if: ${{ always() }}

				        uses: ./.github/actions/neon-project-delete

				        with:

				          project_id: ${{ steps.create-neon-project.outputs.project_id }}

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				      - name: Post to a Slack channel

				        if: ${{ github.event.schedule && failure() }}

				        uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				        with:

				          channel-id: ${{ vars.SLACK_ON_CALL_QA_STAGING_STREAM }}

				          slack-message: |

				            Periodic extensions test on staging: ${{ job.status }}

				            <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				        env:

				          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

									
										138

.github/workflows/cloud-regress.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,138 @@

				name: Cloud Regression Test

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:  '45 1 * * *' # run once a day, timezone is utc

				  workflow_dispatch: # adds ability to run this manually

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow

				  group: ${{ github.workflow }}

				  cancel-in-progress: true

				permissions:

				  id-token: write # aws-actions/configure-aws-credentials

				  statuses: write

				  contents: write

				jobs:

				  regress:

				    env:

				      POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				      TEST_OUTPUT: /tmp/test_output

				      BUILD_TYPE: remote

				    strategy:

				      fail-fast: false

				      matrix:

				        pg-version: [16, 17]

				    runs-on: us-east-2

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				      - name: Patch the test

				        env:

				          PG_VERSION: ${{matrix.pg-version}}

				        run: |

				          cd "vendor/postgres-v${PG_VERSION}"

				          patch -p1 < "../../compute/patches/cloud_regress_pg${PG_VERSION}.patch"

				      - name: Generate a random password

				        id: pwgen

				        run: |

				          set +x

				          DBPASS=$(dd if=/dev/random bs=48 count=1 2>/dev/null | base64)

				          echo "::add-mask::${DBPASS//\//}"

				          echo DBPASS="${DBPASS//\//}" >> "${GITHUB_OUTPUT}"

				      - name: Change tests according to the generated password

				        env:

				          DBPASS: ${{ steps.pwgen.outputs.DBPASS }}

				          PG_VERSION: ${{matrix.pg-version}}

				        run: |

				          cd vendor/postgres-v"${PG_VERSION}"/src/test/regress

				          for fname in sql/*.sql expected/*.out; do

				            sed -i.bak s/NEON_PASSWORD_PLACEHOLDER/"'${DBPASS}'"/ "${fname}"

				          done

				          for ph in $(grep NEON_MD5_PLACEHOLDER expected/password.out | awk '{print $3;}' | sort | uniq); do

				            USER=$(echo "${ph}" | cut -c 22-)

				            MD5=md5$(echo -n "${DBPASS}${USER}" | md5sum | awk '{print $1;}')

				            sed -i.bak "s/${ph}/${MD5}/" expected/password.out

				          done

				      - name: Download Neon artifact

				        uses: ./.github/actions/download

				        with:

				          name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				          path: /tmp/neon/

				          prefix: latest

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      - name: Create a new branch

				        id: create-branch

				        uses: ./.github/actions/neon-branch-create

				        with:

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				          project_id: ${{ vars[format('PGREGRESS_PG{0}_PROJECT_ID', matrix.pg-version)] }}

				      - name: Run the regression tests

				        uses: ./.github/actions/run-python-test-set

				        with:

				          build_type: ${{ env.BUILD_TYPE }}

				          test_selection: cloud_regress

				          pg_version: ${{matrix.pg-version}}

				          extra_params: -m remote_cluster

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          BENCHMARK_CONNSTR: ${{steps.create-branch.outputs.dsn}}

				      - name: Delete branch

				        if: always()

				        uses: ./.github/actions/neon-branch-delete

				        with:

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				          project_id: ${{ vars[format('PGREGRESS_PG{0}_PROJECT_ID', matrix.pg-version)] }}

				          branch_id: ${{steps.create-branch.outputs.branch_id}}

				      - name: Create Allure report

				        id: create-allure-report

				        if: ${{ !cancelled() }}

				        uses: ./.github/actions/allure-report-generate

				        with:

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      - name: Post to a Slack channel

				        if: ${{ github.event.schedule && failure() }}

				        uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				        with:

				          channel-id: ${{ vars.SLACK_ON_CALL_QA_STAGING_STREAM }}

				          slack-message: |

				            Periodic pg_regress on staging: ${{ job.status }}

				            <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				            <${{ steps.create-allure-report.outputs.report-url }}|Allure report>

				        env:

				          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

									
										43

.github/workflows/fast-forward.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,43 @@

				name: Fast forward merge

				on:

				  pull_request:

				    types: [labeled]

				    branches:

				      - release

				      - release-proxy

				      - release-compute

				jobs:

				  fast-forward:

				    if: ${{ github.event.label.name == 'fast-forward' }}

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				        with:

				          egress-policy: audit

				      - name: Remove fast-forward label to PR

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				        run: |

				          gh pr edit ${{ github.event.pull_request.number }} --repo "${GITHUB_REPOSITORY}" --remove-label "fast-forward"

				      - name: Fast forwarding

				        uses: sequoia-pgp/fast-forward@ea7628bedcb0b0b96e94383ada458d812fca4979

				        # See https://docs.github.com/en/graphql/reference/enums#mergestatestatus

				        if: ${{ contains(fromJSON('["clean", "unstable"]'), github.event.pull_request.mergeable_state) }}

				        with:

				          merge: true

				          comment: on-error

				          github_token: ${{ secrets.CI_ACCESS_TOKEN }}

				      - name: Comment if mergeable_state is not clean

				        if: ${{ !contains(fromJSON('["clean", "unstable"]'), github.event.pull_request.mergeable_state) }}

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				        run: |

				          gh pr comment ${{ github.event.pull_request.number }} \

				            --repo "${GITHUB_REPOSITORY}" \

				            --body "Not trying to forward pull-request, because \`mergeable_state\` is \`${{ github.event.pull_request.mergeable_state }}\`, not \`clean\` or \`unstable\`."

									
										82

.github/workflows/force-test-extensions-upgrade.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,82 @@

				name: Force Test Upgrading of Extension

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:  '45 2 * * *' # run once a day, timezone is utc

				  workflow_dispatch: # adds ability to run this manually

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow

				  group: ${{ github.workflow }}

				  cancel-in-progress: true

				permissions:

				  id-token: write # aws-actions/configure-aws-credentials

				  statuses: write

				  contents: read

				jobs:

				  regress:

				    strategy:

				      fail-fast: false

				      matrix:

				        pg-version: [16, 17]

				    runs-on: small

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: false

				      - name: Get the last compute release tag

				        id: get-last-compute-release-tag

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				        run: |

				          tag=$(gh api -q '[.[].tag_name | select(startswith("release-compute"))][0]'\

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            "/repos/${GITHUB_REPOSITORY}/releases")

				          echo tag=${tag} >> ${GITHUB_OUTPUT}

				      - name: Test extension upgrade

				        timeout-minutes: 60

				        env:

				          NEW_COMPUTE_TAG: latest

				          OLD_COMPUTE_TAG: ${{ steps.get-last-compute-release-tag.outputs.tag }}

				          TEST_EXTENSIONS_TAG: ${{ steps.get-last-compute-release-tag.outputs.tag }}

				          PG_VERSION: ${{ matrix.pg-version }}

				          FORCE_ALL_UPGRADE_TESTS: true

				        run: ./docker-compose/test_extensions_upgrade.sh

				      - name: Print logs and clean up

				        if: always()

				        run: |

				          docker compose --profile test-extensions -f ./docker-compose/docker-compose.yml logs || true

				          docker compose --profile test-extensions -f ./docker-compose/docker-compose.yml down

				      - name: Post to the Slack channel

				        if: ${{ github.event.schedule && failure() }}

				        uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				        with:

				          channel-id: ${{ vars.SLACK_ON_CALL_QA_STAGING_STREAM }}

				          slack-message: |

				            Test upgrading of extensions: ${{ job.status }}

				            <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				        env:

				          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

									
										200

.github/workflows/ingest_benchmark.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,200 @@

				name: benchmarking ingest

				on:

				  # uncomment to run on push for debugging your PR

				  # push:

				  #   branches: [ your branch ]

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:   '0 9 * * *' # run once a day, timezone is utc

				  workflow_dispatch: # adds ability to run this manually

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow globally because we need dedicated resources which only exist once

				  group: ingest-bench-workflow

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  ingest:

				    strategy:

				      fail-fast: false # allow other variants to continue even if one fails

				      matrix:

				        include:

				          - target_project: new_empty_project_stripe_size_2048

				            stripe_size: 2048 # 16 MiB

				            postgres_version: 16

				            disable_sharding: false

				          - target_project: new_empty_project_stripe_size_32768

				            stripe_size: 32768 # 256 MiB # note that this is different from null because using null will shard_split the project only if it reaches the threshold

				                               # while here it is sharded from the beginning with a shard size of 256 MiB

				            disable_sharding: false

				            postgres_version: 16

				          - target_project: new_empty_project

				            stripe_size: null # run with neon defaults which will shard split only when reaching the threshold

				            disable_sharding: false

				            postgres_version: 16

				          - target_project: new_empty_project

				            stripe_size: null # run with neon defaults which will shard split only when reaching the threshold

				            disable_sharding: false

				            postgres_version: 17

				          - target_project: large_existing_project

				            stripe_size: null # cannot re-shared or choose different stripe size for existing, already sharded project

				            disable_sharding: false

				            postgres_version: 16

				          - target_project: new_empty_project_unsharded

				            stripe_size: null # run with neon defaults which will shard split only when reaching the threshold

				            disable_sharding: true

				            postgres_version: 16

				      max-parallel: 1 # we want to run each stripe size sequentially to be able to compare the results

				    permissions:

				      contents: write

				      statuses: write

				      id-token: write # aws-actions/configure-aws-credentials

				    env:

				      PG_CONFIG: /tmp/neon/pg_install/v16/bin/pg_config

				      PSQL: /tmp/neon/pg_install/v16/bin/psql

				      PG_16_LIB_PATH: /tmp/neon/pg_install/v16/lib

				      PGCOPYDB: /pgcopydb/bin/pgcopydb

				      PGCOPYDB_LIB_PATH: /pgcopydb/lib

				    runs-on: [ self-hosted, us-east-2, x64 ]

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    timeout-minutes: 1440

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				    - name: Configure AWS credentials # necessary to download artefacts

				      uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # 5 hours is currently max associated with IAM role

				    - name: Download Neon artifact

				      uses: ./.github/actions/download

				      with:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				        path: /tmp/neon/

				        prefix: latest

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Create Neon Project

				      if: ${{ startsWith(matrix.target_project, 'new_empty_project') }}

				      id: create-neon-project-ingest-target

				      uses: ./.github/actions/neon-project-create

				      with:

				        region_id: aws-us-east-2

				        postgres_version: ${{ matrix.postgres_version }}

				        compute_units: '[7, 7]' # we want to test large compute here to avoid compute-side bottleneck

				        api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				        shard_split_project: ${{ matrix.stripe_size != null && 'true' || 'false' }}

				        admin_api_key: ${{ secrets.NEON_STAGING_ADMIN_API_KEY }}

				        shard_count: 8

				        stripe_size: ${{ matrix.stripe_size }}

				        disable_sharding: ${{ matrix.disable_sharding }}

				    - name: Initialize Neon project

				      if: ${{ startsWith(matrix.target_project, 'new_empty_project') }}

				      env:

				          BENCHMARK_INGEST_TARGET_CONNSTR: ${{ steps.create-neon-project-ingest-target.outputs.dsn }}

				          NEW_PROJECT_ID: ${{ steps.create-neon-project-ingest-target.outputs.project_id }}

				      run: |

				        echo "Initializing Neon project with project_id: ${NEW_PROJECT_ID}"

				        export LD_LIBRARY_PATH=${PG_16_LIB_PATH}

				        ${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "CREATE EXTENSION IF NOT EXISTS neon; CREATE EXTENSION IF NOT EXISTS neon_utils;"

				        echo "BENCHMARK_INGEST_TARGET_CONNSTR=${BENCHMARK_INGEST_TARGET_CONNSTR}" >> $GITHUB_ENV

				    - name: Create Neon Branch for large tenant

				      if: ${{ matrix.target_project == 'large_existing_project' }}

				      id: create-neon-branch-ingest-target

				      uses: ./.github/actions/neon-branch-create

				      with:

				        project_id: ${{ vars.BENCHMARK_INGEST_TARGET_PROJECTID }}

				        api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				    - name: Initialize Neon project

				      if: ${{ matrix.target_project == 'large_existing_project' }}

				      env:

				          BENCHMARK_INGEST_TARGET_CONNSTR: ${{ steps.create-neon-branch-ingest-target.outputs.dsn }}

				          NEW_BRANCH_ID: ${{ steps.create-neon-branch-ingest-target.outputs.branch_id }}

				      run: |

				        echo "Initializing Neon branch with branch_id: ${NEW_BRANCH_ID}"

				        export LD_LIBRARY_PATH=${PG_16_LIB_PATH}

				        # Extract the part before the database name

				        base_connstr="${BENCHMARK_INGEST_TARGET_CONNSTR%/*}"

				        # Extract the query parameters (if any) after the database name

				        query_params="${BENCHMARK_INGEST_TARGET_CONNSTR#*\?}"

				        # Reconstruct the new connection string

				        if [ "$query_params" != "$BENCHMARK_INGEST_TARGET_CONNSTR" ]; then

				          new_connstr="${base_connstr}/neondb?${query_params}"

				        else

				          new_connstr="${base_connstr}/neondb"

				        fi

				        ${PSQL} "${new_connstr}" -c "drop database ludicrous;"

				        ${PSQL} "${new_connstr}" -c "CREATE DATABASE ludicrous;"

				        if [ "$query_params" != "$BENCHMARK_INGEST_TARGET_CONNSTR" ]; then

				          BENCHMARK_INGEST_TARGET_CONNSTR="${base_connstr}/ludicrous?${query_params}"

				        else

				          BENCHMARK_INGEST_TARGET_CONNSTR="${base_connstr}/ludicrous"

				        fi

				        ${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "CREATE EXTENSION IF NOT EXISTS neon; CREATE EXTENSION IF NOT EXISTS neon_utils;"

				        echo "BENCHMARK_INGEST_TARGET_CONNSTR=${BENCHMARK_INGEST_TARGET_CONNSTR}" >> $GITHUB_ENV

				    - name: Invoke pgcopydb

				      uses: ./.github/actions/run-python-test-set

				      with:

				        build_type: remote

				        test_selection: performance/test_perf_ingest_using_pgcopydb.py

				        run_in_parallel: false

				        extra_params: -s -m remote_cluster --timeout 86400 -k test_ingest_performance_using_pgcopydb

				        pg_version: v${{ matrix.postgres_version }}

				        save_perf_report: true

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        BENCHMARK_INGEST_SOURCE_CONNSTR: ${{ secrets.BENCHMARK_INGEST_SOURCE_CONNSTR }}

				        TARGET_PROJECT_TYPE: ${{ matrix.target_project }}

				        # we report PLATFORM in zenbenchmark NeonBenchmarker perf database and want to distinguish between new project and large tenant

				        PLATFORM: "${{ matrix.target_project }}-us-east-2-staging"

				        PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"

				    - name: show tables sizes after ingest

				      run: |

				        export LD_LIBRARY_PATH=${PG_16_LIB_PATH}

				        ${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "\dt+"

				    - name: Delete Neon Project

				      if: ${{ always() && startsWith(matrix.target_project, 'new_empty_project') }}

				      uses: ./.github/actions/neon-project-delete

				      with:

				        project_id: ${{ steps.create-neon-project-ingest-target.outputs.project_id }}

				        api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				    - name: Delete Neon Branch for large tenant

				      if: ${{ always() && matrix.target_project == 'large_existing_project' }}

				      uses: ./.github/actions/neon-branch-delete

				      with:

				        project_id: ${{ vars.BENCHMARK_INGEST_TARGET_PROJECTID }}

				        branch_id: ${{ steps.create-neon-branch-ingest-target.outputs.branch_id }}

				        api_key: ${{ secrets.NEON_STAGING_API_KEY }}

									
										10

.github/workflows/label-for-external-users.yml
									
										vendored
									
												View File
												
				@@ -27,6 +27,11 @@ jobs:

				      is-member: ${{ steps.check-user.outputs.is-member }}

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				      with:

				        egress-policy: audit

				    - name: Check whether `${{ github.actor }}` is a member of `${{ github.repository_owner }}`

				      id: check-user

				      env:

				@@ -69,6 +74,11 @@ jobs:

				      issues: write        # for `gh issue edit`

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				      with:

				        egress-policy: audit

				    - name: Add `${{ env.LABEL }}` label

				      env:

				        GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

									
										203

.github/workflows/large_oltp_benchmark.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,203 @@

				name: large oltp benchmark

				on:

				  # uncomment to run on push for debugging your PR

				  #push:

				  #  branches: [ bodobolero/synthetic_oltp_workload ]

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │  ┌───────────── day of the month (1 - 31)

				    #          │ │  │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │  │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:   '0 15 * * 0,2,4' # run on Sunday, Tuesday, Thursday at 3 PM UTC

				  workflow_dispatch: # adds ability to run this manually

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow globally because we need dedicated resources which only exist once

				  group: large-oltp-bench-workflow

				  cancel-in-progress: false

				permissions:

				  contents: read

				jobs:

				  oltp:

				    strategy:

				      fail-fast: false # allow other variants to continue even if one fails

				      matrix:

				        include:

				          # test only read-only custom scripts in new branch without database maintenance

				          - target: new_branch

				            custom_scripts: select_any_webhook_with_skew.sql@300 select_recent_webhook.sql@397 select_prefetch_webhook.sql@3

				            test_maintenance: false

				          # test all custom scripts in new branch with database maintenance

				          - target: new_branch

				            custom_scripts: insert_webhooks.sql@200 select_any_webhook_with_skew.sql@300 select_recent_webhook.sql@397 select_prefetch_webhook.sql@3 IUD_one_transaction.sql@100

				            test_maintenance: true

				          # test all custom scripts in reuse branch with database maintenance

				          - target: reuse_branch

				            custom_scripts: insert_webhooks.sql@200 select_any_webhook_with_skew.sql@300 select_recent_webhook.sql@397 select_prefetch_webhook.sql@3 IUD_one_transaction.sql@100

				            test_maintenance: true

				      max-parallel: 1 # we want to run each benchmark sequentially to not have noisy neighbors on shared storage (PS, SK)

				    permissions:

				      contents: write

				      statuses: write

				      id-token: write # aws-actions/configure-aws-credentials

				    env:

				      TEST_PG_BENCH_DURATIONS_MATRIX: "1h" # todo update to > 1 h

				      TEST_PGBENCH_CUSTOM_SCRIPTS: ${{ matrix.custom_scripts }}

				      POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				      PG_VERSION: 16 # pre-determined by pre-determined project

				      TEST_OUTPUT: /tmp/test_output

				      BUILD_TYPE: remote

				      PLATFORM: ${{ matrix.target }}

				    runs-on: [ self-hosted, us-east-2, x64 ]

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    # Increase timeout to 2 days, default timeout is 6h - database maintenance can take a long time

				    # (normally 1h pgbench, 3h vacuum analyze 3.5h re-index) x 2 = 15h, leave some buffer for regressions

				    # in one run vacuum didn't finish within 12 hours

				    timeout-minutes: 2880

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				    - name: Configure AWS credentials # necessary to download artefacts

				      uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # 5 hours is currently max associated with IAM role

				    - name: Download Neon artifact

				      uses: ./.github/actions/download

				      with:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				        path: /tmp/neon/

				        prefix: latest

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Create Neon Branch for large tenant

				      if: ${{ matrix.target == 'new_branch' }}

				      id: create-neon-branch-oltp-target

				      uses: ./.github/actions/neon-branch-create

				      with:

				          project_id: ${{ vars.BENCHMARK_LARGE_OLTP_PROJECTID }}

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				    - name: Set up Connection String

				      id: set-up-connstr

				      run: |

				        case "${{ matrix.target }}" in

				          new_branch)

				          CONNSTR=${{ steps.create-neon-branch-oltp-target.outputs.dsn }}

				          ;;

				          reuse_branch)

				          CONNSTR=${{ secrets.BENCHMARK_LARGE_OLTP_REUSE_CONNSTR }}

				          ;;

				          *)

				          echo >&2 "Unknown target=${{ matrix.target }}"

				          exit 1

				          ;;

				        esac

				        CONNSTR_WITHOUT_POOLER="${CONNSTR//-pooler/}"

				        echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT

				        echo "connstr_without_pooler=${CONNSTR_WITHOUT_POOLER}" >> $GITHUB_OUTPUT

				    - name: Delete rows from prior runs in reuse branch

				      if: ${{ matrix.target == 'reuse_branch' }}

				      env:

				          BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr_without_pooler }}

				          PG_CONFIG: /tmp/neon/pg_install/v16/bin/pg_config

				          PSQL: /tmp/neon/pg_install/v16/bin/psql

				          PG_16_LIB_PATH: /tmp/neon/pg_install/v16/lib

				      run: |

				        echo "$(date '+%Y-%m-%d %H:%M:%S') - Deleting rows in table webhook.incoming_webhooks from prior runs"

				        export LD_LIBRARY_PATH=${PG_16_LIB_PATH}

				        ${PSQL} "${BENCHMARK_CONNSTR}" -c "SET statement_timeout = 0; DELETE FROM webhook.incoming_webhooks WHERE created_at > '2025-02-27 23:59:59+00';"

				        echo "$(date '+%Y-%m-%d %H:%M:%S') - Finished deleting rows in table webhook.incoming_webhooks from prior runs"

				    - name: Benchmark pgbench with custom-scripts

				      uses: ./.github/actions/run-python-test-set

				      with:

				        build_type: ${{ env.BUILD_TYPE }}

				        test_selection: performance

				        run_in_parallel: false

				        save_perf_report: true

				        extra_params: -m remote_cluster --timeout 7200 -k test_perf_oltp_large_tenant_pgbench

				        pg_version: ${{ env.PG_VERSION }}

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}

				        VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"

				        PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"

				    - name: Benchmark database maintenance

				      if: ${{ matrix.test_maintenance }}

				      uses: ./.github/actions/run-python-test-set

				      with:

				        build_type: ${{ env.BUILD_TYPE }}

				        test_selection: performance

				        run_in_parallel: false

				        save_perf_report: true

				        extra_params: -m remote_cluster --timeout 172800 -k test_perf_oltp_large_tenant_maintenance

				        pg_version: ${{ env.PG_VERSION }}

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr_without_pooler }}

				        VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"

				        PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"

				    - name: Delete Neon Branch for large tenant

				      if: ${{ always() && matrix.target == 'new_branch' }}

				      uses: ./.github/actions/neon-branch-delete

				      with:

				        project_id: ${{ vars.BENCHMARK_LARGE_OLTP_PROJECTID }}

				        branch_id: ${{ steps.create-neon-branch-oltp-target.outputs.branch_id }}

				        api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				    - name: Configure AWS credentials # again because prior steps could have exceeded 5 hours

				      uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # 5 hours

				    - name: Create Allure report

				      id: create-allure-report

				      if: ${{ !cancelled() }}

				      uses: ./.github/actions/allure-report-generate

				      with:

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Post to a Slack channel

				      if: ${{ github.event.schedule && failure() }}

				      uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				      with:

				        channel-id: "C06KHQVQ7U3" # on-call-qa-staging-stream

				        slack-message: |

				          Periodic large oltp perf testing: ${{ job.status }}

				          <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				          <${{ steps.create-allure-report.outputs.report-url }}|Allure report>

				      env:

				        SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

									
										175

.github/workflows/large_oltp_growth.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,175 @@

				name: large oltp growth

				# workflow to grow the reuse branch of large oltp benchmark continuously (about 16 GB per run)

				on:

				  # uncomment to run on push for debugging your PR

				  # push:

				  #  branches: [ bodobolero/increase_large_oltp_workload ]

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #        ┌───────────── minute (0 - 59)

				    #        │ ┌───────────── hour (0 - 23)

				    #        │ │  ┌───────────── day of the month (1 - 31)

				    #        │ │  │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #        │ │  │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron: '0 6 * * *'   # 06:00 UTC

				    - cron: '0 8 * * *'   # 08:00 UTC

				    - cron: '0 10 * * *'  # 10:00 UTC

				    - cron: '0 12 * * *'  # 12:00 UTC

				    - cron: '0 14 * * *'  # 14:00 UTC

				    - cron: '0 16 * * *'  # 16:00 UTC

				  workflow_dispatch: # adds ability to run this manually

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				concurrency:

				  # Allow only one workflow globally because we need dedicated resources which only exist once

				  group: large-oltp-growth

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  oltp:

				    strategy:

				      fail-fast: false # allow other variants to continue even if one fails

				      matrix:

				        include:

				          # for now only grow the reuse branch, not the other branches.

				          - target: reuse_branch

				            custom_scripts:

				            - grow_action_blocks.sql

				            - grow_action_kwargs.sql

				            - grow_device_fingerprint_event.sql

				            - grow_edges.sql

				            - grow_hotel_rate_mapping.sql

				            - grow_ocr_pipeline_results_version.sql

				            - grow_priceline_raw_response.sql

				            - grow_relabled_transactions.sql

				            - grow_state_values.sql

				            - grow_values.sql

				            - grow_vertices.sql

				            - update_accounting_coding_body_tracking_category_selection.sql

				            - update_action_blocks.sql

				            - update_action_kwargs.sql

				            - update_denormalized_approval_workflow.sql

				            - update_device_fingerprint_event.sql

				            - update_edges.sql

				            - update_heron_transaction_enriched_log.sql

				            - update_heron_transaction_enrichment_requests.sql

				            - update_hotel_rate_mapping.sql

				            - update_incoming_webhooks.sql

				            - update_manual_transaction.sql

				            - update_ml_receipt_matching_log.sql

				            - update_ocr_pipeine_results_version.sql

				            - update_orc_pipeline_step_results.sql

				            - update_orc_pipeline_step_results_version.sql

				            - update_priceline_raw_response.sql

				            - update_quickbooks_transactions.sql

				            - update_raw_finicity_transaction.sql

				            - update_relabeled_transactions.sql

				            - update_state_values.sql

				            - update_stripe_authorization_event_log.sql

				            - update_transaction.sql

				            - update_values.sql

				            - update_vertices.sql

				      max-parallel: 1 # we want to run each growth workload sequentially (for now there is just one)

				    permissions:

				      contents: write

				      statuses: write

				      id-token: write # aws-actions/configure-aws-credentials

				    env:

				      TEST_PG_BENCH_DURATIONS_MATRIX: "1h"

				      TEST_PGBENCH_CUSTOM_SCRIPTS: ${{ join(matrix.custom_scripts, ' ') }}

				      POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				      PG_VERSION: 16 # pre-determined by pre-determined project

				      TEST_OUTPUT: /tmp/test_output

				      BUILD_TYPE: remote

				      PLATFORM: ${{ matrix.target }}

				    runs-on: [ self-hosted, us-east-2, x64 ]

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				    - name: Configure AWS credentials # necessary to download artefacts

				      uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # 5 hours is currently max associated with IAM role

				    - name: Download Neon artifact

				      uses: ./.github/actions/download

				      with:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				        path: /tmp/neon/

				        prefix: latest

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Set up Connection String

				      id: set-up-connstr

				      run: |

				        case "${{ matrix.target }}" in

				          reuse_branch)

				          CONNSTR=${{ secrets.BENCHMARK_LARGE_OLTP_REUSE_CONNSTR }}

				          ;;

				          *)

				          echo >&2 "Unknown target=${{ matrix.target }}"

				          exit 1

				          ;;

				        esac

				        CONNSTR_WITHOUT_POOLER="${CONNSTR//-pooler/}"

				        echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT

				        echo "connstr_without_pooler=${CONNSTR_WITHOUT_POOLER}" >> $GITHUB_OUTPUT

				    - name: pgbench with custom-scripts

				      uses: ./.github/actions/run-python-test-set

				      with:

				        build_type: ${{ env.BUILD_TYPE }}

				        test_selection: performance

				        run_in_parallel: false

				        save_perf_report: true

				        extra_params: -m remote_cluster --timeout 7200 -k test_perf_oltp_large_tenant_growth

				        pg_version: ${{ env.PG_VERSION }}

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}

				        VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"

				        PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"

				    - name: Create Allure report

				      id: create-allure-report

				      if: ${{ !cancelled() }}

				      uses: ./.github/actions/allure-report-generate

				      with:

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Post to a Slack channel

				      if: ${{ github.event.schedule && failure() }}

				      uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				      with:

				        channel-id: "C06KHQVQ7U3" # on-call-qa-staging-stream

				        slack-message: |

				          Periodic large oltp tenant growth increase: ${{ job.status }}

				          <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>

				          <${{ steps.create-allure-report.outputs.report-url }}|Allure report>

				      env:

				        SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

									
										32

.github/workflows/lint-release-pr.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				name: Lint Release PR

				on:

				  pull_request:

				    branches:

				      - release

				      - release-proxy

				      - release-compute

				permissions:

				  contents: read

				jobs:

				  lint-release-pr:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout PR branch

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0  # Fetch full history for git operations

				          ref: ${{ github.event.pull_request.head.ref }}

				      - name: Run lint script

				        env:

				          RELEASE_BRANCH: ${{ github.base_ref }}

				        run: |

				          ./.github/scripts/lint-release-pr.sh

									
										169

.github/workflows/neon_extra_builds.yml
									
										vendored
									
												View File
												
				@@ -26,124 +26,75 @@ jobs:

				    with:

				      github-event-name: ${{ github.event_name}}

				  check-build-tools-image:

				    needs: [ check-permissions ]

				    uses: ./.github/workflows/check-build-tools-image.yml

				  build-build-tools-image:

				    needs: [ check-build-tools-image ]

				    needs: [ check-permissions ]

				    uses: ./.github/workflows/build-build-tools-image.yml

				    with:

				      image-tag: ${{ needs.check-build-tools-image.outputs.image-tag }}

				    secrets: inherit

				  check-macos-build:

				    needs: [ check-permissions ]

				    if: |

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-macos')  ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				      github.ref_name == 'main'

				    timeout-minutes: 90

				    runs-on: macos-14

				    env:

				      # Use release build only, to have less debug info around

				      # Hence keeping target/ (and general cache size) smaller

				      BUILD_TYPE: release

				  files-changed:

				    name: Detect what files changed

				    runs-on: ubuntu-22.04

				    timeout-minutes: 3

				    outputs:

				      v17: ${{ steps.files_changed.outputs.v17 }}

				      postgres_changes: ${{ steps.postgres_changes.outputs.changes }}

				      rebuild_rust_code: ${{ steps.files_changed.outputs.rust_code }}

				      rebuild_everything: ${{ steps.files_changed.outputs.rebuild_neon_extra || steps.files_changed.outputs.rebuild_macos }}

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout

				        uses: actions/checkout@v4

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				      - name: Install macOS postgres dependencies

				        run: brew install flex bison openssl protobuf icu4c pkg-config

				      - name: Set pg 14 revision for caching

				        id: pg_v14_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v14) >> $GITHUB_OUTPUT

				      - name: Set pg 15 revision for caching

				        id: pg_v15_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v15) >> $GITHUB_OUTPUT

				      - name: Set pg 16 revision for caching

				        id: pg_v16_rev

				        run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v16) >> $GITHUB_OUTPUT

				      - name: Cache postgres v14 build

				        id: cache_pg_14

				        uses: actions/cache@v4

				      - name: Check for Postgres changes

				        uses: dorny/paths-filter@1441771bbfdd59dcd748680ee64ebd8faab1a242  #v3

				        id: files_changed

				        with:

				          path: pg_install/v14

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ env.BUILD_TYPE }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}

				          token: ${{ github.token }}

				          filters: .github/file-filters.yaml

				          base: ${{ github.event_name != 'pull_request' && (github.event.merge_group.base_ref || github.ref_name) || '' }}

				          ref: ${{ github.event_name != 'pull_request' && (github.event.merge_group.head_ref || github.ref) || '' }}

				      - name: Cache postgres v15 build

				        id: cache_pg_15

				        uses: actions/cache@v4

				        with:

				          path: pg_install/v15

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ env.BUILD_TYPE }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}

				      - name: Cache postgres v16 build

				        id: cache_pg_16

				        uses: actions/cache@v4

				        with:

				          path: pg_install/v16

				          key: v1-${{ runner.os }}-${{ runner.arch }}-${{ env.BUILD_TYPE }}-pg-${{ steps.pg_v16_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}

				      - name: Set extra env for macOS

				      - name: Filter out only v-string for build matrix

				        id: postgres_changes

				        env:

				          CHANGES: ${{ steps.files_changed.outputs.changes }}

				        run: |

				          echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV

				          echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV

				          v_strings_only_as_json_array=$(echo ${CHANGES} | jq '.[]|select(test("v\\d+"))' | jq --slurp -c)

				          echo "changes=${v_strings_only_as_json_array}" | tee -a "${GITHUB_OUTPUT}"

				      - name: Cache cargo deps

				        uses: actions/cache@v4

				        with:

				          path: |

				            ~/.cargo/registry

				            !~/.cargo/registry/src

				            ~/.cargo/git

				            target

				          key: v1-${{ runner.os }}-${{ runner.arch }}-cargo-${{ hashFiles('./Cargo.lock') }}-${{ hashFiles('./rust-toolchain.toml') }}-rust

				      - name: Build postgres v14

				        if: steps.cache_pg_14.outputs.cache-hit != 'true'

				        run: make postgres-v14 -j$(sysctl -n hw.ncpu)

				      - name: Build postgres v15

				        if: steps.cache_pg_15.outputs.cache-hit != 'true'

				        run: make postgres-v15 -j$(sysctl -n hw.ncpu)

				      - name: Build postgres v16

				        if: steps.cache_pg_16.outputs.cache-hit != 'true'

				        run: make postgres-v16 -j$(sysctl -n hw.ncpu)

				      - name: Build neon extensions

				        run: make neon-pg-ext -j$(sysctl -n hw.ncpu)

				      - name: Build walproposer-lib

				        run: make walproposer-lib -j$(sysctl -n hw.ncpu)

				      - name: Run cargo build

				        run: PQ_LIB_DIR=$(pwd)/pg_install/v16/lib cargo build --all --release

				      - name: Check that no warnings are produced

				        run: ./run_clippy.sh

				  check-macos-build:

				    needs: [ check-permissions, files-changed ]

				    uses: ./.github/workflows/build-macos.yml

				    with:

				      pg_versions: ${{ needs.files-changed.outputs.postgres_changes }}

				      rebuild_rust_code: ${{ fromJSON(needs.files-changed.outputs.rebuild_rust_code) }}

				      rebuild_everything: ${{ fromJSON(needs.files-changed.outputs.rebuild_everything) }}

				  gather-rust-build-stats:

				    needs: [ check-permissions, build-build-tools-image ]

				    needs: [ check-permissions, build-build-tools-image, files-changed ]

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				    if: |

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-stats') ||

				      contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				      github.ref_name == 'main'

				      (needs.files-changed.outputs.v17 == 'true' || needs.files-changed.outputs.rebuild_everything == 'true') && (

				        contains(github.event.pull_request.labels.*.name, 'run-extra-build-stats') ||

				        contains(github.event.pull_request.labels.*.name, 'run-extra-build-*') ||

				        github.ref_name == 'main'

				      )

				    runs-on: [ self-hosted, large ]

				    container:

				      image: ${{ needs.build-build-tools-image.outputs.image }}

				      image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    env:

				@@ -153,8 +104,13 @@ jobs:

				      CARGO_INCREMENTAL: 0

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Checkout

				        uses: actions/checkout@v4

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          submodules: true

				@@ -166,26 +122,33 @@ jobs:

				        run: make walproposer-lib -j$(nproc)

				      - name: Produce the build stats

				        run: PQ_LIB_DIR=$(pwd)/pg_install/v16/lib cargo build --all --release --timings -j$(nproc)

				        run: cargo build --all --release --timings -j$(nproc)

				      - name: Configure AWS credentials

				        uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				        with:

				          aws-region: eu-central-1

				          role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				          role-duration-seconds: 3600

				      - name: Upload the build stats

				        id: upload-stats

				        env:

				          BUCKET: neon-github-public-dev

				          SHA: ${{ github.event.pull_request.head.sha || github.sha }}

				          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}

				          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}

				        run: |

				          REPORT_URL=https://${BUCKET}.s3.amazonaws.com/build-stats/${SHA}/${GITHUB_RUN_ID}/cargo-timing.html

				          aws s3 cp --only-show-errors ./target/cargo-timings/cargo-timing.html "s3://${BUCKET}/build-stats/${SHA}/${GITHUB_RUN_ID}/"

				          echo "report-url=${REPORT_URL}" >> $GITHUB_OUTPUT

				      - name: Publish build stats report

				        uses: actions/github-script@v7

				        uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1

				        env:

				          REPORT_URL: ${{ steps.upload-stats.outputs.report-url }}

				          SHA: ${{ github.event.pull_request.head.sha || github.sha }}

				        with:

				          # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				          retries: 5

				          script: |

				            const { REPORT_URL, SHA } = process.env

									
										310

.github/workflows/periodic_pagebench.yml
									
										vendored
									
												View File
												
				@@ -1,14 +1,14 @@

				name: Periodic pagebench performance test on dedicated EC2 machine in eu-central-1 region

				name: Periodic pagebench performance test on unit-perf hetzner runner

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │ ┌───────────── hour (0 - 23)

				    #          │ │ ┌───────────── day of the month (1 - 31)

				    #          │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:  '0 18 * * *' # Runs at 6 PM UTC every day

				    #        ┌───────────── minute (0 - 59)

				    #        │   ┌───────────── hour (0 - 23)

				    #        │   │ ┌───────────── day of the month (1 - 31)

				    #        │   │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #        │   │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron: '0 */4 * * *' # Runs every 4 hours

				  workflow_dispatch: # Allows manual triggering of the workflow

				    inputs:

				      commit_hash:

				@@ -16,6 +16,11 @@ on:

				        description: 'The long neon repo commit hash for the system under test (pageserver) to be tested.'

				        required: false

				        default: ''

				      recreate_snapshots:

				        type: boolean

				        description: 'Recreate snapshots - !!!WARNING!!! We should only recreate snapshots if the previous ones are no longer compatible. Otherwise benchmarking results are not comparable across runs.'

				        required: false

				        default: false

				defaults:

				  run:

				@@ -25,131 +30,252 @@ concurrency:

				  group: ${{ github.workflow }}

				  cancel-in-progress: false

				permissions:

				  contents: read

				jobs:

				  trigger_bench_on_ec2_machine_in_eu_central_1:

				    runs-on: [ self-hosted, small ]

				  run_periodic_pagebench_test:

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				      pull-requests: write

				    runs-on: [ self-hosted, unit-perf ]

				    container:

				      image: neondatabase/build-tools:pinned

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    timeout-minutes: 360  # Set the timeout to 6 hours

				    env:

				      API_KEY: ${{ secrets.PERIODIC_PAGEBENCH_EC2_RUNNER_API_KEY }}

				      RUN_ID: ${{ github.run_id }}

				      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_EC2_US_TEST_RUNNER_ACCESS_KEY_ID }}

				      AWS_SECRET_ACCESS_KEY : ${{ secrets.AWS_EC2_US_TEST_RUNNER_ACCESS_KEY_SECRET }}

				      AWS_DEFAULT_REGION : "eu-central-1"

				      AWS_INSTANCE_ID : "i-02a59a3bf86bc7e74"

				      DEFAULT_PG_VERSION: 16

				      BUILD_TYPE: release

				      RUST_BACKTRACE: 1

				      # NEON_ENV_BUILDER_USE_OVERLAYFS_FOR_SNAPSHOTS: 1 - doesn't work without root in container

				      S3_BUCKET: neon-github-public-dev

				      PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"

				    steps:

				    # we don't need the neon source code because we run everything remotely

				    # however we still need the local github actions to run the allure step below

				    - uses: actions/checkout@v4

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - name: Show my own (github runner) external IP address - usefull for IP allowlisting

				      run: curl https://ifconfig.me

				    - name: Start EC2 instance and wait for the instance to boot up

				    - name: Set up the environment which depends on $RUNNER_TEMP on nvme drive

				      id: set-env

				      shell: bash -euxo pipefail {0}

				      run: |

				        aws ec2 start-instances --instance-ids $AWS_INSTANCE_ID

				        aws ec2 wait instance-running --instance-ids $AWS_INSTANCE_ID

				        sleep 60 # sleep some time to allow cloudinit and our API server to start up

				        {

				          echo "NEON_DIR=${RUNNER_TEMP}/neon"

				          echo "NEON_BIN=${RUNNER_TEMP}/neon/bin"

				          echo "POSTGRES_DISTRIB_DIR=${RUNNER_TEMP}/neon/pg_install"

				          echo "LD_LIBRARY_PATH=${RUNNER_TEMP}/neon/pg_install/v${DEFAULT_PG_VERSION}/lib"

				          echo "BACKUP_DIR=${RUNNER_TEMP}/instance_store/saved_snapshots"

				          echo "TEST_OUTPUT=${RUNNER_TEMP}/neon/test_output"

				          echo "PERF_REPORT_DIR=${RUNNER_TEMP}/neon/test_output/perf-report-local"

				          echo "ALLURE_DIR=${RUNNER_TEMP}/neon/test_output/allure-results"

				          echo "ALLURE_RESULTS_DIR=${RUNNER_TEMP}/neon/test_output/allure-results/results"

				        } >> "$GITHUB_ENV"

				    - name: Determine public IP of the EC2 instance and set env variable EC2_MACHINE_URL_US

				      run: |

				        public_ip=$(aws ec2 describe-instances --instance-ids $AWS_INSTANCE_ID --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)

				        echo "Public IP of the EC2 instance: $public_ip"

				        echo "EC2_MACHINE_URL_US=https://${public_ip}:8443" >> $GITHUB_ENV

				        echo "allure_results_dir=${RUNNER_TEMP}/neon/test_output/allure-results/results" >> "$GITHUB_OUTPUT"

				    - uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2

				      with:

				        aws-region: eu-central-1

				        role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        role-duration-seconds: 18000 # max 5 hours (needed in case commit hash is still being built)

				    - name: Determine commit hash

				      id: commit_hash

				      shell: bash -euxo pipefail {0}

				      env:

				        INPUT_COMMIT_HASH: ${{ github.event.inputs.commit_hash }}

				      run: |

				        if [ -z "$INPUT_COMMIT_HASH" ]; then

				          echo "COMMIT_HASH=$(curl -s https://api.github.com/repos/neondatabase/neon/commits/main | jq -r '.sha')" >> $GITHUB_ENV

				        if [[ -z "${INPUT_COMMIT_HASH}" ]]; then

				          COMMIT_HASH=$(curl -s https://api.github.com/repos/neondatabase/neon/commits/main | jq -r '.sha')

				          echo "COMMIT_HASH=$COMMIT_HASH" >> $GITHUB_ENV

				          echo "commit_hash=$COMMIT_HASH" >> "$GITHUB_OUTPUT"

				          echo "COMMIT_HASH_TYPE=latest" >> $GITHUB_ENV

				        else

				          echo "COMMIT_HASH=$INPUT_COMMIT_HASH" >> $GITHUB_ENV

				          COMMIT_HASH="${INPUT_COMMIT_HASH}"

				          echo "COMMIT_HASH=$COMMIT_HASH" >> $GITHUB_ENV

				          echo "commit_hash=$COMMIT_HASH" >> "$GITHUB_OUTPUT"

				          echo "COMMIT_HASH_TYPE=manual" >> $GITHUB_ENV

				        fi

				    - name: Checkout the neon repository at given commit hash

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        ref: ${{ steps.commit_hash.outputs.commit_hash }}

				    - name: Start Bench with run_id   

				    # does not reuse ./.github/actions/download because we need to download the artifact for the given commit hash

				    # example artifact

				    # s3://neon-github-public-dev/artifacts/48b870bc078bd2c450eb7b468e743b9c118549bf/15036827400/1/neon-Linux-X64-release-artifact.tar.zst /instance_store/artifacts/neon-Linux-release-artifact.tar.zst

				    - name: Determine artifact S3_KEY for given commit hash and download and extract artifact

				      id: artifact_prefix

				      shell: bash -euxo pipefail {0}

				      env:

				        ARCHIVE: ${{ runner.temp }}/downloads/neon-${{ runner.os }}-${{ runner.arch }}-release-artifact.tar.zst

				        COMMIT_HASH: ${{ env.COMMIT_HASH }}

				        COMMIT_HASH_TYPE: ${{ env.COMMIT_HASH_TYPE }}

				      run: |

				        curl -k -X 'POST' \

				        "${EC2_MACHINE_URL_US}/start_test/${GITHUB_RUN_ID}" \

				        -H 'accept: application/json' \

				        -H 'Content-Type: application/json' \

				        -H "Authorization: Bearer $API_KEY" \

				        -d "{\"neonRepoCommitHash\": \"${COMMIT_HASH}\"}"

				        attempt=0

				        max_attempts=24 # 5 minutes * 24 = 2 hours

				    - name: Poll Test Status

				      id: poll_step

				      run: |

				        status=""

				        while [[ "$status" != "failure" && "$status" != "success" ]]; do

				          response=$(curl -k -X 'GET' \

				          "${EC2_MACHINE_URL_US}/test_status/${GITHUB_RUN_ID}" \

				          -H 'accept: application/json' \

				          -H "Authorization: Bearer $API_KEY")

				          echo "Response: $response"

				          set +x

				          status=$(echo $response | jq -r '.status')

				          echo "Test status: $status"

				          if [[ "$status" == "failure" ]]; then

				            echo "Test failed"

				            exit 1 # Fail the job step if status is failure

				          elif [[ "$status" == "success" || "$status" == "null" ]]; then

				        while [[ $attempt -lt $max_attempts ]]; do

				          # the following command will fail until the artifacts are available ...

				          S3_KEY=$(aws s3api list-objects-v2 --bucket "$S3_BUCKET" --prefix "artifacts/$COMMIT_HASH/" \

				            | jq -r '.Contents[]?.Key' \

				            | grep "neon-${{ runner.os }}-${{ runner.arch }}-release-artifact.tar.zst" \

				            | sort --version-sort \

				            | tail -1) || true # ... thus ignore errors from the command

				          if [[ -n "${S3_KEY}" ]]; then

				            echo "Artifact found: $S3_KEY"

				            echo "S3_KEY=$S3_KEY" >> $GITHUB_ENV

				            break

				          elif [[ "$status" == "too_many_runs" ]]; then

				            echo "Too many runs already running"

				            echo "too_many_runs=true" >> "$GITHUB_OUTPUT"

				            exit 1

				          fi

				          sleep 60 # Poll every 60 seconds

				          # Increment attempt counter and sleep for 5 minutes

				          attempt=$((attempt + 1))

				          echo "Attempt $attempt of $max_attempts to find artifacts in S3 bucket s3://$S3_BUCKET/artifacts/$COMMIT_HASH failed. Retrying in 5 minutes..."

				          sleep 300 # Sleep for 5 minutes

				        done

				    - name: Retrieve Test Logs

				      if: always() && steps.poll_step.outputs.too_many_runs != 'true'

				        if [[ -z "${S3_KEY}" ]]; then

				          echo "Error: artifact not found in S3 bucket s3://$S3_BUCKET/artifacts/$COMMIT_HASH" after 2 hours

				        else

				          mkdir -p $(dirname $ARCHIVE)

				          time aws s3 cp --only-show-errors s3://$S3_BUCKET/${S3_KEY} ${ARCHIVE}

				          mkdir -p ${NEON_DIR}

				          time tar -xf ${ARCHIVE} -C ${NEON_DIR}

				          rm -f ${ARCHIVE}

				        fi

				    - name: Download snapshots from S3

				      if: ${{ github.event_name != 'workflow_dispatch' || github.event.inputs.recreate_snapshots == 'false' || github.event.inputs.recreate_snapshots == '' }}

				      id: download_snapshots

				      shell: bash -euxo pipefail {0}

				      run: |

				        curl -k -X 'GET' \

				        "${EC2_MACHINE_URL_US}/test_log/${GITHUB_RUN_ID}" \

				        -H 'accept: application/gzip' \

				        -H "Authorization: Bearer $API_KEY" \

				        --output "test_log_${GITHUB_RUN_ID}.gz"

				    - name: Unzip Test Log and Print it into this job's log

				      if: always() && steps.poll_step.outputs.too_many_runs != 'true'

				        # Download the snapshots from S3

				        mkdir -p ${TEST_OUTPUT}

				        mkdir -p $BACKUP_DIR

				        cd $BACKUP_DIR

				        mkdir parts

				        cd parts

				        PART=$(aws s3api list-objects-v2 --bucket $S3_BUCKET --prefix performance/pagebench/ \

				          | jq -r '.Contents[]?.Key' \

				          | grep -E 'shared-snapshots-[0-9]{4}-[0-9]{2}-[0-9]{2}' \

				          | sort \

				          | tail -1)

				        echo "Latest PART: $PART"

				        if [[ -z "$PART" ]]; then

				          echo "ERROR: No matching S3 key found" >&2

				          exit 1

				        fi

				        S3_KEY=$(dirname $PART)

				        time aws s3 cp --only-show-errors --recursive s3://${S3_BUCKET}/$S3_KEY/ .

				        cd $TEST_OUTPUT

				        time cat $BACKUP_DIR/parts/* | zstdcat | tar --extract --preserve-permissions

				        rm -rf ${BACKUP_DIR}

				    - name: Cache poetry deps

				      uses: actions/cache@v4

				      with:

				        path: ~/.cache/pypoetry/virtualenvs

				        key: v2-${{ runner.os }}-${{ runner.arch }}-python-deps-bookworm-${{ hashFiles('poetry.lock') }}

				    - name: Install Python deps

				      shell: bash -euxo pipefail {0}

				      run: ./scripts/pysync

				    # we need high number of open files for pagebench

				    - name: show ulimits

				      shell: bash -euxo pipefail {0}

				      run: |

				        gzip -d "test_log_${GITHUB_RUN_ID}.gz"

				        cat "test_log_${GITHUB_RUN_ID}"

				        ulimit -a

				    - name: Run pagebench testcase

				      shell: bash -euxo pipefail {0}

				      env:

				        CI: false  # need to override this env variable set by github to enforce using snapshots

				      run: |

				        export PLATFORM=hetzner-unit-perf-${COMMIT_HASH_TYPE}

				        # report the commit hash of the neon repository in the revision of the test results

				        export GITHUB_SHA=${COMMIT_HASH}

				        rm -rf ${PERF_REPORT_DIR}

				        rm -rf ${ALLURE_RESULTS_DIR}

				        mkdir -p ${PERF_REPORT_DIR}

				        mkdir -p ${ALLURE_RESULTS_DIR}

				        PARAMS="--alluredir=${ALLURE_RESULTS_DIR} --tb=short --verbose -rA"

				        EXTRA_PARAMS="--out-dir ${PERF_REPORT_DIR} --durations-path $TEST_OUTPUT/benchmark_durations.json"

				        # run only two selected tests

				        # environment set by parent:

				        # RUST_BACKTRACE=1 DEFAULT_PG_VERSION=16 BUILD_TYPE=release

				        ./scripts/pytest ${PARAMS} test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py::test_pageserver_characterize_throughput_with_n_tenants ${EXTRA_PARAMS}

				        ./scripts/pytest ${PARAMS} test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py::test_pageserver_characterize_latencies_with_1_client_and_throughput_with_many_clients_one_tenant ${EXTRA_PARAMS}

				    - name: upload the performance metrics to the Neon performance database which is used by grafana dashboards to display the results

				      shell: bash -euxo pipefail {0}

				      run: |

				        export REPORT_FROM="$PERF_REPORT_DIR"

				        export GITHUB_SHA=${COMMIT_HASH}

				        time ./scripts/generate_and_push_perf_report.sh

				    - name: Upload test results

				      if: ${{ !cancelled() }}

				      uses: ./.github/actions/allure-report-store

				      with:

				        report-dir:  ${{ steps.set-env.outputs.allure_results_dir }}

				        unique-key: ${{ env.BUILD_TYPE }}-${{ env.DEFAULT_PG_VERSION }}-${{ runner.arch }}

				        aws-oidc-role-arn:  ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Create Allure report

				      env:

				        AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}

				        AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}

				      id: create-allure-report

				      if: ${{ !cancelled() }}

				      uses: ./.github/actions/allure-report-generate

				      with:

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Upload snapshots

				      if: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.recreate_snapshots != 'false' && github.event.inputs.recreate_snapshots != '' }}

				      id: upload_snapshots

				      shell: bash -euxo pipefail {0}

				      run: |

				        mkdir -p $BACKUP_DIR

				        cd $TEST_OUTPUT

				        tar --create --preserve-permissions --file - shared-snapshots | zstd -o $BACKUP_DIR/shared_snapshots.tar.zst

				        cd $BACKUP_DIR

				        mkdir parts

				        split -b 1G shared_snapshots.tar.zst ./parts/shared_snapshots.tar.zst.part.

				        SNAPSHOT_DATE=$(date +%F)  # YYYY-MM-DD

				        cd parts

				        time aws s3 cp --recursive . s3://${S3_BUCKET}/performance/pagebench/shared-snapshots-${SNAPSHOT_DATE}/

				    - name: Post to a Slack channel

				      if: ${{ github.event.schedule && failure() }}

				      uses: slackapi/slack-github-action@v1

				      uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				      with:

				        channel-id: "C033QLM5P7D" # dev-staging-stream

				        channel-id: "C06KHQVQ7U3" # on-call-qa-staging-stream

				        slack-message: "Periodic pagebench testing on dedicated hardware: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"

				      env:

				        SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

				    - name: Cleanup Test Resources

				      if: always() 

				      if: always()

				      shell: bash -euxo pipefail {0}

				      env:

				        ARCHIVE: ${{ runner.temp }}/downloads/neon-${{ runner.os }}-${{ runner.arch }}-release-artifact.tar.zst

				      run: |

				        curl -k -X 'POST' \

				        "${EC2_MACHINE_URL_US}/cleanup_test/${GITHUB_RUN_ID}" \

				        -H 'accept: application/json' \

				        -H "Authorization: Bearer $API_KEY" \

				        -d ''

				        # Cleanup the test resources

				        if [[ -d "${BACKUP_DIR}" ]]; then

				          rm -rf ${BACKUP_DIR}

				        fi

				        if [[ -d "${TEST_OUTPUT}" ]]; then

				          rm -rf ${TEST_OUTPUT}

				        fi

				        if [[ -d "${NEON_DIR}" ]]; then

				          rm -rf ${NEON_DIR}

				        fi

				        rm -rf $(dirname $ARCHIVE)

				    - name: Stop EC2 instance and wait for the instance to be stopped

				      if: always() && steps.poll_step.outputs.too_many_runs != 'true'

				      run: |

				        aws ec2 stop-instances --instance-ids $AWS_INSTANCE_ID

				        aws ec2 wait instance-stopped --instance-ids $AWS_INSTANCE_ID

									
										60

.github/workflows/pg-clients.yml
									
										vendored
									
												View File
												
				@@ -12,8 +12,8 @@ on:

				  pull_request:

				    paths:

				      - '.github/workflows/pg-clients.yml'

				      - 'test_runner/pg_clients/**'

				      - 'test_runner/logical_repl/**'

				      - 'test_runner/pg_clients/**/*.py'

				      - 'test_runner/logical_repl/**/*.py'

				      - 'poetry.lock'

				  workflow_dispatch:

				@@ -25,11 +25,13 @@ defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				permissions:

				  id-token: write # aws-actions/configure-aws-credentials

				  statuses: write # require for posting a status update

				env:

				  DEFAULT_PG_VERSION: 16

				  DEFAULT_PG_VERSION: 17

				  PLATFORM: neon-captest-new

				  AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}

				  AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}

				  AWS_DEFAULT_REGION: eu-central-1

				jobs:

				@@ -39,15 +41,11 @@ jobs:

				    with:

				      github-event-name: ${{ github.event_name }}

				  check-build-tools-image:

				    needs: [ check-permissions ]

				    uses: ./.github/workflows/check-build-tools-image.yml

				  build-build-tools-image:

				    needs: [ check-build-tools-image ]

				    permissions:

				      packages: write

				    needs: [ check-permissions ]

				    uses: ./.github/workflows/build-build-tools-image.yml

				    with:

				      image-tag: ${{ needs.check-build-tools-image.outputs.image-tag }}

				    secrets: inherit

				  test-logical-replication:

				@@ -55,10 +53,10 @@ jobs:

				    runs-on: ubuntu-22.04

				    container:

				      image: ${{ needs.build-build-tools-image.outputs.image }}

				      image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init --user root

				    services:

				      clickhouse:

				@@ -92,7 +90,12 @@ jobs:

				        ports:

				          - 8083:8083

				    steps:

				      - uses: actions/checkout@v4

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Download Neon artifact

				        uses: ./.github/actions/download

				@@ -100,6 +103,7 @@ jobs:

				          name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				          path: /tmp/neon/

				          prefix: latest

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      - name: Create Neon Project

				        id: create-neon-project

				@@ -107,6 +111,8 @@ jobs:

				        with:

				          api_key: ${{ secrets.NEON_STAGING_API_KEY }}

				          postgres_version: ${{ env.DEFAULT_PG_VERSION }}

				          project_settings: >-

				            {"enable_logical_replication": true}

				      - name: Run tests

				        uses: ./.github/actions/run-python-test-set

				@@ -116,6 +122,7 @@ jobs:

				          run_in_parallel: false

				          extra_params: -m remote_cluster

				          pg_version: ${{ env.DEFAULT_PG_VERSION }}

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}

				@@ -132,12 +139,13 @@ jobs:

				        uses: ./.github/actions/allure-report-generate

				        with:

				          store-test-results-into-db: true

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

				      - name: Post to a Slack channel

				        if: github.event.schedule && failure()

				        uses: slackapi/slack-github-action@v1

				        uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				        with:

				          channel-id: "C06KHQVQ7U3" # on-call-qa-staging-stream

				          slack-message: |

				@@ -150,14 +158,19 @@ jobs:

				    runs-on: ubuntu-22.04

				    container:

				      image: ${{ needs.build-build-tools-image.outputs.image }}

				      image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm

				      credentials:

				        username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				        password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init --user root

				    steps:

				    - uses: actions/checkout@v4

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				    - name: Download Neon artifact

				      uses: ./.github/actions/download

				@@ -165,6 +178,7 @@ jobs:

				        name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				        path: /tmp/neon/

				        prefix: latest

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				    - name: Create Neon Project

				      id: create-neon-project

				@@ -181,6 +195,7 @@ jobs:

				        run_in_parallel: false

				        extra_params: -m remote_cluster

				        pg_version: ${{ env.DEFAULT_PG_VERSION }}

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}

				@@ -197,12 +212,13 @@ jobs:

				      uses: ./.github/actions/allure-report-generate

				      with:

				        store-test-results-into-db: true

				        aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      env:

				        REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

				    - name: Post to a Slack channel

				      if: github.event.schedule && failure()

				      uses: slackapi/slack-github-action@v1

				      uses: slackapi/slack-github-action@fcfb566f8b0aab22203f066d80ca1d7e4b5d05b3 # v1.27.1

				      with:

				        channel-id: "C06KHQVQ7U3" # on-call-qa-staging-stream

				        slack-message: |

									
										82

.github/workflows/pin-build-tools-image.yml
									
										vendored
									
												View File
												
				@@ -33,10 +33,6 @@ concurrency:

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				env:

				  FROM_TAG: ${{ inputs.from-tag }}

				  TO_TAG: pinned

				jobs:

				  check-manifests:

				    runs-on: ubuntu-22.04

				@@ -44,13 +40,21 @@ jobs:

				      skip: ${{ steps.check-manifests.outputs.skip }}

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				        with:

				          egress-policy: audit

				      - name: Check if we really need to pin the image

				        id: check-manifests

				        env:

				          FROM_TAG: ${{ inputs.from-tag }}

				          TO_TAG: pinned

				        run: |

				          docker manifest inspect neondatabase/build-tools:${FROM_TAG} > ${FROM_TAG}.json

				          docker manifest inspect neondatabase/build-tools:${TO_TAG}   > ${TO_TAG}.json

				          docker manifest inspect "ghcr.io/neondatabase/build-tools:${FROM_TAG}" > "${FROM_TAG}.json"

				          docker manifest inspect "ghcr.io/neondatabase/build-tools:${TO_TAG}"   > "${TO_TAG}.json"

				          if diff ${FROM_TAG}.json ${TO_TAG}.json; then

				          if diff "${FROM_TAG}.json" "${TO_TAG}.json"; then

				            skip=true

				          else

				            skip=false

				@@ -64,38 +68,36 @@ jobs:

				    # use format(..) to catch both inputs.force = true AND inputs.force = 'true'

				    if: needs.check-manifests.outputs.skip == 'false' || format('{0}', inputs.force) == 'true'

				    runs-on: ubuntu-22.04

				    permissions:

				      id-token: write # for `azure/login`

				      id-token: write  # Required for aws/azure login

				      packages: write  # required for pushing to GHCR

				    steps:

				      - uses: docker/login-action@v3

				        with:

				          username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}

				          password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}

				      - uses: docker/login-action@v3

				        with:

				          registry: 369495373322.dkr.ecr.eu-central-1.amazonaws.com

				          username: ${{ secrets.AWS_ACCESS_KEY_DEV }}

				          password: ${{ secrets.AWS_SECRET_KEY_DEV }}

				      - name: Azure login

				        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1

				        with:

				          client-id: ${{ secrets.AZURE_DEV_CLIENT_ID }}

				          tenant-id: ${{ secrets.AZURE_TENANT_ID }}

				          subscription-id: ${{ secrets.AZURE_DEV_SUBSCRIPTION_ID }}

				      - name: Login to ACR

				        run: |

				          az acr login --name=neoneastus2

				      - name: Tag build-tools with `${{ env.TO_TAG }}` in Docker Hub, ECR, and ACR

				        run: |

				          docker buildx imagetools create -t 369495373322.dkr.ecr.eu-central-1.amazonaws.com/build-tools:${TO_TAG} \

				                                          -t neoneastus2.azurecr.io/neondatabase/build-tools:${TO_TAG} \

				                                          -t neondatabase/build-tools:${TO_TAG} \

				                                             neondatabase/build-tools:${FROM_TAG}

				    uses: ./.github/workflows/_push-to-container-registry.yml

				    with:

				      image-map: |

				        {

				          "ghcr.io/neondatabase/build-tools:${{ inputs.from-tag }}-bullseye": [

				            "docker.io/neondatabase/build-tools:pinned-bullseye",

				            "ghcr.io/neondatabase/build-tools:pinned-bullseye",

				            "${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/build-tools:pinned-bullseye",

				            "${{ vars.AZURE_DEV_REGISTRY_NAME }}.azurecr.io/neondatabase/build-tools:pinned-bullseye"

				          ],

				          "ghcr.io/neondatabase/build-tools:${{ inputs.from-tag }}-bookworm": [

				            "docker.io/neondatabase/build-tools:pinned-bookworm",

				            "docker.io/neondatabase/build-tools:pinned",

				            "ghcr.io/neondatabase/build-tools:pinned-bookworm",

				            "ghcr.io/neondatabase/build-tools:pinned",

				            "${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/build-tools:pinned-bookworm",

				            "${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_ECR_REGION }}.amazonaws.com/build-tools:pinned",

				            "${{ vars.AZURE_DEV_REGISTRY_NAME }}.azurecr.io/neondatabase/build-tools:pinned-bookworm",

				            "${{ vars.AZURE_DEV_REGISTRY_NAME }}.azurecr.io/neondatabase/build-tools:pinned"

				          ]

				        }

				      aws-region: ${{ vars.AWS_ECR_REGION }}

				      aws-account-id: "${{ vars.NEON_DEV_AWS_ACCOUNT_ID }}"

				      aws-role-to-assume: "gha-oidc-neon-admin"

				      azure-client-id: ${{ vars.AZURE_DEV_CLIENT_ID }}

				      azure-subscription-id: ${{ vars.AZURE_DEV_SUBSCRIPTION_ID }}

				      azure-tenant-id: ${{ vars.AZURE_TENANT_ID }}

				      acr-registry-name: ${{ vars.AZURE_DEV_REGISTRY_NAME }}

				    secrets: inherit

									
										177

.github/workflows/pre-merge-checks.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,177 @@

				name: Pre-merge checks

				on:

				  pull_request:

				    paths:

				      - .github/workflows/_check-codestyle-python.yml

				      - .github/workflows/_check-codestyle-rust.yml

				      - .github/workflows/build-build-tools-image.yml

				      - .github/workflows/pre-merge-checks.yml

				  merge_group:

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				jobs:

				  meta:

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				    outputs:

				      python-changed: ${{ steps.python-src.outputs.any_changed }}

				      rust-changed: ${{ steps.rust-src.outputs.any_changed }}

				      branch: ${{ steps.group-metadata.outputs.branch }}

				      pr-number: ${{ steps.group-metadata.outputs.pr-number }}

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - uses: tj-actions/changed-files@ed68ef82c095e0d48ec87eccea555d944a631a4c # v46.0.5

				        id: python-src

				        with:

				          files: |

				            .github/workflows/_check-codestyle-python.yml

				            .github/workflows/build-build-tools-image.yml

				            .github/workflows/pre-merge-checks.yml

				            **/**.py

				            poetry.lock

				            pyproject.toml

				      - uses: tj-actions/changed-files@ed68ef82c095e0d48ec87eccea555d944a631a4c # v46.0.5

				        id: rust-src

				        with:

				          files: |

				            .github/workflows/_check-codestyle-rust.yml

				            .github/workflows/build-build-tools-image.yml

				            .github/workflows/pre-merge-checks.yml

				            **/**.rs

				            **/Cargo.toml

				            Cargo.toml

				            Cargo.lock

				      - name: PRINT ALL CHANGED FILES FOR DEBUG PURPOSES

				        env:

				          PYTHON_CHANGED_FILES: ${{ steps.python-src.outputs.all_changed_files }}

				          RUST_CHANGED_FILES: ${{ steps.rust-src.outputs.all_changed_files }}

				        run: |

				          echo "${PYTHON_CHANGED_FILES}"

				          echo "${RUST_CHANGED_FILES}"

				      - name: Merge group metadata

				        if: ${{ github.event_name == 'merge_group' }}

				        id: group-metadata

				        env:

				          MERGE_QUEUE_REF: ${{ github.event.merge_group.head_ref }}

				        run: |

				          echo $MERGE_QUEUE_REF | jq -Rr 'capture("refs/heads/gh-readonly-queue/(?<branch>.*)/pr-(?<pr_number>[0-9]+)-[0-9a-f]{40}") | ["branch=" + .branch, "pr-number=" + .pr_number] | .[]' | tee -a "${GITHUB_OUTPUT}"

				  build-build-tools-image:

				    if: |

				      false

				      || needs.meta.outputs.python-changed == 'true'

				      || needs.meta.outputs.rust-changed == 'true'

				    needs: [ meta ]

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/build-build-tools-image.yml

				    with:

				      # Build only one combination to save time

				      archs: '["x64"]'

				      debians: '["bookworm"]'

				    secrets: inherit

				  check-codestyle-python:

				    if: needs.meta.outputs.python-changed == 'true'

				    needs: [ meta, build-build-tools-image ]

				    permissions:

				      contents: read

				      packages: read

				    uses: ./.github/workflows/_check-codestyle-python.yml

				    with:

				      # `-bookworm-x64` suffix should match the combination in `build-build-tools-image`

				      build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm-x64

				    secrets: inherit

				  check-codestyle-rust:

				    if: needs.meta.outputs.rust-changed == 'true'

				    needs: [ meta, build-build-tools-image ]

				    permissions:

				      contents: read

				      packages: read

				    uses: ./.github/workflows/_check-codestyle-rust.yml

				    with:

				      # `-bookworm-x64` suffix should match the combination in `build-build-tools-image`

				      build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm-x64

				      archs: '["x64"]'

				    secrets: inherit

				  # To get items from the merge queue merged into main we need to satisfy "Status checks that are required".

				  # Currently we require 2 jobs (checks with exact name):

				  # - conclusion

				  # - neon-cloud-e2e

				  conclusion:

				    # Do not run job on Pull Requests as it interferes with the `conclusion` job from the `build_and_test` workflow

				    if: always() && github.event_name == 'merge_group'

				    permissions:

				      statuses: write # for `github.repos.createCommitStatus(...)`

				      contents: write

				    needs:

				      - meta

				      - check-codestyle-python

				      - check-codestyle-rust

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Create fake `neon-cloud-e2e` check

				        uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1

				        with:

				          # Retry script for 5XX server errors: https://github.com/actions/github-script#retries

				          retries: 5

				          script: |

				            const { repo, owner } = context.repo;

				            const targetUrl = `${context.serverUrl}/${owner}/${repo}/actions/runs/${context.runId}`;

				            await github.rest.repos.createCommitStatus({

				              owner: owner,

				              repo: repo,

				              sha: context.sha,

				              context: `neon-cloud-e2e`,

				              state: `success`,

				              target_url: targetUrl,

				              description: `fake check for merge queue`,

				            });

				      - name: Fail the job if any of the dependencies do not succeed or skipped

				        run: exit 1

				        if: |

				          false

				          || (github.event_name == 'merge_group' && needs.meta.outputs.branch != 'main')

				          || (needs.check-codestyle-python.result == 'skipped' && needs.meta.outputs.python-changed == 'true')

				          || (needs.check-codestyle-rust.result   == 'skipped' && needs.meta.outputs.rust-changed   == 'true')

				          || contains(needs.*.result, 'failure')

				          || contains(needs.*.result, 'cancelled')

				      - name: Add fast-forward label to PR to trigger fast-forward merge

				        if: >-

				          ${{

				            always()

				            && github.event_name == 'merge_group'

				            && contains(fromJSON('["release", "release-proxy", "release-compute"]'), needs.meta.outputs.branch)

				          }}

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				        run: >-

				          gh pr edit ${{ needs.meta.outputs.pr-number }} --repo "${GITHUB_REPOSITORY}" --add-label "fast-forward"

									
										84

.github/workflows/proxy-benchmark.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,84 @@

				name: Periodic proxy performance test on unit-perf hetzner runner

				on:

				  push: # TODO: remove after testing

				    branches:

				      - test-proxy-bench # Runs on pushes to branches starting with test-proxy-bench

				  # schedule:

				    # * is a special character in YAML so you have to quote this string

				    #        ┌───────────── minute (0 - 59)

				    #        │ ┌───────────── hour (0 - 23)

				    #        │ │ ┌───────────── day of the month (1 - 31)

				    #        │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #        │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    # - cron: '0 5 * * *' # Runs at 5 UTC once a day

				  workflow_dispatch: # adds an ability to run this manually

				defaults:

				  run:

				    shell: bash -euo pipefail {0}

				concurrency:

				  group: ${{ github.workflow }}

				  cancel-in-progress: false

				permissions:

				  contents: read

				jobs:

				  run_periodic_proxybench_test:

				    permissions:

				      id-token: write # aws-actions/configure-aws-credentials

				      statuses: write

				      contents: write

				      pull-requests: write

				    runs-on: [self-hosted, unit-perf]

				    timeout-minutes: 60  # 1h timeout

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				    - name: Checkout proxy-bench Repo

				      uses: actions/checkout@v4

				      with:

				        repository: neondatabase/proxy-bench

				        path: proxy-bench

				    - name: Set up the environment which depends on $RUNNER_TEMP on nvme drive

				      id: set-env

				      shell: bash -euxo pipefail {0}

				      run: |

				        PROXY_BENCH_PATH=$(realpath ./proxy-bench)

				        {

				          echo "PROXY_BENCH_PATH=$PROXY_BENCH_PATH"

				          echo "NEON_DIR=${RUNNER_TEMP}/neon"

				          echo "TEST_OUTPUT=${PROXY_BENCH_PATH}/test_output"

				          echo ""

				        } >> "$GITHUB_ENV"

				    - name: Run proxy-bench

				      run: ${PROXY_BENCH_PATH}/run.sh

				    - name: Ingest Bench Results # neon repo script

				      if: always()

				      run: |

				        mkdir -p $TEST_OUTPUT

				        python $NEON_DIR/scripts/proxy_bench_results_ingest.py --out $TEST_OUTPUT

				    - name: Push Metrics to Proxy perf database

				      if: always()

				      env:

				        PERF_TEST_RESULT_CONNSTR: "${{ secrets.PROXY_TEST_RESULT_CONNSTR }}"

				        REPORT_FROM: $TEST_OUTPUT

				      run: $NEON_DIR/scripts/generate_and_push_perf_report.sh

				    - name: Docker cleanup

				      if: always()

				      run: docker compose down

				    - name: Notify Failure

				      if: failure()

				      run: echo "Proxy bench job failed" && exit 1

									
										93

.github/workflows/random-ops-test.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,93 @@

				name: Random Operations Test

				on:

				  schedule:

				    # * is a special character in YAML so you have to quote this string

				    #          ┌───────────── minute (0 - 59)

				    #          │  ┌───────────── hour (0 - 23)

				    #          │  │  ┌───────────── day of the month (1 - 31)

				    #          │  │  │ ┌───────────── month (1 - 12 or JAN-DEC)

				    #          │  │  │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)

				    - cron:  '23 */2 * * *' # runs every 2 hours

				  workflow_dispatch:

				    inputs:

				      random_seed:

				        type: number

				        description: 'The random seed'

				        required: false

				        default: 0

				      num_operations:

				        type: number

				        description: "The number of operations to test"

				        default: 250

				defaults:

				  run:

				    shell: bash -euxo pipefail {0}

				permissions: {}

				env:

				  DEFAULT_PG_VERSION: 16

				  PLATFORM: neon-captest-new

				  AWS_DEFAULT_REGION: eu-central-1

				jobs:

				  run-random-rests:

				    env:

				      POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install

				    runs-on: small

				    permissions:

				      id-token: write

				      statuses: write

				    strategy:

				      fail-fast: false

				      matrix:

				        pg-version: [16, 17]

				    container:

				      image: ghcr.io/neondatabase/build-tools:pinned-bookworm

				      credentials:

				        username: ${{ github.actor }}

				        password: ${{ secrets.GITHUB_TOKEN }}

				      options: --init

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Download Neon artifact

				        uses: ./.github/actions/download

				        with:

				          name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact

				          path: /tmp/neon/

				          prefix: latest

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				      - name: Run tests

				        uses: ./.github/actions/run-python-test-set

				        with:

				          build_type: remote

				          test_selection: random_ops

				          run_in_parallel: false

				          extra_params: -m remote_cluster

				          pg_version: ${{ matrix.pg-version }}

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          NEON_API_KEY: ${{ secrets.NEON_STAGING_API_KEY }}

				          RANDOM_SEED: ${{ inputs.random_seed }}

				          NUM_OPERATIONS: ${{ inputs.num_operations }}

				      - name: Create Allure report

				        if: ${{ !cancelled() }}

				        id: create-allure-report

				        uses: ./.github/actions/allure-report-generate

				        with:

				          store-test-results-into-db: true

				          aws-oidc-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}

				        env:

				          REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

									
										46

.github/workflows/regenerate-pg-setting.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,46 @@

				name: Regenerate Postgres Settings

				on:

				  pull_request:

				    types:

				      - opened

				      - synchronize

				      - reopened

				    paths:

				      - pgxn/neon/**.c

				      - vendor/postgres-v*

				      - vendor/revisions.json

				concurrency:

				  group: ${{ github.workflow }}-${{ github.head_ref }}

				  cancel-in-progress: true

				permissions:

				  pull-requests: write

				jobs:

				  regenerate-pg-settings:

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - name: Add comment

				        uses: thollander/actions-comment-pull-request@65f9e5c9a1f2cd378bd74b2e057c9736982a8e74 # v3

				        with:

				          comment-tag: ${{ github.job }}

				          pr-number: ${{ github.event.number }}

				          message: |

				            If this PR added a GUC in the Postgres fork or `neon` extension,

				            please regenerate the Postgres settings in the `cloud` repo:

				            ```

				            make NEON_WORKDIR=path/to/neon/checkout \

				              -C goapp/internal/shareddomain/postgres generate

				            ```

				            If you're an external contributor, a Neon employee will assist in

				            making sure this step is done.

									
										12

.github/workflows/release-compute.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				name: Create compute release PR

				on:

				  schedule:

				    - cron: '0 7 * * FRI'

				jobs:

				  create-release-pr:

				    uses: ./.github/workflows/release.yml

				    with:

				      component: compute

				    secrets: inherit

									
										7

.github/workflows/release-notify.yml
									
										vendored
									
												View File
												
				@@ -22,7 +22,12 @@ jobs:

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: neondatabase/dev-actions/release-pr-notify@main

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				      - uses: neondatabase/dev-actions/release-pr-notify@483a843f2a8bcfbdc4c69d27630528a3ddc4e14b # main

				        with:

				          slack-token: ${{ secrets.SLACK_BOT_TOKEN }}

				          slack-channel-id: ${{ vars.SLACK_UPCOMING_RELEASE_CHANNEL_ID || 'C05QQ9J1BRC' }} # if not set, then `#test-release-notifications`

									
										12

.github/workflows/release-proxy.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				name: Create proxy release PR

				on:

				  schedule:

				    - cron: '0 6 * * TUE'

				jobs:

				  create-release-pr:

				    uses: ./.github/workflows/release.yml

				    with:

				      component: proxy

				    secrets: inherit

									
										12

.github/workflows/release-storage.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				name: Create storage release PR

				on:

				  schedule:

				    - cron: '0 6 * * FRI'

				jobs:

				  create-release-pr:

				    uses: ./.github/workflows/release.yml

				    with:

				      component: storage

				    secrets: inherit

									
										129

.github/workflows/release.yml
									
										vendored
									
												View File
												
				@@ -1,20 +1,34 @@

				name: Create Release Branch

				name: Create release PR

				on:

				  schedule:

				    # It should be kept in sync with if-condition in jobs

				    - cron: '0 6 * * MON' # Storage release

				    - cron: '0 6 * * THU' # Proxy release

				  workflow_dispatch:

				    inputs:

				      create-storage-release-branch:

				        type: boolean

				        description: 'Create Storage release PR'

				      component:

				        description: "Component to release"

				        required: true

				        type: choice

				        options:

				          - compute

				          - proxy

				          - storage

				      cherry-pick:

				        description: "Commits to cherry-pick (space separated, makes this a hotfix based on previous release)"

				        required: false

				      create-proxy-release-branch:

				        type: boolean

				        description: 'Create Proxy release PR'

				        type: string

				        default: ''

				  workflow_call:

				    inputs:

				      component:

				        description: "Component to release"

				        required: true

				        type: string

				      cherry-pick:

				        description: "Commits to cherry-pick (space separated, makes this a hotfix based on previous release)"

				        required: false

				        type: string

				        default: ''

				# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.

				permissions: {}

				@@ -24,84 +38,31 @@ defaults:

				    shell: bash -euo pipefail {0}

				jobs:

				  create-storage-release-branch:

				    if: ${{ github.event.schedule == '0 6 * * MON' || format('{0}', inputs.create-storage-release-branch) == 'true' }}

				  create-release-pr:

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # for `git push`

				      contents: write

				    steps:

				    - name: Check out code

				      uses: actions/checkout@v4

				      with:

				        ref: main

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				        with:

				          egress-policy: audit

				    - name: Set environment variables

				      run: |

				        echo "RELEASE_DATE=$(date +'%Y-%m-%d')" | tee -a $GITHUB_ENV

				        echo "RELEASE_BRANCH=rc/$(date +'%Y-%m-%d')" | tee -a $GITHUB_ENV

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				    - name: Create release branch

				      run: git checkout -b $RELEASE_BRANCH

				      - name: Configure git

				        run: |

				          git config user.name "github-actions[bot]"

				          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

				    - name: Push new branch

				      run: git push origin $RELEASE_BRANCH

				    - name: Create pull request into release

				      env:

				        GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				      run: |

				        TITLE="Storage & Compute release ${RELEASE_DATE}"

				        cat << EOF > body.md

				          ## ${TITLE}

				          **Please merge this Pull Request using 'Create a merge commit' button**

				        EOF

				        gh pr create --title "${TITLE}" \

				                     --body-file "body.md" \

				                     --head "${RELEASE_BRANCH}" \

				                     --base "release"

				  create-proxy-release-branch:

				    if: ${{ github.event.schedule == '0 6 * * THU' || format('{0}', inputs.create-proxy-release-branch) == 'true' }}

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # for `git push`

				    steps:

				    - name: Check out code

				      uses: actions/checkout@v4

				      with:

				        ref: main

				    - name: Set environment variables

				      run: |

				        echo "RELEASE_DATE=$(date +'%Y-%m-%d')" | tee -a $GITHUB_ENV

				        echo "RELEASE_BRANCH=rc/proxy/$(date +'%Y-%m-%d')" | tee -a $GITHUB_ENV

				    - name: Create release branch

				      run: git checkout -b $RELEASE_BRANCH

				    - name: Push new branch

				      run: git push origin $RELEASE_BRANCH

				    - name: Create pull request into release

				      env:

				        GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				      run: |

				        TITLE="Proxy release ${RELEASE_DATE}"

				        cat << EOF > body.md

				          ## ${TITLE}

				          **Please merge this Pull Request using 'Create a merge commit' button**

				        EOF

				        gh pr create --title "${TITLE}" \

				                     --body-file "body.md" \

				                     --head "${RELEASE_BRANCH}" \

				                     --base "release-proxy"

				      - name: Create release PR

				        uses: neondatabase/dev-actions/release-pr@290dec821d86fa8a93f019e8c69720f5865b5677

				        with:

				          component: ${{ inputs.component }}

				          cherry-pick: ${{ inputs.cherry-pick }}

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

									
										71

.github/workflows/report-workflow-stats-batch.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				name: Report Workflow Stats Batch

				on:

				  schedule:

				    - cron: '*/15 * * * *'

				    - cron: '25 0 * * *'

				    - cron: '25 1 * * 6'

				permissions:

				  contents: read

				jobs:

				  gh-workflow-stats-batch-2h:

				    name: GitHub Workflow Stats Batch 2 hours

				    if: github.event.schedule == '*/15 * * * *'

				    runs-on: ubuntu-22.04

				    permissions:

				      actions: read

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - name: Export Workflow Run for the past 2 hours

				      uses: neondatabase/gh-workflow-stats-action@701b1f202666d0b82e67b4d387e909af2b920127 # v0.2.2

				      with:

				        db_uri: ${{ secrets.GH_REPORT_STATS_DB_RW_CONNSTR }}

				        db_table: "gh_workflow_stats_neon"

				        gh_token: ${{ secrets.GITHUB_TOKEN }}

				        duration: '2h'

				  gh-workflow-stats-batch-48h:

				    name: GitHub Workflow Stats Batch 48 hours

				    if: github.event.schedule == '25 0 * * *'

				    runs-on: ubuntu-22.04

				    permissions:

				      actions: read

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - name: Export Workflow Run for the past 48 hours

				      uses: neondatabase/gh-workflow-stats-action@701b1f202666d0b82e67b4d387e909af2b920127 # v0.2.2

				      with:

				        db_uri: ${{ secrets.GH_REPORT_STATS_DB_RW_CONNSTR }}

				        db_table: "gh_workflow_stats_neon"

				        gh_token: ${{ secrets.GITHUB_TOKEN }}

				        duration: '48h'

				  gh-workflow-stats-batch-30d:

				    name: GitHub Workflow Stats Batch 30 days

				    if: github.event.schedule == '25 1 * * 6'

				    runs-on: ubuntu-22.04

				    permissions:

				      actions: read

				    steps:

				    - name: Harden the runner (Audit all outbound calls)

				      uses: step-security/harden-runner@4d991eb9b905ef189e4c376166672c3f2f230481 # v2.11.0

				      with:

				        egress-policy: audit

				    - name: Export Workflow Run for the past 30 days

				      uses: neondatabase/gh-workflow-stats-action@701b1f202666d0b82e67b4d387e909af2b920127 # v0.2.2

				      with:

				        db_uri: ${{ secrets.GH_REPORT_STATS_DB_RW_CONNSTR }}

				        db_table: "gh_workflow_stats_neon"

				        gh_token: ${{ secrets.GITHUB_TOKEN }}

				        duration: '720h'

									
										114

.github/workflows/trigger-e2e-tests.yml
									
										vendored
									
												View File
												
				@@ -5,6 +5,13 @@ on:

				    types:

				      - ready_for_review

				  workflow_call:

				    inputs:

				      github-event-name:

				        type: string

				        required: true

				      github-event-json:

				        type: string

				        required: true

				defaults:

				  run:

				@@ -15,11 +22,23 @@ env:

				  E2E_CONCURRENCY_GROUP: ${{ github.repository }}-e2e-tests-${{ github.ref_name }}-${{ github.ref_name == 'main' && github.sha || 'anysha' }}

				jobs:

				  check-permissions:

				    if: ${{ !contains(github.event.pull_request.labels.*.name, 'run-no-ci') }}

				    uses: ./.github/workflows/check-permissions.yml

				    with:

				      github-event-name: ${{ inputs.github-event-name || github.event_name }}

				  cancel-previous-e2e-tests:

				    needs: [ check-permissions ]

				    if: github.event_name == 'pull_request'

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				        with:

				          egress-policy: audit

				      - name: Cancel previous e2e-tests runs for this PR

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				@@ -28,45 +47,37 @@ jobs:

				            run cancel-previous-in-concurrency-group.yml \

				              --field concurrency_group="${{ env.E2E_CONCURRENCY_GROUP }}"

				  tag:

				    runs-on: ubuntu-22.04

				    outputs:

				      build-tag: ${{ steps.build-tag.outputs.tag }}

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Get build tag

				        env:

				          GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				          CURRENT_BRANCH: ${{ github.head_ref || github.ref_name }}

				          CURRENT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}

				        run: |

				          if [[ "$GITHUB_REF_NAME" == "main" ]]; then

				            echo "tag=$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release" ]]; then

				            echo "tag=release-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT

				          elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then

				            echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT

				          else

				            echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"

				            BUILD_AND_TEST_RUN_ID=$(gh run list -b $CURRENT_BRANCH -c $CURRENT_SHA -w 'Build and Test' -L 1 --json databaseId --jq '.[].databaseId')

				            echo "tag=$BUILD_AND_TEST_RUN_ID" | tee -a $GITHUB_OUTPUT

				          fi

				        id: build-tag

				  meta:

				    uses: ./.github/workflows/_meta.yml

				    with:

				      github-event-name: ${{ inputs.github-event-name || github.event_name }}

				      github-event-json: ${{ inputs.github-event-json || toJSON(github.event) }}

				  trigger-e2e-tests:

				    needs: [ tag ]

				    needs: [ meta ]

				    runs-on: ubuntu-22.04

				    env:

				      EVENT_ACTION: ${{ github.event.action }}

				      GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}

				      TAG: ${{ needs.tag.outputs.build-tag }}

				      TAG: >-

				        ${{

				          contains(fromJSON('["compute-release", "compute-rc-pr"]'), needs.meta.outputs.run-kind)

				          && needs.meta.outputs.previous-storage-release

				          || needs.meta.outputs.build-tag

				        }}

				      COMPUTE_TAG: >-

				        ${{

				          contains(fromJSON('["storage-release", "storage-rc-pr", "proxy-release", "proxy-rc-pr"]'), needs.meta.outputs.run-kind)

				          && needs.meta.outputs.previous-compute-release

				          || needs.meta.outputs.build-tag

				        }}

				    steps:

				      - name: Wait for `promote-images` job to finish

				      - name: Harden the runner (Audit all outbound calls)

				        uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0

				        with:

				          egress-policy: audit

				      - name: Wait for `push-{neon,compute}-image-dev` job to finish

				        # It's important to have a timeout here, the script in the step can run infinitely

				        timeout-minutes: 60

				        run: |

				@@ -77,20 +88,20 @@ jobs:

				          # For PRs we use the run id as the tag

				          BUILD_AND_TEST_RUN_ID=${TAG}

				          while true; do

				            conclusion=$(gh run --repo ${GITHUB_REPOSITORY} view ${BUILD_AND_TEST_RUN_ID} --json jobs --jq '.jobs[] | select(.name == "promote-images") | .conclusion')

				            case "$conclusion" in

				              success)

				                break

				                ;;

				              failure | cancelled | skipped)

				                echo "The 'promote-images' job didn't succeed: '${conclusion}'. Exiting..."

				                exit 1

				                ;;

				              *)

				                echo "The 'promote-images' hasn't succeed yet. Waiting..."

				                sleep 60

				                ;;

				            esac

				            gh run --repo ${GITHUB_REPOSITORY} view ${BUILD_AND_TEST_RUN_ID} --json jobs --jq '[.jobs[] | select((.name | startswith("push-neon-image-dev")) or (.name | startswith("push-compute-image-dev"))) | {"name": .name, "conclusion": .conclusion, "url": .url}]' > jobs.json

				            if [ $(jq '[.[] | select(.conclusion == "success")] | length' jobs.json) -eq 2 ]; then

				              break

				            fi

				            jq -c '.[]' jobs.json | while read -r job; do

				              case $(echo $job | jq .conclusion) in

				                failure | cancelled | skipped)

				                  echo "The '$(echo $job | jq .name)' job didn't succeed: '$(echo $job | jq .conclusion)'. See log in '$(echo $job | jq .url)' Exiting..."

				                  exit 1

				                  ;;

				              esac

				            done

				            echo "The 'push-{neon,compute}-image-dev' jobs haven't succeeded yet. Waiting..."

				            sleep 60

				          done

				      - name: Set e2e-platforms

				@@ -102,12 +113,17 @@ jobs:

				          # Default set of platforms to run e2e tests on

				          platforms='["docker", "k8s"]'

				          # If the PR changes vendor/, pgxn/ or libs/vm_monitor/ directories, or Dockerfile.compute-node, add k8s-neonvm to the list of platforms.

				          # If a PR changes anything that affects computes, add k8s-neonvm to the list of platforms.

				          # If the workflow run is not a pull request, add k8s-neonvm to the list.

				          if [ "$GITHUB_EVENT_NAME" == "pull_request" ]; then

				            for f in $(gh api "/repos/${GITHUB_REPOSITORY}/pulls/${PR_NUMBER}/files" --paginate --jq '.[].filename'); do

				              case "$f" in

				                vendor/*|pgxn/*|libs/vm_monitor/*|Dockerfile.compute-node)

				                # List of directories that contain code which affect compute images.

				                #

				                # This isn't exhaustive, just the paths that are most directly compute-related.

				                # For example, compute_ctl also depends on libs/utils, but we don't trigger

				                # an e2e run on that.

				                vendor/*|pgxn/*|compute_tools/*|libs/vm_monitor/*|compute/compute-node.Dockerfile)

				                  platforms=$(echo "${platforms}" | jq --compact-output '. += ["k8s-neonvm"] | unique')

				                  ;;

				                *)

				@@ -142,6 +158,6 @@ jobs:

				              --raw-field "commit_hash=$COMMIT_SHA" \

				              --raw-field "remote_repo=${GITHUB_REPOSITORY}" \

				              --raw-field "storage_image_tag=${TAG}" \

				              --raw-field "compute_image_tag=${TAG}" \

				              --raw-field "compute_image_tag=${COMPUTE_TAG}" \

				              --raw-field "concurrency_group=${E2E_CONCURRENCY_GROUP}" \

				              --raw-field "e2e-platforms=${E2E_PLATFORMS}"

4

.gitignore vendored

View File

@@ -1,3 +1,5 @@
 /artifact_cache
 /build
 /pg_install
 /target
 /tmp_check
@@ -6,6 +8,8 @@ __pycache__/
 test_output/
 .vscode
 .idea
 *.swp
 tags
 neon.iml
 /.neon
 /integration_tests/.neon

4

.gitmodules vendored

View File

@@ -10,3 +10,7 @@
 	path = vendor/postgres-v16
 	url = https://github.com/neondatabase/postgres.git
 	branch = REL_16_STABLE_neon
 [submodule "vendor/postgres-v17"]
 	path = vendor/postgres-v17
 	url = https://github.com/neondatabase/postgres.git
 	branch = REL_17_STABLE_neon

32

CODEOWNERS

View File

@@ -1,13 +1,29 @@
 /compute_tools/ @neondatabase/control-plane @neondatabase/compute
 # Autoscaling
 /libs/vm_monitor/ @neondatabase/autoscaling
 # DevProd & PerfCorr
 /.github/ @neondatabase/developer-productivity @neondatabase/performance-correctness
 # Compute
 /pgxn/ @neondatabase/compute
 /vendor/ @neondatabase/compute
 /compute/ @neondatabase/compute
 /compute_tools/ @neondatabase/compute
 # Proxy
 /libs/proxy/ @neondatabase/proxy
 /proxy/ @neondatabase/proxy
 # Storage
 /pageserver/ @neondatabase/storage
 /safekeeper/ @neondatabase/storage
 /storage_controller @neondatabase/storage
 /storage_scrubber @neondatabase/storage
 /libs/pageserver_api/ @neondatabase/storage
 /libs/postgres_ffi/ @neondatabase/compute @neondatabase/storage
 /libs/remote_storage/ @neondatabase/storage
 /libs/safekeeper_api/ @neondatabase/storage
 /libs/vm_monitor/ @neondatabase/autoscaling
 /pageserver/ @neondatabase/storage
 /pgxn/ @neondatabase/compute
 # Shared
 /pgxn/neon/ @neondatabase/compute @neondatabase/storage
 /proxy/ @neondatabase/proxy
 /safekeeper/ @neondatabase/storage
 /vendor/ @neondatabase/compute
 /libs/compute_api/ @neondatabase/compute @neondatabase/control-plane
 /libs/postgres_ffi/ @neondatabase/compute @neondatabase/storage

3718

Cargo.lock generated

View File

File diff suppressed because it is too large Load Diff

									
										191

Cargo.toml
									
												View File
												
				@@ -9,21 +9,28 @@ members = [

				    "pageserver/ctl",

				    "pageserver/client",

				    "pageserver/pagebench",

				    "pageserver/page_api",

				    "proxy",

				    "safekeeper",

				    "safekeeper/client",

				    "storage_broker",

				    "storage_controller",

				    "storage_controller/client",

				    "storage_scrubber",

				    "workspace_hack",

				    "libs/compute_api",

				    "libs/http-utils",

				    "libs/pageserver_api",

				    "libs/postgres_ffi",

				    "libs/postgres_ffi_types",

				    "libs/postgres_versioninfo",

				    "libs/safekeeper_api",

				    "libs/desim",

				    "libs/neon-shmem",

				    "libs/utils",

				    "libs/consumption_metrics",

				    "libs/postgres_backend",

				    "libs/posthog_client_lite",

				    "libs/pq_proto",

				    "libs/tenant_size_model",

				    "libs/metrics",

				@@ -33,52 +40,57 @@ members = [

				    "libs/postgres_ffi/wal_craft",

				    "libs/vm_monitor",

				    "libs/walproposer",

				    "libs/wal_decoder",

				    "libs/postgres_initdb",

				    "libs/proxy/postgres-protocol2",

				    "libs/proxy/postgres-types2",

				    "libs/proxy/tokio-postgres2",

				    "endpoint_storage",

				]

				[workspace.package]

				edition = "2021"

				edition = "2024"

				license = "Apache-2.0"

				## All dependency versions, used in the project

				[workspace.dependencies]

				ahash = "0.8"

				anyhow = { version = "1.0", features = ["backtrace"] }

				arc-swap = "1.6"

				arc-swap = "1.7"

				async-compression = { version = "0.4.0", features = ["tokio", "gzip", "zstd"] }

				atomic-take = "1.1.0"

				azure_core = { version = "0.19", default-features = false, features = ["enable_reqwest_rustls", "hmac_rust"] }

				azure_identity = { version = "0.19", default-features = false, features = ["enable_reqwest_rustls"] }

				azure_storage = { version = "0.19", default-features = false, features = ["enable_reqwest_rustls"] }

				azure_storage_blobs = { version = "0.19", default-features = false, features = ["enable_reqwest_rustls"] }

				flate2 = "1.0.26"

				assert-json-diff = "2"

				async-stream = "0.3"

				async-trait = "0.1"

				aws-config = { version = "1.3", default-features = false, features=["rustls"] }

				aws-sdk-s3 = "1.26"

				aws-sdk-iam = "1.15.0"

				aws-config = { version = "1.5", default-features = false, features=["rustls", "sso"] }

				aws-sdk-s3 = "1.52"

				aws-sdk-iam = "1.46.0"

				aws-sdk-kms = "1.47.0"

				aws-smithy-async = { version = "1.2.1", default-features = false, features=["rt-tokio"] }

				aws-smithy-types = "1.1.9"

				aws-smithy-types = "1.2"

				aws-credential-types = "1.2.0"

				aws-sigv4 = { version = "1.2.1", features = ["sign-http"] }

				aws-types = "1.2.0"

				axum = { version = "0.6.20", features = ["ws"] }

				base64 = "0.13.0"

				aws-sigv4 = { version = "1.2", features = ["sign-http"] }

				aws-types = "1.3"

				axum = { version = "0.8.1", features = ["ws"] }

				axum-extra = { version = "0.10.0", features = ["typed-header", "query"] }

				base64 = "0.22"

				bincode = "1.3"

				bindgen = "0.70"

				bindgen = "0.71"

				bit_field = "0.10.2"

				bstr = "1.0"

				byteorder = "1.4"

				bytes = "1.0"

				bytes = "1.9"

				camino = "1.1.6"

				cfg-if = "1.0.0"

				cron = "0.15"

				chrono = { version = "0.4", default-features = false, features = ["clock"] }

				clap = { version = "4.0", features = ["derive"] }

				clap = { version = "4.0", features = ["derive", "env"] }

				clashmap = { version = "1.0", features = ["raw-api"] }

				comfy-table = "7.1"

				const_format = "0.2"

				crc32c = "0.6"

				crossbeam-deque = "0.8.5"

				crossbeam-utils = "0.8.5"

				dashmap = { version = "5.5.0", features = ["raw-api"] }

				diatomic-waker = { version = "0.2.3" }

				either = "1.8"

				enum-map = "2.4.2"

				enumset = "1.0.12"

				@@ -89,154 +101,183 @@ futures = "0.3"

				futures-core = "0.3"

				futures-util = "0.3"

				git-version = "0.3"

				governor = "0.8"

				hashbrown = "0.14"

				hashlink = "0.9.1"

				hdrhistogram = "7.5.2"

				hex = "0.4"

				hex-literal = "0.4"

				hmac = "0.12.1"

				hostname = "0.3.1"

				hostname = "0.4"

				http = {version = "1.1.0", features = ["std"]}

				http-types = { version = "2", default-features = false }

				humantime = "2.1"

				http-body-util = "0.1.2"

				humantime = "2.2"

				humantime-serde = "1.1.1"

				hyper = "0.14"

				tokio-tungstenite = "0.20.0"

				indexmap = "2"

				hyper0 = { package = "hyper", version = "0.14" }

				hyper = "1.4"

				hyper-util = "0.1"

				tokio-tungstenite = "0.21.0"

				indexmap = { version = "2", features = ["serde"] }

				indoc = "2"

				inotify = "0.10.2"

				ipnet = "2.9.0"

				ipnet = "2.10.0"

				itertools = "0.10"

				itoa = "1.0.11"

				jemalloc_pprof = { version = "0.7", features = ["symbolize", "flamegraph"] }

				jsonwebtoken = "9"

				lasso = "0.7"

				libc = "0.2"

				md5 = "0.7.0"

				measured = { version = "0.0.22", features=["lasso"] }

				measured-process = { version = "0.0.22" }

				memoffset = "0.8"

				nix = { version = "0.27", features = ["dir", "fs", "process", "socket", "signal", "poll"] }

				memoffset = "0.9"

				nix = { version = "0.30.1", features = ["dir", "fs", "mman", "process", "socket", "signal", "poll"] }

				# Do not update to >= 7.0.0, at least. The update will have a significant impact

				# on compute startup metrics (start_postgres_ms), >= 25% degradation.

				notify = "6.0.0"

				num_cpus = "1.15"

				num-traits = "0.2.15"

				num-traits = "0.2.19"

				once_cell = "1.13"

				opentelemetry = "0.20.0"

				opentelemetry-otlp = { version = "0.13.0", default-features=false, features = ["http-proto", "trace", "http", "reqwest-client"] }

				opentelemetry-semantic-conventions = "0.12.0"

				opentelemetry = "0.27"

				opentelemetry_sdk = "0.27"

				opentelemetry-otlp = { version = "0.27", default-features = false, features = ["http-proto", "trace", "http", "reqwest-client"] }

				opentelemetry-semantic-conventions = "0.27"

				parking_lot = "0.12"

				parquet = { version = "53", default-features = false, features = ["zstd"] }

				parquet_derive = "53"

				pbkdf2 = { version = "0.12.1", features = ["simple", "std"] }

				pem = "3.0.3"

				pin-project-lite = "0.2"

				pprof = { version = "0.14", features = ["criterion", "flamegraph", "frame-pointer", "prost-codec"] }

				procfs = "0.16"

				prometheus = {version = "0.13", default-features=false, features = ["process"]} # removes protobuf dependency

				prost = "0.11"

				prost = "0.13.5"

				prost-types = "0.13.5"

				rand = "0.8"

				redis = { version = "0.25.2", features = ["tokio-rustls-comp", "keep-alive"] }

				redis = { version = "0.29.2", features = ["tokio-rustls-comp", "keep-alive"] }

				regex = "1.10.2"

				reqwest = { version = "0.12", default-features = false, features = ["rustls-tls"] }

				reqwest-tracing = { version = "0.5", features = ["opentelemetry_0_20"] }

				reqwest-middleware = "0.3.0"

				reqwest-retry = "0.5"

				reqwest-tracing = { version = "0.5", features = ["opentelemetry_0_27"] }

				reqwest-middleware = "0.4"

				reqwest-retry = "0.7"

				routerify = "3"

				rpds = "0.13"

				rustc-hash = "1.1.0"

				rustls = "0.22"

				rustls = { version = "0.23.16", default-features = false }

				rustls-pemfile = "2"

				rustls-split = "0.3"

				rustls-pki-types = "1.11"

				scopeguard = "1.1"

				sysinfo = "0.29.2"

				sd-notify = "0.4.1"

				send-future = "0.1.0"

				sentry = { version = "0.32", default-features = false, features = ["backtrace", "contexts", "panic", "rustls", "reqwest" ] }

				sentry = { version = "0.37", default-features = false, features = ["backtrace", "contexts", "panic", "rustls", "reqwest" ] }

				serde = { version = "1.0", features = ["derive"] }

				serde_json = "1"

				serde_path_to_error = "0.1"

				serde_with = "2.0"

				serde_with = { version = "3", features = [ "base64" ] }

				serde_assert = "0.5.0"

				serde_repr = "0.1.20"

				sha2 = "0.10.2"

				signal-hook = "0.3"

				smallvec = "1.11"

				smol_str = { version = "0.2.0", features = ["serde"] }

				socket2 = "0.5"

				spki = "0.7.3"

				strum = "0.26"

				strum_macros = "0.26"

				"subtle"  = "2.5.0"

				svg_fmt = "0.4.3"

				sync_wrapper = "0.1.2"

				tar = "0.4"

				task-local-extensions = "0.1.4"

				test-context = "0.3"

				thiserror = "1.0"

				tikv-jemallocator = "0.5"

				tikv-jemalloc-ctl = "0.5"

				tokio = { version = "1.17", features = ["macros"] }

				tikv-jemallocator = { version = "0.6", features = ["profiling", "stats", "unprefixed_malloc_on_supported_platforms"] }

				tikv-jemalloc-ctl = { version = "0.6", features = ["stats"] }

				tokio = { version = "1.43.1", features = ["macros"] }

				tokio-epoll-uring = { git = "https://github.com/neondatabase/tokio-epoll-uring.git" , branch = "main" }

				tokio-io-timeout = "1.2.0"

				tokio-postgres-rustls = "0.11.0"

				tokio-rustls = "0.25"

				tokio-postgres-rustls = "0.12.0"

				tokio-rustls = { version = "0.26.0", default-features = false, features = ["tls12", "ring"]}

				tokio-stream = "0.1"

				tokio-tar = "0.3"

				tokio-util = { version = "0.7.10", features = ["io", "rt"] }

				tokio-util = { version = "0.7.10", features = ["io", "io-util", "rt"] }

				toml = "0.8"

				toml_edit = "0.22"

				tonic = {version = "0.9", features = ["tls", "tls-roots"]}

				tower-service = "0.3.2"

				tonic = { version = "0.13.1", default-features = false, features = ["channel", "codegen", "gzip", "prost", "router", "server", "tls-ring", "tls-native-roots", "zstd"] }

				tonic-reflection = { version = "0.13.1", features = ["server"] }

				tower = { version = "0.5.2", default-features = false }

				tower-http = { version = "0.6.2", features = ["auth", "request-id", "trace"] }

				# This revision uses opentelemetry 0.27. There's no tag for it.

				tower-otel = { git = "https://github.com/mattiapenati/tower-otel", rev = "56a7321053bcb72443888257b622ba0d43a11fcd" }

				tower-service = "0.3.3"

				tracing = "0.1"

				tracing-error = "0.2.0"

				tracing-opentelemetry = "0.21.0"

				tracing-error = "0.2"

				tracing-log = "0.2"

				tracing-opentelemetry = "0.28"

				tracing-serde = "0.2.0"

				tracing-subscriber = { version = "0.3", default-features = false, features = ["smallvec", "fmt", "tracing-log", "std", "env-filter", "json"] }

				try-lock = "0.2.5"

				test-log = { version = "0.2.17", default-features = false, features = ["log"] }

				twox-hash = { version = "1.6.3", default-features = false }

				typed-json = "0.1"

				url = "2.2"

				urlencoding = "2.1"

				uuid = { version = "1.6.1", features = ["v4", "v7", "serde"] }

				walkdir = "2.3.2"

				rustls-native-certs = "0.7"

				x509-parser = "0.15"

				rustls-native-certs = "0.8"

				whoami = "1.5.1"

				zerocopy = { version = "0.8", features = ["derive", "simd"] }

				json-structural-diff = { version = "0.2.0" }

				x509-cert = { version = "0.2.5" }

				## TODO replace this with tracing

				env_logger = "0.10"

				env_logger = "0.11"

				log = "0.4"

				## Libraries from neondatabase/ git forks, ideally with changes to be upstreamed

				postgres = { git = "https://github.com/neondatabase/rust-postgres.git", branch = "neon" }

				postgres-protocol = { git = "https://github.com/neondatabase/rust-postgres.git", branch = "neon" }

				postgres-types = { git = "https://github.com/neondatabase/rust-postgres.git", branch = "neon" }

				tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", branch = "neon" }

				# We want to use the 'neon' branch for these, but there's currently one

				# incompatible change on the branch. See:

				#

				# - PR #8076 which contained changes that depended on the new changes in

				#   the rust-postgres crate, and

				# - PR #8654 which reverted those changes and made the code in proxy incompatible

				#   with the tip of the 'neon' branch again.

				#

				# When those proxy changes are re-applied (see PR #8747), we can switch using

				# the tip of the 'neon' branch again.

				postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev = "20031d7a9ee1addeae6e0968e3899ae6bf01cee2" }

				postgres-protocol = { git = "https://github.com/neondatabase/rust-postgres.git", rev = "20031d7a9ee1addeae6e0968e3899ae6bf01cee2" }

				postgres-types = { git = "https://github.com/neondatabase/rust-postgres.git", rev = "20031d7a9ee1addeae6e0968e3899ae6bf01cee2" }

				tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev = "20031d7a9ee1addeae6e0968e3899ae6bf01cee2" }

				## Azure SDK crates

				azure_core = { git = "https://github.com/neondatabase/azure-sdk-for-rust.git", branch = "neon", default-features = false, features = ["enable_reqwest_rustls", "hmac_rust"] }

				azure_identity = { git = "https://github.com/neondatabase/azure-sdk-for-rust.git", branch = "neon", default-features = false, features = ["enable_reqwest_rustls"] }

				azure_storage = { git = "https://github.com/neondatabase/azure-sdk-for-rust.git", branch = "neon", default-features = false, features = ["enable_reqwest_rustls"] }

				azure_storage_blobs = { git = "https://github.com/neondatabase/azure-sdk-for-rust.git", branch = "neon", default-features = false, features = ["enable_reqwest_rustls"] }

				## Local libraries

				compute_api = { version = "0.1", path = "./libs/compute_api/" }

				consumption_metrics = { version = "0.1", path = "./libs/consumption_metrics/" }

				desim = { version = "0.1", path = "./libs/desim" }

				endpoint_storage = { version = "0.0.1", path = "./endpoint_storage/" }

				http-utils = { version = "0.1", path = "./libs/http-utils/" }

				metrics = { version = "0.1", path = "./libs/metrics/" }

				pageserver = { path = "./pageserver" }

				pageserver_api = { version = "0.1", path = "./libs/pageserver_api/" }

				pageserver_client = { path = "./pageserver/client" }

				pageserver_compaction = { version = "0.1", path = "./pageserver/compaction/" }

				pageserver_page_api = { path = "./pageserver/page_api" }

				postgres_backend = { version = "0.1", path = "./libs/postgres_backend/" }

				postgres_connection = { version = "0.1", path = "./libs/postgres_connection/" }

				postgres_ffi = { version = "0.1", path = "./libs/postgres_ffi/" }

				postgres_ffi_types = { version = "0.1", path = "./libs/postgres_ffi_types/" }

				postgres_versioninfo = { version = "0.1", path = "./libs/postgres_versioninfo/" }

				postgres_initdb = { path = "./libs/postgres_initdb" }

				posthog_client_lite = { version = "0.1", path = "./libs/posthog_client_lite" }

				pq_proto = { version = "0.1", path = "./libs/pq_proto/" }

				remote_storage = { version = "0.1", path = "./libs/remote_storage/" }

				safekeeper_api = { version = "0.1", path = "./libs/safekeeper_api" }

				desim = { version = "0.1", path = "./libs/desim" }

				safekeeper_client = { path = "./safekeeper/client" }

				storage_broker = { version = "0.1", path = "./storage_broker/" } # Note: main broker code is inside the binary crate, so linking with the library shouldn't be heavy.

				storage_controller_client = { path = "./storage_controller/client" }

				tenant_size_model = { version = "0.1", path = "./libs/tenant_size_model/" }

				tracing-utils = { version = "0.1", path = "./libs/tracing-utils/" }

				utils = { version = "0.1", path = "./libs/utils/" }

				vm_monitor = { version = "0.1", path = "./libs/vm_monitor/" }

				wal_decoder = { version = "0.1", path = "./libs/wal_decoder" }

				walproposer = { version = "0.1", path = "./libs/walproposer/" }

				## Common library dependency

				@@ -244,21 +285,23 @@ workspace_hack = { version = "0.1", path = "./workspace_hack/" }

				## Build dependencies

				criterion = "0.5.1"

				rcgen = "0.12"

				rcgen = "0.13"

				rstest = "0.18"

				camino-tempfile = "1.0.2"

				tonic-build = "0.9"

				tonic-build = "0.13.1"

				[patch.crates-io]

				# Needed to get `tokio-postgres-rustls` to depend on our fork.

				tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev = "20031d7a9ee1addeae6e0968e3899ae6bf01cee2" }

				tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", branch = "neon" }

				################# Binary contents sections

				[profile.release]

				# This is useful for profiling and, to some extent, debug.

				# Besides, debug info should not affect the performance.

				#

				# NB: we also enable frame pointers for improved profiling, see .cargo/config.toml.

				debug = true

				# disable debug symbols for all packages except this one to decrease binaries size

									
										76

Dockerfile
									
												View File
												
				@@ -2,9 +2,33 @@

				### The image itself is mainly used as a container for the binaries and for starting e2e tests with custom parameters.

				### By default, the binaries inside the image have some mock parameters and can start, but are not intended to be used

				### inside this image in the real deployments.

				ARG REPOSITORY=neondatabase

				ARG REPOSITORY=ghcr.io/neondatabase

				ARG IMAGE=build-tools

				ARG TAG=pinned

				ARG DEBIAN_VERSION=bookworm

				ARG DEBIAN_FLAVOR=${DEBIAN_VERSION}-slim

				# Here are the INDEX DIGESTS for the images we use.

				# You can get them following next steps for now:

				# 1. Get an authentication token from DockerHub:

				#    TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/debian:pull" | jq -r .token)

				# 2. Using that token, query index for the given tag:

				#    curl -s -H "Authorization: Bearer $TOKEN" \

				#       -H "Accept: application/vnd.docker.distribution.manifest.list.v2+json" \

				#       "https://registry.hub.docker.com/v2/library/debian/manifests/bullseye-slim" \

				#       -I | grep -i docker-content-digest

				# 3. As a next step, TODO(fedordikarev): create script and schedule workflow to run these checks

				#    and updates on regular bases and in automated way.

				ARG BOOKWORM_SLIM_SHA=sha256:40b107342c492725bc7aacbe93a49945445191ae364184a6d24fedb28172f6f7

				ARG BULLSEYE_SLIM_SHA=sha256:e831d9a884d63734fe3dd9c491ed9a5a3d4c6a6d32c5b14f2067357c49b0b7e1

				# Here we use ${var/search/replace} syntax, to check

				# if base image is one of the images, we pin image index for.

				# If var will match one the known images, we will replace it with the known sha.

				# If no match, than value will be unaffected, and will process with no-pinned image.

				ARG BASE_IMAGE_SHA=debian:${DEBIAN_FLAVOR}

				ARG BASE_IMAGE_SHA=${BASE_IMAGE_SHA/debian:bookworm-slim/debian@$BOOKWORM_SLIM_SHA}

				ARG BASE_IMAGE_SHA=${BASE_IMAGE_SHA/debian:bullseye-slim/debian@$BULLSEYE_SLIM_SHA}

				# Build Postgres

				FROM $REPOSITORY/$IMAGE:$TAG AS pg-build

				@@ -13,16 +37,25 @@ WORKDIR /home/nonroot

				COPY --chown=nonroot vendor/postgres-v14 vendor/postgres-v14

				COPY --chown=nonroot vendor/postgres-v15 vendor/postgres-v15

				COPY --chown=nonroot vendor/postgres-v16 vendor/postgres-v16

				COPY --chown=nonroot vendor/postgres-v17 vendor/postgres-v17

				COPY --chown=nonroot pgxn pgxn

				COPY --chown=nonroot Makefile Makefile

				COPY --chown=nonroot postgres.mk postgres.mk

				COPY --chown=nonroot scripts/ninstall.sh scripts/ninstall.sh

				ENV BUILD_TYPE=release

				RUN set -e \

				    && mold -run make -j $(nproc) -s neon-pg-ext \

				    && rm -rf pg_install/build \

				    && tar -C pg_install -czf /home/nonroot/postgres_install.tar.gz .

				# Prepare cargo-chef recipe

				FROM $REPOSITORY/$IMAGE:$TAG AS plan

				WORKDIR /home/nonroot

				COPY --chown=nonroot . .

				RUN cargo chef prepare --recipe-path recipe.json

				# Build neon binaries

				FROM $REPOSITORY/$IMAGE:$TAG AS build

				WORKDIR /home/nonroot

				@@ -32,12 +65,18 @@ ARG BUILD_TAG

				COPY --from=pg-build /home/nonroot/pg_install/v14/include/postgresql/server pg_install/v14/include/postgresql/server

				COPY --from=pg-build /home/nonroot/pg_install/v15/include/postgresql/server pg_install/v15/include/postgresql/server

				COPY --from=pg-build /home/nonroot/pg_install/v16/include/postgresql/server pg_install/v16/include/postgresql/server

				COPY --from=pg-build /home/nonroot/pg_install/v16/lib                       pg_install/v16/lib

				COPY --from=pg-build /home/nonroot/pg_install/v17/include/postgresql/server pg_install/v17/include/postgresql/server

				COPY --from=plan     /home/nonroot/recipe.json                              recipe.json

				ARG ADDITIONAL_RUSTFLAGS=""

				RUN set -e \

				    && RUSTFLAGS="-Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=-Wl,--no-rosegment -Cforce-frame-pointers=yes ${ADDITIONAL_RUSTFLAGS}" cargo chef cook --locked --release --recipe-path recipe.json

				COPY --chown=nonroot . .

				ARG ADDITIONAL_RUSTFLAGS

				RUN set -e \

				    && PQ_LIB_DIR=$(pwd)/pg_install/v16/lib RUSTFLAGS="-Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=-Wl,--no-rosegment ${ADDITIONAL_RUSTFLAGS}" cargo build \

				    && RUSTFLAGS="-Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=-Wl,--no-rosegment -Cforce-frame-pointers=yes ${ADDITIONAL_RUSTFLAGS}" cargo build \

				      --bin pg_sni_router  \

				      --bin pageserver  \

				      --bin pagectl  \

				@@ -45,21 +84,38 @@ RUN set -e \

				      --bin storage_broker  \

				      --bin storage_controller  \

				      --bin proxy  \

				      --bin endpoint_storage \

				      --bin neon_local \

				      --bin storage_scrubber \

				      --locked --release

				# Build final image

				#

				FROM debian:bullseye-slim

				FROM $BASE_IMAGE_SHA

				WORKDIR /data

				RUN set -e \

				    && echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries \

				    && apt update \

				    && apt install -y \

				        libreadline-dev \

				        libseccomp-dev \

				        ca-certificates \

				        openssl \

				        unzip \

				        curl \

				    && ARCH=$(uname -m) \

				    && if [ "$ARCH" = "x86_64" ]; then \

				        curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"; \

				    elif [ "$ARCH" = "aarch64" ]; then \

				        curl "https://awscli.amazonaws.com/awscli-exe-linux-aarch64.zip" -o "awscliv2.zip"; \

				    else \

				        echo "Unsupported architecture: $ARCH" && exit 1; \

				    fi \

				    && unzip awscliv2.zip \

				    && ./aws/install \

				    && rm -rf aws awscliv2.zip \

				    && rm -f /etc/apt/apt.conf.d/80-retries \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \

				    && useradd -d /data neon \

				    && chown -R neon:neon /data

				@@ -71,12 +127,14 @@ COPY --from=build --chown=neon:neon /home/nonroot/target/release/safekeeper

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/storage_broker      /usr/local/bin

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/storage_controller  /usr/local/bin

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/proxy               /usr/local/bin

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/endpoint_storage    /usr/local/bin

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/neon_local          /usr/local/bin

				COPY --from=build --chown=neon:neon /home/nonroot/target/release/storage_scrubber    /usr/local/bin

				COPY --from=pg-build /home/nonroot/pg_install/v14 /usr/local/v14/

				COPY --from=pg-build /home/nonroot/pg_install/v15 /usr/local/v15/

				COPY --from=pg-build /home/nonroot/pg_install/v16 /usr/local/v16/

				COPY --from=pg-build /home/nonroot/pg_install/v17 /usr/local/v17/

				COPY --from=pg-build /home/nonroot/postgres_install.tar.gz /data/

				# By default, pageserver uses `.neon/` working directory in WORKDIR, so create one and fill it with the dummy config.

				@@ -91,15 +149,9 @@ RUN mkdir -p /data/.neon/ && \

				  > /data/.neon/pageserver.toml && \

				  chown -R neon:neon /data/.neon

				# When running a binary that links with libpq, default to using our most recent postgres version.  Binaries

				# that want a particular postgres version will select it explicitly: this is just a default.

				ENV LD_LIBRARY_PATH=/usr/local/v16/lib

				VOLUME ["/data"]

				USER neon

				EXPOSE 6400

				EXPOSE 9898

				CMD ["/usr/local/bin/pageserver", "-D", "/data/.neon"]

									
										229

Dockerfile.build-tools
									
												View File
											
				@@ -1,229 +0,0 @@

				FROM debian:bullseye-slim

				# Use ARG as a build-time environment variable here to allow.

				# It's not supposed to be set outside.

				# Alternatively it can be obtained using the following command

				# ```

				# . /etc/os-release && echo "${VERSION_CODENAME}"

				# ```

				ARG DEBIAN_VERSION_CODENAME=bullseye

				# Add nonroot user

				RUN useradd -ms /bin/bash nonroot -b /home

				SHELL ["/bin/bash", "-c"]

				# System deps

				RUN set -e \

				    && apt update \

				    && apt install -y \

				        autoconf \

				        automake \

				        bison \

				        build-essential \

				        ca-certificates \

				        cmake \

				        curl \

				        flex \

				        git \

				        gnupg \

				        gzip \

				        jq \

				        libcurl4-openssl-dev \

				        libbz2-dev \

				        libffi-dev \

				        liblzma-dev \

				        libncurses5-dev \

				        libncursesw5-dev \

				        libreadline-dev \

				        libseccomp-dev \

				        libsqlite3-dev \

				        libssl-dev \

				        libstdc++-10-dev \

				        libtool \

				        libxml2-dev \

				        libxmlsec1-dev \

				        libxxhash-dev \

				        lsof \

				        make \

				        netcat \

				        net-tools \

				        openssh-client \

				        parallel \

				        pkg-config \

				        unzip \

				        wget \

				        xz-utils \

				        zlib1g-dev \

				        zstd \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# protobuf-compiler (protoc)

				ENV PROTOC_VERSION=25.1

				RUN curl -fsSL "https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-$(uname -m | sed 's/aarch64/aarch_64/g').zip" -o "protoc.zip" \

				    && unzip -q protoc.zip -d protoc \

				    && mv protoc/bin/protoc /usr/local/bin/protoc \

				    && mv protoc/include/google /usr/local/include/google \

				    && rm -rf protoc.zip protoc

				# s5cmd

				ENV S5CMD_VERSION=2.2.2

				RUN curl -sL "https://github.com/peak/s5cmd/releases/download/v${S5CMD_VERSION}/s5cmd_${S5CMD_VERSION}_Linux-$(uname -m | sed 's/x86_64/64bit/g' | sed 's/aarch64/arm64/g').tar.gz" | tar zxvf - s5cmd \

				    && chmod +x s5cmd \

				    && mv s5cmd /usr/local/bin/s5cmd

				# LLVM

				ENV LLVM_VERSION=18

				RUN curl -fsSL 'https://apt.llvm.org/llvm-snapshot.gpg.key' | apt-key add - \

				    && echo "deb http://apt.llvm.org/${DEBIAN_VERSION_CODENAME}/ llvm-toolchain-${DEBIAN_VERSION_CODENAME}-${LLVM_VERSION} main" > /etc/apt/sources.list.d/llvm.stable.list \

				    && apt update \

				    && apt install -y clang-${LLVM_VERSION} llvm-${LLVM_VERSION} \

				    && bash -c 'for f in /usr/bin/clang*-${LLVM_VERSION} /usr/bin/llvm*-${LLVM_VERSION}; do ln -s "${f}" "${f%-${LLVM_VERSION}}"; done' \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# Install docker

				RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \

				    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian ${DEBIAN_VERSION_CODENAME} stable" > /etc/apt/sources.list.d/docker.list \

				    && apt update \

				    && apt install -y docker-ce docker-ce-cli \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# Configure sudo & docker

				RUN usermod -aG sudo nonroot && \

				    echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers && \

				    usermod -aG docker nonroot

				# AWS CLI

				RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-$(uname -m).zip" -o "awscliv2.zip" \

				    && unzip -q awscliv2.zip \

				    && ./aws/install \

				    && rm awscliv2.zip

				# Mold: A Modern Linker

				ENV MOLD_VERSION=v2.33.0

				RUN set -e \

				    && git clone https://github.com/rui314/mold.git \

				    && mkdir mold/build \

				    && cd mold/build \

				    && git checkout ${MOLD_VERSION} \

				    && cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang++ .. \

				    && cmake --build . -j $(nproc) \

				    && cmake --install . \

				    && cd .. \

				    && rm -rf mold

				# LCOV

				# Build lcov from a fork:

				# It includes several bug fixes on top on v2.0 release (https://github.com/linux-test-project/lcov/compare/v2.0...master)

				# And patches from us:

				# - Generates json file with code coverage summary (https://github.com/neondatabase/lcov/commit/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz)

				RUN for package in Capture::Tiny DateTime Devel::Cover Digest::MD5 File::Spec JSON::XS Memory::Process Time::HiRes JSON; do yes | perl -MCPAN -e "CPAN::Shell->notest('install', '$package')"; done \

				    && wget https://github.com/neondatabase/lcov/archive/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz -O lcov.tar.gz \

				    && echo "61a22a62e20908b8b9e27d890bd0ea31f567a7b9668065589266371dcbca0992  lcov.tar.gz" | sha256sum --check \

				    && mkdir -p lcov && tar -xzf lcov.tar.gz -C lcov --strip-components=1 \

				    && cd lcov \

				    && make install \

				    && rm -rf ../lcov.tar.gz

				# Compile and install the static OpenSSL library

				ENV OPENSSL_VERSION=1.1.1w

				ENV OPENSSL_PREFIX=/usr/local/openssl

				RUN wget -O /tmp/openssl-${OPENSSL_VERSION}.tar.gz https://www.openssl.org/source/openssl-${OPENSSL_VERSION}.tar.gz && \

				    echo "cf3098950cb4d853ad95c0841f1f9c6d3dc102dccfcacd521d93925208b76ac8 /tmp/openssl-${OPENSSL_VERSION}.tar.gz" | sha256sum --check && \

				    cd /tmp && \

				    tar xzvf /tmp/openssl-${OPENSSL_VERSION}.tar.gz && \

				    rm /tmp/openssl-${OPENSSL_VERSION}.tar.gz && \

				    cd /tmp/openssl-${OPENSSL_VERSION} && \

				    ./config --prefix=${OPENSSL_PREFIX}  -static --static no-shared -fPIC && \

				    make -j "$(nproc)" && \

				    make install && \

				    cd /tmp && \

				    rm -rf /tmp/openssl-${OPENSSL_VERSION}

				# Use the same version of libicu as the compute nodes so that

				# clusters created using inidb on pageserver can be used by computes.

				#

				# TODO: at this time, Dockerfile.compute-node uses the debian bullseye libicu

				# package, which is 67.1. We're duplicating that knowledge here, and also, technically,

				# Debian has a few patches on top of 67.1 that we're not adding here.

				ENV ICU_VERSION=67.1

				ENV ICU_PREFIX=/usr/local/icu

				# Download and build static ICU

				RUN wget -O /tmp/libicu-${ICU_VERSION}.tgz https://github.com/unicode-org/icu/releases/download/release-${ICU_VERSION//./-}/icu4c-${ICU_VERSION//./_}-src.tgz && \

				    echo "94a80cd6f251a53bd2a997f6f1b5ac6653fe791dfab66e1eb0227740fb86d5dc /tmp/libicu-${ICU_VERSION}.tgz" | sha256sum --check && \

				    mkdir /tmp/icu && \

				    pushd /tmp/icu && \

				    tar -xzf /tmp/libicu-${ICU_VERSION}.tgz && \

				    pushd icu/source && \

				    ./configure --prefix=${ICU_PREFIX}  --enable-static --enable-shared=no CXXFLAGS="-fPIC" CFLAGS="-fPIC" && \

				    make -j "$(nproc)" && \

				    make install && \

				    popd && \

				    rm -rf icu && \

				    rm -f /tmp/libicu-${ICU_VERSION}.tgz && \

				    popd

				# Switch to nonroot user

				USER nonroot:nonroot

				WORKDIR /home/nonroot

				# Python

				ENV PYTHON_VERSION=3.9.19 \

				    PYENV_ROOT=/home/nonroot/.pyenv \

				    PATH=/home/nonroot/.pyenv/shims:/home/nonroot/.pyenv/bin:/home/nonroot/.poetry/bin:$PATH

				RUN set -e \

				    && cd $HOME \

				    && curl -sSO https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer \

				    && chmod +x pyenv-installer \

				    && ./pyenv-installer \

				    && export PYENV_ROOT=/home/nonroot/.pyenv \

				    && export PATH="$PYENV_ROOT/bin:$PATH" \

				    && export PATH="$PYENV_ROOT/shims:$PATH" \

				    && pyenv install ${PYTHON_VERSION} \

				    && pyenv global ${PYTHON_VERSION} \

				    && python --version \

				    && pip install --upgrade pip \

				    && pip --version \

				    && pip install pipenv wheel poetry

				# Switch to nonroot user (again)

				USER nonroot:nonroot

				WORKDIR /home/nonroot

				# Rust

				# Please keep the version of llvm (installed above) in sync with rust llvm (`rustc --version --verbose | grep LLVM`)

				ENV RUSTC_VERSION=1.81.0

				ENV RUSTUP_HOME="/home/nonroot/.rustup"

				ENV PATH="/home/nonroot/.cargo/bin:${PATH}"

				ARG RUSTFILT_VERSION=0.2.1

				ARG CARGO_HAKARI_VERSION=0.9.30

				ARG CARGO_DENY_VERSION=0.16.1

				ARG CARGO_HACK_VERSION=0.6.31

				ARG CARGO_NEXTEST_VERSION=0.9.72

				RUN curl -sSO https://static.rust-lang.org/rustup/dist/$(uname -m)-unknown-linux-gnu/rustup-init && whoami && \

					chmod +x rustup-init && \

					./rustup-init -y --default-toolchain ${RUSTC_VERSION} && \

					rm rustup-init && \

				    export PATH="$HOME/.cargo/bin:$PATH" && \

				    . "$HOME/.cargo/env" && \

				    cargo --version && rustup --version && \

				    rustup component add llvm-tools rustfmt clippy && \

				    cargo install rustfilt            --version ${RUSTFILT_VERSION} && \

				    cargo install cargo-hakari        --version ${CARGO_HAKARI_VERSION} && \

				    cargo install cargo-deny --locked --version ${CARGO_DENY_VERSION} && \

				    cargo install cargo-hack          --version ${CARGO_HACK_VERSION} && \

				    cargo install cargo-nextest       --version ${CARGO_NEXTEST_VERSION} && \

				    rm -rf /home/nonroot/.cargo/registry && \

				    rm -rf /home/nonroot/.cargo/git

				# Show versions

				RUN whoami \

				    && python --version \

				    && pip --version \

				    && cargo --version --verbose \

				    && rustup --version --verbose \

				    && rustc --version --verbose \

				    && clang --version

				# Set following flag to check in Makefile if its running in Docker

				RUN touch /home/nonroot/.docker_build

1039

Dockerfile.compute-node

View File

File diff suppressed because it is too large Load Diff

									
										285

Makefile
									
												View File
												
				@@ -1,9 +1,21 @@

				ROOT_PROJECT_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))

				# Where to install Postgres, default is ./pg_install, maybe useful for package managers

				# Where to install Postgres, default is ./pg_install, maybe useful for package

				# managers.

				POSTGRES_INSTALL_DIR ?= $(ROOT_PROJECT_DIR)/pg_install/

				OPENSSL_PREFIX_DIR := /usr/local/openssl

				# Supported PostgreSQL versions

				POSTGRES_VERSIONS = v17 v16 v15 v14

				# CARGO_BUILD_FLAGS: Extra flags to pass to `cargo build`. `--locked`

				# and `--features testing` are popular examples.

				#

				# CARGO_PROFILE: Set to override the cargo profile to use. By default,

				# it is derived from BUILD_TYPE.

				# All intermediate build artifacts are stored here.

				BUILD_DIR := build

				ICU_PREFIX_DIR := /usr/local/icu

				#

				@@ -11,33 +23,46 @@ ICU_PREFIX_DIR := /usr/local/icu

				# environment variable.

				#

				BUILD_TYPE ?= debug

				WITH_SANITIZERS ?= no

				PG_CFLAGS = -fsigned-char

				ifeq ($(BUILD_TYPE),release)

					PG_CONFIGURE_OPTS = --enable-debug --with-openssl

					PG_CFLAGS = -O2 -g3 $(CFLAGS)

					# Unfortunately, `--profile=...` is a nightly feature

					CARGO_BUILD_FLAGS += --release

					PG_CFLAGS += -O2 -g3 $(CFLAGS)

					PG_LDFLAGS = $(LDFLAGS)

					CARGO_PROFILE ?= --profile=release

				else ifeq ($(BUILD_TYPE),debug)

					PG_CONFIGURE_OPTS = --enable-debug --with-openssl --enable-cassert --enable-depend

					PG_CFLAGS = -O0 -g3 $(CFLAGS)

					PG_CFLAGS += -O0 -g3 $(CFLAGS)

					PG_LDFLAGS = $(LDFLAGS)

					CARGO_PROFILE ?= --profile=dev

				else

					$(error Bad build type '$(BUILD_TYPE)', see Makefile for options)

				endif

				ifeq ($(WITH_SANITIZERS),yes)

					PG_CFLAGS += -fsanitize=address -fsanitize=undefined -fno-sanitize-recover

					COPT += -Wno-error # to avoid failing on warnings induced by sanitizers

					PG_LDFLAGS = -fsanitize=address -fsanitize=undefined -static-libasan -static-libubsan $(LDFLAGS)

					export CC := gcc

					export ASAN_OPTIONS := detect_leaks=0

				endif

				ifeq ($(shell test -e /home/nonroot/.docker_build && echo -n yes),yes)

					# Exclude static build openssl, icu for local build (MacOS, Linux)

					# Only keep for build type release and debug

					PG_CFLAGS += -I$(OPENSSL_PREFIX_DIR)/include

					PG_CONFIGURE_OPTS += --with-icu

					PG_CONFIGURE_OPTS += ICU_CFLAGS='-I/$(ICU_PREFIX_DIR)/include -DU_STATIC_IMPLEMENTATION'

					PG_CONFIGURE_OPTS += ICU_LIBS='-L$(ICU_PREFIX_DIR)/lib -L$(ICU_PREFIX_DIR)/lib64 -licui18n -licuuc -licudata -lstdc++ -Wl,-Bdynamic -lm'

					PG_CONFIGURE_OPTS += LDFLAGS='-L$(OPENSSL_PREFIX_DIR)/lib -L$(OPENSSL_PREFIX_DIR)/lib64 -L$(ICU_PREFIX_DIR)/lib -L$(ICU_PREFIX_DIR)/lib64 -Wl,-Bstatic -lssl -lcrypto -Wl,-Bdynamic -lrt -lm -ldl -lpthread'

				endif

				UNAME_S := $(shell uname -s)

				ifeq ($(UNAME_S),Linux)

					# Seccomp BPF is only available for Linux

					PG_CONFIGURE_OPTS += --with-libseccomp

					ifneq ($(WITH_SANITIZERS),yes)

						PG_CONFIGURE_OPTS += --with-libseccomp

					endif

				else ifeq ($(UNAME_S),Darwin)

					PG_CFLAGS += -DUSE_PREFETCH

					ifndef DISABLE_HOMEBREW

						# macOS with brew-installed openssl requires explicit paths

						# It can be configured with OPENSSL_PREFIX variable

				@@ -66,8 +91,6 @@ CARGO_BUILD_FLAGS += $(filter -j1,$(MAKEFLAGS))

				CARGO_CMD_PREFIX += $(if $(filter n,$(MAKEFLAGS)),,+)

				# Force cargo not to print progress bar

				CARGO_CMD_PREFIX += CARGO_TERM_PROGRESS_WHEN=never CI=1

				# Set PQ_LIB_DIR to make sure `storage_controller` get linked with bundled libpq (through diesel)

				CARGO_CMD_PREFIX += PQ_LIB_DIR=$(POSTGRES_INSTALL_DIR)/v16/lib

				CACHEDIR_TAG_CONTENTS := "Signature: 8a477f597d28d172789f06886806bc55"

				@@ -75,135 +98,29 @@ CACHEDIR_TAG_CONTENTS := "Signature: 8a477f597d28d172789f06886806bc55"

				# Top level Makefile to build Neon and PostgreSQL

				#

				.PHONY: all

				all: neon postgres neon-pg-ext

				all: neon postgres-install neon-pg-ext

				### Neon Rust bits

				#

				# The 'postgres_ffi' depends on the Postgres headers.

				.PHONY: neon

				neon: postgres-headers walproposer-lib cargo-target-dir

				neon: postgres-headers-install walproposer-lib cargo-target-dir

					+@echo "Compiling Neon"

					$(CARGO_CMD_PREFIX) cargo build $(CARGO_BUILD_FLAGS)

					$(CARGO_CMD_PREFIX) cargo build $(CARGO_BUILD_FLAGS) $(CARGO_PROFILE)

				.PHONY: cargo-target-dir

				cargo-target-dir:

					# https://github.com/rust-lang/cargo/issues/14281

					mkdir -p target

					test -e target/CACHEDIR.TAG || echo "$(CACHEDIR_TAG_CONTENTS)" > target/CACHEDIR.TAG

				### PostgreSQL parts

				# Some rules are duplicated for Postgres v14 and 15. We may want to refactor

				# to avoid the duplication in the future, but it's tolerable for now.

				#

				$(POSTGRES_INSTALL_DIR)/build/%/config.status:

					mkdir -p $(POSTGRES_INSTALL_DIR)

					test -e $(POSTGRES_INSTALL_DIR)/CACHEDIR.TAG || echo "$(CACHEDIR_TAG_CONTENTS)" > $(POSTGRES_INSTALL_DIR)/CACHEDIR.TAG

					+@echo "Configuring Postgres $* build"

					@test -s $(ROOT_PROJECT_DIR)/vendor/postgres-$*/configure || { \

						echo "\nPostgres submodule not found in $(ROOT_PROJECT_DIR)/vendor/postgres-$*/, execute "; \

						echo "'git submodule update --init --recursive --depth 2 --progress .' in project root.\n"; \

						exit 1; }

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/$*

					VERSION=$*; \

					EXTRA_VERSION=$$(cd $(ROOT_PROJECT_DIR)/vendor/postgres-$$VERSION && git rev-parse HEAD); \

					(cd $(POSTGRES_INSTALL_DIR)/build/$$VERSION && \

					env PATH="$(EXTRA_PATH_OVERRIDES):$$PATH" $(ROOT_PROJECT_DIR)/vendor/postgres-$$VERSION/configure \

						CFLAGS='$(PG_CFLAGS)' \

						$(PG_CONFIGURE_OPTS) --with-extra-version=" ($$EXTRA_VERSION)" \

						--prefix=$(abspath $(POSTGRES_INSTALL_DIR))/$$VERSION > configure.log)

				# nicer alias to run 'configure'

				# Note: I've been unable to use templates for this part of our configuration.

				# I'm not sure why it wouldn't work, but this is the only place (apart from

				# the "build-all-versions" entry points) where direct mention of PostgreSQL

				# versions is used.

				.PHONY: postgres-configure-v16

				postgres-configure-v16: $(POSTGRES_INSTALL_DIR)/build/v16/config.status

				.PHONY: postgres-configure-v15

				postgres-configure-v15: $(POSTGRES_INSTALL_DIR)/build/v15/config.status

				.PHONY: postgres-configure-v14

				postgres-configure-v14: $(POSTGRES_INSTALL_DIR)/build/v14/config.status

				# Install the PostgreSQL header files into $(POSTGRES_INSTALL_DIR)/<version>/include

				.PHONY: postgres-headers-%

				postgres-headers-%: postgres-configure-%

					+@echo "Installing PostgreSQL $* headers"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/include MAKELEVEL=0 install

				# Compile and install PostgreSQL

				.PHONY: postgres-%

				postgres-%: postgres-configure-% \

						  postgres-headers-% # to prevent `make install` conflicts with neon's `postgres-headers`

					+@echo "Compiling PostgreSQL $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$* MAKELEVEL=0 install

					+@echo "Compiling libpq $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/interfaces/libpq install

					+@echo "Compiling pg_prewarm $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_prewarm install

					+@echo "Compiling pg_buffercache $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_buffercache install

					+@echo "Compiling pageinspect $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pageinspect install

					+@echo "Compiling amcheck $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/amcheck install

					+@echo "Compiling test_decoding $*"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/test_decoding install

				.PHONY: postgres-clean-%

				postgres-clean-%:

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$* MAKELEVEL=0 clean

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_buffercache clean

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pageinspect clean

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/interfaces/libpq clean

				.PHONY: postgres-check-%

				postgres-check-%: postgres-%

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$* MAKELEVEL=0 check

				.PHONY: neon-pg-ext-%

				neon-pg-ext-%: postgres-%

					+@echo "Compiling neon $*"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-$* \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile install

					+@echo "Compiling neon_walredo $*"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$* \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon_walredo/Makefile install

					+@echo "Compiling neon_rmgr $*"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-rmgr-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-rmgr-$* \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon_rmgr/Makefile install

					+@echo "Compiling neon_test_utils $*"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$* \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile install

					+@echo "Compiling neon_utils $*"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-utils-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-utils-$* \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon_utils/Makefile install

				.PHONY: neon-pg-clean-ext-%

				neon-pg-clean-ext-%:

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \

					-C $(POSTGRES_INSTALL_DIR)/build/neon-$* \

					-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile clean

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \

					-C $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$* \

					-f $(ROOT_PROJECT_DIR)/pgxn/neon_walredo/Makefile clean

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \

					-C $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$* \

					-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile clean

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \

					-C $(POSTGRES_INSTALL_DIR)/build/neon-utils-$* \

					-f $(ROOT_PROJECT_DIR)/pgxn/neon_utils/Makefile clean

				neon-pg-ext-%: postgres-install-%

					+@echo "Compiling neon-specific Postgres extensions for $*"

					mkdir -p $(BUILD_DIR)/pgxn-$*

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config COPT='$(COPT)' \

						-C $(BUILD_DIR)/pgxn-$*\

						-f $(ROOT_PROJECT_DIR)/pgxn/Makefile  install

				# Build walproposer as a static library. walproposer source code is located

				# in the pgxn/neon directory.

				@@ -215,78 +132,36 @@ neon-pg-clean-ext-%:

				# they depend on openssl and other libraries that are not included in our

				# Rust build.

				.PHONY: walproposer-lib

				walproposer-lib: neon-pg-ext-v16

				walproposer-lib: neon-pg-ext-v17

					+@echo "Compiling walproposer-lib"

					mkdir -p $(POSTGRES_INSTALL_DIR)/build/walproposer-lib

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v16/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						-C $(POSTGRES_INSTALL_DIR)/build/walproposer-lib \

					mkdir -p $(BUILD_DIR)/walproposer-lib

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v17/bin/pg_config COPT='$(COPT)' \

						-C $(BUILD_DIR)/walproposer-lib \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile walproposer-lib

					cp $(POSTGRES_INSTALL_DIR)/v16/lib/libpgport.a $(POSTGRES_INSTALL_DIR)/build/walproposer-lib

					cp $(POSTGRES_INSTALL_DIR)/v16/lib/libpgcommon.a $(POSTGRES_INSTALL_DIR)/build/walproposer-lib

				ifeq ($(UNAME_S),Linux)

					$(AR) d $(POSTGRES_INSTALL_DIR)/build/walproposer-lib/libpgport.a \

					cp $(POSTGRES_INSTALL_DIR)/v17/lib/libpgport.a $(BUILD_DIR)/walproposer-lib

					cp $(POSTGRES_INSTALL_DIR)/v17/lib/libpgcommon.a $(BUILD_DIR)/walproposer-lib

					$(AR) d $(BUILD_DIR)/walproposer-lib/libpgport.a \

						pg_strong_random.o

					$(AR) d $(POSTGRES_INSTALL_DIR)/build/walproposer-lib/libpgcommon.a \

						pg_crc32c.o \

						hmac_openssl.o \

					$(AR) d $(BUILD_DIR)/walproposer-lib/libpgcommon.a \

						checksum_helper.o \

						cryptohash_openssl.o \

						scram-common.o \

						hmac_openssl.o \

						md5_common.o \

						checksum_helper.o

						parse_manifest.o \

						scram-common.o

				ifeq ($(UNAME_S),Linux)

					$(AR) d $(BUILD_DIR)/walproposer-lib/libpgcommon.a \

						pg_crc32c.o

				endif

				.PHONY: walproposer-lib-clean

				walproposer-lib-clean:

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v16/bin/pg_config \

						-C $(POSTGRES_INSTALL_DIR)/build/walproposer-lib \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile clean

				# Shorthand to call neon-pg-ext-% target for all Postgres versions

				.PHONY: neon-pg-ext

				neon-pg-ext: \

					neon-pg-ext-v14 \

					neon-pg-ext-v15 \

					neon-pg-ext-v16

				.PHONY: neon-pg-clean-ext

				neon-pg-clean-ext: \

					neon-pg-clean-ext-v14 \

					neon-pg-clean-ext-v15 \

					neon-pg-clean-ext-v16

				# shorthand to build all Postgres versions

				.PHONY: postgres

				postgres: \

					postgres-v14 \

					postgres-v15 \

					postgres-v16

				.PHONY: postgres-headers

				postgres-headers: \

					postgres-headers-v14 \

					postgres-headers-v15 \

					postgres-headers-v16

				.PHONY: postgres-clean

				postgres-clean: \

					postgres-clean-v14 \

					postgres-clean-v15 \

					postgres-clean-v16

				.PHONY: postgres-check

				postgres-check: \

					postgres-check-v14 \

					postgres-check-v15 \

					postgres-check-v16

				# This doesn't remove the effects of 'configure'.

				.PHONY: clean

				clean: postgres-clean neon-pg-clean-ext

					$(CARGO_CMD_PREFIX) cargo clean

				neon-pg-ext: $(foreach pg_version,$(POSTGRES_VERSIONS),neon-pg-ext-$(pg_version))

				# This removes everything

				.PHONY: distclean

				distclean:

					rm -rf $(POSTGRES_INSTALL_DIR)

					$(RM) -r $(POSTGRES_INSTALL_DIR) $(BUILD_DIR)

					$(CARGO_CMD_PREFIX) cargo clean

				.PHONY: fmt

				@@ -295,7 +170,7 @@ fmt:

				postgres-%-pg-bsd-indent: postgres-%

					+@echo "Compiling pg_bsd_indent"

					$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/tools/pg_bsd_indent/

					$(MAKE) -C $(BUILD_DIR)/$*/src/tools/pg_bsd_indent/

				# Create typedef list for the core. Note that generally it should be combined with

				# buildfarm one to cover platform specific stuff.

				@@ -314,23 +189,39 @@ postgres-%-pgindent: postgres-%-pg-bsd-indent postgres-%-typedefs.list

					cat $(ROOT_PROJECT_DIR)/vendor/postgres-$*/src/tools/pgindent/typedefs.list |\

						cat - postgres-$*-typedefs.list | sort | uniq > postgres-$*-typedefs-full.list

					+@echo note: you might want to run it on selected files/dirs instead.

					INDENT=$(POSTGRES_INSTALL_DIR)/build/$*/src/tools/pg_bsd_indent/pg_bsd_indent \

					INDENT=$(BUILD_DIR)/$*/src/tools/pg_bsd_indent/pg_bsd_indent \

						$(ROOT_PROJECT_DIR)/vendor/postgres-$*/src/tools/pgindent/pgindent --typedefs postgres-$*-typedefs-full.list \

						$(ROOT_PROJECT_DIR)/vendor/postgres-$*/src/ \

						--excludes $(ROOT_PROJECT_DIR)/vendor/postgres-$*/src/tools/pgindent/exclude_file_patterns

					rm -f pg*.BAK

					$(RM) pg*.BAK

				# Indent pxgn/neon.

				.PHONY: pgindent

				neon-pgindent: postgres-v16-pg-bsd-indent neon-pg-ext-v16

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v16/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \

						FIND_TYPEDEF=$(ROOT_PROJECT_DIR)/vendor/postgres-v16/src/tools/find_typedef \

						INDENT=$(POSTGRES_INSTALL_DIR)/build/v16/src/tools/pg_bsd_indent/pg_bsd_indent \

						PGINDENT_SCRIPT=$(ROOT_PROJECT_DIR)/vendor/postgres-v16/src/tools/pgindent/pgindent \

						-C $(POSTGRES_INSTALL_DIR)/build/neon-v16 \

				.PHONY: neon-pgindent

				neon-pgindent: postgres-v17-pg-bsd-indent neon-pg-ext-v17

					$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v17/bin/pg_config COPT='$(COPT)' \

						FIND_TYPEDEF=$(ROOT_PROJECT_DIR)/vendor/postgres-v17/src/tools/find_typedef \

						INDENT=$(BUILD_DIR)/v17/src/tools/pg_bsd_indent/pg_bsd_indent \

						PGINDENT_SCRIPT=$(ROOT_PROJECT_DIR)/vendor/postgres-v17/src/tools/pgindent/pgindent \

						-C $(BUILD_DIR)/neon-v17 \

						-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile pgindent

				.PHONY: setup-pre-commit-hook

				setup-pre-commit-hook:

					ln -s -f $(ROOT_PROJECT_DIR)/pre-commit.py .git/hooks/pre-commit

				# Targets for building PostgreSQL are defined in postgres.mk.

				#

				# But if the caller has indicated that PostgreSQL is already

				# installed, by setting the PG_INSTALL_CACHED variable, skip it.

				ifdef PG_INSTALL_CACHED

				postgres-install: skip-install

				$(foreach pg_version,$(POSTGRES_VERSIONS),postgres-install-$(pg_version)): skip-install

				postgres-headers-install:

					+@echo "Skipping installation of PostgreSQL headers because PG_INSTALL_CACHED is set"

				skip-install:

					+@echo "Skipping PostgreSQL installation because PG_INSTALL_CACHED is set"

				else

				include postgres.mk

				endif

									
										14

README.md
									
												View File
												
				@@ -21,8 +21,10 @@ The Neon storage engine consists of two major components:

				See developer documentation in [SUMMARY.md](/docs/SUMMARY.md) for more information.

				## Running local installation

				## Running a local development environment

				Neon can be run on a workstation for small experiments and to test code changes, by

				following these instructions.

				#### Installing dependencies on Linux

				1. Install build dependencies and other applicable packages

				@@ -31,7 +33,7 @@ See developer documentation in [SUMMARY.md](/docs/SUMMARY.md) for more informati

				```bash

				apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \

				libssl-dev clang pkg-config libpq-dev cmake postgresql-client protobuf-compiler \

				libcurl4-openssl-dev openssl python3-poetry lsof libicu-dev

				libprotobuf-dev libcurl4-openssl-dev openssl python3-poetry lsof libicu-dev

				```

				* On Fedora, these packages are needed:

				```bash

				@@ -58,7 +60,7 @@ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

				1. Install XCode and dependencies

				```

				xcode-select --install

				brew install protobuf openssl flex bison icu4c pkg-config

				brew install protobuf openssl flex bison icu4c pkg-config m4

				# add openssl to PATH, required for ed25519 keys generation in neon_local

				echo 'export PATH="$(brew --prefix openssl)/bin:$PATH"' >> ~/.zshrc

				@@ -132,7 +134,7 @@ make -j`sysctl -n hw.logicalcpu` -s

				To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `pg_install/bin` and `pg_install/lib`, respectively.

				To run the integration tests or Python scripts (not required to use the code), install

				Python (3.9 or higher), and install the python3 packages using `./scripts/pysync` (requires [poetry>=1.8](https://python-poetry.org/)) in the project directory.

				Python (3.11 or higher), and install the python3 packages using `./scripts/pysync` (requires [poetry>=1.8](https://python-poetry.org/)) in the project directory.

				#### Running neon database

				@@ -238,7 +240,7 @@ postgres=# select * from t;

				> cargo neon stop

				```

				More advanced usages can be found at [Control Plane and Neon Local](./control_plane/README.md).

				More advanced usages can be found at [Local Development Control Plane (`neon_local`))](./control_plane/README.md).

				#### Handling build failures

				@@ -268,7 +270,7 @@ By default, this runs both debug and release modes, and all supported postgres v

				testing locally, it is convenient to run just one set of permutations, like this:

				```sh

				DEFAULT_PG_VERSION=16 BUILD_TYPE=release ./scripts/pytest

				DEFAULT_PG_VERSION=17 BUILD_TYPE=release ./scripts/pytest

				```

				## Flamegraphs

									
										341

build-tools.Dockerfile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,341 @@

				ARG DEBIAN_VERSION=bookworm

				ARG DEBIAN_FLAVOR=${DEBIAN_VERSION}-slim

				# Here are the INDEX DIGESTS for the images we use.

				# You can get them following next steps for now:

				# 1. Get an authentication token from DockerHub:

				#    TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/debian:pull" | jq -r .token)

				# 2. Using that token, query index for the given tag:

				#    curl -s -H "Authorization: Bearer $TOKEN" \

				#       -H "Accept: application/vnd.docker.distribution.manifest.list.v2+json" \

				#       "https://registry.hub.docker.com/v2/library/debian/manifests/bullseye-slim" \

				#       -I | grep -i docker-content-digest

				# 3. As a next step, TODO(fedordikarev): create script and schedule workflow to run these checks

				#    and updates on regular bases and in automated way.

				ARG BOOKWORM_SLIM_SHA=sha256:40b107342c492725bc7aacbe93a49945445191ae364184a6d24fedb28172f6f7

				ARG BULLSEYE_SLIM_SHA=sha256:e831d9a884d63734fe3dd9c491ed9a5a3d4c6a6d32c5b14f2067357c49b0b7e1

				# Here we use ${var/search/replace} syntax, to check

				# if base image is one of the images, we pin image index for.

				# If var will match one the known images, we will replace it with the known sha.

				# If no match, than value will be unaffected, and will process with no-pinned image.

				ARG BASE_IMAGE_SHA=debian:${DEBIAN_FLAVOR}

				ARG BASE_IMAGE_SHA=${BASE_IMAGE_SHA/debian:bookworm-slim/debian@$BOOKWORM_SLIM_SHA}

				ARG BASE_IMAGE_SHA=${BASE_IMAGE_SHA/debian:bullseye-slim/debian@$BULLSEYE_SLIM_SHA}

				FROM $BASE_IMAGE_SHA AS pgcopydb_builder

				ARG DEBIAN_VERSION

				# Use strict mode for bash to catch errors early

				SHELL ["/bin/bash", "-euo", "pipefail", "-c"]

				# By default, /bin/sh used in debian images will treat '\n' as eol,

				# but as we use bash as SHELL, and built-in echo in bash requires '-e' flag for that.

				RUN echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries && \

				    echo -e "retry_connrefused=on\ntimeout=15\ntries=5\nretry-on-host-error=on\n" > /root/.wgetrc && \

				    echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /root/.curlrc

				COPY build_tools/patches/pgcopydbv017.patch /pgcopydbv017.patch

				RUN if [ "${DEBIAN_VERSION}" = "bookworm" ]; then \

				        set -e && \

				        apt update && \

				        apt install -y --no-install-recommends \

				        ca-certificates wget gpg && \

				        wget -qO - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /usr/share/keyrings/postgresql-keyring.gpg && \

				        echo "deb [signed-by=/usr/share/keyrings/postgresql-keyring.gpg] http://apt.postgresql.org/pub/repos/apt bookworm-pgdg main" > /etc/apt/sources.list.d/pgdg.list && \

				        apt-get update && \

				        apt install -y --no-install-recommends \

				        build-essential \

				        autotools-dev \

				        libedit-dev \

				        libgc-dev \

				        libpam0g-dev \

				        libreadline-dev \

				        libselinux1-dev \

				        libxslt1-dev \

				        libssl-dev \

				        libkrb5-dev \

				        zlib1g-dev \

				        liblz4-dev \

				        libpq5 \

				        libpq-dev \

				        libzstd-dev \

				        postgresql-16 \

				        postgresql-server-dev-16 \

				        postgresql-common  \

				        python3-sphinx && \

				        wget -O /tmp/pgcopydb.tar.gz https://github.com/dimitri/pgcopydb/archive/refs/tags/v0.17.tar.gz && \

				        mkdir /tmp/pgcopydb && \

				        tar -xzf /tmp/pgcopydb.tar.gz -C /tmp/pgcopydb --strip-components=1 && \

				        cd /tmp/pgcopydb && \

				        patch -p1 < /pgcopydbv017.patch && \

				        make -s clean && \

				        make -s -j12 install && \

				        libpq_path=$(find /lib /usr/lib -name "libpq.so.5" | head -n 1) && \

				        mkdir -p /pgcopydb/lib && \

				        cp "$libpq_path" /pgcopydb/lib/; \

				    else \

				        # copy command below will fail if we don't have dummy files, so we create them for other debian versions

				        mkdir -p /usr/lib/postgresql/16/bin && touch /usr/lib/postgresql/16/bin/pgcopydb && \

				        mkdir -p mkdir -p /pgcopydb/lib && touch /pgcopydb/lib/libpq.so.5; \

				    fi

				FROM $BASE_IMAGE_SHA AS build_tools

				ARG DEBIAN_VERSION

				# Add nonroot user

				RUN useradd -ms /bin/bash nonroot -b /home

				# Use strict mode for bash to catch errors early

				SHELL ["/bin/bash", "-euo", "pipefail", "-c"]

				RUN mkdir -p /pgcopydb/bin && \

				    mkdir -p /pgcopydb/lib && \

				    chmod -R 755 /pgcopydb && \

				    chown -R nonroot:nonroot /pgcopydb

				COPY --from=pgcopydb_builder /usr/lib/postgresql/16/bin/pgcopydb /pgcopydb/bin/pgcopydb

				COPY --from=pgcopydb_builder /pgcopydb/lib/libpq.so.5 /pgcopydb/lib/libpq.so.5

				RUN echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries && \

				    echo -e "retry_connrefused=on\ntimeout=15\ntries=5\nretry-on-host-error=on\n" > /root/.wgetrc && \

				    echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /root/.curlrc

				# System deps

				#

				# 'gdb' is included so that we get backtraces of core dumps produced in

				# regression tests

				RUN set -e \

				    && apt update \

				    && apt install -y \

				        autoconf \

				        automake \

				        bison \

				        build-essential \

				        ca-certificates \

				        cmake \

				        curl \

				        flex \

				        gdb \

				        git \

				        gnupg \

				        gzip \

				        jq \

				        jsonnet \

				        libcurl4-openssl-dev \

				        libbz2-dev \

				        libffi-dev \

				        liblzma-dev \

				        libncurses5-dev \

				        libncursesw5-dev \

				        libreadline-dev \

				        libseccomp-dev \

				        libsqlite3-dev \

				        libssl-dev \

				        $([[ "${DEBIAN_VERSION}" = "bullseye" ]] && echo libstdc++-10-dev || echo libstdc++-11-dev) \

				        libtool \

				        libxml2-dev \

				        libxmlsec1-dev \

				        libxxhash-dev \

				        lsof \

				        make \

				        netcat-openbsd \

				        net-tools \

				        openssh-client \

				        parallel \

				        pkg-config \

				        unzip \

				        wget \

				        xz-utils \

				        zlib1g-dev \

				        zstd \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# sql_exporter

				# Keep the version the same as in compute/compute-node.Dockerfile and

				# test_runner/regress/test_compute_metrics.py.

				ENV SQL_EXPORTER_VERSION=0.17.3

				RUN curl -fsSL \

				    "https://github.com/burningalchemist/sql_exporter/releases/download/${SQL_EXPORTER_VERSION}/sql_exporter-${SQL_EXPORTER_VERSION}.linux-$(case "$(uname -m)" in x86_64) echo amd64;; aarch64) echo arm64;; esac).tar.gz" \

				    --output sql_exporter.tar.gz \

				    && mkdir /tmp/sql_exporter \

				    && tar xzvf sql_exporter.tar.gz -C /tmp/sql_exporter --strip-components=1 \

				    && mv /tmp/sql_exporter/sql_exporter /usr/local/bin/sql_exporter \

				    && rm sql_exporter.tar.gz

				# protobuf-compiler (protoc)

				# Keep the version the same as in compute/compute-node.Dockerfile

				ENV PROTOC_VERSION=25.1

				RUN curl -fsSL "https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-$(uname -m | sed 's/aarch64/aarch_64/g').zip" -o "protoc.zip" \

				    && unzip -q protoc.zip -d protoc \

				    && mv protoc/bin/protoc /usr/local/bin/protoc \

				    && mv protoc/include/google /usr/local/include/google \

				    && rm -rf protoc.zip protoc

				# s5cmd

				ENV S5CMD_VERSION=2.3.0

				RUN curl -sL "https://github.com/peak/s5cmd/releases/download/v${S5CMD_VERSION}/s5cmd_${S5CMD_VERSION}_Linux-$(uname -m | sed 's/x86_64/64bit/g' | sed 's/aarch64/arm64/g').tar.gz" | tar zxvf - s5cmd \

				    && chmod +x s5cmd \

				    && mv s5cmd /usr/local/bin/s5cmd

				# LLVM

				ENV LLVM_VERSION=20

				RUN curl -fsSL 'https://apt.llvm.org/llvm-snapshot.gpg.key' | apt-key add - \

				    && echo "deb http://apt.llvm.org/${DEBIAN_VERSION}/ llvm-toolchain-${DEBIAN_VERSION}-${LLVM_VERSION} main" > /etc/apt/sources.list.d/llvm.stable.list \

				    && apt update \

				    && apt install -y clang-${LLVM_VERSION} llvm-${LLVM_VERSION} \

				    && bash -c 'for f in /usr/bin/clang*-${LLVM_VERSION} /usr/bin/llvm*-${LLVM_VERSION}; do ln -s "${f}" "${f%-${LLVM_VERSION}}"; done' \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# Install docker

				RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \

				    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian ${DEBIAN_VERSION} stable" > /etc/apt/sources.list.d/docker.list \

				    && apt update \

				    && apt install -y docker-ce docker-ce-cli \

				    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

				# Configure sudo & docker

				RUN usermod -aG sudo nonroot && \

				    echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers && \

				    usermod -aG docker nonroot

				# AWS CLI

				RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-$(uname -m).zip" -o "awscliv2.zip" \

				    && unzip -q awscliv2.zip \

				    && ./aws/install \

				    && rm awscliv2.zip

				# Mold: A Modern Linker

				ENV MOLD_VERSION=v2.37.1

				RUN set -e \

				    && git clone https://github.com/rui314/mold.git \

				    && mkdir mold/build \

				    && cd mold/build \

				    && git checkout ${MOLD_VERSION} \

				    && cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang++ .. \

				    && cmake --build . -j $(nproc) \

				    && cmake --install . \

				    && cd .. \

				    && rm -rf mold

				# LCOV

				# Build lcov from a fork:

				# It includes several bug fixes on top on v2.0 release (https://github.com/linux-test-project/lcov/compare/v2.0...master)

				# And patches from us:

				# - Generates json file with code coverage summary (https://github.com/neondatabase/lcov/commit/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz)

				RUN set +o pipefail && \

					 for package in Capture::Tiny DateTime Devel::Cover Digest::MD5 File::Spec JSON::XS Memory::Process Time::HiRes JSON; do \

						yes | perl -MCPAN -e "CPAN::Shell->notest('install', '$package')";\

					 done && \

					set -o pipefail

				# Split into separate step to debug flaky failures here

				RUN wget https://github.com/neondatabase/lcov/archive/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz -O lcov.tar.gz \

				    && ls -laht lcov.tar.gz && sha256sum lcov.tar.gz \

				    && echo "61a22a62e20908b8b9e27d890bd0ea31f567a7b9668065589266371dcbca0992  lcov.tar.gz" | sha256sum --check \

				    && mkdir -p lcov && tar -xzf lcov.tar.gz -C lcov --strip-components=1 \

				    && cd lcov \

				    && make install \

				    && rm -rf ../lcov.tar.gz

				# Use the same version of libicu as the compute nodes so that

				# clusters created using inidb on pageserver can be used by computes.

				#

				# TODO: at this time, compute-node.Dockerfile uses the debian bullseye libicu

				# package, which is 67.1. We're duplicating that knowledge here, and also, technically,

				# Debian has a few patches on top of 67.1 that we're not adding here.

				ENV ICU_VERSION=67.1

				ENV ICU_PREFIX=/usr/local/icu

				# Download and build static ICU

				RUN wget -O /tmp/libicu-${ICU_VERSION}.tgz https://github.com/unicode-org/icu/releases/download/release-${ICU_VERSION//./-}/icu4c-${ICU_VERSION//./_}-src.tgz && \

				    echo "94a80cd6f251a53bd2a997f6f1b5ac6653fe791dfab66e1eb0227740fb86d5dc /tmp/libicu-${ICU_VERSION}.tgz" | sha256sum --check && \

				    mkdir /tmp/icu && \

				    pushd /tmp/icu && \

				    tar -xzf /tmp/libicu-${ICU_VERSION}.tgz && \

				    pushd icu/source && \

				    ./configure --prefix=${ICU_PREFIX}  --enable-static --enable-shared=no CXXFLAGS="-fPIC" CFLAGS="-fPIC" && \

				    make -j "$(nproc)" && \

				    make install && \

				    popd && \

				    rm -rf icu && \

				    rm -f /tmp/libicu-${ICU_VERSION}.tgz && \

				    popd

				# Switch to nonroot user

				USER nonroot:nonroot

				WORKDIR /home/nonroot

				RUN echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /home/nonroot/.curlrc

				# Python

				ENV PYTHON_VERSION=3.11.12 \

				    PYENV_ROOT=/home/nonroot/.pyenv \

				    PATH=/home/nonroot/.pyenv/shims:/home/nonroot/.pyenv/bin:/home/nonroot/.poetry/bin:$PATH

				RUN set -e \

				    && cd $HOME \

				    && curl -sSO https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer \

				    && chmod +x pyenv-installer \

				    && ./pyenv-installer \

				    && export PYENV_ROOT=/home/nonroot/.pyenv \

				    && export PATH="$PYENV_ROOT/bin:$PATH" \

				    && export PATH="$PYENV_ROOT/shims:$PATH" \

				    && pyenv install ${PYTHON_VERSION} \

				    && pyenv global ${PYTHON_VERSION} \

				    && python --version \

				    && pip install --upgrade pip \

				    && pip --version \

				    && pip install pipenv wheel poetry

				# Switch to nonroot user (again)

				USER nonroot:nonroot

				WORKDIR /home/nonroot

				# Rust

				# Please keep the version of llvm (installed above) in sync with rust llvm (`rustc --version --verbose | grep LLVM`)

				ENV RUSTC_VERSION=1.88.0

				ENV RUSTUP_HOME="/home/nonroot/.rustup"

				ENV PATH="/home/nonroot/.cargo/bin:${PATH}"

				ARG RUSTFILT_VERSION=0.2.1

				ARG CARGO_HAKARI_VERSION=0.9.36

				ARG CARGO_DENY_VERSION=0.18.2

				ARG CARGO_HACK_VERSION=0.6.36

				ARG CARGO_NEXTEST_VERSION=0.9.94

				ARG CARGO_CHEF_VERSION=0.1.71

				ARG CARGO_DIESEL_CLI_VERSION=2.2.9

				RUN curl -sSO https://static.rust-lang.org/rustup/dist/$(uname -m)-unknown-linux-gnu/rustup-init && whoami && \

					chmod +x rustup-init && \

					./rustup-init -y --default-toolchain ${RUSTC_VERSION} && \

					rm rustup-init && \

				    export PATH="$HOME/.cargo/bin:$PATH" && \

				    . "$HOME/.cargo/env" && \

				    cargo --version && rustup --version && \

				    rustup component add llvm-tools rustfmt clippy && \

				    cargo install rustfilt            --version ${RUSTFILT_VERSION} --locked && \

				    cargo install cargo-hakari        --version ${CARGO_HAKARI_VERSION} --locked && \

				    cargo install cargo-deny          --version ${CARGO_DENY_VERSION} --locked && \

				    cargo install cargo-hack          --version ${CARGO_HACK_VERSION} --locked && \

				    cargo install cargo-nextest       --version ${CARGO_NEXTEST_VERSION} --locked && \

				    cargo install cargo-chef          --version ${CARGO_CHEF_VERSION} --locked && \

				    cargo install diesel_cli          --version ${CARGO_DIESEL_CLI_VERSION} --locked \

				                                      --features postgres-bundled --no-default-features && \

				    rm -rf /home/nonroot/.cargo/registry && \

				    rm -rf /home/nonroot/.cargo/git

				# Show versions

				RUN whoami \

				    && python --version \

				    && pip --version \

				    && cargo --version --verbose \

				    && rustup --version --verbose \

				    && rustc --version --verbose \

				    && clang --version

				RUN if [ "${DEBIAN_VERSION}" = "bookworm" ]; then \

				    LD_LIBRARY_PATH=/pgcopydb/lib /pgcopydb/bin/pgcopydb --version; \

				else \

				    echo "pgcopydb is not available for ${DEBIAN_VERSION}"; \

				fi

				# Set following flag to check in Makefile if its running in Docker

				RUN touch /home/nonroot/.docker_build

									
										57

build_tools/patches/pgcopydbv017.patch
									
										Normal file
									
												View File
												
				@@ -0,0 +1,57 @@

				diff --git a/src/bin/pgcopydb/copydb.c b/src/bin/pgcopydb/copydb.c

				index d730b03..69a9be9 100644

				--- a/src/bin/pgcopydb/copydb.c

				+++ b/src/bin/pgcopydb/copydb.c

				@@ -44,6 +44,7 @@ GUC dstSettings[] = {

				 	{ "synchronous_commit", "'off'" },

				 	{ "statement_timeout", "0" },

				 	{ "lock_timeout", "0" },

				+	{ "idle_in_transaction_session_timeout", "0" },

				 	{ NULL, NULL },

				 };

				diff --git a/src/bin/pgcopydb/pgsql.c b/src/bin/pgcopydb/pgsql.c

				index 94f2f46..e051ba8 100644

				--- a/src/bin/pgcopydb/pgsql.c

				+++ b/src/bin/pgcopydb/pgsql.c

				@@ -2319,6 +2319,11 @@ pgsql_execute_log_error(PGSQL *pgsql,

				 	LinesBuffer lbuf = { 0 };

				+	if (message != NULL){

				+		// make sure message is writable by splitLines

				+		message = strdup(message);

				+	}

				+

				 	if (!splitLines(&lbuf, message))

				 	{

				 		/* errors have already been logged */

				@@ -2332,6 +2337,7 @@ pgsql_execute_log_error(PGSQL *pgsql,

				 				  PQbackendPID(pgsql->connection),

				 				  lbuf.lines[lineNumber]);

				 	}

				+        free(message); // free copy of message we created above

				 	if (pgsql->logSQL)

				 	{

				@@ -3174,11 +3180,18 @@ pgcopy_log_error(PGSQL *pgsql, PGresult *res, const char *context)

				 		/* errors have already been logged */

				 		return;

				 	}

				-

				 	if (res != NULL)

				 	{

				 		char *sqlstate = PQresultErrorField(res, PG_DIAG_SQLSTATE);

				-		strlcpy(pgsql->sqlstate, sqlstate, sizeof(pgsql->sqlstate));

				+		if (sqlstate == NULL)

				+		{

				+			// PQresultErrorField returned NULL!

				+			pgsql->sqlstate[0] = '\0';  // Set to an empty string to avoid segfault

				+		}

				+		else

				+		{

				+			strlcpy(pgsql->sqlstate, sqlstate, sizeof(pgsql->sqlstate));

				+		}

				 	}

				 	char *endpoint =

									
										2

clippy.toml
									
												View File
												
				@@ -12,3 +12,5 @@ disallowed-macros = [

				    # cannot disallow this, because clippy finds used from tokio macros

				    #"tokio::pin",

				]

				allow-unwrap-in-tests = true

8

compute/.gitignore vendored Normal file

View File

@@ -0,0 +1,8 @@
 # sql_exporter config files generated from Jsonnet
 etc/neon_collector.yml
 etc/neon_collector_autoscaling.yml
 etc/sql_exporter.yml
 etc/sql_exporter_autoscaling.yml
 # Node.js dependencies
 node_modules/

									
										58

compute/Makefile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,58 @@

				jsonnet_files = $(wildcard \

					etc/*.jsonnet \

					etc/sql_exporter/*.libsonnet)

				.PHONY: all

				all: neon_collector.yml neon_collector_autoscaling.yml sql_exporter.yml sql_exporter_autoscaling.yml

				neon_collector.yml: $(jsonnet_files)

					JSONNET_PATH=jsonnet:etc jsonnet \

						--output-file etc/$@ \

						--ext-str pg_version=$(PG_VERSION) \

						etc/neon_collector.jsonnet

				neon_collector_autoscaling.yml: $(jsonnet_files)

					JSONNET_PATH=jsonnet:etc jsonnet \

						--output-file etc/$@ \

						--ext-str pg_version=$(PG_VERSION) \

						etc/neon_collector_autoscaling.jsonnet

				sql_exporter.yml: $(jsonnet_files)

					JSONNET_PATH=etc jsonnet \

						--output-file etc/$@ \

						--tla-str collector_name=neon_collector \

						--tla-str collector_file=neon_collector.yml \

						--tla-str 'connection_string=postgresql://cloud_admin@127.0.0.1:5432/postgres?sslmode=disable&application_name=sql_exporter&pgaudit.log=none' \

						etc/sql_exporter.jsonnet

				sql_exporter_autoscaling.yml: $(jsonnet_files)

					JSONNET_PATH=etc jsonnet \

						--output-file etc/$@ \

						--tla-str collector_name=neon_collector_autoscaling \

						--tla-str collector_file=neon_collector_autoscaling.yml \

						--tla-str 'connection_string=postgresql://cloud_admin@127.0.0.1:5432/postgres?sslmode=disable&application_name=sql_exporter_autoscaling&pgaudit.log=none' \

						etc/sql_exporter.jsonnet

				.PHONY: clean

				clean:

					$(RM) \

						etc/neon_collector.yml \

						etc/neon_collector_autoscaling.yml \

						etc/sql_exporter.yml \

						etc/sql_exporter_autoscaling.yml

				.PHONY: jsonnetfmt-test

				jsonnetfmt-test:

					jsonnetfmt --test $(jsonnet_files)

				.PHONY: jsonnetfmt-format

				jsonnetfmt-format:

					jsonnetfmt --in-place $(jsonnet_files)

				.PHONY: manifest-schema-validation

				manifest-schema-validation: node_modules

					node_modules/.bin/jsonschema validate -d https://json-schema.org/draft/2020-12/schema manifest.schema.json manifest.yaml

				node_modules: package.json

					npm install

					touch node_modules

21

compute/README.md Normal file

View File

@@ -0,0 +1,21 @@
 This directory contains files that are needed to build the compute
 images, or included in the compute images.
 compute-node.Dockerfile
 	To build the compute image
 vm-image-spec.yaml
 	Instructions for vm-builder, to turn the compute-node image into
 	corresponding vm-compute-node image.
 etc/
 	Configuration files included in /etc in the compute image
 patches/
 	Some extensions need to be patched to work with Neon. This
 	directory contains such patches. They are applied to the extension
 	sources in compute-node.Dockerfile
 In addition to these, postgres itself, the neon postgres extension,
 and compute_ctl are built and copied into the compute image by
 compute-node.Dockerfile.

2044

compute/compute-node.Dockerfile Normal file

View File

File diff suppressed because it is too large Load Diff

									
										17

compute/etc/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,17 @@

				# Compute Configuration

				These files are the configuration files for various other pieces of software

				that will be running in the compute alongside Postgres.

				## `sql_exporter`

				### Adding a `sql_exporter` Metric

				We use `sql_exporter` to export various metrics from Postgres. In order to add

				a metric, you will need to create two files: a `libsonnet` and a `sql` file. You

				will then import the `libsonnet` file in one of the collector files, and the

				`sql` file will be imported in the `libsonnet` file.

				In the event your statistic is an LSN, you may want to cast it to a `float8`

				because Prometheus only supports floats. It's probably fine because `float8` can

				store integers from `-2^53` to `+2^53` exactly.

1

compute/etc/ld.so.conf.d/00-neon.conf Normal file

View File

				`@@ -0,0 +1 @@`
				`/usr/local/lib`

									
										58

compute/etc/neon_collector.jsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,58 @@

				{

				  collector_name: 'neon_collector',

				  metrics: [

				    import 'sql_exporter/checkpoints_req.libsonnet',

				    import 'sql_exporter/checkpoints_timed.libsonnet',

				    import 'sql_exporter/compute_backpressure_throttling_seconds_total.libsonnet',

				    import 'sql_exporter/compute_current_lsn.libsonnet',

				    import 'sql_exporter/compute_logical_snapshot_files.libsonnet',

				    import 'sql_exporter/compute_logical_snapshots_bytes.libsonnet',

				    import 'sql_exporter/compute_max_connections.libsonnet',

				    import 'sql_exporter/compute_receive_lsn.libsonnet',

				    import 'sql_exporter/compute_subscriptions_count.libsonnet',

				    import 'sql_exporter/connection_counts.libsonnet',

				    import 'sql_exporter/db_total_size.libsonnet',

				    import 'sql_exporter/file_cache_read_wait_seconds_bucket.libsonnet',

				    import 'sql_exporter/file_cache_read_wait_seconds_count.libsonnet',

				    import 'sql_exporter/file_cache_read_wait_seconds_sum.libsonnet',

				    import 'sql_exporter/file_cache_write_wait_seconds_bucket.libsonnet',

				    import 'sql_exporter/file_cache_write_wait_seconds_count.libsonnet',

				    import 'sql_exporter/file_cache_write_wait_seconds_sum.libsonnet',

				    import 'sql_exporter/getpage_prefetch_discards_total.libsonnet',

				    import 'sql_exporter/getpage_prefetch_misses_total.libsonnet',

				    import 'sql_exporter/getpage_prefetch_requests_total.libsonnet',

				    import 'sql_exporter/getpage_prefetches_buffered.libsonnet',

				    import 'sql_exporter/getpage_sync_requests_total.libsonnet',

				    import 'sql_exporter/compute_getpage_stuck_requests_total.libsonnet',

				    import 'sql_exporter/compute_getpage_max_inflight_stuck_time_ms.libsonnet',

				    import 'sql_exporter/getpage_wait_seconds_bucket.libsonnet',

				    import 'sql_exporter/getpage_wait_seconds_count.libsonnet',

				    import 'sql_exporter/getpage_wait_seconds_sum.libsonnet',

				    import 'sql_exporter/lfc_approximate_working_set_size.libsonnet',

				    import 'sql_exporter/lfc_approximate_working_set_size_windows.libsonnet',

				    import 'sql_exporter/lfc_cache_size_limit.libsonnet',

				    import 'sql_exporter/lfc_chunk_size.libsonnet',

				    import 'sql_exporter/lfc_hits.libsonnet',

				    import 'sql_exporter/lfc_misses.libsonnet',

				    import 'sql_exporter/lfc_used.libsonnet',

				    import 'sql_exporter/lfc_used_pages.libsonnet',

				    import 'sql_exporter/lfc_writes.libsonnet',

				    import 'sql_exporter/logical_slot_restart_lsn.libsonnet',

				    import 'sql_exporter/max_cluster_size.libsonnet',

				    import 'sql_exporter/pageserver_disconnects_total.libsonnet',

				    import 'sql_exporter/pageserver_requests_sent_total.libsonnet',

				    import 'sql_exporter/pageserver_send_flushes_total.libsonnet',

				    import 'sql_exporter/pageserver_open_requests.libsonnet',

				    import 'sql_exporter/pg_stats_userdb.libsonnet',

				    import 'sql_exporter/replication_delay_bytes.libsonnet',

				    import 'sql_exporter/replication_delay_seconds.libsonnet',

				    import 'sql_exporter/retained_wal.libsonnet',

				    import 'sql_exporter/wal_is_lost.libsonnet',

				  ],

				  queries: [

				    {

				      query_name: 'neon_perf_counters',

				      query: importstr 'sql_exporter/neon_perf_counters.sql',

				    },

				  ],

				}

									
										11

compute/etc/neon_collector_autoscaling.jsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,11 @@

				{

				  collector_name: 'neon_collector_autoscaling',

				  metrics: [

				    import 'sql_exporter/lfc_approximate_working_set_size_windows.autoscaling.libsonnet',

				    import 'sql_exporter/lfc_cache_size_limit.libsonnet',

				    import 'sql_exporter/lfc_hits.libsonnet',

				    import 'sql_exporter/lfc_misses.libsonnet',

				    import 'sql_exporter/lfc_used.libsonnet',

				    import 'sql_exporter/lfc_writes.libsonnet',

				  ],

				}

									
										32

compute/etc/pgbouncer.ini
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				[databases]

				;; pgbouncer propagates application_name (if it's specified) to the server, but some

				;; clients don't set it. We set default application_name=pgbouncer to make it

				;; easier to identify pgbouncer connections in Postgres. If client sets

				;; application_name, it will be used instead.

				*=host=localhost port=5432 auth_user=cloud_admin application_name=pgbouncer

				[pgbouncer]

				listen_port=6432

				listen_addr=0.0.0.0

				auth_type=scram-sha-256

				auth_user=cloud_admin

				auth_dbname=postgres

				client_tls_sslmode=disable

				server_tls_sslmode=disable

				pool_mode=transaction

				max_client_conn=10000

				default_pool_size=64

				max_prepared_statements=0

				admin_users=postgres

				unix_socket_dir=/tmp/

				unix_socket_mode=0777

				; required for pgbouncer_exporter

				ignore_startup_parameters=extra_float_digits

				; pidfile for graceful termination

				pidfile=/tmp/pgbouncer.pid

				;; Disable connection logging. It produces a lot of logs that no one looks at,

				;; and we can get similar log entries from the proxy too. We had incidents in

				;; the past where the logging significantly stressed the log device or pgbouncer

				;; itself.

				log_connections=0

				log_disconnections=0

0

compute/etc/postgres_exporter.yml Normal file

View File

									
										40

compute/etc/sql_exporter.jsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,40 @@

				function(collector_name, collector_file, connection_string) {

				  // Configuration for sql_exporter for autoscaling-agent

				  // Global defaults.

				  global: {

				    // If scrape_timeout <= 0, no timeout is set unless Prometheus provides one. The default is 10s.

				    scrape_timeout: '10s',

				    // Subtracted from Prometheus' scrape_timeout to give us some headroom and prevent Prometheus from timing out first.

				    scrape_timeout_offset: '500ms',

				    // Minimum interval between collector runs: by default (0s) collectors are executed on every scrape.

				    min_interval: '0s',

				    // Maximum number of open connections to any one target. Metric queries will run concurrently on multiple connections,

				    // as will concurrent scrapes.

				    max_connections: 1,

				    // Maximum number of idle connections to any one target. Unless you use very long collection intervals, this should

				    // always be the same as max_connections.

				    max_idle_connections: 1,

				    // Maximum number of maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse.

				    // If 0, connections are not closed due to a connection's age.

				    max_connection_lifetime: '5m',

				  },

				  // The target to monitor and the collectors to execute on it.

				  target: {

				    // Data source name always has a URI schema that matches the driver name. In some cases (e.g. MySQL)

				    // the schema gets dropped or replaced to match the driver expected DSN format.

				    data_source_name: connection_string,

				    // Collectors (referenced by name) to execute on the target.

				    // Glob patterns are supported (see <https://pkg.go.dev/path/filepath#Match> for syntax).

				    collectors: [

				      collector_name,

				    ],

				  },

				  // Collector files specifies a list of globs. One collector definition is read from each matching file.

				  // Glob patterns are supported (see <https://pkg.go.dev/path/filepath#Match> for syntax).

				  collector_files: [

				    collector_file,

				  ],

				}

									
										1

compute/etc/sql_exporter/checkpoints_req.17.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				SELECT num_requested AS checkpoints_req FROM pg_stat_checkpointer;

									
										15

compute/etc/sql_exporter/checkpoints_req.libsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				local neon = import 'neon.libsonnet';

				local pg_stat_bgwriter = importstr 'sql_exporter/checkpoints_req.sql';

				local pg_stat_checkpointer = importstr 'sql_exporter/checkpoints_req.17.sql';

				{

				  metric_name: 'checkpoints_req',

				  type: 'gauge',

				  help: 'Number of requested checkpoints',

				  key_labels: null,

				  values: [

				    'checkpoints_req',

				  ],

				  query: if neon.PG_MAJORVERSION_NUM < 17 then pg_stat_bgwriter else pg_stat_checkpointer,

				}

									
										1

compute/etc/sql_exporter/checkpoints_req.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				SELECT checkpoints_req FROM pg_stat_bgwriter;

									
										1

compute/etc/sql_exporter/checkpoints_timed.17.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				SELECT num_timed AS checkpoints_timed FROM pg_stat_checkpointer;

									
										15

compute/etc/sql_exporter/checkpoints_timed.libsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				local neon = import 'neon.libsonnet';

				local pg_stat_bgwriter = importstr 'sql_exporter/checkpoints_timed.sql';

				local pg_stat_checkpointer = importstr 'sql_exporter/checkpoints_timed.17.sql';

				{

				  metric_name: 'checkpoints_timed',

				  type: 'gauge',

				  help: 'Number of scheduled checkpoints',

				  key_labels: null,

				  values: [

				    'checkpoints_timed',

				  ],

				  query: if neon.PG_MAJORVERSION_NUM < 17 then pg_stat_bgwriter else pg_stat_checkpointer,

				}

									
										1

compute/etc/sql_exporter/checkpoints_timed.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				SELECT checkpoints_timed FROM pg_stat_bgwriter;

									
										10

compute/etc/sql_exporter/compute_backpressure_throttling_seconds_total.libsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,10 @@

				{

				  metric_name: 'compute_backpressure_throttling_seconds_total',

				  type: 'counter',

				  help: 'Time compute has spent throttled',

				  key_labels: null,

				  values: [

				    'throttled',

				  ],

				  query: importstr 'sql_exporter/compute_backpressure_throttling_seconds_total.sql',

				}

									
										1

compute/etc/sql_exporter/compute_backpressure_throttling_seconds_total.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				SELECT (neon.backpressure_throttling_time()::float8 / 1000000) AS throttled;

									
										10

compute/etc/sql_exporter/compute_current_lsn.libsonnet
									
										Normal file
									
												View File
												
				@@ -0,0 +1,10 @@

				{

				  metric_name: 'compute_current_lsn',

				  type: 'gauge',

				  help: 'Current LSN of the database',

				  key_labels: null,

				  values: [

				    'lsn',

				  ],

				  query: importstr 'sql_exporter/compute_current_lsn.sql',

				}

									
										4

compute/etc/sql_exporter/compute_current_lsn.sql
									
										Normal file
									
												View File
												
				@@ -0,0 +1,4 @@

				SELECT CASE

				  WHEN pg_catalog.pg_is_in_recovery() THEN (pg_last_wal_replay_lsn() - '0/0')::FLOAT8

				  ELSE (pg_current_wal_lsn() - '0/0')::FLOAT8

				END AS lsn;

				`@@ -0,0 +1 @@`
				`SELECT num_requested AS checkpoints_req FROM pg_stat_checkpointer;`

				`@@ -0,0 +1 @@`
				`SELECT checkpoints_req FROM pg_stat_bgwriter;`

				`@@ -0,0 +1 @@`
				`SELECT num_timed AS checkpoints_timed FROM pg_stat_checkpointer;`

				`@@ -0,0 +1 @@`
				`SELECT checkpoints_timed FROM pg_stat_bgwriter;`

				`@@ -0,0 +1 @@`
				`SELECT (neon.backpressure_throttling_time()::float8 / 1000000) AS throttled;`

Compare commits

2548 Commits hackathon/ ... release-pr

10 .cargo/config.toml Unescape Escape View File

3 .config/hakari.toml Unescape Escape View File

10 .dockerignore Unescape Escape View File

1 .github/ISSUE_TEMPLATE/bug-template.md vendored Unescape Escape View File

1 .github/ISSUE_TEMPLATE/epic-template.md vendored Unescape Escape View File

21 .github/PULL_REQUEST_TEMPLATE/release-pr.md vendored Unescape Escape View File

26 .github/actionlint.yml vendored Unescape Escape View File

30 .github/actions/allure-report-generate/action.yml vendored Unescape Escape View File

21 .github/actions/allure-report-store/action.yml vendored Unescape Escape View File

9 .github/actions/download/action.yml vendored Unescape Escape View File

12 .github/actions/neon-branch-create/action.yml vendored Unescape Escape View File

88 .github/actions/neon-project-create/action.yml vendored Unescape Escape View File

63 .github/actions/run-python-test-set/action.yml vendored Unescape Escape View File

2 .github/actions/save-coverage-data/action.yml vendored Unescape Escape View File

36 .github/actions/set-docker-config-dir/action.yml vendored Unescape Escape View File

29 .github/actions/upload/action.yml vendored Unescape Escape View File

13 .github/file-filters.yaml vendored Normal file Unescape Escape View File

11 .github/pull_request_template.md vendored Unescape Escape View File

69 .github/scripts/generate_image_maps.py vendored Normal file Unescape Escape View File

110 .github/scripts/lint-release-pr.sh vendored Executable file Unescape Escape View File

31 .github/scripts/previous-releases.jq vendored Normal file Unescape Escape View File

45 .github/scripts/push_with_image_map.py vendored Normal file Unescape Escape View File

60 .github/workflows/_benchmarking_preparation.yml vendored Unescape Escape View File

257 .github/workflows/_build-and-test-locally.yml vendored Unescape Escape View File

55 .github/workflows/_check-codestyle-python.yml vendored Normal file Unescape Escape View File

102 .github/workflows/_check-codestyle-rust.yml vendored Normal file Unescape Escape View File

169 .github/workflows/_meta.yml vendored Normal file Unescape Escape View File

56 .github/workflows/_push-to-acr.yml vendored Unescape Escape View File

128 .github/workflows/_push-to-container-registry.yml vendored Normal file Unescape Escape View File

11 .github/workflows/actionlint.yml vendored Unescape Escape View File

29 .github/workflows/approved-for-ci-run.yml vendored Unescape Escape View File

556 .github/workflows/benchmarking.yml vendored View File

178 .github/workflows/build-build-tools-image.yml vendored Unescape Escape View File

265 .github/workflows/build-macos.yml vendored Normal file Unescape Escape View File

121 .github/workflows/build_and_run_selected_test.yml vendored Normal file Unescape Escape View File

1451 .github/workflows/build_and_test.yml vendored View File

151 .github/workflows/build_and_test_fully.yml vendored Normal file Unescape Escape View File

145 .github/workflows/build_and_test_with_sanitizers.yml vendored Normal file Unescape Escape View File

69 .github/workflows/cargo-deny.yml vendored Normal file Unescape Escape View File

51 .github/workflows/check-build-tools-image.yml vendored Unescape Escape View File

5 .github/workflows/check-permissions.yml vendored Unescape Escape View File

5 .github/workflows/cleanup-caches-by-a-branch.yml vendored Unescape Escape View File

99 .github/workflows/cloud-extensions.yml vendored Normal file Unescape Escape View File

138 .github/workflows/cloud-regress.yml vendored Normal file Unescape Escape View File

43 .github/workflows/fast-forward.yml vendored Normal file Unescape Escape View File

82 .github/workflows/force-test-extensions-upgrade.yml vendored Normal file Unescape Escape View File

200 .github/workflows/ingest_benchmark.yml vendored Normal file Unescape Escape View File

10 .github/workflows/label-for-external-users.yml vendored Unescape Escape View File

203 .github/workflows/large_oltp_benchmark.yml vendored Normal file Unescape Escape View File

175 .github/workflows/large_oltp_growth.yml vendored Normal file Unescape Escape View File

32 .github/workflows/lint-release-pr.yml vendored Normal file Unescape Escape View File

169 .github/workflows/neon_extra_builds.yml vendored Unescape Escape View File

310 .github/workflows/periodic_pagebench.yml vendored Unescape Escape View File

60 .github/workflows/pg-clients.yml vendored Unescape Escape View File

82 .github/workflows/pin-build-tools-image.yml vendored Unescape Escape View File

177 .github/workflows/pre-merge-checks.yml vendored Normal file Unescape Escape View File

84 .github/workflows/proxy-benchmark.yml vendored Normal file Unescape Escape View File

93 .github/workflows/random-ops-test.yml vendored Normal file Unescape Escape View File

46 .github/workflows/regenerate-pg-setting.yml vendored Normal file Unescape Escape View File

12 .github/workflows/release-compute.yml vendored Normal file Unescape Escape View File

7 .github/workflows/release-notify.yml vendored Unescape Escape View File

12 .github/workflows/release-proxy.yml vendored Normal file Unescape Escape View File

12 .github/workflows/release-storage.yml vendored Normal file Unescape Escape View File

129 .github/workflows/release.yml vendored Unescape Escape View File

71 .github/workflows/report-workflow-stats-batch.yml vendored Normal file Unescape Escape View File

114 .github/workflows/trigger-e2e-tests.yml vendored Unescape Escape View File

4 .gitignore vendored Unescape Escape View File

4 .gitmodules vendored Unescape Escape View File

32 CODEOWNERS Unescape Escape View File

3718 Cargo.lock generated View File

191 Cargo.toml Unescape Escape View File

76 Dockerfile Unescape Escape View File

229 Dockerfile.build-tools Unescape Escape View File

1039 Dockerfile.compute-node View File

285 Makefile Unescape Escape View File

14 README.md Unescape Escape View File

341 build-tools.Dockerfile Normal file Unescape Escape View File

57 build_tools/patches/pgcopydbv017.patch Normal file Unescape Escape View File

2548 Commits

hackathon/ ... release-pr

10

.cargo/config.toml

View File

3

.config/hakari.toml

View File

10

.dockerignore

View File

1

.github/ISSUE_TEMPLATE/bug-template.md vendored

View File

1

.github/ISSUE_TEMPLATE/epic-template.md vendored

View File

21

.github/PULL_REQUEST_TEMPLATE/release-pr.md vendored

View File

26

.github/actionlint.yml vendored

View File

30

.github/actions/allure-report-generate/action.yml vendored

View File

21

.github/actions/allure-report-store/action.yml vendored

View File

9

.github/actions/download/action.yml vendored

View File

12

.github/actions/neon-branch-create/action.yml vendored

View File

88

.github/actions/neon-project-create/action.yml vendored

View File

63

.github/actions/run-python-test-set/action.yml vendored

View File

2

.github/actions/save-coverage-data/action.yml vendored

View File

36

.github/actions/set-docker-config-dir/action.yml vendored

View File

29

.github/actions/upload/action.yml vendored

View File

13

.github/file-filters.yaml vendored Normal file

View File

11

.github/pull_request_template.md vendored

View File

69

.github/scripts/generate_image_maps.py vendored Normal file

View File

110

.github/scripts/lint-release-pr.sh vendored Executable file

View File

31

.github/scripts/previous-releases.jq vendored Normal file

View File

45

.github/scripts/push_with_image_map.py vendored Normal file

View File

60

.github/workflows/_benchmarking_preparation.yml vendored

View File

257

.github/workflows/_build-and-test-locally.yml vendored

View File

55

.github/workflows/_check-codestyle-python.yml vendored Normal file

View File

102

.github/workflows/_check-codestyle-rust.yml vendored Normal file

View File

169

.github/workflows/_meta.yml vendored Normal file

View File

56

.github/workflows/_push-to-acr.yml vendored

View File

128

.github/workflows/_push-to-container-registry.yml vendored Normal file

View File

11

.github/workflows/actionlint.yml vendored

View File

29

.github/workflows/approved-for-ci-run.yml vendored

View File

556

.github/workflows/benchmarking.yml vendored

View File

178

.github/workflows/build-build-tools-image.yml vendored

View File

265

.github/workflows/build-macos.yml vendored Normal file

View File

121

.github/workflows/build_and_run_selected_test.yml vendored Normal file

View File

1451

.github/workflows/build_and_test.yml vendored

View File

151

.github/workflows/build_and_test_fully.yml vendored Normal file

View File

145

.github/workflows/build_and_test_with_sanitizers.yml vendored Normal file

View File

69

.github/workflows/cargo-deny.yml vendored Normal file

View File

51

.github/workflows/check-build-tools-image.yml vendored

View File

5

.github/workflows/check-permissions.yml vendored

View File

5

.github/workflows/cleanup-caches-by-a-branch.yml vendored

View File

99

.github/workflows/cloud-extensions.yml vendored Normal file

View File

138

.github/workflows/cloud-regress.yml vendored Normal file

View File

43

.github/workflows/fast-forward.yml vendored Normal file

View File

82

.github/workflows/force-test-extensions-upgrade.yml vendored Normal file

View File

200

.github/workflows/ingest_benchmark.yml vendored Normal file

View File

10

.github/workflows/label-for-external-users.yml vendored

View File

203

.github/workflows/large_oltp_benchmark.yml vendored Normal file

View File

175

.github/workflows/large_oltp_growth.yml vendored Normal file

View File

32

.github/workflows/lint-release-pr.yml vendored Normal file

View File

169

.github/workflows/neon_extra_builds.yml vendored

View File

310

.github/workflows/periodic_pagebench.yml vendored

View File

60

.github/workflows/pg-clients.yml vendored

View File

82

.github/workflows/pin-build-tools-image.yml vendored

View File

177

.github/workflows/pre-merge-checks.yml vendored Normal file

View File

84

.github/workflows/proxy-benchmark.yml vendored Normal file

View File

93

.github/workflows/random-ops-test.yml vendored Normal file

View File

46

.github/workflows/regenerate-pg-setting.yml vendored Normal file

View File

12

.github/workflows/release-compute.yml vendored Normal file

View File

7

.github/workflows/release-notify.yml vendored

View File

12

.github/workflows/release-proxy.yml vendored Normal file

View File

12

.github/workflows/release-storage.yml vendored Normal file

View File

129

.github/workflows/release.yml vendored

View File

71

.github/workflows/report-workflow-stats-batch.yml vendored Normal file

View File

114

.github/workflows/trigger-e2e-tests.yml vendored

View File

4

.gitignore vendored

View File

4

.gitmodules vendored

View File

32

CODEOWNERS

View File

3718

Cargo.lock generated

View File

191

Cargo.toml

View File

76

Dockerfile

View File

229

Dockerfile.build-tools

View File

1039

Dockerfile.compute-node

View File

285

Makefile

View File

14

README.md

View File

341

build-tools.Dockerfile Normal file

View File

57

build_tools/patches/pgcopydbv017.patch Normal file

View File

2

clippy.toml

View File