rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-04 12:40:37 +00:00

Author	SHA1	Message	Date
Arseny Sher	01180666b0	Merge pull request #6803 from neondatabase/releases/2024-02-19 Release 2024-02-19 release-4916	2024-02-19 16:38:35 +04:00
John Spray	5667372c61	pageserver: during shard split, wait for child to activate (#6789 ) ## Problem test_sharding_split_unsharded was flaky with log errors from tenants not being active. This was happening when the split function enters wait_lsn() while the child shard might still be activating. It's flaky rather than an outright failure because activation is usually very fast. This is also a real bug fix, because in realistic scenarios we could proceed to detach the parent shard before the children are ready, leading to an availability gap for clients. ## Summary of changes - Do a short wait_to_become_active on the child shards before proceeding to wait for their LSNs to advance --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-18 15:55:19 +00:00
Alexander Bayandin	61f99d703d	test_create_snapshot: do not try to copy pg_dynshmem dir (#6796 ) ## Problem `test_create_snapshot` is flaky[0] on CI and fails constantly on macOS, but with a slightly different error: ``` shutil.Error: [('/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem', '/Users/bayandin/work/neon/test_output/compatibility_snapshot_pgv15/repo/endpoints/ep-1/pgdata/pg_dynshmem', "[Errno 2] No such file or directory: '/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem'")] ``` Also (on macOS) `repo/endpoints/ep-1/pgdata/pg_dynshmem` is a symlink to `/dev/shm/`. - [0] https://github.com/neondatabase/neon/issues/6784 ## Summary of changes Ignore `pg_dynshmem` directory while copying a snapshot	2024-02-18 12:16:07 +00:00
John Spray	24014d8383	pageserver: fix sharding emitting empty image layers during compaction (#6776 ) ## Problem Sharded tenants would sometimes try to write empty image layers during compaction: this was more noticeable on larger databases. - https://github.com/neondatabase/neon/issues/6755 Note to reviewers: the last commit is a refactor that de-intents a whole block, I recommend reviewing the earlier commits one by one to see the real changes ## Summary of changes - Fix a case where when we drop a key during compaction, we might fail to write out keys (this was broken when vectored get was added) - If an image layer is empty, then do not try and write it out, but leave `start` where it is so that if the subsequent key range meets criteria for writing an image layer, we will extend its key range to cover the empty area. - Add a compaction test that configures small layers and compaction thresholds, and asserts that we really successfully did image layer generation. This fails before the fix.	2024-02-18 08:51:12 +00:00
Konstantin Knizhnik	e3ded64d1b	Support pg-ivm extension (#6793 ) ## Problem See https://github.com/neondatabase/cloud/issues/10268 ## Summary of changes Add pg_ivm extension ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-02-17 22:13:25 +02:00
dependabot[bot]	9b714c8572	build(deps): bump cryptography from 42.0.0 to 42.0.2 (#6792 )	2024-02-17 19:15:21 +00:00
Alex Chi Z	29fb675432	Revert "fix superuser permission check for extensions (#6733 )" (#6791 ) This reverts commit `9ad940086c`. This pull request reverts #6733 to avoid incompatibility with pgvector and I will push further fixes later. Note that after reverting this pull request, the postgres submodule will point to some detached branches.	2024-02-16 20:50:09 +00:00
Christian Schwarz	ca07fa5f8b	per-TenantShard read throttling (#6706 )	2024-02-16 21:26:59 +01:00
John Spray	5d039c6e9b	libs: add 'generations_api' auth scope (#6783 ) ## Problem Even if you're not enforcing auth, the JwtAuth middleware barfs on scopes it doesn't know about. Add `generations_api` scope, which was invented in the cloud control plane for the pageserver's /re-attach and /validate upcalls: this will be enforced in storage controller's implementation of these in a later PR. Unfortunately the scope's naming doesn't match the other scope's naming styles, so needs a manual serde decorator to give it an underscore. ## Summary of changes - Add `Scope::GenerationsApi` variant - Update pageserver + safekeeper auth code to print appropriate message if they see it.	2024-02-16 15:53:09 +00:00
Calin Anca	36e1100949	bench_walredo: use tokio multi-threaded runtime (#6743 ) fixes https://github.com/neondatabase/neon/issues/6648 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-02-16 16:31:54 +01:00
Alexander Bayandin	59c5b374de	test_pageserver_max_throughput_getpage_at_latest_lsn: disable on CI (#6785 ) ## Problem `test_pageserver_max_throughput_getpage_at_latest_lsn` is flaky which makes CI status red pretty frequently. `benchmarks` is not a blocking job (doesn't block `deploy`), so having it red might hide failures in other jobs Ref: https://github.com/neondatabase/neon/issues/6724 ## Summary of changes - Disable `test_pageserver_max_throughput_getpage_at_latest_lsn` on CI until it fixed	2024-02-16 15:30:04 +00:00
Arpad Müller	0f3b87d023	Add test for pageserver_directory_entries_count metric (#6767 ) Adds a simple test to ensure the metric works. The test creates a bunch of relations to activate the metric. Follow-up of #6736	2024-02-16 14:53:36 +00:00
Konstantin Knizhnik	c19625a29c	Support sharding for compute_ctl (#6787 ) ## Problem See https://github.com/neondatabase/neon/issues/6786 ## Summary of changes Split connection string in compute.rs when requesting basebackup	2024-02-16 14:50:09 +00:00
John Spray	f2e5212fed	storage controller: background reconcile, graceful shutdown, better logging (#6709 ) ## Problem Now that the storage controller is working end to end, we start burning down the robustness aspects. ## Summary of changes - Add a background task that periodically calls `reconcile_all`. This ensures that if earlier operations couldn't succeed (e.g. because a node was unavailable), we will eventually retry. This is a naive initial implementation can start an unlimited number of reconcile tasks: limiting reconcile concurrency is a later item in #6342 - Add a number of tracing spans in key locations: each background task, each reconciler task. - Add a top level CancellationToken and Gate, and use these to implement a graceful shutdown that waits for tasks to shut down. This is not bulletproof yet, because within these tasks we have remote HTTP calls that aren't wrapped in cancellation/timeouts, but it creates the structure, and if we don't shutdown promptly then k8s will kill us. - To protect shard splits from background reconciliation, expose the `SplitState` in memory and use it to guard any APIs that require an attached tenant.	2024-02-16 13:00:53 +00:00
Christian Schwarz	568bc1fde3	fix(build): production flamegraphs are useless (#6764 )	2024-02-16 10:12:34 +00:00
Christian Schwarz	45e929c069	stop reading local `metadata` file (#6777 )	2024-02-16 09:35:11 +00:00
John Spray	6b980f38da	libs: refactor ShardCount.0 to private (#6690 ) ## Problem The ShardCount type has a magic '0' value that represents a legacy single-sharded tenant, whose TenantShardId is formatted without a `-0001` suffix (i.e. formatted as a traditional TenantId). This was error-prone in code locations that wanted the actual number of shards: they had to handle the 0 case specially. ## Summary of changes - Make the internal value of ShardCount private, and expose `count()` and `literal()` getters so that callers have to explicitly say whether they want the literal value (e.g. for storing in a TenantShardId), or the actual number of shards in the tenant. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-15 21:59:39 +00:00
MMeent	f0d8bd7855	Update Makefile (#6779 ) This fixes issues where `neon-pg-ext-clean-vYY` is used as target and resolves using the `neon-pg-ext-%` template with `$` resolving as `clean-vYY`, for older versions of GNU Make, rather than `neon-pg-ext-clean-%` using `$` = `vYY` ## Problem ``` $ make clean ... rm -f pg_config_paths.h Compiling neon clean-v14 mkdir -p /Users/<user>/neon-build//pg_install//build/neon-clean-v14 /Applications/Xcode.app/Contents/Developer/usr/bin/make PG_CONFIG=/Users/<user>/neon-build//pg_install//clean-v14/bin/pg_config CFLAGS='-O0 -g3 ' \ -C /Users/<user>/neon-build//pg_install//build/neon-clean-v14 \ -f /Users/<user>/neon-build//pgxn/neon/Makefile install make[1]: /Users/<user>/neon-build//pg_install//clean-v14/bin/pg_config: Command not found make[1]: * No rule to make target `install'. Stop. make: * [neon-pg-ext-clean-v14] Error 2 ```	2024-02-15 19:48:50 +00:00
Joonas Koivunen	046d9c69e6	fix: require wider jwt for changing the io engine (#6770 ) io-engine should not be changeable with any JWT token, for example the tenant_id scoped token which computes have.	2024-02-15 16:58:26 +00:00
Alexander Bayandin	c72cb44213	test_runner/performance: parametrize benchmarks (#6744 ) ## Problem Currently, we don't store `PLATFORM` for Nightly Benchmarks. It causes them to be merged as reruns in Allure report (because they have the same test name). ## Summary of changes - Parametrize benchmarks by - Postgres Version (14/15/16) - Build Type (debug/release/remote) - PLATFORM (neon-staging/github-actions-selfhosted/...) --------- Co-authored-by: Bodobolero <peterbendel@neon.tech>	2024-02-15 15:53:58 +00:00
Arpad Müller	cd3e4ac18d	Rename TEST_IMG function to test_img (#6762 ) Latter follows the canonical way to naming functions in Rust.	2024-02-15 15:14:51 +00:00
Alex Chi Z	9ad940086c	fix superuser permission check for extensions (#6733 ) close https://github.com/neondatabase/neon/issues/6236 This pull request bumps neon postgres dependencies. The corresponding postgres commits fix the checks for superuser permission when creating an extension. Also, for creating native functinos, it now allows neon_superuser only in the extension creation process. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-02-15 14:59:13 +00:00
Joonas Koivunen	936f2ee2a5	fix: accidential wide span in tests (#6772 ) introduced in a PR without other #[tracing::instrument] changes.	2024-02-15 13:48:44 +00:00
Heikki Linnakangas	1af047dd3e	Fix typo in CI message (#6749 )	2024-02-15 14:34:19 +02:00
Conrad Ludgate	6c94269c32	Merge pull request #6758 from neondatabase/release-proxy-2024-02-14 2024-02-14 Proxy Release release-4862	2024-02-15 09:45:08 +00:00
John Spray	5fa747e493	pageserver: shard splitting refinements (parent deletion, hard linking) (#6725 ) ## Problem - We weren't deleting parent shard contents once the split was done - Re-downloading layers into child shards is wasteful ## Summary of changes - Hard-link layers into child chart local storage during split - Delete parent shards content at the end --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-02-15 10:21:53 +02:00
Anna Khanova	edc691647d	Proxy: remove fail fast logic to connect to compute (#6759 ) ## Problem Flaky tests ## Summary of changes Remove failfast logic	2024-02-15 07:42:12 +00:00
Joonas Koivunen	80854b98ff	move timeouts and cancellation handling to remote_storage (#6697 ) Cancellation and timeouts are handled at remote_storage callsites, if they are. However they should always be handled, because we've had transient problems with remote storage connections. - Add cancellation token to the `trait RemoteStorage` methods - For `download`, `list` methods there is `DownloadError::{Cancelled,Timeout}` - For the rest now using `anyhow::Error`, it will have root cause `remote_storage::TimeoutOrCancel::{Cancel,Timeout}` - Both types have `::is_permanent` equivalent which should be passed to `backoff::retry` - New generic RemoteStorageConfig option `timeout`, defaults to 120s - Start counting timeouts only after acquiring concurrency limiter permit - Cancellable permit acquiring - Download stream timeout or cancellation is communicated via an `std::io::Error` - Exit backoff::retry by marking cancellation errors permanent Fixes: #6096 Closes: #4781 Co-authored-by: arpad-m <arpad-m@users.noreply.github.com>	2024-02-14 23:24:07 +00:00
Christian Schwarz	024372a3db	Revert "refactor(VirtualFile::crashsafe_overwrite): avoid Handle::block_on in callers" (#6765 ) Reverts neondatabase/neon#6731 On high tenant count Pageservers in staging, memory and CPU usage shoots to 100% with this change. (NB: staging currently has tokio-epoll-uring enabled) Will analyze tomorrow. https://neondb.slack.com/archives/C03H1K0PGKH/p1707933875639379?thread_ts=1707929541.125329&cid=C03H1K0PGKH	2024-02-14 19:17:12 +00:00
Shayan Hosseini	fff2468aa2	Add resource consume test funcs (#6747 ) ## Problem Building on #5875 to add handy test functions for autoscaling. Resolves #5609 ## Summary of changes This PR makes the following changes to #5875: - Enable `neon_test_utils` extension in the compute node docker image, so we could use it in the e2e tests (as discussed with @kelvich). - Removed test functions related to disk as we don't use them for autoscaling. - Fix the warning with printf-ing unsigned long variables. --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-02-14 18:45:05 +00:00
Anna Khanova	c7538a2c20	Proxy: remove fail fast logic to connect to compute (#6759 ) ## Problem Flaky tests ## Summary of changes Remove failfast logic	2024-02-14 18:43:52 +00:00
Arpad Müller	a2d0d44b42	Remove unused allow's (#6760 ) These allow's became redundant some time ago so remove them, or address them if addressing is very simple.	2024-02-14 18:16:05 +00:00
Christian Schwarz	7d3cdc05d4	fix(pageserver): pagebench doesn't work with released artifacts (#6757 ) The canonical release artifact of neon.git is the Docker image with all the binaries in them: ``` docker pull neondatabase/neon:release-4854 docker create --name extract neondatabase/neon:release-4854 docker cp extract:/usr/local/bin/pageserver ./pageserver.release-4854 chmod +x pageserver.release-4854 cp -a pageserver.release-4854 ./target/release/pageserver ``` Before this PR, these artifacts didn't expose the `keyspace` API, thereby preventing `pagebench get-page-latest-lsn` from working. Having working pagebench is useful, e.g., for experiments in staging. So, expose the API, but don't document it, as it's not part of the interface with control plane.	2024-02-14 17:01:15 +00:00
John Spray	840abe3954	pageserver: store aux files as deltas (#6742 ) ## Problem Aux files were stored with an O(N^2) cost, since on each modification the entire map is re-written as a page image. This addresses one axis of the inefficiency in logical replication's use of storage (https://github.com/neondatabase/neon/issues/6626). It will still be writing a large amount of duplicative data if writing the same slot's state every 15 seconds, but the impact will be O(N) instead of O(N^2). ## Summary of changes - Introduce `NeonWalRecord::AuxFile` - In `DatadirModification`, if the AUX_FILES_KEY has already been set, then write a delta instead of an image	2024-02-14 15:01:16 +00:00
Christian Schwarz	774a6e7475	refactor(virtual_file) make write_all_at take owned buffers (#6673 ) context: https://github.com/neondatabase/neon/issues/6663 Building atop #6664, this PR switches `write_all_at` to take owned buffers. The main challenge here is the `EphemeralFile::mutable_tail`, for which I'm picking the ugly solution of an `Option` that is `None` while the IO is in flight. After this, we will be able to switch `write_at` to take owned buffers and call tokio-epoll-uring's `write` function with that owned buffer. That'll be done in #6378.	2024-02-14 15:59:06 +01:00
Conrad Ludgate	855d7b4781	hold cancel session (#6750 ) ## Problem In a recent refactor, we accidentally dropped the cancel session early ## Summary of changes Hold the cancel session during proxy passthrough	2024-02-14 14:57:22 +00:00
Anna Khanova	c49c9707ce	Proxy: send cancel notifications to all instances (#6719 ) ## Problem If cancel request ends up on the wrong proxy instance, it doesn't take an effect. ## Summary of changes Send redis notifications to all proxy pods about the cancel request. Related issue: https://github.com/neondatabase/neon/issues/5839, https://github.com/neondatabase/cloud/issues/10262	2024-02-14 14:57:22 +00:00
Anna Khanova	2227540a0d	Proxy refactor auth+connect (#6708 ) ## Problem Not really a problem, just refactoring. ## Summary of changes Separate authenticate from wake compute. Do not call wake compute second time if we managed to connect to postgres or if we got it not from cache.	2024-02-14 14:57:22 +00:00
Conrad Ludgate	f1347f2417	proxy: add more http logging (#6726 ) ## Problem hard to see where time is taken during HTTP flow. ## Summary of changes add a lot more for query state. add a conn_id field to the sql-over-http span	2024-02-14 14:57:22 +00:00
Conrad Ludgate	30b295b017	proxy: some more parquet data (#6711 ) ## Summary of changes add auth_method and database to the parquet logs	2024-02-14 14:57:22 +00:00
Anna Khanova	1cef395266	Proxy: copy bidirectional fork (#6720 ) ## Problem `tokio::io::copy_bidirectional` doesn't close the connection once one of the sides closes it. It's not really suitable for the postgres protocol. ## Summary of changes Fork `copy_bidirectional` and initiate a shutdown for both connections. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-02-14 14:57:22 +00:00
Christian Schwarz	df5d588f63	refactor(VirtualFile::crashsafe_overwrite): avoid Handle::block_on in callers (#6731 ) Some callers of `VirtualFile::crashsafe_overwrite` call it on the executor thread, thereby potentially stalling it. Others are more diligent and wrap it in `spawn_blocking(..., Handle::block_on, ... )` to avoid stalling the executor thread. However, because `crashsafe_overwrite` uses VirtualFile::open_with_options internally, we spawn a new thread-local `tokio-epoll-uring::System` in the blocking pool thread that's used for the `spawn_blocking` call. This PR refactors the situation such that we do the `spawn_blocking` inside `VirtualFile::crashsafe_overwrite`. This unifies the situation for the better: 1. Callers who didn't wrap in `spawn_blocking(..., Handle::block_on, ...)` before no longer stall the executor. 2. Callers who did it before now can avoid the `block_on`, resolving the problem with the short-lived `tokio-epoll-uring::System`s in the blocking pool threads. A future PR will build on top of this and divert to tokio-epoll-uring if it's configures as the IO engine. Changes ------- - Convert implementation to std::fs and move it into `crashsafe.rs` - Yes, I know, Safekeepers (cc @arssher ) added `durable_rename` and `fsync_async_opt` recently. However, `crashsafe_overwrite` is different in the sense that it's higher level, i.e., it's more like `std::fs::write` and the Safekeeper team's code is more building block style. - The consequence is that we don't use the VirtualFile file descriptor cache anymore. - I don't think it's a big deal because we have plenty of slack wrt production file descriptor limit rlimit (see [this dashboard](https://neonprod.grafana.net/d/e4a40325-9acf-4aa0-8fd9-f6322b3f30bd/pageserver-open-file-descriptors?orgId=1)) - Use `tokio::task::spawn_blocking` in `VirtualFile::crashsafe_overwrite` to call the new `crashsafe::overwrite` API. - Inspect all callers to remove any double-`spawn_blocking` - spawn_blocking requires the captures data to be 'static + Send. So, refactor the callers. We'll need this for future tokio-epoll-uring support anyway, because tokio-epoll-uring requires owned buffers. Related Issues -------------- - overall epic to enable write path to tokio-epoll-uring: #6663 - this is also kind of relevant to the tokio-epoll-uring System creation failures that we encountered in staging, investigation being tracked in #6667 - why is it relevant? Because this PR removes two uses of `spawn_blocking+Handle::block_on`	2024-02-14 14:22:41 +00:00
John Spray	f39b0fce9b	Revert #6666 "tests: try to make restored-datadir comparison tests not flaky" (#6751 ) The #6666 change appears to have made the test fail more often. PR https://github.com/neondatabase/neon/pull/6712 should re-instate this change, along with its change to make the overall flow more reliable. This reverts commit `568f91420a`.	2024-02-14 10:57:01 +00:00
Conrad Ludgate	a9ec4eb4fc	hold cancel session (#6750 ) ## Problem In a recent refactor, we accidentally dropped the cancel session early ## Summary of changes Hold the cancel session during proxy passthrough	2024-02-14 10:26:32 +00:00
Heikki Linnakangas	a97b54e3b9	Cherry-pick Postgres bugfix to 'mmap' DSM implementation Cherry-pick Upstream commit fbf9a7ac4d to neon stable branches. We'll get it in the next PostgreSQL minor release anyway, but we need it now, if we want to start using the 'mmap' implementation. See https://github.com/neondatabase/autoscaling/issues/800 for the plans on doing that.	2024-02-14 11:37:52 +02:00
Heikki Linnakangas	a5114a99b2	Create a symlink from pg_dynshmem to /dev/shm See included comment and issue https://github.com/neondatabase/autoscaling/issues/800 for details. This has no effect, unless you set "dynamic_shared_memory_type = mmap" in postgresql.conf.	2024-02-14 11:37:52 +02:00
Arpad Müller	ee7bbdda0e	Create new metric for directory counts (#6736 ) There is O(n^2) issues due to how we store these directories (#6626), so it's good to keep an eye on them and ensure the numbers stay low. The new per-timeline metric `pageserver_directory_entries_count` isn't perfect, namely we don't calculate it every time we attach the timeline, but only if there is an actual change. Also, it is a collective metric over multiple scalars. Lastly, we only emit the metric if it is above a certain threshold. However, the metric still give a feel for the general size of the timeline. We care less for small values as the metric is mainly there to detect and track tenants with large directory counts. We also expose the directory counts in `TimelineInfo` so that one can get the detailed size distribution directly via the pageserver's API. Related: #6642 , https://github.com/neondatabase/cloud/issues/10273	2024-02-14 02:12:00 +01:00
Konstantin Knizhnik	b6e070bf85	Do not perform fast exit for catalog pages in redo filter (#6730 ) ## Problem See https://github.com/neondatabase/neon/issues/6674 Current implementation of `neon_redo_read_buffer_filter` performs fast exist for catalog pages: ``` /* * Out of an abundance of caution, we always run redo on shared catalogs, * regardless of whether the block is stored in shared buffers. See also * this function's top comment. / if (!OidIsValid(NInfoGetDbOid(rinfo))) return false; / as a result last written lsn and relation size for FSM fork are not correctly updated for catalog relations. ## Summary of changes Do not perform fast path return for catalog relations. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-02-13 20:41:17 +02:00
Christian Schwarz	7fa732c96c	refactor(virtual_file): take owned buffer in VirtualFile::write_all (#6664 ) Building atop #6660 , this PR converts VirtualFile::write_all to owned buffers. Part of https://github.com/neondatabase/neon/issues/6663	2024-02-13 18:46:25 +01:00
Anna Khanova	331935df91	Proxy: send cancel notifications to all instances (#6719 ) ## Problem If cancel request ends up on the wrong proxy instance, it doesn't take an effect. ## Summary of changes Send redis notifications to all proxy pods about the cancel request. Related issue: https://github.com/neondatabase/neon/issues/5839, https://github.com/neondatabase/cloud/issues/10262	2024-02-13 17:58:58 +01:00

1 2 3 4 5 ...

4916 Commits