rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 20:42:54 +00:00

Author	SHA1	Message	Date
Vlad Lazar	3f91ea28d9	tests: add infra and test for storcon leadership transfer (#8587 ) ## Problem https://github.com/neondatabase/neon/pull/8588 implemented the mechanism for storage controller leadership transfers. However, there's no tests that exercise the behaviour. ## Summary of changes 1. Teach `neon_local` how to handle multiple storage controller instances. Each storage controller instance gets its own subdirectory (`storage_controller_1, ...`). `storage_controller start\|stop` subcommands have also been extended to optionally accept an instance id. 2. Add a storage controller proxy test fixture. It's a basic HTTP server that forwards requests from pageserver and test env to the currently configured storage controller. 3. Add a test which exercises storage controller leadership transfer. 4. Finally fix a couple bugs that the test surfaced	2024-08-16 13:05:04 +01:00
Heikki Linnakangas	7fdc3ea162	Add retroactive RFC about physical replication (#8546 ) We've had physical replication support for a long time, but we never created an RFC for the feature. This RFC does that after the fact. Even though we've already implemented the feature, let's have a design discussion as if it hadn't done that. It can still be quite insightful. This is written from a pretty compute-centric viewpoint, not much on how it works in the control plane.	2024-08-16 11:30:53 +01:00
Joonas Koivunen	4763a960d1	chore: log if we have an open layer or any frozen on shutdown (#8740 ) Some benchmarks are failing with a "long" flushing, which might be because there is a queue of in-memory layers, or something else. Add logging to narrow it down. Private slack DM ref: https://neondb.slack.com/archives/D049K7HJ9JM/p1723727305238099	2024-08-16 06:10:05 +01:00
Sasha Krassovsky	df086cd139	Add logical replication test to exercise snapfiles (#8364 )	2024-08-15 15:34:45 -07:00
Alexander Bayandin	69cb1ee479	CI(replication-tests): store test results & change notification channel (#8687 ) ## Problem We want to store Nightly Replication test results in the database and notify the relevant Slack channel about failures ## Summary of changes - Store test results in the database - Notify `on-call-compute-staging-stream` about failures	2024-08-15 22:41:58 +01:00
Alexander Bayandin	4e58fd9321	CI(label-for-external-users): use CI_ACCESS_TOKEN (#8738 ) ## Problem `secrets.GITHUB_TOKEN` (with any permissions) is not enough to get a user's membership info if they decide to hide it. ## Summary of changes - Use `secrets.CI_ACCESS_TOKEN` for `gh api` call - Use `pull_request_target` instead of `pull_request` event to get access to secrets	2024-08-15 18:37:15 +01:00
Konstantin Knizhnik	f087423a01	Handle reload config file request in LR monitor (#8732 ) ## Problem Logical replication BGW checking replication lag is not reloading config ## Summary of changes Add handling of reload config request ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-08-15 16:28:25 +03:00
Joonas Koivunen	24d347f50b	storcon: use tracing for logging panics (#8734 ) this gives spans for panics, and does not globber loki output by writing to stderr while all of the other logging is to stdout. See: #3475	2024-08-15 16:27:07 +03:00
Joonas Koivunen	52641eb853	storcon: add spans to drain/fill ops (#8735 ) this way we do not need to repeat the %node_id everywhere, and we get no stray messages in logs from within the op.	2024-08-15 15:30:04 +03:00
Joonas Koivunen	d9a57aeed9	storcon: deny external node configuration if an operation is ongoing (#8727 ) Per #8674, disallow node configuration while drain/fill are ongoing. Implement it by adding a only-http wrapper `Service::external_node_configure` which checks for operation existing before configuring. Additionally: - allow cancelling drain/fill after a pageserver has restarted and transitioned to WarmingUp Fixes: #8674	2024-08-15 10:54:05 +01:00
Alexander Bayandin	a9c28be7d0	fix(pageserver): allow unused_imports in download.rs on macOS (#8733 ) ## Problem On macOS, clippy fails with the following error: ``` error: unused import: `crate::virtual_file::owned_buffers_io::io_buf_ext::IoBufExt` --> pageserver/src/tenant/remote_timeline_client/download.rs:26:5 \| 26 \| use crate::virtual_file::owned_buffers_io::io_buf_ext::IoBufExt; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `-D unused-imports` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_imports)]` ``` Introduced in https://github.com/neondatabase/neon/pull/8717 ## Summary of changes - allow `unused_imports` for `crate::virtual_file::owned_buffers_io::io_buf_ext::IoBufExt` on macOS in download.rs	2024-08-15 10:06:28 +01:00
Vlad Lazar	fef77b0cc9	safekeeper: consider partial uploads when pulling timeline (#8628 ) ## Problem The control file contains the id of the safekeeper that uploaded it. Previously, when sending a snapshot of the control file to another sk, it would eventually be gc-ed by the receiving sk. This is incorrect because the original sk might still need it later. ## Summary of Changes When sending a snapshot and the control file contains an uploaded segment: * Create a copy of the segment in s3 with the destination sk in the object name * Tweak the streamed control file to point to the object create in the previous step Note that the snapshot endpoint now has to know the id of the requestor, so the api has been extended to include the node if of the destination sk. Closes https://github.com/neondatabase/neon/issues/8542	2024-08-15 09:02:33 +01:00
Christian Schwarz	168913bdf0	refactor(write path): newtype to enforce use of fully initialized slices (#8717 ) The `tokio_epoll_uring::Slice` / `tokio_uring::Slice` type is weird. The new `FullSlice` newtype is better. See the doc comment for details. The naming is not ideal, but we'll clean that up in a future refactoring where we move the `FullSlice` into `tokio_epoll_uring`. Then, we'll do the following: * tokio_epoll_uring::Slice is removed * `FullSlice` becomes `tokio_epoll_uring::IoBufView` * new type `tokio_epoll_uring::IoBufMutView` for the current `tokio_epoll_uring::Slice<IoBufMut>` Context ------- I did this work in preparation for https://github.com/neondatabase/neon/pull/8537. There, I'm changing the type that the `inmemory_layer.rs` passes to `DeltaLayerWriter::put_value_bytes` and thus it seemed like a good opportunity to make this cleanup first.	2024-08-14 21:57:17 +02:00
Alexander Bayandin	aa2e16f307	CI: misc cleanup & fixes (#8559 ) ## Problem A bunch of small fixes and improvements for CI, that are too small to have a separate PR for them ## Summary of changes - CI(build-and-test): fix parenthesis - CI(actionlint): fix path to workflow file - CI: remove default args from actions/checkout - CI: remove `gen3` label, using a couple `self-hosted` + `small{,-arm64}`/`large{,-arm64}` is enough - CI: prettify Slack messages, hide links behind text messages - C(build-and-test): add more dependencies to `conclusion` job	2024-08-14 17:56:59 +01:00
Alexander Bayandin	70b18ff481	CI(neon-image): add ARM-specific RUSTFLAGS (#8566 ) ## Problem It's recommended that a couple of additional RUSTFLAGS be set up to improve the performance of Rust applications on AWS Graviton. See `57dc813626/rust.md` Note: Apple Silicon is compatible with neoverse-n1: ``` $ clang --version Apple clang version 15.0.0 (clang-1500.3.9.4) Target: arm64-apple-darwin23.6.0 Thread model: posix InstalledDir: /Applications/Xcode_15.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin $ $ clang --print-supported-cpus 2>&1 \| grep neoverse- neoverse-512tvb neoverse-e1 neoverse-n1 neoverse-n2 neoverse-v1 neoverse-v2 ``` ## Summary of changes - Add `-Ctarget-feature=+lse -Ctarget-cpu=neoverse-n1` to RUSTFLAGS for ARM images	2024-08-14 17:03:21 +01:00
Joonas Koivunen	60fc1e8cc8	chore: even more responsive compaction cancellation (#8725 ) Some benchmarks and tests might still fail because of #8655 (tracked in #8708) because we are not fast enough to shut down ([one evidence]). Partially this is explained by the current validation mode of streaming k-merge, but otherwise because that is where we use a lot of time in compaction. Outside of L0 => L1 compaction, the image layer generation is already guarded by vectored reads doing cancellation checks. 32768 is a wild guess based on looking how many keys we put in each layer in a bench (1-2 million), but I assume it will be good enough divisor. Doing checks more often will start showing up as contention which we cannot currently measure. Doing checks less often might be reasonable. [one evidence]: https://neon-github-public-dev.s3.amazonaws.com/reports/main/10384136483/index.html#suites/9681106e61a1222669b9d22ab136d07b/96e6d53af234924/ Earlier PR: #8706.	2024-08-14 14:48:15 +01:00
Alexander Bayandin	36c1719a07	CI(build-neon): fix accidental neon rebuild on `cargo test` (#8721 ) ## Problem During `Run rust tests` step (for debug builds), we accidentally rebuild neon twice (by `cargo test --doc` and by `cargo nextest run`). It happens because we don't set `cov_prefix` for the `cargo test --doc` command, which triggers rebuilding with different build flags, and one more rebuild by `cargo nextest run`. ## Summary of changes - Set `cov_prefix` for `cargo test --doc` to prevent unneeded rebuilds	2024-08-14 13:38:25 +01:00
John Spray	abb53ba36d	storcon_cli: don't clobber heatmap interval when setting eviction (#8722 ) ## Problem This command is kind of a hack, used when we're migrating large tenants and want to get their resident size down. It sets the tenant config to a fixed value, which omitted heatmap_period, so caused secondaries to get out of date. ## Summary of changes - Set heatmap period to the same 300s default that we use elsewhere when updating eviction settings This is not as elegant as some general purpose partial modification of the config, but it practically makes the command safer to use.	2024-08-14 13:37:03 +01:00
Conrad Ludgate	a7028d92b7	proxy: start of jwk cache (#8690 ) basic JWT implementation that caches JWKs and verifies signatures. this code is currently not reachable from proxy, I just wanted to get something merged in.	2024-08-14 13:35:29 +01:00
Joonas Koivunen	6c9e3c9551	refactor: error/anyhow::Error wrapping (#8697 ) We can get CompactionError::Other(Cancelled) via the error handling with a few ways. [evidence](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8655/10301613380/index.html#suites/cae012a1e6acdd9fdd8b81541972b6ce/653a33de17802bb1/). Hopefully fix it by: 1. replace the `map_err` which hid the `GetReadyAncestorError::Cancelled` with `From<GetReadyAncestorError> for GetVectoredError` conversion 2. simplifying the code in pgdatadir_mapping to eliminate the token anyhow wrapping for deserialization errors 3. stop wrapping GetVectoredError as anyhow errors 4. stop wrapping PageReconstructError as anyhow errors Additionally, produce warnings if we treat any other error (as was legal before this PR) as missing key. Cc: #8708.	2024-08-14 12:45:56 +01:00
Alexander Bayandin	fc3d372f3a	CI(label-for-external-users): check membership using GitHub API (#8724 ) ## Problem `author_association` doesn't properly work if a GitHub user decides not to show affiliation with the org in their profile (i.e. if it's private) ## Summary of changes - Call `/orgs/ORG/members/USERNAME` API to check whether a PR/issue author is a member of the org	2024-08-14 12:27:52 +01:00
John Spray	19d69d515c	pageserver: evict covered layers earlier (#8679 ) ## Problem When pageservers do compaction, they frequently create image layers that make earlier layers un-needed for reads, but then keep those earlier layers around for 24 hours waiting for time-based eviction to expire them. Now that we track layer visibility, we can use it as an input to eviction, and avoid the 24 hour "disk bump" that happens around pageserver restarts. ## Summary of changes - During time-based eviction, if a layer is marked Covered, use the eviction period as the threshold: i.e. these layers get to remain resident for at least one iteration of the eviction loop, but then get evicted. With current settings this means they get evicted after 1h instead of 24h. - During disk usage eviction, prioritized evicting covered layers above all other layers. Caveats: - Using the period as the threshold for time based eviction in this case is a bit of a hack, but it avoids adding yet another configuration property, and in any case the value of a new property would be somewhat arbitrary: there's no "right" length of time to keep covered layers around just in case. - We had previously planned on removing time-based eviction: this change would motivate us to keep it around, but we can still simplify the code later to just do the eviction of covered layers, rather than applying a TTL policy to all layers.	2024-08-14 12:10:15 +01:00
Joonas Koivunen	485d76ac62	timeline_detach_ancestor: adjust error handling (#8528 ) With additional phases from #8430 the `detach_ancestor::Error` became untenable. Split it up into phases, and introduce laundering for remaining `anyhow::Error` to propagate them as most often `Error::ShuttingDown`. Additionally, complete FIXMEs. Cc: #6994	2024-08-14 10:16:18 +01:00
John Spray	4049d2b7e1	scrubber: fix spurious "Missed some shards" errors (#8661 ) ## Problem The storage scrubber was reporting warnings for lots of timelines like: ``` WARN Missed some shards at count ShardCount(0) tenant_id=25eb7a83d9a2f90ac0b765b6ca84cf4c ``` These were spurious: these tenants are fine. There was a bug in accumulating the ShardIndex for each tenant, whereby multiple timelines would lead us to add the same ShardIndex more than one. Closes: #8646 ## Summary of changes - Accumulate ShardIndex in a BTreeSet instead of a Vec - Extend the test to reproduce the issue	2024-08-14 09:29:06 +01:00
Konstantin Knizhnik	7a1736ddcf	Preserve HEAP_COMBOCID when restoring t_cid from WAL (#8503 ) ## Problem See https://github.com/neondatabase/neon/issues/8499 ## Summary of changes Save HEAP_COMBOCID flag in WAL and do not clear it in redo handlers. Related Postgres PRs: https://github.com/neondatabase/postgres/pull/457 https://github.com/neondatabase/postgres/pull/458 https://github.com/neondatabase/postgres/pull/459 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-08-14 08:13:20 +03:00
Tristan Partin	c624317b0e	Decode the database name in SQL/HTTP connections A url::Url does not hand you back a URL decoded value for path values, so we must decode them ourselves. Link: https://docs.rs/url/2.5.2/url/struct.Url.html#method.path Link: https://docs.rs/url/2.5.2/url/struct.Url.html#method.path_segments Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-08-13 16:32:58 -05:00
Tristan Partin	0f43b7c51b	Loosen type on PgProtocol::safe_psql(queries:) Using Iterable allows us to also use tuples, among other things. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-08-13 16:32:58 -05:00
Joonas Koivunen	6d6e2c6a39	feat(detach_ancestor): better retries with persistent gc blocking (#8430 ) With the persistent gc blocking, we can now retry reparenting timelines which had failed for whatever reason on the previous attempt(s). Restructure the detach_ancestor into three phases: - prepare (insert persistent gc blocking, copy lsn prefix, layers) - detach and reparent - reparenting can fail, so we might need to retry this portion - complete (remove persistent gc blocking) Cc: #6994	2024-08-13 18:51:51 +01:00
Joonas Koivunen	87a5d7db9e	test: do better job of shutting everything down (#8714 ) After #8655 we've had a few issues (mostly tracked on #8708) with the graceful shutdown. In order to shutdown more of the processes and catch more errors, for example, from all pageservers, do an immediate shutdown for those nodes which fail the initial (possibly graceful) shutdown. Cc: #6485	2024-08-13 18:49:50 +01:00
Peter Bendel	9d2276323d	Benchmarking tests: automatically restore Neon reuse databases, too and migrate to pg16 (#8707 ) ## Problem We use a set of Neon reuse databases in benchmarking.yml which are still using pg14. Because we want to compare apples to apples and have migrated the AWS reuse clusters to pg16 we should also use pg16 for Neon. ## Summary of changes - Automatically restore the test databases for Neon project	2024-08-13 19:36:39 +02:00
Joonas Koivunen	ae6e27274c	refactor(test): unify how we clear shared buffers (#8634 ) so that we can easily plug in LFC clearing as well. Private discussion reference: <https://neondb.slack.com/archives/C033A2WE6BZ/p1722942856987979>	2024-08-13 20:14:42 +03:00
Joonas Koivunen	8f170c5105	fix: make compaction more sensitive to cancellation (#8706 ) A few of the benchmarks have started failing after #8655 where they are waiting for compactor task. Reads done by image layer creation should already be cancellation sensitive because vectored get does a check each time, but try sprinkling additional cancellation points to: - each partition - after each vectored read batch	2024-08-13 18:00:54 +01:00
Joonas Koivunen	e0946e334a	bench: stop immediatedly in some benches (#8713 ) It seems that some benchmarks are failing because they are simply not stopping to ingest wal on shutdown. It might mean that the tests were never ran on a stable pageserver situation and WAL has always been left to be ingested on safekeepers, but let's see if this silences the failures and "stops the bleeding". Cc: https://github.com/neondatabase/neon/issues/8712	2024-08-13 17:07:51 +01:00
Alexander Bayandin	852a6a7a5a	CI: mark PRs and issues create by external users (#8694 ) ## Problem We want to mark new PRs and issues created by external users ## Summary of changes - Add a new workflow which adds `external` label for issues and PRs created by external users	2024-08-13 15:28:26 +01:00
John Spray	ecb01834d6	pageserver: implement utilization score (#8703 ) ## Problem When the utilization API was added, it was just a stub with disk space information. Disk space information isn't a very good metric for assigning tenants to pageservers, because pageservers making full use of their disks would always just have 85% utilization, irrespective of how much pressure they had for disk space. ## Summary of changes - Use the new layer visibiilty metric to calculate a "wanted size" per tenant, and sum these to get a total local disk space wanted per pageserver. This acts as the primary signal for utilization. - Also use the shard count to calculate a utilization score, and take the max of this and the disk-driven utilization. The shard count limit is currently set as a constant 20,000, which matches contemporary operational practices when loading pageservers. The shard count limit means that for tiny/empty tenants, on a machine with 3.84TB disk, each tiny tenant influences the utilization score as if it had size 160MB.	2024-08-13 15:15:55 +01:00
Konstantin Knizhnik	afb68b0e7e	Report search_path to make it possible to use it in pgbouncer track_extra_parameters (#8303 ) ## Problem When pooled connections are used, session semantic its not preserved, including GUC settings. Many customers have particular problem with setting search_path. But pgbouncer 1.20 has `track_extra_parameters` settings which allows to track parameters included in startup package which are reported by Postgres. Postgres has [an official list of parameters that it reports to the client](https://www.postgresql.org/docs/15/protocol-flow.html#PROTOCOL-ASYNC). This PR makes Postgres also report `search_path` and so allows to include it in `track_extra_parameters`. ## Summary of changes Set GUC_REPORT flag for `search_path`. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-08-13 15:07:24 +03:00
Vlad Lazar	b9d2c7bdd5	pageserver: remove vectored get related configs (#8695 ) ## Problem Pageserver exposes some vectored get related configs which are not in use. ## Summary of changes Remove the following pageserver configs: get_impl, get_vectored_impl, and `validate_get_vectored`. They are not used in the pageserver since https://github.com/neondatabase/neon/pull/8601. Manual overrides have been removed from the aws repo in https://github.com/neondatabase/aws/pull/1664.	2024-08-13 12:45:54 +01:00
John Spray	3379cbcaa4	pageserver: add CompactKey, use it in InMemoryLayer (#8652 ) ## Problem This follows a PR that insists all input keys are representable in 16 bytes: - https://github.com/neondatabase/neon/pull/8648 & a PR that prevents postgres from sending us keys that use the high bits of field2: - https://github.com/neondatabase/neon/pull/8657 Motivation for this change: 1. Ingest is bottlenecked on CPU 2. InMemoryLayer can create huge (~1M value) BTreeMap<Key,_> for its index. 3. Maps over i128 are much faster than maps over an arbitrary 18 byte struct. It may still be worthwhile to make the index two-tier to optimize for the case where only the last 4 bytes (blkno) of the key vary frequently, but simply using the i128 representation of keys has a big impact for very little effort. Related: #8452 ## Summary of changes - Introduce `CompactKey` type which contains an i128 - Use this instead of Key in InMemoryLayer's index, converting back and forth as needed. ## Performance All the small-value `bench_ingest` cases show improved throughput. The one that exercises this index most directly shows a 35% throughput increase: ``` ingest-small-values/ingest 128MB/100b seq, no delta time: [374.29 ms 378.56 ms 383.38 ms] thrpt: [333.88 MiB/s 338.13 MiB/s 341.98 MiB/s] change: time: [-26.993% -26.117% -25.111%] (p = 0.00 < 0.05) thrpt: [+33.531% +35.349% +36.974%] Performance has improved. ```	2024-08-13 11:48:23 +01:00
Arseny Sher	d24f1b6c04	Allow logical_replication_max_snap_files = -1 which disables the mechanism.	2024-08-13 09:42:16 +03:00
Sasha Krassovsky	32aa1fc681	Add on-demand WAL download to slot funcs (#8705 ) ## Problem Currently we can have an issue where if someone does `pg_logical_slot_advance`, it could fail because it doesn't have the WAL locally. ## Summary of changes Adds on-demand WAL download and a test to these slot funcs. Before adding these, the test fails with ``` requested WAL segment pg_wal/000000010000000000000001 has already been removed ``` After the changes, the test passes Relies on: - https://github.com/neondatabase/postgres/pull/466 - https://github.com/neondatabase/postgres/pull/467 - https://github.com/neondatabase/postgres/pull/468	2024-08-12 20:54:42 -08:00
Peter Bendel	f57c2fe8fb	Automatically prepare/restore Aurora and RDS databases from pg_dump in benchmarking workflow (#8682 ) ## Problem We use infrastructure as code (TF) to deploy AWS Aurora and AWS RDS Postgres database clusters. Whenever we have a change in TF (e.g. every year to upgrade to a higher Postgres version or when we change the cluster configuration) TF will apply the change and create a new AWS database cluster. However our benchmarking testcase also expects databases in these clusters and tables loaded with data. So we add auto-detection - if the AWS RDS instances are "empty" we create the necessary databases and restore a pg_dump. Important Notes: - These steps are NOT run in each benchmarking run, but only after a new RDS instance has been deployed. - the benchmarking workflows use GitHub secrets to find the connection string for the database. These secrets still need to be (manually or programmatically using git cli) updated if some port of the connection string (e.g. user, password or hostname) changes. ## Summary of changes In each benchmarking run check if - database has already been created - if not create it - database has already been restored - if not restore it Supported databases - tpch - clickbench - user example Supported platforms: - AWS RDS Postgres - AWS Aurora serverless Postgres Sample workflow run - but this one uses Neon database to test the restore step and not real AWS databases https://github.com/neondatabase/neon/actions/runs/10321441086/job/28574350581 Sample workflow run - with real AWS database clusters https://github.com/neondatabase/neon/actions/runs/10346816389/job/28635997653 Verification in second run - with real AWS database clusters - that second time the restore is skipped https://github.com/neondatabase/neon/actions/runs/10348469517/job/28640778223	2024-08-12 21:46:35 +02:00
Christian Schwarz	ce0d0a204c	fix(walredo): shutdown can complete too early (#8701 ) Problem ------- The following race is possible today: ``` walredo_extraordinary_shutdown_thread: shutdown gets until Poll::Pending of self.launched_processes.close().await call other thread: drops the last Arc<Process> = 1. drop(_launched_processes_guard) runs, this ... walredo_extraordinary_shutdown_thread: ... wakes self.launched_processes.close().await walredo_extraordinary_shutdown_thread: logs `done` other thread: = 2. drop(process): this kill & waits ``` Solution -------- Change drop order so that `process` gets dropped first. Context ------- https://neondb.slack.com/archives/C06Q661FA4C/p1723478188785719?thread_ts=1723456706.465789&cid=C06Q661FA4C refs https://github.com/neondatabase/neon/pull/8572 refs https://github.com/neondatabase/cloud/issues/11387	2024-08-12 18:15:48 +01:00
Vlad Lazar	ae527ef088	storcon: implement graceful leadership transfer (#8588 ) ## Problem Storage controller restarts cause temporary unavailability from the control plane POV. See RFC for more details. ## Summary of changes * A couple of small refactors of the storage controller start-up sequence to make extending it easier. * A leader table is added to track the storage controller instance that's currently the leader (if any) * A peer client is added such that storage controllers can send `step_down` requests to each other (implemented in https://github.com/neondatabase/neon/pull/8512). * Implement the leader cut-over as described in the RFC * Add `start-as-candidate` flag to the storage controller to gate the rolling restart behaviour. When the flag is `false` (the default), the only change from the current start-up sequence is persisting the leader entry to the database.	2024-08-12 13:58:46 +01:00
Joonas Koivunen	9dc9a9b2e9	test: do graceful shutdown by default (#8655 ) It should give us all possible allowed_errors more consistently. While getting the workflows to pass on https://github.com/neondatabase/neon/pull/8632 it was noticed that allowed_errors are rarely hit (1/4). This made me realize that we always do an immediate stop by default. Doing a graceful shutdown would had made the draining more apparent and likely we would not have needed the #8632 hotfix. Downside of doing this is that we will see more timeouts if tests are randomly leaving pause failpoints which fail the shutdown. The net outcome should however be positive, we could even detect too slow shutdowns caused by a bug or deadlock.	2024-08-12 15:37:15 +03:00
John Spray	1b9a27d6e3	tests: reinstate test_bulk_insert (#8683 ) ## Problem This test was disabled. ## Summary of changes - Remove the skip marker. - Explicitly avoid doing compaction & gc during checkpoints (the default scale doesn't do anything here, but when experimeting with larger scales it messes things up) - Set a data size that gives a ~20s runtime on a Hetzner dev machine, previous one gave very noisy results because it was so small For reference on a Hetzner AX102: ``` ------------------------------ Benchmark results ------------------------------- test_bulk_insert[neon-release-pg16].insert: 25.664 s test_bulk_insert[neon-release-pg16].pageserver_writes: 5,428 MB test_bulk_insert[neon-release-pg16].peak_mem: 577 MB test_bulk_insert[neon-release-pg16].size: 0 MB test_bulk_insert[neon-release-pg16].data_uploaded: 1,922 MB test_bulk_insert[neon-release-pg16].num_files_uploaded: 8 test_bulk_insert[neon-release-pg16].wal_written: 1,382 MB test_bulk_insert[neon-release-pg16].wal_recovery: 25.373 s test_bulk_insert[neon-release-pg16].compaction: 0.035 s ```	2024-08-12 13:33:09 +01:00
Shinya Kato	41b5ee491e	Fix a comment in walproposer_pg.c (#8583 ) ## Problem Perhaps there is an error in the source code comment. ## Summary of changes Fix "walsender" to "walproposer"	2024-08-12 13:24:25 +01:00
Arseny Sher	06df6ca52e	proto changes	2024-08-12 14:48:05 +03:00
Arseny Sher	930763cad2	s/jsonb/array	2024-08-12 14:48:05 +03:00
Arseny Sher	28ef1522d6	cosmetic fixes	2024-08-12 14:48:05 +03:00
Arseny Sher	c9d2b61195	fix term uniqueness	2024-08-12 14:48:05 +03:00

1 2 3 4 5 ...

5880 Commits