mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-06 13:02:55 +00:00
storage controller: make proxying of GETs to pageservers more robust (#9065)
## Problem These commits are split off from https://github.com/neondatabase/neon/pull/8971/commits where I was fixing this to make a better scale test pass -- Vlad also independently recognized these issues with cloudbench in https://github.com/neondatabase/neon/issues/9062. 1. The storage controller proxies GET requests to pageservers based on their intent, not the ground truth of where they're really attached. 2. Proxied requests can race with scheduling to tenants, resulting in 404 responses if the request hits the wrong pageserver. Closes: https://github.com/neondatabase/neon/issues/9062 ## Summary of changes 1. If a shard has a running reconciler, then use the database generation_pageserver to decide who to proxy the request to 2. If such a request gets a 404 response and its scheduled node has changed since the request was dispatched.
This commit is contained in:
@@ -541,6 +541,8 @@ impl Reconciler {
|
||||
}
|
||||
}
|
||||
|
||||
pausable_failpoint!("reconciler-live-migrate-pre-generation-inc");
|
||||
|
||||
// Increment generation before attaching to new pageserver
|
||||
self.generation = Some(
|
||||
self.persistence
|
||||
@@ -617,6 +619,8 @@ impl Reconciler {
|
||||
},
|
||||
);
|
||||
|
||||
pausable_failpoint!("reconciler-live-migrate-post-detach");
|
||||
|
||||
tracing::info!("🔁 Switching to AttachedSingle mode on node {dest_ps}",);
|
||||
let dest_final_conf = build_location_config(
|
||||
&self.shard,
|
||||
|
||||
Reference in New Issue
Block a user