mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-10 06:52:55 +00:00
storcon: return 503 instead of 500 if there is no new leader yet (#11417)
The leadership transfer protocol between storage controller instances is as follows, listing the steps for the new pod: The new pod does these things: 1. new pod comes online. looks in database if there is a leader. if there is, it asks that leader to step down. 2. the new pod does some operations to come online. they should be fairly short timed, but it's not zero. 3. the new pod updates the leader entry in the database. The old pod, once it gets the step down request, changes its internal state to stepped down. It treats all incoming requests specially now: instead of processing, it wants to forward them to the new pod. The forwarding however only works if the new pod is online already, so before forwarding it reads from the db for a leader (also to get the address to forward to in the first place). If the new pod is not online yet, i.e. during step 2 above, the old pod might legitimately land in the branch which this patch is editing: the leader in the database is a stepped down instance. Before, we've returned a `ApiError::InternalServerError`, but that would print the full backtrace plus an error log. With this patch, we cut down on the noise, as it's an expected situation to have a short storcon downtime while we are cutting over to the new instance. A `ResourceUnavailable` error is not just more fitting, it also doesn't print a backtrace once encountered, and only prints on the INFO log level (see `api_error_handler` function). Fixes #11320 cc #8954
This commit is contained in:
@@ -1733,9 +1733,9 @@ async fn maybe_forward(req: Request<Body>) -> ForwardOutcome {
|
||||
};
|
||||
|
||||
if *self_addr == leader_addr {
|
||||
return ForwardOutcome::Forwarded(Err(ApiError::InternalServerError(anyhow::anyhow!(
|
||||
"Leader is stepped down instance"
|
||||
))));
|
||||
return ForwardOutcome::Forwarded(Err(ApiError::ResourceUnavailable(
|
||||
"Leader is stepped down instance".into(),
|
||||
)));
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user