pull_timeline and sk logging fixes (#11786)

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-06 04:52:55 +00:00

This patch contains some fixes of issues I ran into for #11712:

* make `pull_timeline` return success for timeline that already exists.
This follows general API design of storage components: API endpoints are
retryable and converge to a status code, instead of starting to error.
We change the `pull_timeline`'s return type a little bit, because we
might not actually have a source sk to pull from. Note that the fix is
not enough, there is still a race when two `pull_timeline` instances
happen in parallel: we might try to enter both pulled timelines at the
same time. That can be fixed later.
* make `pull_timeline` support one safekeeper being down. In general, if
one safekeeper is down, that's not a problem. the added comment explains
a potential situation (found in the `test_lagging_sk` test for example)
* don't log very long errors when computes try to connect to safekeepers
that don't have the timeline yet, if `allow_timeline_creation` is false.
That flag is enabled when a sk connection string with generation numbers
is passed to the compute, so we'll hit this code path more often. E.g.
when a safekeeper missed a timeline creation, but the compute connects
to it first before the `pull_timeline` gets requested by the storcon
reconciler: this is a perfectly normal situation. So don't log the whole
error backtrace, and don't log it on the error log level, but only on
info.

part of #11670

This commit is contained in:

Arpad Müller

2025-04-30 18:24:01 +02:00

committed by

GitHub

parent e2db76b9be

commit bec7427d9e

5 changed files with 47 additions and 14 deletions

									
										5

libs/safekeeper_api/src/models.rs
									
												View File
												
				@@ -303,7 +303,8 @@ pub struct PullTimelineRequest {

				#[derive(Debug, Serialize, Deserialize)]

				pub struct PullTimelineResponse {

				    // Donor safekeeper host

				    pub safekeeper_host: String,

				    /// Donor safekeeper host.

				    /// None if no pull happened because the timeline already exists.

				    pub safekeeper_host: Option<String>,

				    // TODO: add more fields?

				}

pull_timeline and sk logging fixes (#11786)

5 libs/safekeeper_api/src/models.rs Unescape Escape View File

5

libs/safekeeper_api/src/models.rs

View File