storcon: avoid multiple initdbs when shard 0 has stale locations (#11760)

## Problem

In #11727 I overlooked the case of multiple attached locations for shard
0.

I misread the code and thought `create_one` acts on one location, but it
actually acts on one _shard_, which is potentially multiple locations.

This was not a regression, but it meant that the fix was incomplete.

## Summary of changes

- In `create_one`, when updating shard zero, have any "other" locations
use the initdb from shard 0
This commit is contained in:
John Spray
2025-04-29 16:31:52 +01:00
committed by GitHub
parent 768a580373
commit a2adc7dbd3

View File

@@ -3663,7 +3663,7 @@ impl Service {
locations: ShardMutationLocations,
http_client: reqwest::Client,
jwt: Option<String>,
create_req: TimelineCreateRequest,
mut create_req: TimelineCreateRequest,
) -> Result<TimelineInfo, ApiError> {
let latest = locations.latest.node;
@@ -3682,6 +3682,15 @@ impl Service {
.await
.map_err(|e| passthrough_api_error(&latest, e))?;
// If we are going to create the timeline on some stale locations for shard 0, then ask them to re-use
// the initdb generated by the latest location, rather than generating their own. This avoids racing uploads
// of initdb to S3 which might not be binary-identical if different pageservers have different postgres binaries.
if tenant_shard_id.is_shard_zero() {
if let models::TimelineCreateRequestMode::Bootstrap { existing_initdb_timeline_id, .. } = &mut create_req.mode {
*existing_initdb_timeline_id = Some(create_req.new_timeline_id);
}
}
// We propagate timeline creations to all attached locations such that a compute
// for the new timeline is able to start regardless of the current state of the
// tenant shard reconciliation.