From 14e1f89053b87a38f0dd697f5c32721380d19a9a Mon Sep 17 00:00:00 2001 From: Erik Grinaker Date: Tue, 21 Jan 2025 23:01:27 +0100 Subject: [PATCH] pageserver: eagerly notify flush waiters (#10469) ## Problem Currently, the layer flush loop will continue flushing layers as long as any are pending, and only notify waiters once there are no further layers to flush. This can cause waiters to wait longer than necessary, and potentially starve them if pending layers keep arriving faster than they can be flushed. The impact of this will increase when we add compaction backpressure and propagate it up into the WAL receiver. Extracted from #10405. ## Summary of changes Break out of the layer flush loop once we've flushed up to the requested LSN. If further flush requests have arrived in the meanwhile, flushing will resume immediately after. --- pageserver/src/tenant/timeline.rs | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/pageserver/src/tenant/timeline.rs b/pageserver/src/tenant/timeline.rs index 5f4272fb2b..3245f23a28 100644 --- a/pageserver/src/tenant/timeline.rs +++ b/pageserver/src/tenant/timeline.rs @@ -3617,6 +3617,12 @@ impl Timeline { return; } + // Break to notify potential waiters as soon as we've flushed the requested LSN. If + // more requests have arrived in the meanwhile, we'll resume flushing afterwards. + if flushed_to_lsn >= frozen_to_lsn { + break Ok(()); + } + let timer = self.metrics.flush_time_histo.start_timer(); let num_frozen_layers;