pageserver: gate previous heatmap behind config flag (#11088)

## Problem

On unarchival, we update the previous heatmap with all visible layers.
When the primary generates a new heatmap it includes all those layers,
so the secondary will download them. Since they're not actually resident
on the primary (we didn't call the warm up API), they'll never be
evicted, so they remain in the heatmap.

This leads to oversized secondary locations like we saw in pre-prod.

## Summary of changes

Gate the loading of the previous heatmaps and the heatmap generation on
unarchival behind configuration
flags. They are disabled by default, but enabled in tests.
This commit is contained in:
Vlad Lazar
2025-03-05 12:20:18 +00:00
committed by GitHub
parent abae7637d6
commit 8c12ccf729
4 changed files with 29 additions and 3 deletions

View File

@@ -1169,6 +1169,8 @@ class NeonEnv:
# Disable pageserver disk syncs in tests: when running tests concurrently, this avoids
# the pageserver taking a long time to start up due to syncfs flushing other tests' data
"no_sync": True,
# Look for gaps in WAL received from safekeepeers
"validate_wal_contiguity": True,
}
# Batching (https://github.com/neondatabase/neon/issues/9377):
@@ -1181,11 +1183,12 @@ class NeonEnv:
if config.test_may_use_compatibility_snapshot_binaries:
log.info(
"Skipping WAL contiguity validation to avoid forward-compatibility related test failures"
"Skipping prev heatmap settings to avoid forward-compatibility related test failures"
)
else:
# Look for gaps in WAL received from safekeepeers
ps_cfg["validate_wal_contiguity"] = True
ps_cfg["load_previous_heatmap"] = True
ps_cfg["generate_unarchival_heatmap"] = True
get_vectored_concurrent_io = self.pageserver_get_vectored_concurrent_io
if get_vectored_concurrent_io is not None: