tests: stabilize shard locations earlier in test_scrubber_tenant_snapshot (#10606)

## Problem

This test would sometimes emit unexpected logs from the storage
controller's requests to do migrations, which overlap with the test's
restarts of pageservers, where those migrations are happening some time
after a shard split as the controller moves load around.

Example:
https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10602/13067323736/index.html#testresult/f66f1329557a1fc5/retries

## Summary of changes

- Do a reconcile_until_idle after shard split, so that the rest of the
test doesn't run concurrently with migrations
This commit is contained in:
John Spray
2025-02-03 09:02:21 +00:00
committed by GitHub
parent 4dfe60e2ad
commit f071800979

View File

@@ -71,6 +71,10 @@ def test_scrubber_tenant_snapshot(neon_env_builder: NeonEnvBuilder, shard_count:
else:
tenant_shard_ids = [TenantShardId(tenant_id, 0, 0)]
# Let shards finish rescheduling to other pageservers: this makes the rest of the test more stable
# is it won't overlap with migrations
env.storage_controller.reconcile_until_idle(max_interval=0.1, timeout_secs=120)
output_path = neon_env_builder.test_output_dir / "snapshot"
os.makedirs(output_path)