From 56da62487015f78c2cfbb48132bc85cd6f1f93d3 Mon Sep 17 00:00:00 2001 From: Peter Bendel Date: Wed, 19 Jun 2024 15:04:29 +0200 Subject: [PATCH] allow storage_controller error during pagebench (#8109) ## Problem `test_pageserver_max_throughput_getpage_at_latest_lsn` is a pagebench testcase which creates several tenants/timelines to verify pageserver performance. The test swaps environments around in the tenant duplication stage, so the storage controller uses two separate db instances (one in the duplication stage and another one in the benchmarking stage). In the benchmarking stage, the storage controller starts without any knowledge of nodes, but with knowledge of tenants (via attachments.json). When we re-attach and attempt to update the scheduler stats, the scheduler rightfully complains about the node not being known. The setup should preserve the storage controller across the two envs, but i think it's fine to just allow list the error in this case. ## Summary of changes add the error message `2024-06-19T09:38:27.866085Z ERROR Scheduler missing node 1`` to the list of allowed errors for storage_controller --- ...est_pageserver_max_throughput_getpage_at_latest_lsn.py | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py b/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py index 772a39fe35..68f3d9dcbe 100644 --- a/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py +++ b/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py @@ -209,3 +209,11 @@ def run_benchmark_max_throughput_latest_lsn( unit="ms", report=MetricReport.LOWER_IS_BETTER, ) + + env.storage_controller.allowed_errors.append( + # The test setup swaps NeonEnv instances, hence different + # pg instances are used for the storage controller db. This means + # the storage controller doesn't know about the nodes mentioned + # in attachments.json at start-up. + ".* Scheduler missing node 1", + )