mirror of
https://github.com/neondatabase/neon.git
synced 2025-12-22 21:59:59 +00:00
tests: allow-list a controller heartbeat error (#8471)
## Problem `test_change_pageserver` stops pageservers in a way that can overlap with the controller's heartbeats: the controller can get a heartbeat success and then immediately find the node unavailable. This particular situation triggers a log that isn't in our current allow-list of messages for nodes offline Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8339/10048487700/index.html#testresult/19678f27810231df/retries ## Summary of changes - Add the message to the allow list
This commit is contained in:
@@ -828,6 +828,8 @@ impl Service {
|
||||
);
|
||||
}
|
||||
Err(err) => {
|
||||
// Transition to active involves reconciling: if a node responds to a heartbeat then
|
||||
// becomes unavailable again, we may get an error here.
|
||||
tracing::error!(
|
||||
"Failed to update node {} after heartbeat round: {}",
|
||||
node_id,
|
||||
|
||||
@@ -102,6 +102,7 @@ DEFAULT_STORAGE_CONTROLLER_ALLOWED_ERRORS = [
|
||||
# failing to connect to them.
|
||||
".*Call to node.*management API.*failed.*receive body.*",
|
||||
".*Call to node.*management API.*failed.*ReceiveBody.*",
|
||||
".*Failed to update node .+ after heartbeat round.*error sending request for url.*",
|
||||
# Many tests will start up with a node offline
|
||||
".*startup_reconcile: Could not scan node.*",
|
||||
# Tests run in dev mode
|
||||
|
||||
Reference in New Issue
Block a user