fix(compute_ctl): Wait for rsyslog longer and with backoff (#12002)

## Problem

https://github.com/neondatabase/neon/pull/11988 waits only for max
~200ms, so we still see failures, which self-resolve after several
operation retries.

## Summary of changes

Change it to waiting for at least 5 seconds, starting with 2 ms sleep
between iterations and x2 sleep on each next iteration. It could be that
it's not a problem with a slow `rsyslog` start, but a longer wait won't
hurt. If it won't start, we should debug why `inittab` doesn't start it,
or maybe there is another problem.
This commit is contained in:
Alexey Kondratov
2025-05-22 21:15:05 +02:00
committed by GitHub
parent e69ae739ff
commit cf81330fbc

View File

@@ -28,20 +28,37 @@ fn get_rsyslog_pid() -> Option<String> {
}
fn wait_for_rsyslog_pid() -> Result<String, anyhow::Error> {
for attempt in 1..=50 {
const MAX_WAIT: Duration = Duration::from_secs(5);
const INITIAL_SLEEP: Duration = Duration::from_millis(2);
let mut sleep_duration = INITIAL_SLEEP;
let start = std::time::Instant::now();
let mut attempts = 1;
for attempt in 1.. {
attempts = attempt;
match get_rsyslog_pid() {
Some(pid) => return Ok(pid),
None => {
if start.elapsed() >= MAX_WAIT {
break;
}
info!(
"rsyslogd is not running, attempt {}/50. Waiting...",
attempt
"rsyslogd is not running, attempt {}. Sleeping for {} ms",
attempt,
sleep_duration.as_millis()
);
std::thread::sleep(std::time::Duration::from_millis(2));
std::thread::sleep(sleep_duration);
sleep_duration *= 2;
}
}
}
Err(anyhow::anyhow!("rsyslogd did not start after 50 attempts"))
Err(anyhow::anyhow!(
"rsyslogd is not running after waiting for {} seconds and {} attempts",
attempts,
start.elapsed().as_secs()
))
}
// Restart rsyslogd to apply the new configuration.