Files
neon/pageserver
Christian Schwarz 74df4a7b76 fix(walredo): walredo process that causes errors is never killed
Before this PR, if walredo failed, we would still update `last_redo_at`.
This means the `maybe_quiesce()` would never kill that process, although
clearly something is wrong.

However, we don't want to kill the process immediately on each redo failure,
because, the redo failure could be caused by corrupted or malicious
data.

So, change `maybe_quiesce()` to determine inactivity based on last
*successful* `last_redo_at`.
The result is that a transient redo failure won't cause a kill.
If there are *only* redo failures, we'll spawn & kill walredo process
at frequency 1/(10*compaction_period).
2024-02-01 14:38:53 +01:00
..
2023-11-28 12:50:53 -05:00