rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-05 13:10:37 +00:00

Files

Christian Schwarz dd6990567f walredo: apply_batch_postgres: get a backtrace whenever it encounters an error (#5541 )

For 2 weeks we've seen rare, spurious, not-reproducible page
reconstruction
failures with PG16 in prod.

One of the commits we deployed this week was

Commit

    commit fc467941f9
    Author: Joonas Koivunen <joonas@neon.tech>
    Date:   Wed Oct 4 16:19:19 2023 +0300

        walredo: log retryed error (#546)

With the logs from that commit, we learned that some read() or write()
system call that walredo does fails with `EAGAIN`, aka
`Resource temporarily unavailable (os error 11)`.

But we have no idea where exactly in the code we get back that error.

So, use anyhow instead of fake std::io::Error's as an easy way to get
a backtrace when the error happens, and change the logging to print
that backtrace (i.e., use `{:?}` instead of
`utils::error::report_compact_sources(e)`).

The `WalRedoError` type had to go because we add additional `.context()`
further up the call chain before we `{:?}`-print it. That additional
`.context()` further up doesn't see that there's already an
anyhow::Error
inside the `WalRedoError::ApplyWalRecords` variant, and hence captures
another backtrace and prints that one on `{:?}`-print instead of the
original one inside `WalRedoError::ApplyWalRecords`.

If we ever switch back to `report_compact_sources`, we should make sure
we have some other way to uniquely identify the places where we return
an error in the error message.

2023-10-13 14:08:23 +00:00

bench_layer_map.rs

refactor: remove is_incremental=true for ImageLayers footgun (#5061 )

2023-08-22 22:12:05 +03:00

bench_walredo.rs

walredo: apply_batch_postgres: get a backtrace whenever it encounters an error (#5541 )

2023-10-13 14:08:23 +00:00

large-layer-map-layernames.txt

Add layer map search benchmark (#2957 )

2022-11-30 13:48:07 -05:00

odd-brook-layernames.txt

Add layer map search benchmark (#2957 )