Files
neon/libs/postgres_ffi
Heikki Linnakangas 2855c73990 Fix race condition after attaching tenant with branches. (#4170)
After tenant attach, there is a window where the child timeline is
loaded and accepts GetPage requests, but its parent is not. If a
GetPage request needs to traverse to the parent, it needs to wait for
the parent timeline to become active, or it might miss some records on
the parent timeline.

It's also possible that the parent timeline is active, but it hasn't
yet received all the WAL up to the branch point from the safekeeper.
This happens if a pageserver crashes soon after creating a timeline,
so that the WAL leading to the branch point has not yet been uploaded
to remote storage. After restart, the WAL will be re-streamed and
ingested from the safekeeper, but that takes a while. Because of that,
it's not enough to check that the parent timeline is active, we also
need to wait for the WAL to arrive on the parent timeline, just like
at the beginning of GetPage handling. We probably should change the
behavior at create_timeline so that a timeline can only be created
after all the WAL up to the branch point has been uploaded to remote
storage, but that's not currently the case and out of scope for this
PR (see github issue #4218).

@NanoBjorn encountered this while working on tenant migration. After
migrating a tenant with a parent and child branch, connecting to the
child branch failed with an error like:

```
FATAL:  "base/16385" is not a valid data directory
DETAIL:  File "base/16385/PG_VERSION" is missing.
```

This commit adds two tests that reproduce the bug, with slightly
different symptoms.
2023-05-13 10:44:11 +03:00
..

This module contains utilities for working with PostgreSQL file formats. It's a collection of structs that are auto-generated from the PostgreSQL header files using bindgen, and Rust functions to read and manipulate them.

There are also a bunch of constants in pg_constants.rs that are copied from various PostgreSQL headers, rather than auto-generated. They mostly should be auto-generated too, but that's a TODO.

The PostgreSQL on-disk file format is not portable across different CPU architectures and operating systems. It is also subject to change in each major PostgreSQL version. Currently, this module supports PostgreSQL v14 and v15: bindings and code that depends on them are version-specific. This code is organized in modules: postgres_ffi::v14 and postgres_ffi::v15 Version independend code is explicitly exported into shared postgres_ffi.

TODO: Currently, there is also some code that deals with WAL records in pageserver/src/waldecoder.rs. That should be moved into this module. The rest of the codebase should not have intimate knowledge of PostgreSQL file formats or WAL layout, that knowledge should be encapsulated in this module.