mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-15 01:12:56 +00:00
initdb takes about 1 s. Our tests create and destroy a lot of tenants, so that adds up. Cache the initdb result to speed it up. This is currently only enabled in tests. Out of caution, mostly. But also because when you reuse the initdb result, all the postgres clusters end up having the same system_identifier, which is supposed to be unique. It's not necessary for it to be unique for correctness, nothing critical relies on it and you can easily end up with duplicate system_identifiers in standalone PostgreSQL too, if you e.g. create a backup and restore it on a different system. But it is used in various checks to reduce the chance that you e.g. accidentally apply WAL belonging to a different cluster. Because this is aimed at tests, there are a few things that might be surprising: - The initdb cache directory is configurable, and can be outside the pageserver's repo directory. This allows reuse across different pageservers running on the same host. In production use, that'd be pointless, but our tests create a lot of pageservers. - The cache is not automatically purged at start / shutdown. For production use, we'd probably want that, so that we'd pick up any changes in what an empty cluster looks like after a Postgres minor version upgrade, for example. But again tests create and destroy a lot of pageservers, so it's important to retain the cache. - The locking on the cache directory relies purely on filesystem operations and atomic rename(). Using e.g. a rust Mutex() would be more straightforward, but that's not enough because the cache needs to be shared between different pageservers running on the same system.