mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-07 05:22:56 +00:00
## Problem I noticed when onboarding lots of tenants that the AZ scheduling violation stat was climbing, before falling later as optimisations happened. This was happening because we first add the tenant with PlacementPolicy::Secondary, and then later go to PlacementPolicy::Attached, and the scheduler's behavior led to a bad AZ choice: 1. Create a secondary location in the non-preferred AZ 2. Upgrade to Attached where we promote that non-preferred-AZ location to attached and then create another secondary 3. Optimiser later realises we're in the wrong AZ and moves us ## Summary of changes - Extend some logging to give more information about AZs - When scheduling secondary location in PlacementPolicy::Secondary, select it as if we were attached: in this mode, our business goal is to have a warm pageserver location that we can make available as attached quickly if needed, therefore we want it to be in the preferred AZ. - Make optimize_secondary logic the same, so that it will consider a secondary location in the preferred AZ to be optimal when in PlacementPolicy::Secondary - When transitioning to from PlacementPolicy::Attached(N) to PlacementPolicy::Secondary, instead of arbitrarily picking a location to keep, prefer to keep the location in the preferred AZ