rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 12:32:54 +00:00

Files

John Spray 63213fc814 storage controller: scheduling optimization for sharded tenants (#7181 )

## Problem

- When we scheduled locations, we were doing it without any context
about other shards in the same tenant
- After a shard split, there wasn't an automatic mechanism to migrate
the attachments away from the split location
- After a shard split and the migration away from the split location,
there wasn't an automatic mechanism to pick new secondary locations so
that the end state has no concentration of locations on the nodes where
the split happened.

Partially completes: https://github.com/neondatabase/neon/issues/7139

## Summary of changes

- Scheduler now takes a `ScheduleContext` object that can be populated
with information about other shards
- During tenant creation and shard split, we incrementally build up the
ScheduleContext, updating it for each shard as we proceed.
- When scheduling new locations, the ScheduleContext is used to apply a
soft anti-affinity to nodes where a tenant already has shards.
- The background reconciler task now has an extra phase `optimize_all`,
which runs only if the primary `reconcile_all` phase didn't generate any
work. The separation is that `reconcile_all` is needed for availability,
but optimize_all is purely "nice to have" work to balance work across
the nodes better.
- optimize_all calls into two new TenantState methods called
optimize_attachment and optimize_secondary, which seek out opportunities
to improve placment:
- optimize_attachment: if the node where we're currently attached has an
excess of attached shard locations for this tenant compared with the
node where we have a secondary location, then cut over to the secondary
location.
- optimize_secondary: if the node holding our secondary location has an
excessive number of locations for this tenant compared with some other
node where we don't currently have a location, then create a new
secondary location on that other node.
- a new debug API endpoint is provided to run background tasks
on-demand. This returns a number of reconciliations in progress, so
callers can keep calling until they get a `0` to advance the system to
its final state without waiting for many iterations of the background
task.

Optimization is run at an implicitly low priority by:
- Omitting the phase entirely if reconcile_all has work to do
- Skipping optimization of any tenant that has reconciles in flight
- Limiting the total number of optimizations that will be run from one
call to optimize_all to a constant (currently 2).

The idea of that low priority execution is to minimize the operational
risk that optimization work overloads any part of the system. It happens
to also make the system easier to observe and debug, as we avoid running
large numbers of concurrent changes. Eventually we may relax these
limitations: there is no correctness problem with optimizing lots of
tenants concurrently, and optimizing multiple shards in one tenant just
requires housekeeping changes to update ShardContext with the result of
one optimization before proceeding to the next shard.

2024-03-28 18:48:52 +00:00

attachment_service

storage controller: scheduling optimization for sharded tenants (#7181 )

2024-03-28 18:48:52 +00:00

src

pageserver: check for new image layers based on ingested WAL (#7230 )

2024-03-28 17:44:55 +00:00

.gitignore

Revert "Use actual temporary dir for pageserver unit tests"

2023-01-19 20:16:56 +01:00

Cargo.toml

storage_controller: periodic pageserver heartbeats (#7092 )

2024-03-14 15:21:36 +00:00

README.md

neon_local: improved docs and fix wrong connstr (#6954 )

2024-03-01 14:54:07 -05:00

safekeepers.conf

Separate mgmt and libpq authentication configs in pageserver. (#3773 )

2023-03-15 13:52:29 +02:00

simple.conf

tests: enable multiple pageservers in neon_local and neon_fixture (#5231 )

2023-09-08 16:19:57 +01:00

README.md

Control Plane and Neon Local

This crate contains tools to start a Neon development environment locally. This utility can be used with the cargo neon command.

Example: Start with Postgres 16

To create and start a local development environment with Postgres 16, you will need to provide --pg-version flag to 3 of the start-up commands.

cargo neon init --pg-version 16
cargo neon start
cargo neon tenant create --set-default --pg-version 16
cargo neon endpoint create main --pg-version 16
cargo neon endpoint start main

Example: Create Test User and Database

By default, cargo neon starts an endpoint with cloud_admin and postgres database. If you want to have a role and a database similar to what we have on the cloud service, you can do it with the following commands when starting an endpoint.

cargo neon endpoint create main --pg-version 16 --update-catalog true
cargo neon endpoint start main --create-test-user true

The first command creates neon_superuser and necessary roles. The second command creates test user and neondb database. You will see a connection string that connects you to the test user after running the second command.