Part of #7497, closes #8817. ## Problem See #8817. ## Summary of changes **compute_ctl** - Renew lsn lease as soon as `/configure` updates pageserver_connstr, use `state_changed` Condvar for synchronization. **pageserver** As mentioned in https://github.com/neondatabase/neon/issues/8817#issuecomment-2315768076, we still want some permanent error reported if a lease cannot be granted. By considering attachment mode and the added `lsn_lease_deadline` when processing lease requests, we can also bound the case of bad requests to a very short period after migration/restart. - Refactor https://github.com/neondatabase/neon/pull/9024 and move `lsn_lease_deadline` to `AttachedTenantConf` so timeline can easily access it. - Have separate HTTP `init_lsn_lease` and libpq `renew_lsn_lease` API. - Always do LSN verification for the initial HTTP lease request. - LSN verification for the renewal is **still done** when tenants are not in `AttachedSingle` and we have pass the `lsn_lease_deadline`, which give plenty of time for compute to renew the lease. **neon_local** - add and call `timeline_init_lsn_lease` mgmt_api at static endpoint start. The initial lsn lease http request is sent when we run `cargo neon endpoint start <static endpoint>`. ## Testing - Extend `test_readonly_node_gc` to do pageserver restarts and migration. ## Future Work - The control plane should make the initial lease request through HTTP when creating a static endpoint. This is currently only done in `neon_local`. Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Compute node tools
Postgres wrapper (compute_ctl) is intended to be run as a Docker entrypoint or as a systemd
ExecStart option. It will handle all the Neon specifics during compute node
initialization:
compute_ctlaccepts cluster (compute node) specification as a JSON file.- Every start is a fresh start, so the data directory is removed and initialized again on each run.
- Next it will put configuration files into the
PGDATAdirectory. - Sync safekeepers and get commit LSN.
- Get
basebackupfrom pageserver using the returned on the previous step LSN. - Try to start
postgresand wait until it is ready to accept connections. - Check and alter/drop/create roles and databases.
- Hang waiting on the
postmasterprocess to exit.
Also compute_ctl spawns two separate service threads:
compute-monitorchecks the last Postgres activity timestamp and saves it into the sharedComputeNode;http-endpointruns a Hyper HTTP API server, which serves readiness and the last activity requests.
If AUTOSCALING environment variable is set, compute_ctl will start the
vm-monitor located in [neon/libs/vm_monitor]. For VM compute nodes,
vm-monitor communicates with the VM autoscaling system. It coordinates
downscaling and requests immediate upscaling under resource pressure.
Usage example:
compute_ctl -D /var/db/postgres/compute \
-C 'postgresql://cloud_admin@localhost/postgres' \
-S /var/db/postgres/specs/current.json \
-b /usr/local/bin/postgres
State Diagram
Computes can be in various states. Below is a diagram that details how a compute moves between states.
%% https://mermaid.js.org/syntax/stateDiagram.html
stateDiagram-v2
[*] --> Empty : Compute spawned
Empty --> ConfigurationPending : Waiting for compute spec
ConfigurationPending --> Configuration : Received compute spec
Configuration --> Failed : Failed to configure the compute
Configuration --> Running : Compute has been configured
Empty --> Init : Compute spec is immediately available
Empty --> TerminationPending : Requested termination
Init --> Failed : Failed to start Postgres
Init --> Running : Started Postgres
Running --> TerminationPending : Requested termination
TerminationPending --> Terminated : Terminated compute
Failed --> [*] : Compute exited
Terminated --> [*] : Compute exited
Tests
Cargo formatter:
cargo fmt
Run tests:
cargo test
Clippy linter:
cargo clippy --all --all-targets -- -Dwarnings -Drust-2018-idioms
Cross-platform compilation
Imaging that you are on macOS (x86) and you want a Linux GNU (x86_64-unknown-linux-gnu platform in rust terminology) executable.
Using docker
You can use a throw-away Docker container (rustlang/rust image) for doing that:
docker run --rm \
-v $(pwd):/compute_tools \
-w /compute_tools \
-t rustlang/rust:nightly cargo build --release --target=x86_64-unknown-linux-gnu
or one-line:
docker run --rm -v $(pwd):/compute_tools -w /compute_tools -t rust:latest cargo build --release --target=x86_64-unknown-linux-gnu
Using rust native cross-compilation
Another way is to add x86_64-unknown-linux-gnu target on your host system:
rustup target add x86_64-unknown-linux-gnu
Install macOS cross-compiler toolchain:
brew tap SergioBenitez/osxct
brew install x86_64-unknown-linux-gnu
And finally run cargo build:
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=x86_64-unknown-linux-gnu-gcc cargo build --target=x86_64-unknown-linux-gnu --release