## Problem Currently, activity monitor in `compute_ctl` has 500 ms polling interval. It also looks on the list of current client backends looking for an active one or one with the most recent state change. This means we can miss short-living connections. Yet, during testing this PR I realized that it's usually not a problem with pooled connection, as pgbouncer maintains connections to Postgres even though client connection are short-living. We can still miss direct connections. ## Summary of changes This commit introduces another way to detect user activity on the compute. It polls a sum of `active_time` and sum of `sessions` from all non-system databases in the `pg_stat_database` [1]. If user runs some queries or just open a direct connection, it will rise; if user will drop db, it can go down, but it's still a change and will be detected as activity. New statistic-based logic seems to be working fine. Yet, after having it running for a couple of hours I've seen several odd cases with connections via pgbouncer: 1. Sometimes, if you run just `psql pooler_connstr -c 'select 1;'` `active_time` could be not updated immediately, and it may take a couple of dozens of seconds. This doesn't seem critical, though. 2. Same query with pooler, `active_time` can be bumped a bit, then pgbouncer keeps open connection to Postgres for ~10 minutes, then it disconnects, and `active_time` *could be* bumped a bit again. 'Could be' because I've seen it once, but it didn't reproduce for a second try. I think this can create false-positives (hopefully rare), when we will not suspend some computes because of lagged statistics update OR because some non-user processes will try to connect to user databases. Currently, we don't touch them outside of startup and `postgres_exporter` is configured to do not discover other databases, but this can change in the future. New behavior is covered by feature flag `activity_monitor_experimental`, which should be provided by control plane via neondatabase/cloud#9171 [1] https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-DATABASE-VIEW Related to neondatabase/cloud#7966, neondatabase/cloud#7198
Compute node tools
Postgres wrapper (compute_ctl) is intended to be run as a Docker entrypoint or as a systemd
ExecStart option. It will handle all the Neon specifics during compute node
initialization:
compute_ctlaccepts cluster (compute node) specification as a JSON file.- Every start is a fresh start, so the data directory is removed and initialized again on each run.
- Next it will put configuration files into the
PGDATAdirectory. - Sync safekeepers and get commit LSN.
- Get
basebackupfrom pageserver using the returned on the previous step LSN. - Try to start
postgresand wait until it is ready to accept connections. - Check and alter/drop/create roles and databases.
- Hang waiting on the
postmasterprocess to exit.
Also compute_ctl spawns two separate service threads:
compute-monitorchecks the last Postgres activity timestamp and saves it into the sharedComputeNode;http-endpointruns a Hyper HTTP API server, which serves readiness and the last activity requests.
If AUTOSCALING environment variable is set, compute_ctl will start the
vm-monitor located in [neon/libs/vm_monitor]. For VM compute nodes,
vm-monitor communicates with the VM autoscaling system. It coordinates
downscaling and requests immediate upscaling under resource pressure.
Usage example:
compute_ctl -D /var/db/postgres/compute \
-C 'postgresql://cloud_admin@localhost/postgres' \
-S /var/db/postgres/specs/current.json \
-b /usr/local/bin/postgres
Tests
Cargo formatter:
cargo fmt
Run tests:
cargo test
Clippy linter:
cargo clippy --all --all-targets -- -Dwarnings -Drust-2018-idioms
Cross-platform compilation
Imaging that you are on macOS (x86) and you want a Linux GNU (x86_64-unknown-linux-gnu platform in rust terminology) executable.
Using docker
You can use a throw-away Docker container (rustlang/rust image) for doing that:
docker run --rm \
-v $(pwd):/compute_tools \
-w /compute_tools \
-t rustlang/rust:nightly cargo build --release --target=x86_64-unknown-linux-gnu
or one-line:
docker run --rm -v $(pwd):/compute_tools -w /compute_tools -t rust:latest cargo build --release --target=x86_64-unknown-linux-gnu
Using rust native cross-compilation
Another way is to add x86_64-unknown-linux-gnu target on your host system:
rustup target add x86_64-unknown-linux-gnu
Install macOS cross-compiler toolchain:
brew tap SergioBenitez/osxct
brew install x86_64-unknown-linux-gnu
And finally run cargo build:
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=x86_64-unknown-linux-gnu-gcc cargo build --target=x86_64-unknown-linux-gnu --release