Add a condition to switch walreceiver connection to safekeeper that is
located in the same availability zone. Switch happens when commit_lsn of
a candidate is not less than commit_lsn from the active connection. This
condition is expected not to trigger instantly, because commit_lsn of a
current connection is usually greater than commit_lsn of updates from
the broker. That means that if WAL is written continuously, switch can
take a lot of time, but it should happen eventually.
Now protoc 3.15+ is required for building neon.
Fixes https://github.com/neondatabase/neon/issues/3200
- Add support for splitting async postgres_backend into read and write halfes.
Safekeeper needs this for bidirectional streams. To this end, encapsulate
reading-writing postgres messages to framed.rs with split support without any
additional changes (relying on BufRead for reading and BytesMut out buffer for
writing).
- Use async postgres_backend throughout safekeeper (and in proxy auth link
part).
- In both safekeeper COPY streams, do read-write from the same thread/task with
select! for easier error handling.
- Tidy up finishing CopyBoth streams in safekeeper sending and receiving WAL
-- join split parts back catching errors from them before returning.
Initially I hoped to do that read-write without split at all, through polling
IO:
https://github.com/neondatabase/neon/pull/3522
However that turned out to be more complicated than I initially expected
due to 1) borrow checking and 2) anon Future types. 1) required Rc<Refcell<...>>
which is Send construct just to satisfy the checker; 2) can be workaround with
transmute. But this is so messy that I decided to leave split.
There will be different scopes for those two, so authorization code should be different.
The `check_permission` function is now not in the shared library. Its implementation
is very similar to the one which will be added for Safekeeper. In fact, we may reuse
the same existing root-like 'PageServerApi' scope, but I would prefer to have separate
root-like scopes for services.
Also, generate_management_token in tests is generate_pageserver_token now.
With the ability to pass commit_lsn. This allows to perform project WAL recovery
through different (from the original) set of safekeepers (or under different
ttid) by
1) moving WAL files to s3 under proper ttid;
2) explicitly creating timeline on safekeepers, setting commit_lsn to the
latest point;
3) putting the lastest .parital file to the timeline directory on safekeepers, if
desired.
Extend test_s3_wal_replay to exersise this behaviour.
Also extends timeline_status endpoint to return postgres information.
Creates new `pageserver_api` and `safekeeper_api` crates to serve as the
shared dependencies. Should reduce both recompile times and cold compile
times.
Decreases the size of the optimized `neon_local` binary: 380M -> 179M.
No significant changes for anything else (mostly as expected).
Part of the general work on improving pageserver logs.
Brief summary of changes:
* Remove `ApiError::from_err`
* Remove `impl From<anyhow::Error> for ApiError`
* Convert `ApiError::{BadRequest, NotFound}` to use `anyhow::Error`
* Note: `NotFound` has more verbose formatting because it's more
likely to have useful information for the receiving "user"
* Explicitly convert from `tokio::task::JoinError`s into
`InternalServerError`s where appropriate
Also note: many of the places where errors were implicitly converted to
500s have now been updated to return a more appropriate error. Some
places where it's not yet possible to distinguish the error types have
been left as 500s.
* `control_plane` crate (used by `neon_local`) now parses an `auth_enabled` bool for each Safekeeper
* If auth is enabled, a Safekeeper is passed a path to a public key via a new command line argument
* Added TODO comments to other places needing auth
Separate task is launched for each timeline and stopped when timeline doesn't
need offloading. Decision who offloads is done through etcd leader election;
currently there is no pre condition for participating, that's a TODO.
neon_local and tests infrastructure for remote storage in safekeepers added,
along with the test itself.
ref #1009
Co-authored-by: Anton Shyrabokau <ahtoxa@Antons-MacBook-Pro.local>
* There is no auth in Safekeeper HTTP at all currently,
so simply calling `check_permission` is not enough.
* There are no checks of Safekeeper still working with the data,
as "still working" is burry now: a timeline may be "active"
while there are no compute nodes and all data is propagated.
* Still, callmemaybe is deactivated, and timeline is removed from the
internal map. It can easily sneak back in case of race conditions
and implicit creations, though.
Make it remember when timeline starts in general and on this safekeeper in
particular (the point might be later on new safekeeper replacing failed one).
Bumps control file and walproposer protocol versions.
While protocol is bumped, also add safekeeper node id to
AcceptorProposerGreeting.
ref #1561
Remove when it is consumed by all of 1) pageserver (remote_consistent_lsn) 2)
safekeeper peers 3) s3 WAL offloading.
In test s3 offloading for now is mocked by directly bumping s3_wal_lsn.
ref #1403