Current state with authentication.
Page server validates JWT token passed as a password during connection
phase and later when performing an action such as create branch tenant
parameter of an operation is validated to match one submitted in token.
To allow access from console there is dedicated scope: PageServerApi,
this scope allows access to all tenants. See code for access validation in:
PageServerHandler::check_permission.
Because we are in progress of refactoring of communication layer
involving wal proposer protocol, and safekeeper<->pageserver. Safekeeper
now doesn’t check token passed from compute, and uses “hardcoded” token
passed via environment variable to communicate with pageserver.
Compute postgres now takes token from environment variable and passes it
as a password field in pageserver connection. It is not passed through
settings because then user will be able to retrieve it using pg_settings
or SHOW ..
I’ve added basic test in test_auth.py. Probably after we add
authentication to remaining network paths we should enable it by default
and switch all existing tests to use it.
The pieces are:
base Connection
SendWal
ReplicationHandler
There are lots of other changes here:
- Put the replication reader in a background thread; this gets rid
of some hacks with nonblocking mode.
- Stop manually buffering input data; use BufReader instead.
- Use BytesMut a lot less; use Read/Write traits where possible.
Multiple fds writing to the same file doesn't work. One fd will
overwrite the output of the other fd. We were opening log files three
times (stdout, stderr, and slog).
The symptoms can be seen when the program panics; the final file will
have truncated or lost messages. After this change, all messages are
preserved. If panicking and logging are concurrent (and they definitely
can be), some of the messages may be interleaved in slightly
inconvenient ways.
File::try_clone() is essentially `dup` underneath, meaning the two will
share the same file offset.
- remove needless return
- remove needless format!
- remove a few more needless clone()
- from_str_radix(_, 10) -> .parse()
- remove needless reference
- remove needless `mut`
Also manually replaced a match statement with map_err() because after
clippy was done with it, there was almost nothing left in the match
expression.
This replaces the page server's "datadir" concept. The Page Server now
always works with a "Zenith Repository". When you initialize a new
repository with "zenith init", it runs initdb and loads an initial
basebackup of the freshly-created cluster into the repository, on "main"
branch. Repository can hold multiple "timelines", which can be given
human-friendly names, making them "branches". One page server simultaneously
serves all timelines stored in the repository, and you can have multiple
Postgres compute nodes connected to the page server, as long they all
operate on a different timeline.
There is a new command "zenith branch", which can be used to fork off
new branches from existing branches.
The repository uses the directory layout desribed as Repository format
v1 in https://github.com/zenithdb/rfcs/pull/5. It it *highly* inefficient:
- we never create new snapshots. So in practice, it's really just a base
backup of the initial empty cluster, and everything else is reconstructed
by redoing all WAL
- when you create a new timeline, the base snapshot and *all* WAL is copied
from the new timeline to the new one. There is no smarts about
referencing the old snapshots/wal from the ancestor timeline.
To support all this, this commit includes a bunch of other changes:
- Implement "basebackup" funtionality in page server. When you initialize
a new compute node with "zenith pg create", it connects to the page
server, and requests a base backup of the Postgres data directory on
that timeline. (the base backup excludes user tables, so it's not
as bad as it sounds).
- Have page server's WAL receiver write the WAL into timeline dir. This
allows running a Page Server and Compute Nodes without a WAL safekeeper,
until we get around to integrate that properly into the system. (Even
after we integrate WAL safekeeper, this is perhaps how this will operate
when you want to run the system on your laptop.)
- restore_datadir.rs was renamed to restore_local_repo.rs, and heavily
modified to use the new format. It now also restores all WAL.
- Page server no longer scans and restores everything into memory at startup.
Instead, when the first request is made for a timeline, the timeline is
slurped into memory at that point.
- The responsibility for telling page server to "callmemaybe" was moved
into Postgres libpqpagestore code. Also, WAL producer connstring cannot
be specified in the pageserver's command line anymore.
- Having multiple "system identifiers" in the same page server is no
longer supported. I repurposed much of that code to support multiple
timelines, instead.
- Implemented very basic, incomplete, support for PostgreSQL's Extended
Query Protocol in page_service.rs. Turns out that rust-postgres'
copy_out() function always uses the extended query protocol to send
out the command, and I'm using that to stream the base backup from the
page server.
TODO: I haven't fixed the WAL safekeeper for this scheme, so all the
integration tests involving safekeepers are failing. My plan is to modify
the safekeeper to know about Zenith timelines, too, and modify it to work
with the same Zenith repository format. It only needs to care about the
'.zenith/timelines/<timeline>/wal' directories.
Resolve some basic warnings from clippy:
- useless conversion to the same type
- redundant field names in struct initialization
- redundant single-component path imports