This patch started as an effort to support CLI working against remote
pageserver, but turned into a pretty big refactoring.
* CLI now does not look into repository files directly. New commands
'branch_create' and 'identify_system' were introduced into page_service to
support that.
* Branch management that was scattered between local_env and
zenith/main.rs is moved into pageserver/branches.rs. That code could better fit
in Repository/Timeline impl, but I'll leave that for a different patch.
* All tests-related code from local_env went into integration_tests/src/lib.rs as an
extension to PostgresNode trait.
* Paths-generating functions were concentrated around corresponding config
types (LocalEnv and PageserverConf).
When creating a new branch, we copied all WAL from the source timeline
to the new one, and it was being picked up and digested into the
repository on first use of the timeline. Fix by copying the WAL only
up to the branch's starting point.
We should probably move the branch-creation code from the CLI to page
server itself - that's what I was starting to hack on when I noticed this
bug - but let's fix this first.
Add a regression test. To test multiple branches, enhance the python
test fixture to manage multiple running Postgres instances. Also, for
convenience, add a function to the postgres fixture to open a connection
to the server with psycopg2.
This isn't just cosmetic, this also fixes one bug: the code in
parse_point_in_time() function used str::parse::<u64>() to parse the
parts of the LSN string (e.g. 0/1A2B3C4D). That's wrong, because the
LSN consists of hex digits, not base-10.
Commit 9ece1e863d used `slice.fill`, which isn't available until Rust
v1.50.0. I have 1.48.0 installed, so it was failing to compile for me.
We haven't really standardized on any particular Rust version, and if
there's a good feature we need in a recent version, let's bump up the
minimum requirement. But this is simple enough to work around.
Turn WalRedoManager into an abstract trait, so that it can be easily
mocked in unit tests.
One change here is that the WAL redo manager is no longer tied to a
specific zenith timeline. It didn't do anything with that information
aside from using it in the dummy datadir's name. We could use any
random string for that purpose, it's just to prevent two WAL redo
managers from stepping over each other. But this commit actually
changes things so that all timelines use the same WAL redo manager, so
that's not necessary. We will probably want to maintain a pool of WAL
redo processes in the future, but for now let's keep it simple.
In the passing, fix some comments.
We used to create them under .zenith/.zenith/<timelineid>. The double
.zenith was clearly not intentional. Change it to
.zenith/timelines/<timelineid>.
Fixes https://github.com/zenithdb/zenith/issues/127
It's a bit annoying that the .zenith state can show up in multiple
places, but since this is how the regression tests run if you launch
them from the git root directory, ignore this one too.
The module comment should use "//!" instead of "///". Otherwise, it is
considered to apply to the *next* thing, in this case the "use" statement
that follows, not the file as whole. "cargo fmt" revealed this by insisting
to move the "use crate::pg_constants" line to before the comment.
Multiple fds writing to the same file doesn't work. One fd will
overwrite the output of the other fd. We were opening log files three
times (stdout, stderr, and slog).
The symptoms can be seen when the program panics; the final file will
have truncated or lost messages. After this change, all messages are
preserved. If panicking and logging are concurrent (and they definitely
can be), some of the messages may be interleaved in slightly
inconvenient ways.
File::try_clone() is essentially `dup` underneath, meaning the two will
share the same file offset.
If timeline doesn't have a valid "last valid LSN", refuse WAL streaming.
The previous behavior was to start streaming from the very beginning of
time. That was needed to support bootstrapping the page server with no
data at all (see commit bd606ab37a), but we no longer do that.
Somehow I never learned this part correctly: relative imports use the
syntax "import .file" for a file sitting in the same directory.
This error wasn't terribly obvious, but the Pylance linter is yelling at
me so I'll fix it now before anyone else notices.
I didn't think this mattered, but it does: if you add a dependency to
zenith_utils, but forget to request a feature you need, the crate will
build from the workspace root, but not by itself.
It's probably better to pull in the whole dependency tree.
This leaves one problem unsolved: the missing feature above will now be
a latent bug. If that feature gets removed later by other crates, and
then the workspace_hack Cargo.toml is updated, this missing feature will
become a build failure.
Add fixes suggested in code review.
In a previous commit, I changed the NodeId field order and types to try
to preserve the exact serialization that was happening. Unfortunately,
that serialization was incorrect and the original struct was mostly
correct.
Change uuid to be a [u8; 16] as it was intended to be a byte array; that
will clearly indicate to serde serializers that no endian swaps will
ever be needed.
Replace XLogRecPtr with Lsn in wal_service.rs .
This removes the last use of XLogSegmentOffset and XLByteToSeg, so
delete them. (replaced by Lsn::segment_offset and Lsn::segment_number.)
The C memory representation is only needed if we want to guarantee the
same memory layout as some other program. Since we're using serde to
serialize these data structures, we can let the compiler do what it
wants.
This version validates on every call that our result is exactly the same
as the previous result.
NodeId is a strange corner case: one field is serialized little-endian
and one field is serialized big-endian. Hopefully we can fix that in the
future.
This module adds two traits that implement bincode-based serialization.
BeSer implements methods for big-endian encoding/decoding.
LeSer implements methods for little-endian encoding/decoding.
Right now, the BeSer and LeSer methods have the same names, meaning you
can't `use` them both at the same time. This is intended to be a safety
mechanism: mixing big-endian and little-endian encoding in the same file
is error-prone. There are ways around this, but the easiest fix is to
put the big-endian code and little-endian code in different files or
submodules.
This is a big async -> sync conversion. Most of it is a pretty
straightforward conversion of removing `async` and `.await` and swapping
in the right std modules.
I didn't find a thread-blocking version of `Notify` so I wrote one, and
then realized that there was already a Mutex being used there, so I
deleted my Notify and just used Condvar instead.
There is one part that seems odd to me: in `handle_start_replication`
there is a place where the previous code was doing a non-blocking read;
there is no TcpStream::try_read() so I fell back on manually flipping
the socket to non-blocking mode and then back again. This seems pretty
gross, but I'm not sure exactly what to replace this with: a background
thread? Extract the fd and run select() on it to first test if it's
readable?
Switch over to a newer version of rust-postgres PR752. A few
minor changes are required:
- PgLsn::UNDEFINED -> PgLsn::from(0)
- PgTimestamp -> SystemTime