It's created once early in server startup, after parsing the
command-line options, and never modified afterwards. To simplify
things, pass it around as static ref, instead of making copies in all
the different structs. We still pass around a reference to it, rather
than putting it in a global variable, to allow unit testing with
different configs in the same process.
I'm going nuts with the pattern:
let k = iter.key().unwrap();
buf.clear();
buf.extend_from_slice(&k);
let key = CacheKey::unpack(&mut buf);
Introduce helper functions to convert a CacheKey into BytesMut, and
from [u8] into CacheKey. Reduces the boilerplate code a lot.
The helper functions create a new BytesMut on each call, whereas the old
coding could reuse a single BytesMut, so this could be a bit slower. I
haven't tried measuring it, but at least it's not immediately noticeable,
and readability is much more imporatant at this point. We can optimize
later
This isn't very exciting with the current RocksDB implementation, because
it doesn't care about the PostgreSQL 1 GB segment boundaries at all.
But I think we will care about this in the future, and more tests is
generally better anyway.
Commit 746f667311 added the 'workdir' field and the get_*_path()
functions, with the idea that we cd into the directory at page server
startup, so that the get_*_path() functions can always return paths
relative to '.', but 'workdir' shows the original path to it. Change it
so that 'conf.workdir' is always set to '.', too, and the get_*_path()
functions include 'workdir' in the returned paths. Why? Because that
allows writing unit tests without changing the current directory.
When I was working on commit 97992226d3, I initially wrote the test so
that it changed the current working directory, just like commit 746f667311
did. But that was problematic, when I tried to add another unit test that
*also* wants to change the current working dir, because they could then
not run concurrently. In fact, they could not even run serially, unless
the current directory was carefully reset after the test. So it is better
to avoid changing the current directory in tests.
This patch started as an effort to support CLI working against remote
pageserver, but turned into a pretty big refactoring.
* CLI now does not look into repository files directly. New commands
'branch_create' and 'identify_system' were introduced into page_service to
support that.
* Branch management that was scattered between local_env and
zenith/main.rs is moved into pageserver/branches.rs. That code could better fit
in Repository/Timeline impl, but I'll leave that for a different patch.
* All tests-related code from local_env went into integration_tests/src/lib.rs as an
extension to PostgresNode trait.
* Paths-generating functions were concentrated around corresponding config
types (LocalEnv and PageserverConf).
Currently, truncation is implemented in the RocksDB repository by storing
a special sentinel entry for each page that was truncated away. Hide that
implementation detail better in the abstract Repository interface, so
that caller doesn't need to construct the special sentinel WAL record.
While we're at it, refactor the CacheEntryContent struct to an enum.
This moves things around:
- The PageCache is split into two structs: Repository and Timeline. A
Repository holds multiple Timelines. In order to get a page version,
you must first get a reference to the Repository, then the Timeline
in the repository, and finally call the get_page_at_lsn() function
on the Timeline object. This sounds complicated, but because each
connection from a compute node, and each WAL receiver, only deals
with one timeline at a time, the callers can get the reference to
the Timeline object once and hold onto it. The Timeline corresponds
most closely to the old PageCache object.
- Repository and Timeline are now abstract traits, so that we can
support multiple implementations. I don't actually expect us to have
multiple implementations for long. We have the RocksDB
implementation now, but as soon as we have a different
implementation that's usable, I expect that we will retire the
RocksDB implementation. But I think this abstraction works as good
documentation in any case: it's now easier to see what the interface
for storing and loading pages from the repository is, by looking at
the Repository/Timeline traits. They abstract traits are in
repository.rs, and the RocksDB implementation of them is in
repository/rocksdb.rs.
- page_cache.rs is now a "switchboard" to get a handle to the
repository. Currently, the page server can only handle one
repository at a time, so there isn't much there, but in the future
we might do multi-tenancy there.