This clarifies - I hope - the abstractions between Repository and ObjectRepository. The ObjectTag struct was a mix of objects that could be accessed directly through the public Timeline interface, and also objects that were created and used internally by the ObjectRepository implementation and not supposed to be accessed directly by the callers. With the RelishTag separaate from ObjectTag, the distinction is more clear: RelishTag is used in the public interface, and ObjectTag is used internally between object_repository.rs and object_store.rs, and it contains the internal metadata object types. One awkward thing with the ObjectTag struct was that the Repository implementation had to distinguish between ObjectTags for relations, and track the size of the relation, while others were used to store "blobs". With the RelishTags, some relishes are considered "non-blocky", and the Repository implementation is expected to track their sizes, while others are stored as blobs. I'm not 100% happy with how RelishTag captures that either: it just knows that some relish kinds are blocky and some non-blocky, and there's an is_block() function to check that. But this does enable size-tracking for SLRUs, allowing us to treat them more like relations. This changes the way SLRUs are stored in the repository. Each SLRU segment, e.g. "pg_clog/0000", "pg_clog/0001", are now handled as a separate relish. This removes the need for the SLRU-specific put_slru_truncate() function in the Timeline trait. SLRU truncation is now handled by caling put_unlink() on the segment. This is more in line with how PostgreSQL stores SLRUs and handles their trunction. The SLRUs are "blocky", so they are accessed one 8k page at a time, and repository tracks their size. I considered an alternative design where we would treat each SLRU segment as non-blocky, and just store the whole file as one blob. Each SLRU segment is up to 256 kB in size, which isn't that large, so that might've worked fine, too. One reason I didn't do that is that it seems better to have the WAL redo routines be as close as possible to the PostgreSQL routines. It doesn't matter much in the repository, though; we have to track the size for relations anyway, so there's not much difference in whether we also do it for SLRUs. While working on this, I noticed that the CLOG and MultiXact redo code did not handle wraparound correctly. We need to fix that, but for now, I just commented them out with a FIXME comment.
Zenith
Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes
Running local installation
- Install build dependencies and other useful packages
On Ubuntu or Debian this set of packages should be sufficient to build the code:
apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang
[Rust] 1.48 or later is also required.
To run the psql client, install the postgresql-client package or modify PATH and LD_LIBRARY_PATH to include tmp_install/bin and tmp_install/lib, respectively.
To run the integration tests (not required to use the code), install
Python (3.6 or higher), and install python3 packages with pip (called pip3 on some systems):
pip install pytest psycopg2
- Build zenith and patched postgres
git clone --recursive https://github.com/zenithdb/zenith.git
cd zenith
make -j5
- Start pageserver and postgres on top of it (should be called from repo root):
# Create repository in .zenith with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/zenith init
pageserver init succeeded
# start pageserver
> ./target/debug/zenith start
Starting pageserver at '127.0.0.1:64000' in .zenith
Pageserver started
# start postgres on top on the pageserver
> ./target/debug/zenith pg start main
Starting postgres node at 'host=127.0.0.1 port=55432 user=stas'
waiting for server to start.... done
# check list of running postgres instances
> ./target/debug/zenith pg list
BRANCH ADDRESS LSN STATUS
main 127.0.0.1:55432 0/1609610 running
- Now it is possible to connect to postgres and run some queries:
> psql -p55432 -h 127.0.0.1 postgres
postgres=# CREATE TABLE t(key int primary key, value text);
CREATE TABLE
postgres=# insert into t values(1,1);
INSERT 0 1
postgres=# select * from t;
key | value
-----+-------
1 | 1
(1 row)
- And create branches and run postgres on them:
# create branch named migration_check
> ./target/debug/zenith branch migration_check main
Created branch 'migration_check' at 0/1609610
# check branches tree
> ./target/debug/zenith branch
main
┗━ @0/1609610: migration_check
# start postgres on that branch
> ./target/debug/zenith pg start migration_check
Starting postgres node at 'host=127.0.0.1 port=55433 user=stas'
waiting for server to start.... done
# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 postgres
postgres=# select * from t;
key | value
-----+-------
1 | 1
(1 row)
postgres=# insert into t values(2,2);
INSERT 0 1
Running tests
git clone --recursive https://github.com/libzenith/zenith.git
make # builds also postgres and installs it to ./tmp_install
cd test_runner
pytest
Documentation
Now we use README files to cover design ideas and overall architecture for each module. And rustdoc style documentation comments.
To view your documentation in a browser, try running cargo doc --no-deps --open
Source tree layout
/control_plane:
Local control plane. Functions to start, configure and stop pageserver and postgres instances running as a local processes. Intended to be used in integration tests and in CLI tools for local installations.
/zenith
Main entry point for the 'zenith' CLI utility. TODO: Doesn't it belong to control_plane?
/postgres_ffi:
Utility functions for interacting with PostgreSQL file formats. Misc constants, copied from PostgreSQL headers.
/zenith_utils:
Helpers that are shared between other crates in this repository.
/walkeeper:
WAL safekeeper (also known as WAL acceptor). Written in Rust.
/pageserver:
Page Server. Written in Rust.
Depends on the modified 'postgres' binary for WAL redo.
/vendor/postgres:
PostgreSQL source tree, with the modifications needed for Zenith.
/vendor/postgres/contrib/zenith:
PostgreSQL extension that implements storage manager API and network communications with remote page server.
/test_runner:
Integration tests, written in Python using the pytest framework.
test_runner/zenith_regress:
Quick way to add new SQL regression test to integration tests set.