Heikki Linnakangas 9ff122835f Refactor ObjectTags, intruducing a new concept called "relish"
This clarifies - I hope - the abstractions between Repository and
ObjectRepository. The ObjectTag struct was a mix of objects that could
be accessed directly through the public Timeline interface, and also
objects that were created and used internally by the ObjectRepository
implementation and not supposed to be accessed directly by the
callers.  With the RelishTag separaate from ObjectTag, the distinction
is more clear: RelishTag is used in the public interface, and
ObjectTag is used internally between object_repository.rs and
object_store.rs, and it contains the internal metadata object types.

One awkward thing with the ObjectTag struct was that the Repository
implementation had to distinguish between ObjectTags for relations,
and track the size of the relation, while others were used to store
"blobs".  With the RelishTags, some relishes are considered
"non-blocky", and the Repository implementation is expected to track
their sizes, while others are stored as blobs. I'm not 100% happy with
how RelishTag captures that either: it just knows that some relish
kinds are blocky and some non-blocky, and there's an is_block()
function to check that.  But this does enable size-tracking for SLRUs,
allowing us to treat them more like relations.

This changes the way SLRUs are stored in the repository. Each SLRU
segment, e.g. "pg_clog/0000", "pg_clog/0001", are now handled as a
separate relish.  This removes the need for the SLRU-specific
put_slru_truncate() function in the Timeline trait. SLRU truncation is
now handled by caling put_unlink() on the segment. This is more in
line with how PostgreSQL stores SLRUs and handles their trunction.

The SLRUs are "blocky", so they are accessed one 8k page at a time,
and repository tracks their size. I considered an alternative design
where we would treat each SLRU segment as non-blocky, and just store
the whole file as one blob. Each SLRU segment is up to 256 kB in size,
which isn't that large, so that might've worked fine, too. One reason
I didn't do that is that it seems better to have the WAL redo
routines be as close as possible to the PostgreSQL routines. It
doesn't matter much in the repository, though; we have to track the
size for relations anyway, so there's not much difference in whether
we also do it for SLRUs.

While working on this, I noticed that the CLOG and MultiXact redo code
did not handle wraparound correctly. We need to fix that, but for now,
I just commented them out with a FIXME comment.
2021-08-03 14:01:05 +03:00
2021-07-09 14:59:45 +03:00
2021-07-22 20:54:20 +03:00
2021-07-19 14:52:41 +03:00
2021-07-22 20:54:20 +03:00
2021-07-22 20:54:20 +03:00
2021-05-07 13:08:31 -07:00
2021-07-22 20:54:20 +03:00
2021-06-01 16:08:32 +03:00
2021-04-02 10:38:51 +03:00
2021-07-19 14:52:41 +03:00
2021-05-27 15:33:08 +03:00
2021-05-27 15:33:08 +03:00
2021-07-09 14:59:45 +03:00
2021-07-22 20:54:20 +03:00
2021-06-15 10:52:11 -07:00

Zenith

Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Running local installation

  1. Install build dependencies and other useful packages

On Ubuntu or Debian this set of packages should be sufficient to build the code:

apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang

[Rust] 1.48 or later is also required.

To run the psql client, install the postgresql-client package or modify PATH and LD_LIBRARY_PATH to include tmp_install/bin and tmp_install/lib, respectively.

To run the integration tests (not required to use the code), install Python (3.6 or higher), and install python3 packages with pip (called pip3 on some systems):

pip install pytest psycopg2
  1. Build zenith and patched postgres
git clone --recursive https://github.com/zenithdb/zenith.git
cd zenith
make -j5
  1. Start pageserver and postgres on top of it (should be called from repo root):
# Create repository in .zenith with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/zenith init
pageserver init succeeded

# start pageserver
> ./target/debug/zenith start
Starting pageserver at '127.0.0.1:64000' in .zenith
Pageserver started

# start postgres on top on the pageserver
> ./target/debug/zenith pg start main
Starting postgres node at 'host=127.0.0.1 port=55432 user=stas'
waiting for server to start.... done

# check list of running postgres instances
> ./target/debug/zenith pg list
BRANCH	ADDRESS		LSN		STATUS
main	127.0.0.1:55432	0/1609610	running
  1. Now it is possible to connect to postgres and run some queries:
> psql -p55432 -h 127.0.0.1 postgres
postgres=# CREATE TABLE t(key int primary key, value text);
CREATE TABLE
postgres=# insert into t values(1,1);
INSERT 0 1
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)
  1. And create branches and run postgres on them:
# create branch named migration_check
> ./target/debug/zenith branch migration_check main
Created branch 'migration_check' at 0/1609610

# check branches tree
> ./target/debug/zenith branch
 main
 ┗━ @0/1609610: migration_check

# start postgres on that branch
> ./target/debug/zenith pg start migration_check
Starting postgres node at 'host=127.0.0.1 port=55433 user=stas'
waiting for server to start.... done

# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 postgres
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

postgres=# insert into t values(2,2);
INSERT 0 1

Running tests

git clone --recursive https://github.com/libzenith/zenith.git
make # builds also postgres and installs it to ./tmp_install
cd test_runner
pytest

Documentation

Now we use README files to cover design ideas and overall architecture for each module. And rustdoc style documentation comments.

To view your documentation in a browser, try running cargo doc --no-deps --open

Source tree layout

/control_plane:

Local control plane. Functions to start, configure and stop pageserver and postgres instances running as a local processes. Intended to be used in integration tests and in CLI tools for local installations.

/zenith

Main entry point for the 'zenith' CLI utility. TODO: Doesn't it belong to control_plane?

/postgres_ffi:

Utility functions for interacting with PostgreSQL file formats. Misc constants, copied from PostgreSQL headers.

/zenith_utils:

Helpers that are shared between other crates in this repository.

/walkeeper:

WAL safekeeper (also known as WAL acceptor). Written in Rust.

/pageserver:

Page Server. Written in Rust.

Depends on the modified 'postgres' binary for WAL redo.

/vendor/postgres:

PostgreSQL source tree, with the modifications needed for Zenith.

/vendor/postgres/contrib/zenith:

PostgreSQL extension that implements storage manager API and network communications with remote page server.

/test_runner:

Integration tests, written in Python using the pytest framework.

test_runner/zenith_regress:

Quick way to add new SQL regression test to integration tests set.

Description
Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
Readme Apache-2.0 133 MiB
Languages
Rust 73.5%
Python 19.4%
C 5.2%
Dockerfile 0.8%
Shell 0.3%
Other 0.8%