Heikki Linnakangas acc0f41985 Don't try to launch duplicate WAL redo thread if tenant already exists.
The codepath for tenant_create command first launched the WAL redo
thread, and then called branches::create_repo() which checked if the
tenant's directory already exists. That's problematic, because
launching the WAL redo thread will run initdb if the directory doesn't
already exist. Race condition: If the tenant already exists, it will
have a WAL redo thread already running, and the old and new WAL redo
thread might try to run initdb at the same time, causing all kinds of
weird failures.

The test_pageserver_api test was failing 100% repeatably on my laptop
because of this. I'm not sure why this doesn't occur on the CI:

    Jul 31 18:05:48.877 INFO running initdb in "./tenants/5227e4eb90894775ac6b8a8c76f24b2e/wal-redo-datadir", location: pageserver::walredo, pageserver/src/walredo.rs:483
    thread 'WAL redo thread' panicked at 'initdb failed: The files belonging to this database system will be owned by user "heikki".
    This user must also own the server process.

    The database cluster will be initialized with locale "C".
    The default database encoding has accordingly been set to "SQL_ASCII".
    The default text search configuration will be set to "english".

    Data page checksums are disabled.

    creating directory ./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir ... ok
    creating subdirectories ... ok
    selecting dynamic shared memory implementation ... posix
    selecting default max_connections ... 100
    selecting default shared_buffers ... 128MB
    selecting default time zone ... Europe/Helsinki
    creating configuration files ... ok
    running bootstrap script ...
    stderr:
    2021-07-31 15:05:48.875 GMT [282569] LOG:  could not open configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf": No such file or directory
    2021-07-31 15:05:48.875 GMT [282569] FATAL:  configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf" contains errors
    child process exited with exit code 1
    initdb: removing data directory "./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir"
2021-07-31 18:13:21 +03:00
2021-07-09 14:59:45 +03:00
2021-07-22 20:54:20 +03:00
2021-07-19 14:52:41 +03:00
2021-07-22 20:54:20 +03:00
2021-07-22 20:54:20 +03:00
2021-05-07 13:08:31 -07:00
2021-07-22 20:54:20 +03:00
2021-06-01 16:08:32 +03:00
2021-04-02 10:38:51 +03:00
2021-07-19 14:52:41 +03:00
2021-05-27 15:33:08 +03:00
2021-05-27 15:33:08 +03:00
2021-07-09 14:59:45 +03:00
2021-07-22 20:54:20 +03:00
2021-06-15 10:52:11 -07:00

Zenith

Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Running local installation

  1. Install build dependencies and other useful packages

On Ubuntu or Debian this set of packages should be sufficient to build the code:

apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang

[Rust] 1.48 or later is also required.

To run the psql client, install the postgresql-client package or modify PATH and LD_LIBRARY_PATH to include tmp_install/bin and tmp_install/lib, respectively.

To run the integration tests (not required to use the code), install Python (3.6 or higher), and install python3 packages with pip (called pip3 on some systems):

pip install pytest psycopg2
  1. Build zenith and patched postgres
git clone --recursive https://github.com/zenithdb/zenith.git
cd zenith
make -j5
  1. Start pageserver and postgres on top of it (should be called from repo root):
# Create repository in .zenith with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/zenith init
pageserver init succeeded

# start pageserver
> ./target/debug/zenith start
Starting pageserver at '127.0.0.1:64000' in .zenith
Pageserver started

# start postgres on top on the pageserver
> ./target/debug/zenith pg start main
Starting postgres node at 'host=127.0.0.1 port=55432 user=stas'
waiting for server to start.... done

# check list of running postgres instances
> ./target/debug/zenith pg list
BRANCH	ADDRESS		LSN		STATUS
main	127.0.0.1:55432	0/1609610	running
  1. Now it is possible to connect to postgres and run some queries:
> psql -p55432 -h 127.0.0.1 postgres
postgres=# CREATE TABLE t(key int primary key, value text);
CREATE TABLE
postgres=# insert into t values(1,1);
INSERT 0 1
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)
  1. And create branches and run postgres on them:
# create branch named migration_check
> ./target/debug/zenith branch migration_check main
Created branch 'migration_check' at 0/1609610

# check branches tree
> ./target/debug/zenith branch
 main
 ┗━ @0/1609610: migration_check

# start postgres on that branch
> ./target/debug/zenith pg start migration_check
Starting postgres node at 'host=127.0.0.1 port=55433 user=stas'
waiting for server to start.... done

# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 postgres
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

postgres=# insert into t values(2,2);
INSERT 0 1

Running tests

git clone --recursive https://github.com/libzenith/zenith.git
make # builds also postgres and installs it to ./tmp_install
cd test_runner
pytest

Documentation

Now we use README files to cover design ideas and overall architecture for each module. And rustdoc style documentation comments.

To view your documentation in a browser, try running cargo doc --no-deps --open

Source tree layout

/control_plane:

Local control plane. Functions to start, configure and stop pageserver and postgres instances running as a local processes. Intended to be used in integration tests and in CLI tools for local installations.

/zenith

Main entry point for the 'zenith' CLI utility. TODO: Doesn't it belong to control_plane?

/postgres_ffi:

Utility functions for interacting with PostgreSQL file formats. Misc constants, copied from PostgreSQL headers.

/zenith_utils:

Helpers that are shared between other crates in this repository.

/walkeeper:

WAL safekeeper (also known as WAL acceptor). Written in Rust.

/pageserver:

Page Server. Written in Rust.

Depends on the modified 'postgres' binary for WAL redo.

/vendor/postgres:

PostgreSQL source tree, with the modifications needed for Zenith.

/vendor/postgres/contrib/zenith:

PostgreSQL extension that implements storage manager API and network communications with remote page server.

/test_runner:

Integration tests, written in Python using the pytest framework.

test_runner/zenith_regress:

Quick way to add new SQL regression test to integration tests set.

Description
Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
Readme Apache-2.0 133 MiB
Languages
Rust 73.5%
Python 19.4%
C 5.2%
Dockerfile 0.8%
Shell 0.3%
Other 0.8%