rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-06-03 13:30:38 +00:00

Go to file

Heikki Linnakangas 960c7d69a8 Remove 'predecessor' reference from in-memory and delta layers.

The caller is now responsible for lookin up the predecessor layer,
instead. This makes the code simpler, as you don't need to update the
predecessor reference when a layer is frozen or written to disk.

There was a bug in that, as Konstantin noted on discord:

    Assume that freeze doesn't create new inmem layer
    (maybe_new_open=None). Then we temporary place in historics frozen
    layer. Assume that now new put_wal_record request arrives. There is
    no open in-mem layer, so it has to create new one. It is looking for
    previous layer for read and set it as new in-mem layer
    predecessor. But as far as I understand, prev layer should be our
    temporary frozen layer. Which will be then removed from
    historics.

That leaves the predecessor field of the new in-memory layer pointing
at the frozen in-memory layer that has been removed from the layer map,
preventing it from being removed from memory.

This makes two subtle changes:

1. When the first new layer is created on a branch for a segment that
   existed on the ancestor branch, the start_lsn of the new layer is now
   the branch point + 1. We were previously slightly confused on what
   the branch point LSN meant. It means that all the WAL up to and
   *including* the LSN on the old branch is visible to the new branch.
   If we mark the start LSN of the new layer as equal to the branch point,
   that's wrong, because if there is a WAL record with that LSN on the
   predecessor layer, the new layer would hide it. This bug was hidden
   when the layer on the new branch contained a direct reference to the
   layer in the old branch, as get_page_reconstruct_data() followed that
   reference directly when it didn't find the page version in the new
   layer. But now that the caller performs the lookup, it will look up
   the new layer that doesn't contain the record, and you get an error.

2. InMemoryLayer now always stores the segment size at the beginning
   of the layer's LSN range. Previously, get_seg_size() might have
   recursed into the predecessor layer to get the size, but now we
   avoid that by always copying over the last size from the previous
   layer, when a new layer is created.

2021-10-08 00:54:13 +03:00

.circleci

Improve build system: (#703 )

2021-10-06 14:37:27 +02:00

.github/workflows

[postgres] Enable seccomp bpf

2021-07-09 14:59:45 +03:00

control_plane

Fix clippy errors on nightly (2021-09-29) (#691 )

2021-10-01 15:45:42 -07:00

docs

docs/README: fix link to walkeeper's README (#677 )

2021-09-29 14:40:16 +03:00

monitoring

Add some prometheus metrics to pageserver

2021-08-03 21:42:24 +03:00

pageserver

Remove 'predecessor' reference from in-memory and delta layers.

2021-10-08 00:54:13 +03:00

postgres_ffi

Fix clippy lints and enable clippy checking in CI

2021-09-16 15:09:16 +03:00

proxy

Allow usage of the compute hostname in the proxy

2021-10-01 16:24:35 +03:00

test_runner

Add test case that demonstrates Write Amplification.

2021-10-08 00:34:29 +03:00

vendor

Store unlogged tables locally, and replace PD_WAL_LOGGED.

2021-10-06 10:58:15 +03:00

walkeeper

Add more details to pageserver and safekeeper docs (#680 )

2021-10-05 19:10:50 +03:00

workspace_hack

add workspace_hack crate

2021-05-07 13:08:31 -07:00

zenith

Support parallel test running for python tests

2021-09-15 14:02:15 +03:00

zenith_metrics

zenith_metrics: exit process on config errors (#706 )

2021-10-08 00:14:56 +03:00

zenith_utils

Make pageserver_ prefix for common metric names configurable (#681 )

2021-10-05 19:06:44 +03:00

.dockerignore

Docker build now also uses BUILD_TYPE=release. (#712 )

2021-10-06 23:42:00 +02:00

.gitignore

.gitignore integration_tests/.zenith

2021-05-13 13:47:22 -07:00

.gitmodules

Replace old git urls with the current ones

2021-08-30 23:51:47 +03:00

Cargo.lock

Make pageserver_ prefix for common metric names configurable (#681 )

2021-10-05 19:06:44 +03:00

Cargo.toml

Add some prometheus metrics to pageserver

2021-08-03 21:42:24 +03:00

cli-v2-story.md

Implement "timelines" in page server

2021-04-20 19:11:27 +03:00

CONTRIBUTING.md

Add CONTRIBUTING.md with some ground rules for submitting PRs.

2021-05-27 23:07:37 +03:00

Add LICENSE and COPYRIGHT files.

2021-05-27 15:33:08 +03:00

docker-entrypoint.sh

Adjust docker container for console's CI pipeline

2021-08-25 17:28:42 +03:00

Dockerfile

Docker build now also uses BUILD_TYPE=release. (#712 )

2021-10-06 23:42:00 +02:00

Dockerfile.alpine

Build zenithdb/zenith:latest in CI (zenithdb/console#18 )

2021-08-19 15:12:35 +03:00

Dockerfile.build

Cleanup Dockerfile.

2021-09-28 18:26:20 +03:00

LICENSE

Add LICENSE and COPYRIGHT files.

2021-05-27 15:33:08 +03:00

Makefile

Improve build system: (#703 )

2021-10-06 14:37:27 +02:00

Pipfile

Symlink Pipfile (& Pipfile.lock) at the top level

2021-07-12 21:30:52 +03:00

Pipfile.lock

Symlink Pipfile (& Pipfile.lock) at the top level

2021-07-12 21:30:52 +03:00

pre-commit.py

add ability to disable colors, use argparse for arguments

2021-08-23 17:28:45 +03:00

README.md

Readme updates based on a fresher Ubuntu installation experience (#627 )

2021-10-05 19:19:25 +03:00

run_clippy.sh

Fix clippy lints and enable clippy checking in CI

2021-09-16 15:09:16 +03:00

README.md

Zenith

Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Architecture overview

A Zenith installation consists of Compute nodes and Storage engine.

Compute nodes are stateless PostgreSQL nodes, backed by zenith storage.

Zenith storage engine consists of two major components:

Pageserver. Scalable storage backend for compute nodes.
WAL service. The service that receives WAL from compute node and ensures that it is stored durably.

Pageserver consists of:

Repository - Zenith storage implementation.
WAL receiver - service that receives WAL from WAL service and stores it in the repository.
Page service - service that communicates with compute nodes and responds with pages from the repository.
WAL redo - service that builds pages from base images and WAL records on Page service request.

Running local installation

Install build dependencies and other useful packages

On Ubuntu or Debian this set of packages should be sufficient to build the code:

apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang pkg-config libpq-dev

[Rust] 1.52 or later is also required.

To run the psql client, install the postgresql-client package or modify PATH and LD_LIBRARY_PATH to include tmp_install/bin and tmp_install/lib, respectively.

To run the integration tests (not required to use the code), install Python (3.6 or higher), and install python3 packages with pipenv using pipenv install in the project directory.

Build zenith and patched postgres

git clone --recursive https://github.com/zenithdb/zenith.git
cd zenith
make -j5

Start pageserver and postgres on top of it (should be called from repo root):

# Create repository in .zenith with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/zenith init
pageserver init succeeded

# start pageserver
> ./target/debug/zenith start
Starting pageserver at '127.0.0.1:64000' in .zenith
Pageserver started

# start postgres on top on the pageserver
> ./target/debug/zenith pg start main
Starting postgres node at 'host=127.0.0.1 port=55432 user=stas'
waiting for server to start.... done

# check list of running postgres instances
> ./target/debug/zenith pg list
BRANCH	ADDRESS		LSN		STATUS
main	127.0.0.1:55432	0/1609610	running

Now it is possible to connect to postgres and run some queries:

> psql -p55432 -h 127.0.0.1 -U zenith_admin postgres
postgres=# CREATE TABLE t(key int primary key, value text);
CREATE TABLE
postgres=# insert into t values(1,1);
INSERT 0 1
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

And create branches and run postgres on them:

# create branch named migration_check
> ./target/debug/zenith branch migration_check main
Created branch 'migration_check' at 0/1609610

# check branches tree
> ./target/debug/zenith branch
 main
 ┗━ @0/1609610: migration_check

# start postgres on that branch
> ./target/debug/zenith pg start migration_check
Starting postgres node at 'host=127.0.0.1 port=55433 user=stas'
waiting for server to start.... done

# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 -U zenith_admin postgres
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

postgres=# insert into t values(2,2);
INSERT 0 1

If you want to run tests afterwards (see below), you have to stop pageserver and all postgres instances you have just started:

> ./target/debug/zenith pg stop migration_check
> ./target/debug/zenith pg stop main
> ./target/debug/zenith stop

Running tests

git clone --recursive https://github.com/zenithdb/zenith.git
make # builds also postgres and installs it to ./tmp_install
cd test_runner
pytest

Documentation

Now we use README files to cover design ideas and overall architecture for each module and rustdoc style documentation comments. See also /docs/ a top-level overview of all available markdown documentation.

/docs/sourcetree.md contains overview of source tree layout.

To view your rustdoc documentation in a browser, try running cargo doc --no-deps --open

Postgres-specific terms

Due to Zenith's very close relation with PostgreSQL internals, there are numerous specific terms used. Same applies to certain spelling: i.e. we use MB to denote 1024 * 1024 bytes, while MiB would be technically more correct, it's inconsistent with what PostgreSQL code and its documentation use.

To get more familiar with this aspect, refer to:

Zenith glossary
PostgreSQL glossary
Other PostgreSQL documentation and sources (Zenith fork sources can be found here)

Join the development

Read CONTRIBUTING.md to learn about project code style and practices.
To get familiar with a source tree layout, use /docs/sourcetree.md.
To learn more about PostgreSQL internals, check http://www.interdb.jp/pg/index.html

Languages

Rust 73.5%

Python 19.4%

C 5.2%

Dockerfile 0.8%

Shell 0.3%

Other 0.8%