Files
neon/test_runner
Heikki Linnakangas acc0f41985 Don't try to launch duplicate WAL redo thread if tenant already exists.
The codepath for tenant_create command first launched the WAL redo
thread, and then called branches::create_repo() which checked if the
tenant's directory already exists. That's problematic, because
launching the WAL redo thread will run initdb if the directory doesn't
already exist. Race condition: If the tenant already exists, it will
have a WAL redo thread already running, and the old and new WAL redo
thread might try to run initdb at the same time, causing all kinds of
weird failures.

The test_pageserver_api test was failing 100% repeatably on my laptop
because of this. I'm not sure why this doesn't occur on the CI:

    Jul 31 18:05:48.877 INFO running initdb in "./tenants/5227e4eb90894775ac6b8a8c76f24b2e/wal-redo-datadir", location: pageserver::walredo, pageserver/src/walredo.rs:483
    thread 'WAL redo thread' panicked at 'initdb failed: The files belonging to this database system will be owned by user "heikki".
    This user must also own the server process.

    The database cluster will be initialized with locale "C".
    The default database encoding has accordingly been set to "SQL_ASCII".
    The default text search configuration will be set to "english".

    Data page checksums are disabled.

    creating directory ./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir ... ok
    creating subdirectories ... ok
    selecting dynamic shared memory implementation ... posix
    selecting default max_connections ... 100
    selecting default shared_buffers ... 128MB
    selecting default time zone ... Europe/Helsinki
    creating configuration files ... ok
    running bootstrap script ...
    stderr:
    2021-07-31 15:05:48.875 GMT [282569] LOG:  could not open configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf": No such file or directory
    2021-07-31 15:05:48.875 GMT [282569] FATAL:  configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf" contains errors
    child process exited with exit code 1
    initdb: removing data directory "./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir"
2021-07-31 18:13:21 +03:00
..
2021-07-22 20:54:20 +03:00
2021-07-22 20:54:20 +03:00
2021-05-17 20:44:00 +03:00
2021-06-01 21:09:09 +03:00

Zenith test runner

This directory contains integration tests.

Prerequisites:

  • Python 3.6 or later
  • Dependencies: install them via pipenv install. Note that Debian/Ubuntu packages are stale, as it commonly happens, so manual installation is not recommended. Run pipenv shell to activate the venv.
  • Zenith and Postgres binaries
    • See the root README.md for build directions
    • Tests can be run from the git tree; or see the environment variables below to run from other directories.
  • The zenith git repo, including the postgres submodule (for some tests, e.g. pg_regress)

Test Organization

The tests are divided into a few batches, such that each batch takes roughly the same amount of time. The batches can be run in parallel, to minimize total runtime. Currently, there are only two batches:

  • test_batch_pg_regress: Runs PostgreSQL regression tests
  • test_others: All other tests

Running the tests

Because pytest will search all subdirectories for tests, it's easiest to run the tests from within the test_runner directory.

Test state (postgres data, pageserver state, and log files) will be stored under a directory test_output.

You can run all the tests with:

pytest

If you want to run all the tests in a particular file:

pytest test_pgbench.py

If you want to run all tests that have the string "bench" in their names:

pytest -k bench

Useful environment variables:

ZENITH_BIN: The directory where zenith binaries can be found. POSTGRES_DISTRIB_DIR: The directory where postgres distribution can be found. TEST_OUTPUT: Set the directory where test state and test output files should go. TEST_SHARED_FIXTURES: Try to re-use a single pageserver for all the tests.

Let stdout and stderr go to the terminal instead of capturing them: pytest -s ... (Note many tests capture subprocess outputs separately, so this may not show much.)

Exit after the first test failure: pytest -x ... (there are many more pytest options; run pytest -h to see them.)

Building new tests

The tests make heavy use of pytest fixtures. You can read about how they work here: https://docs.pytest.org/en/stable/fixture.html

Essentially, this means that each time you see a fixture named as an input parameter, the function with that name will be run and passed as a parameter to the function.

So this code:

def test_something(zenith_cli, pg_bin):
    pass

... will run the fixtures called zenith_cli and pg_bin and deliver those results to the test function.

Fixtures can't be imported using the normal python syntax. Instead, use this:

pytest_plugins = ("fixtures.something")

That will make all the fixtures in the fixtures/something.py file available.

Anything that's likely to be used in multiple tests should be built into a fixture.

Note that fixtures can clean up after themselves if they use the yield syntax. Cleanup will happen even if the test fails (raises an unhandled exception). Python destructors, e.g. __del__() aren't recommended for cleanup.

Code quality

Before submitting a patch, please consider:

  • Writing a couple of docstrings to clarify the reasoning behind a new test.
  • Running flake8 (or a linter of your choice, e.g. pycodestyle) and fixing possible defects, if any.
  • Formatting the code with yapf -r -i . (TODO: implement an opt-in pre-commit hook for that).
  • (Optional) Typechecking the code with mypy .. Currently this mostly affects fixtures/zenith_fixtures.py.

The tools can be installed with pipenv install --dev.