This PR allows setting the
`PAGESERVER_DEFAULT_TENANT_CONFIG_COMPACTION_ALGORITHM` env var to
override the `tenant_config.compaction_algorithm` field in the initial
`pageserver.toml` for all tests.
I tested manually that this works by halting a test using pdb and
inspecting the `effective_config` in the tenant status managment API.
If the env var is set, the tests are parametrized by the `kind` tag
field, allowing to do a matrix build in CI and let Allure summarize
everything in a nice report.
If the env var is not set, the tests are not parametrized. So, merging
this PR doesn't cause problems for flaky test detection. In fact, it
doesn't cause any runtime change if the env var is not set.
There are some tests in the test suite that set used to override
the entire tenant_config using
`NeonEnvBuilder.pageserver_config_override`.
Since config overrides are merged non-recursively, such overrides
that don't specify `kind = ` cause a fallback to pageserver's built-in
`DEFAULT_COMPACTION_ALGORITHM`.
Such cases can be found using
```
["']tenant_config\s*[='"]
```
We'll deal with these tests in a future PR.
closes https://github.com/neondatabase/neon/issues/7555
This pull request adds the new basebackup read path + aux file write
path. In the regression test, all logical replication tests are run with
matrix aux_file_v2=false/true.
Also fixed the vectored get code path to correctly return missing key
error when being called from the unified sequential get code path.
---------
Signed-off-by: Alex Chi Z <chi@neon.tech>
# Problem
While investigating #7124, I noticed that the benchmark was always using
the `DEFAULT_*` `virtual_file_io_engine` , i.e., `tokio-epoll-uring` as
of https://github.com/neondatabase/neon/pull/7077.
The fundamental problem is that the `control_plane` code has its own
view of `PageServerConfig`, which, I believe, will always be a subset of
the real pageserver's `pageserver/src/config.rs`.
For the `virtual_file_io_engine` and `get_vectored_impl` parametrization
of the test suite, we were constructing a dict on the Python side that
contained these parameters, then handed it to
`control_plane::PageServerConfig`'s derived `serde::Deserialize`.
The default in serde is to ignore unknown fields, so, the Deserialize
impl silently ignored the fields.
In consequence, the fields weren't propagated to the `pageserver --init`
call, and the tests ended up using the
`pageserver/src/config.rs::DEFAULT_` values for the respective options
all the time.
Tests that explicitly used overrides in `env.pageserver.start()` and
similar were not affected by this.
But, it means that all the test suite runs where with parametrization
didn't properly exercise the code path.
# Changes
- use `serde(deny_unknown_fields)` to expose the problem
- With this change, the Python tests that override
`virtual_file_io_engine` and
`get_vectored_impl` fail on `pageserver --init`, exposing the problem.
- use destructuring to uncover the issue in the future
- fix the issue by adding the missing fields to the `control_plane`
crate's `PageServerConf`
- A better solution would be for control plane to re-use a struct
provided
by the pageserver crate, so that everything is in one place in
`pageserver/src/config.rs`, but, our config parsing code is (almost)
beyond repair anyways.
- fix the `pageserver_virtual_file_io_engine` to be responsive to the
env var
- => required to make parametrization work in benchmarks
# Testing
Before merging this PR, I re-ran the regression tests & CI with the full
matrix of `virtual_file_io_engine` and `tokio-epoll-uring`, see
9c7ea364e0
All of production is using it now as of
https://github.com/neondatabase/aws/pull/1121
The change in `flaky_tests.py` resets the flakiness detection logic.
The alternative would have been to repeat the choice of io engine in
each test name, which would junk up the various test reports too much.
---------
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
## Problem
Currently, we don't store `PLATFORM` for Nightly Benchmarks. It
causes them to be merged as reruns in Allure report (because they have
the same test name).
## Summary of changes
- Parametrize benchmarks by
- Postgres Version (14/15/16)
- Build Type (debug/release/remote)
- PLATFORM (neon-staging/github-actions-selfhosted/...)
---------
Co-authored-by: Bodobolero <peterbendel@neon.tech>
## Problem
All tests have already been parametrised by Postgres version and build
type (to have them distinguishable in the Allure report), but despite
it, it's anyway required to have DEFAULT_PG_VERSION and BUILD_TYPE env
vars set to corresponding values, for example to
run`test_timeline_deletion_with_files_stuck_in_upload_queue[release-pg14-local_fs]`
test it's required to set `DEFAULT_PG_VERSION=14` and
`BUILD_TYPE=release`.
This PR makes the test framework pick up parameters from the test name
itself.
## Summary of changes
- Postgres version and build type related fixtures now are
function-scoped (instead of being sessions scoped before)
- Deprecate `--pg-version` argument in favour of DEFAULT_PG_VERSION env
variable (it's easier to parse)
- GitHub autocomment now includes only one command with all the failed
tests + runs them in parallel