# Problem
While investigating #7124, I noticed that the benchmark was always using
the `DEFAULT_*` `virtual_file_io_engine` , i.e., `tokio-epoll-uring` as
of https://github.com/neondatabase/neon/pull/7077.
The fundamental problem is that the `control_plane` code has its own
view of `PageServerConfig`, which, I believe, will always be a subset of
the real pageserver's `pageserver/src/config.rs`.
For the `virtual_file_io_engine` and `get_vectored_impl` parametrization
of the test suite, we were constructing a dict on the Python side that
contained these parameters, then handed it to
`control_plane::PageServerConfig`'s derived `serde::Deserialize`.
The default in serde is to ignore unknown fields, so, the Deserialize
impl silently ignored the fields.
In consequence, the fields weren't propagated to the `pageserver --init`
call, and the tests ended up using the
`pageserver/src/config.rs::DEFAULT_` values for the respective options
all the time.
Tests that explicitly used overrides in `env.pageserver.start()` and
similar were not affected by this.
But, it means that all the test suite runs where with parametrization
didn't properly exercise the code path.
# Changes
- use `serde(deny_unknown_fields)` to expose the problem
- With this change, the Python tests that override
`virtual_file_io_engine` and
`get_vectored_impl` fail on `pageserver --init`, exposing the problem.
- use destructuring to uncover the issue in the future
- fix the issue by adding the missing fields to the `control_plane`
crate's `PageServerConf`
- A better solution would be for control plane to re-use a struct
provided
by the pageserver crate, so that everything is in one place in
`pageserver/src/config.rs`, but, our config parsing code is (almost)
beyond repair anyways.
- fix the `pageserver_virtual_file_io_engine` to be responsive to the
env var
- => required to make parametrization work in benchmarks
# Testing
Before merging this PR, I re-ran the regression tests & CI with the full
matrix of `virtual_file_io_engine` and `tokio-epoll-uring`, see
9c7ea364e0
All of production is using it now as of
https://github.com/neondatabase/aws/pull/1121
The change in `flaky_tests.py` resets the flakiness detection logic.
The alternative would have been to repeat the choice of io engine in
each test name, which would junk up the various test reports too much.
---------
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
## Problem
Currently, we don't store `PLATFORM` for Nightly Benchmarks. It
causes them to be merged as reruns in Allure report (because they have
the same test name).
## Summary of changes
- Parametrize benchmarks by
- Postgres Version (14/15/16)
- Build Type (debug/release/remote)
- PLATFORM (neon-staging/github-actions-selfhosted/...)
---------
Co-authored-by: Bodobolero <peterbendel@neon.tech>
## Problem
All tests have already been parametrised by Postgres version and build
type (to have them distinguishable in the Allure report), but despite
it, it's anyway required to have DEFAULT_PG_VERSION and BUILD_TYPE env
vars set to corresponding values, for example to
run`test_timeline_deletion_with_files_stuck_in_upload_queue[release-pg14-local_fs]`
test it's required to set `DEFAULT_PG_VERSION=14` and
`BUILD_TYPE=release`.
This PR makes the test framework pick up parameters from the test name
itself.
## Summary of changes
- Postgres version and build type related fixtures now are
function-scoped (instead of being sessions scoped before)
- Deprecate `--pg-version` argument in favour of DEFAULT_PG_VERSION env
variable (it's easier to parse)
- GitHub autocomment now includes only one command with all the failed
tests + runs them in parallel