rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-26 15:49:58 +00:00

Author	SHA1	Message	Date
Alexander Bayandin	0c99f16c60	CI(run-python-test-set): don't collect code coverage for real (#12611 ) ## Problem neondatabase/neon#12601 did't compleatly disable writing `.profraw` files, but instead of `/tmp/coverage` it started to write into the current directory ## Summary of changes - Set `LLVM_PROFILE_FILE=/dev/null` to avoing writing `.profraw` at all	2025-07-16 08:26:52 +00:00
Alexander Bayandin	921a4f2009	CI(run-python-test-set): don't collect code coverage (#12601 ) ## Problem We don't use code coverage produced by `regress-tests` (neondatabase/neon#6798), so there's no need to collect it. Potentially, disabling it should reduce the load on disks and improve the stability of debug builds. ## Summary of changes - Disable code coverage collection for regression tests	2025-07-15 11:16:29 +00:00
Alexander Bayandin	2b1d2a55d6	CI: fix typo oicd -> oidc (#11747 ) ## Problem It's OIDC (OpenID Connect), not OICD ## Summary of changes - Rename actions input `aws-oicd-role-arn` -> `aws-oidc-role-arn`	2025-04-28 12:44:28 +00:00
Alexander Bayandin	cd2e1fbc7c	CI(benchmarks): upload perf results for passed tests (#11649 ) ## Problem We run benchmarks in batches (five parallel jobs on different runners). If any test in a batch fails, we won’t upload any results for that batch, even for the tests that passed. ## Summary of changes - Move the results upload to a separate step in the run-python-test-set action, and execute this step even if tests fail.	2025-04-22 09:41:28 +00:00
Alexander Bayandin	07c2411f6b	tests: remove mentions of ALLOW_*_COMPATIBILITY_BREAKAGE (#11618 ) ## Problem There are mentions of `ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` and `ALLOW_FORWARD_COMPATIBILITY_BREAKAGE`, but in reality, this mechanism doesn't work, so let's remove it to avoid confusion. The idea behind it was to allow some breaking changes by adding a special label to a PR that would `xfail` the test. However, in practice, this means we would need to carry this label through all subsequent PRs until the release (and artifact regeneration). This approach isn't really viable, as it increases the risk of missing a compatibility break in another PR. ## Summary of changes - Remove mentions and handling of `ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` / `ALLOW_FORWARD_COMPATIBILITY_BREAKAGE`	2025-04-17 10:03:21 +00:00
Alexander Lakhin	9a4e2eab61	Fix artifact name for build with sanitizers (#11066 ) ## Problem When a build is made with sanitizers, this is not reflected in the artifact name, which can lead to overriding normal builds with sanitized ones. ## Summary of changes Take this property of a build into account when constructing the artifact name.	2025-03-03 18:00:53 +00:00
Alexander Bayandin	17724a19e6	CI(allure-reports): update dependencies and cleanup code (#10794 ) ## Problem There are a bunch of minor improvements that are too small and insignificant as is, so collecting them in one PR. ## Summary of changes - Add runner arch to artifact name to make it easier to distinguish files on S3 ([ref](https://neondb.slack.com/archives/C059ZC138NR/p1739365938371149)) - Use `github.event.pull_request.number` instead of parsing `$GITHUB_EVENT_PATH` file - Update Allure CLI and `allure-pytest`	2025-02-24 15:07:14 +00:00
Alexander Lakhin	977781e423	Enable sanitizers for postgres v17 (#10401 ) Add a build with sanitizers (asan, ubsan) to the CI pipeline and run tests on it. See https://github.com/neondatabase/neon/issues/6053 --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-02-06 12:53:43 +00:00
Alexander Bayandin	d56fea680e	CI: always require aws-oicd-role-arn input to be set (#10145 ) ## Problem `benchmarking` job fails because `aws-oicd-role-arn` input is not set ## Summary of changes: - Set `aws-oicd-role-arn` for `benchmarking job - Always require `aws-oicd-role-arn` to be set - Rename `aws_oicd_role_arn` to `aws-oicd-role-arn` for consistency	2024-12-13 19:56:32 +00:00
Rahul Patil	58d45c6e86	ci(fix): Use OIDC auth to login on ECR (#10055 ) ## Problem CI currently uses static credentials in some places. These are less secure and hard to maintain, so we are going to deprecate them and use OIDC auth. ## Summary of changes - ci(fix): Use OIDC auth to upload artifact on s3 - ci(fix): Use OIDC auth to login on ECR	2024-12-12 15:13:08 +00:00
Alexander Bayandin	e04dd3be0b	test_runner: rerun all failed tests (#9917 ) ## Problem Currently, we rerun only known flaky tests. This approach was chosen to reduce the number of tests that go unnoticed (by forcing people to take a look at failed tests and rerun the job manually), but it has some drawbacks: - In PRs, people tend to push new changes without checking failed tests (that's ok) - In the main, tests are just restarted without checking (understandable) - Parametrised tests become flaky one by one, i.e. if `test[1]` is flaky `, test[2]` is not marked as flaky automatically (which may or may not be the case). I suggest rerunning all failed tests to increase the stability of GitHub jobs and using the Grafana Dashboard with flaky tests for deeper analysis. ## Summary of changes - Rerun all failed tests twice at max	2024-11-28 19:02:57 +00:00
Peter Bendel	982cb1c15d	Move logic for ingest benchmark from GitHub workflow into python testcase (#9762 ) ## Problem The first version of the ingest benchmark had some parsing and reporting logic in shell script inside GitHub workflow. it is better to move that logic into a python testcase so that we can also run it locally. ## Summary of changes - Create new python testcase - invoke pgcopydb inside python test case - move the following logic into python testcase - determine backpressure - invoke pgcopydb and report its progress - parse pgcopydb log and extract metrics - insert metrics into perf test database - add additional column to perf test database that can receive endpoint ID used for pgcopydb run to have it available in grafana dashboard when retrieving other metrics for an endpoint ## Example run https://github.com/neondatabase/neon/actions/runs/11860622170/job/33056264386	2024-11-19 09:46:46 +00:00
Alexander Bayandin	0fc4ada3ca	Switch CI, Storage and Proxy to Debian 12 (Bookworm) (#9170 ) ## Problem This PR switches CI and Storage to Debain 12 (Bookworm) based images. ## Summary of changes - Add Debian codename (`bookworm`/`bullseye`) to most of docker tags, create un-codenamed images to be used by default - `vm-compute-node-image`: create a separate spec for `bookworm` (we don't need to build cgroups in the future) - `neon-image`: Switch to `bookworm`-based `build-tools` image - Storage components and Proxy use it - CI: run lints and tests on `bookworm`-based `build-tools` image	2024-10-14 21:12:43 +01:00
Alexander Bayandin	5ef805e12c	CI(run-python-test-set): allow to skip missing compatibility snapshot (#9365 ) ## Problem Action `run-python-test-set` fails if it is not used for `regress_tests` on release PR, because it expects `test_compatibility.py::test_create_snapshot` to generate a snapshot, and the test exists only in `regress_tests` suite. For example, in https://github.com/neondatabase/neon/pull/9291 [`test-postgres-client-libs`](https://github.com/neondatabase/neon/actions/runs/11209615321/job/31155111544) job failed. ## Summary of changes - Add `skip-if-does-not-exist` input to `.github/actions/upload` action (the same way we do for `.github/actions/download`) - Set `skip-if-does-not-exist=true` for "Upload compatibility snapshot" step in `run-python-test-set` action	2024-10-11 16:58:41 +01:00
Alexander Bayandin	e58e045ebb	CI(promote-compatibility-data): fix job (#8871 ) ## Problem `promote-compatibility-data` job got broken and slightly outdated after - https://github.com/neondatabase/neon/pull/8552 -- we don't upload artifacts for ARM64 - https://github.com/neondatabase/neon/pull/8561 -- we don't prepare `debug` artifacts in the release branch anymore ## Summary of changes - Promote artifacts from release PRs to the latest version (but do it from `release` branch) - Upload artifacts for both X64 and ARM64	2024-08-30 13:18:30 +01:00
Alexander Bayandin	75175f3628	CI(build-and-test): run regression tests on arm (#8552 ) ## Problem We want to run our regression test suite on ARM. ## Summary of changes - run regression tests on release ARM builds - run `build-neon` (including rust tests) on debug ARM builds - add `arch` parameter to test to distinguish them in the allure report and in a database	2024-08-21 14:29:11 +01:00
Alexander Bayandin	c96593b473	Make Postgres 16 default version (#8745 ) ## Problem The default Postgres version is set to 15 in code, while we use 16 in most of the other places (and Postgres 17 is coming) ## Summary of changes - Run `benchmarks` job with Postgres 16 (instead of Postgres 14) - Set `DEFAULT_PG_VERSION` to 16 in all places - Remove deprecated `--pg-version` pytest argument - Update `test_metadata_bincode_serde_ensure_roundtrip` for Postgres 16	2024-08-20 10:46:58 +01:00
Alexander Bayandin	aa2e16f307	CI: misc cleanup & fixes (#8559 ) ## Problem A bunch of small fixes and improvements for CI, that are too small to have a separate PR for them ## Summary of changes - CI(build-and-test): fix parenthesis - CI(actionlint): fix path to workflow file - CI: remove default args from actions/checkout - CI: remove `gen3` label, using a couple `self-hosted` + `small{,-arm64}`/`large{,-arm64}` is enough - CI: prettify Slack messages, hide links behind text messages - C(build-and-test): add more dependencies to `conclusion` job	2024-08-14 17:56:59 +01:00
John Spray	3cecbfc04d	.github: reduce test concurrency (#8444 ) ## Problem This is an experiment to see if 16x concurrency is actually helping, or if it's just giving us very noisy results. If the total runtime with a lower concurrency is similar, then a lower concurrency is preferable to reduce the impact of resource-hungry tests running concurrently.	2024-07-26 11:55:37 +01:00
Tristan Partin	1c57f6bac3	Add long running replication tests These tests will help verify that replication, both physical and logical, works as expected in Neon. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-08 07:30:22 -07:00
Alexander Bayandin	6216df7765	CI(benchmarking): move psql queries to actions/run-python-test-set (#8230 ) ## Problem Some of the Nightly benchmarks fail with the error ``` + /tmp/neon/pg_install/v14/bin/pgbench --version /tmp/neon/pg_install/v14/bin/pgbench: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory ``` Originally, we added the `pgbench --version` call to check that `pgbench` is installed and to fail earlier if it's not. The failure happens because we don't have `LD_LIBRARY_PATH` set for every job, and it also affects `psql` command. We can move it to `actions/run-python-test-set` so as not to duplicate code (as it already have `LD_LIBRARY_PATH` set). ## Summary of changes - Remove `pgbench --version` call - Move `psql` commands to common `actions/run-python-test-set`	2024-07-02 15:21:23 +00:00
Alexander Bayandin	e823b92947	CI(build-tools): Remove libpq from build image (#8206 ) ## Problem We use `build-tools` image as a base image to build other images, and it has a pretty old `libpq-dev` installed (v13; it wasn't that old until I removed system Postgres 14 from `build-tools` image in https://github.com/neondatabase/neon/pull/6540) ## Summary of changes - Remove `libpq-dev` from `build-tools` image - Set `LD_LIBRARY_PATH` for tests (for different Postgres binaries that we use, like psql and pgbench) - Set `PQ_LIB_DIR` to build Storage Controller - Set `LD_LIBRARY_PATH`/`DYLD_LIBRARY_PATH` in the Storage Controller where it calls Postgres binaries	2024-07-01 13:11:55 +01:00
Alexander Bayandin	54a06de4b5	CI: Use `runner.arch` in cache keys along with `runner.os` (#8175 ) ## Problem The cache keys that we use on CI are the same for X64 and ARM64 (`runner.arch`) ## Summary of changes - Include `runner.arch` along with `runner.os` into cache keys	2024-06-27 13:56:03 +01:00
Alexander Bayandin	c789ec21f6	CI: miscellaneous cleanups (#8073 ) ## Problem There are a couple of small CI cleanups that seem too small for dedicated PRs ## Summary of changes - Create release PR with the title that matches the title in the description - Tune error message for disallowing `ubuntu-latest` to explicitly mention what to do - Remove junit output from pytest, we use allure instead	2024-06-19 19:21:09 +01:00
Alexander Bayandin	feb359b459	CI: Update deprecated GitHub Actions (#6822 ) ## Problem We use a bunch of deprecated actions. See https://github.com/neondatabase/neon/actions/runs/7958569728 (Annotations section) ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-java@v3, actions/cache@v3, actions/github-script@v6. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/. ``` ## Summary of changes - `actions/cache@v3` -> `actions/cache@v4` - `actions/checkout@v3` -> `actions/checkout@v4` - `actions/github-script@v6` -> `actions/github-script@v7` - `actions/setup-java@v3` -> `actions/setup-java@v4` - `actions/upload-artifact@v3` -> `actions/upload-artifact@v4`	2024-02-19 21:46:22 +00:00
Alexander Bayandin	f4cc7cae14	CI(build-tools): Update Python from 3.9.2 to 3.9.18 (#6615 ) ## Problem We use an outdated version of Python (3.9.2) ## Summary of changes - Update Python to the latest patch version (3.9.18) - Unify the usage of python caches where possible	2024-02-06 20:30:43 +00:00
Alexander Bayandin	e65f0fe874	CI(benchmarks): make job split consistent across reruns (#6614 ) ## Problem We've got several issues with the current `benchmarks` job setup: - `benchmark_durations.json` file (that we generate in runtime to split tests into several jobs[0]) is not consistent between these jobs (and very not consistent with the file if we rerun the job). I.e. test selection for each job can be different, which could end up in missed tests in a test run. - `scripts/benchmark_durations` doesn't fetch all tests from the database (it doesn't expect any extra directories inside `test_runner/performance`) - For some reason, currently split into 4 groups ends up with the 4th group has no tests to run, which fails the job[1] - [0] https://github.com/neondatabase/neon/pull/4683 - [1] https://github.com/neondatabase/neon/issues/6629 ## Summary of changes - Generate `benchmark_durations.json` file once before we start `benchmarks` jobs (this makes it consistent across the jobs) and pass the file content through the GitHub Actions input (this makes it consistent for reruns) - `scripts/benchmark_durations` fix SQL query for getting all required tests - Split benchmarks into 5 jobs instead of 4 jobs.	2024-02-06 17:00:55 +00:00
MMeent	83e7e5dbbd	Feat/postgres 16 (#4761 ) This adds PostgreSQL 16 as a vendored postgresql version, and adapts the code to support this version. The important changes to PostgreSQL 16 compared to the PostgreSQL 15 changeset include the addition of a neon_rmgr instead of altering Postgres's original WAL format. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-12 15:11:32 +02:00
Alexander Bayandin	7e39a96441	scripts/flaky_tests.py: Improve flaky tests detection (#5094 ) ## Problem We still need to rerun some builds manually because flaky tests weren't detected automatically. I found two reasons for it: - If a test is flaky on a particular build type, on a particular Postgres version, there's a high chance that this test is flaky on all configurations, but we don't automatically detect such cases. - We detect flaky tests only on the main branch, which requires manual retrigger runs for freshly made flaky tests. Both of them are fixed in the PR. ## Summary of changes - Spread flakiness of a single test to all configurations - Detect flaky tests in all branches (not only in the main) - Look back only at 7 days of test history (instead of 10)	2023-08-29 11:53:24 +01:00
Alexander Bayandin	b98419ee56	Fix allure report overwriting for different Postgres versions (#4806 ) ## Problem We've got an example of Allure reports from 2 different runners for the same build that started to upload at the exact second, making one overwrite another ## Summary of changes - Use the Postgres version to distinguish artifacts (along with the build type)	2023-07-26 15:19:18 +01:00
Alexander Bayandin	4580f5085a	test_runner: run benchmarks in parallel (#4683 ) ## Problem Benchmarks run takes about an hour on main branch (in a single job), which delays pipeline results. And it takes another hour if we want to restart the job due to some failures. ## Summary of changes - Use `pytest-split` plugin to run benchmarks on separate CI runners in 4 parallel jobs - Add `scripts/benchmark_durations.py` for getting benchmark durations from the database to help `pytest-split` schedule tests more evenly. It uses p99 for the last 10 days' results (durations). The current distribution could be better; each worker's durations vary from 9m to 35m, but this could be improved in consequent PRs.	2023-07-17 20:09:45 +01:00
Alex Chi Z	f276f21636	ci: use eu-central-1 bucket (#4315 ) Probably increase CI success rate. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com>	2023-05-25 00:00:21 +03:00
Alexander Bayandin	1b2ece3715	Re-enable compatibility tests on Postgres 15 (#4274 ) - Enable compatibility tests for Postgres 15 - Also add `PgVersion::v_prefixed` property to return the version number with, _guess what,_ v-prefix!	2023-05-18 19:56:09 +01:00
Alexander Bayandin	131343ed45	Fix regress-tests job for Postgres 15 on release branch (#4253 ) ## Problem Compatibility tests don't support Postgres 15 yet, but we're still trying to upload compatibility snapshot (which we do not collect). Ref https://github.com/neondatabase/neon/actions/runs/4991394158/jobs/8940369368#step:4:38129 ## Summary of changes Add `pg_version` parameter to `run-python-test-set` actions and do not upload compatibility snapshot for Postgres 15	2023-05-16 17:18:56 +01:00
Alexander Bayandin	a5615bd8ea	Fix Allure reports for different benchmark jobs (#4229 ) - Fix Allure report generation failure for Nightly Benchmarks - Fix GitHub Autocomment for `run-benchmarks` label (`build_and_test.yml::benchmarks` job)	2023-05-15 13:04:03 +01:00
Alexander Bayandin	bb06d281ea	Run regressions tests on both Postgres 14 and 15 (#4192 ) This PR adds tests runs on Postgres 15 and created unified Allure report with results for all tests. - Split `.github/actions/allure-report` into `.github/actions/allure-report-store` and `.github/actions/allure-report-generate` - Add debug or release pytest parameter for all tests (depending on `BUILD_TYPE` env variable) - Add Postgres version as a pytest parameter for all tests (depending on `DEFAULT_PG_VERSION` env variable) - Fix `test_wal_restore` and `restore_from_wal.sh` to support path with `[`/`]` in it (fixed by applying spellcheck to the script and fixing all warnings), `restore_from_wal_archive.sh` is deleted as unused. - All known failures on Postgres 15 marked with xfail	2023-05-12 15:28:51 +01:00
Alexander Bayandin	13e53e5dc8	GitHub Workflows: use '!cancelled' instead of 'success or failure'	2023-04-12 15:22:18 +01:00
Alexander Bayandin	105b8bb9d3	test_runner: automatically rerun flaky tests (#3880 ) This PR adds a plugin that automatically reruns (up to 3 times) flaky tests. Internally, it uses data from `TEST_RESULT_CONNSTR` database and `pytest-rerunfailures` plugin. As the first approximation we consider the test flaky if it has failed on the main branch in the last 10 days. Flaky tests are fetched by `scripts/flaky_tests.py` script (it's possible to use it in a standalone mode to learn which tests are flaky), stored to a JSON file, and then the file is passed to the pytest plugin.	2023-04-04 12:21:54 +01:00
Rory de Zoete	cd5732d9d8	Gen3 runners (#3220 ) https://github.com/neondatabase/cloud/issues/2738 Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>	2023-01-26 10:46:06 +01:00
Kirill Bulatov	8932d14d50	Revert "Run Python tests in 8 threads (#3206 )" (#3264 ) This reverts commit `56a4466d0a`. Seems that flackiness increased after this commit, while the time decrease was a couple of seconds. With every regular Python test spawing 1 etcd, 3 safekeepers, 1 pageserver, few CLI commands and post-run cleanup hooks, it might be hard to run many such tests in parallel. We could return to this later, after we consider alternative test structure and/or CI runner structure.	2023-01-04 17:31:51 +02:00
Kirill Bulatov	56a4466d0a	Run Python tests in 8 threads (#3206 ) I have experimented with the runner threads number, and looks like 8 threads win us a few seconds. Bumping the thread count more did not improve the situation much: * 20 threads were not allowed by pytest * 16 threads were flacking quite notably My guess would be that all pageservers, safekeepers, and other nodes we start occupy quite much of the CPU and other resources to make this approach more scalable.	2023-01-02 14:34:06 +02:00
Alexander Bayandin	03190a2161	GitHub Actions: Do not create Allure report for cancelled jobs (#2813 ) If a workflow is cancelled, do not delay its finishing by creating an allure report.	2022-11-15 10:27:59 +00:00
Alexander Bayandin	175779c0ef	GitHub Actions: fix non-parallel benchmarks on CI (#2787 ) Fix non-parallel pytest run by setting `--dist=loadgroup` only for pytest command with xdist enabled (`-n` is set)	2022-11-10 12:51:47 +00:00
Alexander Bayandin	c4f9f1dc6d	Add data format forward compatibility tests (#2766 ) Add `test_forward_compatibility`, which checks if it's going to be possible to roll back a release to the previous version. The test uses artifacts (Neon & Postgres binaries) from the previous release to start Neon on the repo created by the current version. It performs exactly the same checks as `test_backward_compatibility` does. Single `ALLOW_BREAKING_CHANGES` env var got replaced by `ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` & `ALLOW_FORWARD_COMPATIBILITY_BREAKAGE` and can be set by `backward compatibility breakage` and `forward compatibility breakage` labels respectively.	2022-11-10 09:06:34 +00:00
Alexander Bayandin	128dc8d405	Nightly Benchmarks: fix workflow (#2708 )	2022-10-27 19:26:10 +03:00
Alexander Bayandin	834ffe1bac	Add data format backward compatibility tests (#2626 )	2022-10-25 16:41:50 +02:00
Alexander Bayandin	3e65209a06	Nightly Benchmarks: use Postgres binaries from artifacts (#2501 )	2022-09-23 12:50:36 +01:00
Anastasia Lubennikova	eb9200abc8	Use version-specific path in pytest CI script	2022-09-22 18:12:41 +03:00
Anastasia Lubennikova	1fa7d6aebf	Use DEFAULT_PG_VERSION env in CI pytest	2022-09-22 14:15:13 +03:00
Anastasia Lubennikova	d45de3d58f	update build scripts to match pg_distrib_dir versioning schema	2022-09-22 14:15:13 +03:00

1 2

71 Commits