rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-09 14:32:57 +00:00

Author	SHA1	Message	Date
Em Sharnoff	44202eeb3b	Bump vm-builder v0.18.1 -> v0.18.2 (#5646 ) Only applicable change was neondatabase/autoscaling#571, removing the postgres_exporter flags `--auto-discover-databases` and `--exclude-databases=...`	2023-10-24 16:04:28 -07:00
Alexander Bayandin	a8a800af51	Run real Azure tests on CI (#5627 ) ## Problem We do not run real Azure-related tests on CI ## Summary of changes - Set required env variables to run real Azure blob storage tests on CI	2023-10-24 12:12:11 +01:00
Arthur Petukhovsky	ba856140e7	Fix neon_extra_build.yml (#5605 ) Build walproposer-lib in gather-rust-build-stats, fix nproc usage, fix walproposer-lib on macos.	2023-10-19 22:20:39 +01:00
Shany Pozin	893b7bac9a	Fix neon_extra_builds.yml : nproc is not supported in mac os (#5598 ) ## Problem nproc is not supported in mac os, use sysctl -n hw.ncpu instead	2023-10-19 15:24:23 +01:00
Arthur Petukhovsky	66f8f5f1c8	Call walproposer from Rust (#5403 ) Create Rust bindings for C functions from walproposer. This allows to write better tests with real walproposer code without spawning multiple processes and starting up the whole environment. `make walproposer-lib` stage was added to build static libraries `libwalproposer.a`, `libpgport.a`, `libpgcommon.a`. These libraries can be statically linked to any executable to call walproposer functions. `libs/walproposer/src/walproposer.rs` contains `test_simple_sync_safekeepers` to test that walproposer can be called from Rust to emulate sync_safekeepers logic. It can also be used as a usage example.	2023-10-19 14:17:15 +01:00
Em Sharnoff	16c87b5bda	Bump vm-builder v0.17.12 -> v0.18.1 (#5583 ) Only applicable change was neondatabase/autoscaling#566, updating pgbouncer to 1.21.0 and enabling support for prepared statements.	2023-10-18 11:10:01 +02:00
Alexander Bayandin	522aaca718	Temporary deploy staging preprod region from main (#5477 ) ## Problem Stating preprod region can't use `release-XXX` right now, the config is unified across all regions, it supports only `XXX`. Ref https://neondb.slack.com/archives/C03H1K0PGKH/p1696506459720909?thread_ts=1696437812.365249&cid=C03H1K0PGKH ## Summary of changes - Deploy staging-preprod from main	2023-10-05 14:02:20 +00:00
Alexander Bayandin	7a2cafb34d	Use zstd to compress large allure artifacts (#5458 ) ## Problem - Because we compress artifacts file by file, we don't need to put them into `tar` containers (ie instead of `tar.gz` we can use just `gz`). - Pythons gz single-threaded and pretty slow. A benchmark has shown ~20 times speedup (19.876176291 vs 0.8748335830000009) on my laptop (for a pageserver.log size is 1.3M) ## Summary of changes - Replace tarfile with zstandart - Update allure to 2.24.0	2023-10-04 16:20:16 +01:00
Em Sharnoff	5fdc80db03	Bump vm-builder v0.17.11 -> v0.17.12 (#5407 ) Only relevant change is neondatabase/autoscaling#534 - refer there for more details.	2023-09-28 09:52:39 +02:00
Em Sharnoff	a24cd69589	Bump vm-builder v0.17.10 -> v0.17.11 (#5371 ) This only includes the changes from neondatabase/autoscaling#525, which improves graceful VM shutdown.	2023-09-25 19:49:07 +01:00
Alexander Bayandin	3048a5f0e2	Deploy releases to staging-preprod first (#5308 ) ## Problem Before releasing new version to production, we'd like to run a set of required checks on the incoming release. The simplest approach, which doesn't require many changes — dedicate one staging region to `preprod` installation. The proposed changes to the release flow are the following: - When a release PR is merged into the release branch — trigger deployment from the release branch to a dedicated staging-preprod region (for now, it's going to be `eu-west-1` — Ireland) Corresponding infrastructure PR: https://github.com/neondatabase/aws/pull/585 ## Summary of changes - Trigger `deploy.dev` workflow with `-f deployPreprodRegion=true` for release branch	2023-09-22 14:17:43 +01:00
Em Sharnoff	18f3a706da	Bump vm-builder v0.17.5 -> v0.17.10 (#5334 ) Only notable change is including neondatabase/autoscaling#523, which we hope will help with making sure that TCP connections are properly terminated before shutdown (which hopefully fixes a leak in the pageserver).	2023-09-18 17:30:34 +00:00
Alexander Bayandin	bd36d1c44a	approved-for-ci-run.yml: fix variable name and permissions (#5307 ) ## Problem - `gh pr list` fails with `unknown argument "main"; please quote all values that have spaces due to using a variable with the wrong name - `permissions: write-all` are too wide for the job ## Summary of changes - For variable name `HEAD` -> `BRANCH` - Grant only required permissions for each job --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-14 20:18:49 +03:00
Alexander Bayandin	2641ff3d1a	Use CI_ACCESS_TOKEN to create release PR (#5286 ) ## Problem If @github-actions creates release PR, the CI pipeline is not triggered (but we have `release-notify.yml` workflow that we expect to run on this event). I suspect this happened because @github-actions is not a repository member. Ref https://github.com/neondatabase/neon/pull/5283#issuecomment-1715209291 ## Summary of changes - Use `CI_ACCESS_TOKEN` to create a PR - Use `gh` instead of `thomaseizinger/create-pull-request` - Restrict permissions for GITHUB_TOKEN to `contents: write` only (required for `git push`)	2023-09-12 20:01:21 +01:00
Alexander Bayandin	e1661c3c3c	approved-for-ci-run.yml: fix ci-run/pr-* branch deletion (#5278 ) ## Problem `ci-run/pr-*` branches (and attached PRs) should be deleted automatically when their parent PRs get closed. But there are not ## Summary of changes - Fix if-condition	2023-09-12 19:29:26 +03:00
MMeent	83e7e5dbbd	Feat/postgres 16 (#4761 ) This adds PostgreSQL 16 as a vendored postgresql version, and adapts the code to support this version. The important changes to PostgreSQL 16 compared to the PostgreSQL 15 changeset include the addition of a neon_rmgr instead of altering Postgres's original WAL format. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-12 15:11:32 +02:00
Alexander Bayandin	d33e1b1b24	approved-for-ci-run.yml: use token to checkout the repo (#5266 ) ## Problem Another thing I overlooked regarding'approved-for-ci-run`: - When we create a PR, the action is associated with @vipvap and this triggers the pipeline — this is good. - When we update the PR by force-pushing to the branch, the action is associated with @github-actions, which doesn't trigger a pipeline — this is bad. Initially spotted in #5239 / #5211 ([link](https://github.com/neondatabase/neon/actions/runs/6122249456/job/16633919558?pr=5239)) — `check-permissions` should not fail. ## Summary of changes - Use `CI_ACCESS_TOKEN` to check out the repo (I expect this token will be reused in the following `git push`)	2023-09-10 20:12:38 +01:00
Alexander Bayandin	34e39645c4	GitHub Workflows: add actionlint (#5265 ) ## Problem Add a CI pipeline that checks GitHub Workflows with https://github.com/rhysd/actionlint (it uses `shellcheck` for shell scripts in steps) To run it locally: `SHELLCHECK_OPTS=--exclude=SC2046,SC2086 actionlint` ## Summary of changes - Add `.github/workflows/actionlint.yml` - Fix actionlint warnings	2023-09-10 20:05:07 +01:00
Alexander Bayandin	1ea93af56c	Create GitHub release from release tag (#5246 ) ## Problem This PR creates a GitHub release from a release tag with an autogenerated changelog: https://github.com/neondatabase/neon/releases ## Summary of changes - Call GitHub API to create a release	2023-09-09 22:02:28 +01:00
Alexander Bayandin	028fbae161	Miscellaneous fixes for tests-related things (#5259 ) ## Problem A bunch of fixes for different test-related things ## Summary of changes - Fix test_runner/pg_clients (`subprocess_capture` return value has changed) - Do not run create-test-report if check-permissions failed for not cancelled jobs - Fix Code Coverage comment layout after flaky tests. Add another healing "\n" - test_compatibility: add an instruction for local run Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-08 16:28:09 +01:00
Alexander Bayandin	d15563f93b	Misc workflows: fix quotes in bash (#5235 )	2023-09-07 19:39:42 +03:00
Alexander Bayandin	f8a91e792c	Even better handling of `approved-for-ci-run` label (#5227 ) ## Problem We've got `approved-for-ci-run` to work 🎉 But it's still a bit rough, this PR should improve the UX for external contributors. ## Summary of changes - `build_and_test.yml`: add `check-permissions` job, which fails if PR is created from a fork. Make all jobs in the workflow to be dependant on `check-permission` to fail fast - `approved-for-ci-run.yml`: add `cleanup` job to close `ci-run/pr-` PRs and delete linked branches when the parent PR is closed - `approved-for-ci-run.yml`: fix the layout for the `ci-run/pr-` PR description - GitHub Autocomment: add a comment with tests result to the original PR (instead of a PR from `ci-run/pr-*` )	2023-09-07 14:21:01 +01:00
Alexander Bayandin	e4b1d6b30a	Misc post-merge fixes (#5219 ) ## Problem - `SCALE: unbound variable` from https://github.com/neondatabase/neon/pull/5079 - The layout of the GitHub auto-comment is broken if the code coverage section follows flaky test section from https://github.com/neondatabase/neon/pull/4999 ## Summary of changes - `benchmarking.yml`: Rename `SCALE` to `TEST_OLAP_SCALE` - `comment-test-report.js`: Add an extra new-line before Code coverage section	2023-09-06 20:11:44 +03:00
Alexander Bayandin	76a96b0745	Notify Slack channel about upcoming releases (#5197 ) ## Problem When the next release is coming, we want to let everyone know about it by posting a message to the Slack channel with a list of commits. ## Summary of changes - `.github/workflows/release-notify.yml` is added - the workflow sends a message to `vars.SLACK_UPCOMING_RELEASE_CHANNEL_ID` (or [#test-release-notifications](https://neondb.slack.com/archives/C05QQ9J1BRC) if not configured) - On each PR update, the workflow updates the list of commits in the message (it doesn't send additional messages)	2023-09-06 17:52:21 +01:00
Alexander Bayandin	d5f1858f78	approved-for-ci-run.yml: use different tokens (#5218 ) ## Problem `CI_ACCESS_TOKEN` has quite limited access (which is good), but this doesn't allow it to remove labels from PRs (which is bad) ## Summary of changes - Use `GITHUB_TOKEN` to remove labels - Use `CI_ACCESS_TOKEN` to create PRs	2023-09-06 18:50:59 +03:00
Alexander Bayandin	da60f69909	approved-for-ci-run.yml: use our bot (#5216 ) ## Problem Pull Requests created by GitHub Actions bot doesn't have access to secrets, so we need to use our bot for it to auto-trigger a tests run See previous PRs #4663, #5210, #5212 ## Summary of changes - Use our bot to create PRs	2023-09-06 14:55:11 +03:00
Alexander Bayandin	8e25d3e79e	test_runner: add scale parameter to tpc-h tests (#5079 ) ## Problem It's hard to find out which DB size we use for OLAP benchmarks (TPC-H in particular). This PR adds handling of `TEST_OLAP_SCALE` env var, which is get added to a test name as a parameter. This is required for performing larger periodic benchmarks. ## Summary of changes - Handle `TEST_OLAP_SCALE` in `test_runner/performance/test_perf_olap.py` - Set `TEST_OLAP_SCALE` in `.github/workflows/benchmarking.yml` to a TPC-H scale	2023-09-06 13:22:57 +03:00
Vadim Kharitonov	88b1ac48bd	Create Release PR at 7:00 UTC every Tuesday (#5213 )	2023-09-06 13:17:52 +03:00
Alexander Bayandin	15ff4e5fd1	approved-for-ci-run.yml: trigger on pull_request_target (#5212 ) ## Problem Continuation of #4663, #5210 We're still getting an error: ``` GraphQL: Resource not accessible by integration (removeLabelsFromLabelable) ``` ## Summary of changes - trigger `approved-for-ci-run.yml` workflow on `pull_request_target` instead of `pull_request`	2023-09-06 13:14:07 +03:00
Alexander Bayandin	dbfb4ea7b8	Make CI more friendly for external contributors. Second try (#5210 ) ## Problem `approved-for-ci-run` label logic doesn't work as expected: - https://github.com/neondatabase/neon/pull/4722#issuecomment-1636742145 - https://github.com/neondatabase/neon/pull/4722#issuecomment-1636755394 Continuation of https://github.com/neondatabase/neon/pull/4663 Closes #2222 (hopefully) ## Summary of changes - Create a twin PR automatically - Allow `GITHUB_TOKEN` to manipulate with labels	2023-09-06 10:06:55 +01:00
Alexander Bayandin	c222320a2a	Generate lcov coverage report (#4999 ) ## Problem We want to display coverage information for each PR. - an example of a full coverage report: https://neon-github-public-dev.s3.amazonaws.com/code-coverage/abea64800fb390c32a3efe6795d53d8621115c83/lcov/index.html - an example of GitHub auto-comment with coverage information: https://github.com/neondatabase/neon/pull/4999#issuecomment-1679344658 ## Summary of changes - Use patched[*](`426e7e7a22`) lcov to generate coverage report - Upload HTML coverage report to S3 - `scripts/comment-test-report.js`: add coverage information	2023-09-06 00:48:15 +01:00
Alexander Bayandin	7ceddadb37	Merge custom extension CI jobs (#5194 ) ## Problem When a remote custom extension build fails, it looks a bit confusing on neon CI: - `trigger-custom-extensions-build` is green - `wait-for-extensions-build` is red - `build-and-upload-extensions` is red But to restart the build (to get everything green), you need to restart the only passed `trigger-custom-extensions-build`. ## Summary of changes - Merge `trigger-custom-extensions-build` and `wait-for-extensions-build` jobs into `trigger-custom-extensions-build-and-wait`	2023-09-05 14:02:37 +01:00
Alexander Bayandin	e9f2c64322	Wait for custom extensions build before deploy (#5170 ) ## Problem Currently, the `deploy` job doesn't wait for the custom extension job (in another repo) and can be started even with failed extensions build. This PR adds another job that polls the status of the extension build job and fails if the extension build fails. ## Summary of changes - Add `wait-for-extensions-build` job, which waits for a custom extension build in another repo.	2023-09-01 12:59:19 +01:00
Alexander Bayandin	7e39a96441	scripts/flaky_tests.py: Improve flaky tests detection (#5094 ) ## Problem We still need to rerun some builds manually because flaky tests weren't detected automatically. I found two reasons for it: - If a test is flaky on a particular build type, on a particular Postgres version, there's a high chance that this test is flaky on all configurations, but we don't automatically detect such cases. - We detect flaky tests only on the main branch, which requires manual retrigger runs for freshly made flaky tests. Both of them are fixed in the PR. ## Summary of changes - Spread flakiness of a single test to all configurations - Detect flaky tests in all branches (not only in the main) - Look back only at 7 days of test history (instead of 10)	2023-08-29 11:53:24 +01:00
Felix Prasanna	7b5489a0bb	compute_ctl: start pg in cgroup for vms (#4920 ) Starts `postgres` in cgroup directly from `compute_ctl` instead of from `vm-builder`. This is required because the `vm-monitor` cannot be in the cgroup it is managing. Otherwise, it itself would be frozen when freezing the cgroup. Requires https://github.com/neondatabase/cloud/pull/6331, which adds the `AUTOSCALING` environment variable letting `compute_ctl` know to start `postgres` in the cgroup. Requires https://github.com/neondatabase/autoscaling/pull/468, which prevents `vm-builder` from starting the monitor and putting postgres in a cgroup. This will require a `VM_BUILDER_VERSION` bump.	2023-08-25 15:59:12 -04:00
Alek Westover	f71c82e5de	remove obsolete `need` dependency (#5087 )	2023-08-25 09:10:26 -04:00
Felix Prasanna	3128eeff01	compute_ctl: add vm-monitor (#4946 ) Co-authored-by: Em Sharnoff <sharnoff@neon.tech>	2023-08-24 15:54:37 -04:00
Alek Westover	e8f9aaf78c	Don't use non-existent docker tags (#5096 )	2023-08-24 19:45:23 +03:00
Alek Westover	99a1be6c4e	remove upload step from neon, it is in private repo now (#5085 )	2023-08-24 17:14:40 +03:00
Christian Schwarz	71ccb07a43	ci: fix upload-postgres-extensions-to-s3 job (#5063 ) This is cherry-picked-then-improved version of release branch commit `4204960942` PR #4861) The commit commit `5f8fd640bf` Author: Alek Westover <alek.westover@gmail.com> Date: Wed Jul 26 08:24:03 2023 -0400 Upload Test Remote Extensions (#4792) switched to using the release tag instead of `latest`, but, the `promote-images` job only uploads `latest` to the prod ECR. The switch to using release tag was good in principle, but, it broke the release pipeline. So, switch release pipeline back to using `latest`. Note that a proper fix should abandon use of `:latest` tag at all: currently, if a `main` pipeline runs concurrently with a `release` pipeline, the `release` pipeline may end up using the `main` pipeline's images. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-08-22 22:45:25 +03:00
Alek Westover	bf303a6575	Trigger workflow in remote (private) repo to build and upload private extensions (#4944 )	2023-08-22 13:32:29 -04:00
Felix Prasanna	4a8bd866f6	bump vm-builder version to v0.16.3 (#5055 ) This change to autoscaling allows agents to connect directly to the monitor, completely removing the informant.	2023-08-21 13:29:16 -04:00
Felix Prasanna	5c6a692cf1	bump `VM_BUILDER_VERSION` to v0.16.2 (#5031 ) A very slight change that allows us to configure the UID of the neon-postgres cgroup owner. We start postgres in this cgroup so we can scale it with the cgroups v2 api. Currently, the control plane overwrites the entrypoint set by `vm-builder`, so `compute_ctl` (and thus postgres), is not started in the neon-postgres cgroup. Having `compute_ctl` start postgres in the cgroup should fix this. However, at the moment appears like it does not have the correct permissions. Configuring the neon-postgres UID to `postgres` (which is the UID `compute_ctl` runs under) should hopefully fix this. See #4920 - the PR to modify `compute_ctl` to start postgres in the cgorup. See: neondatabase/autoscaling#480, neondatabase/autoscaling#477. Both these PR's are part of an effort to increase `vm-builder`'s configurability and allow us to adjust it as we integrate in the monitor.	2023-08-18 14:29:20 -04:00
Alexander Bayandin	207919f5eb	Upload test results to DB right after generation (#4967 ) ## Problem While adding new test results format, I've also changed the way we upload Allure reports to S3 (`722c7956bb`) to avoid duplicated results from previous runs. But it broke links at earlier results (results are still available but on different URLs). This PR fixes this (by reverting logic in `722c7956bb` changes), and moves the logic for storing test results into db to allure generate step. It allows us to avoid test results duplicates in the db and saves some time on extra s3 downloads that happened in a different job before the PR. Ref https://neondb.slack.com/archives/C059ZC138NR/p1691669522160229 ## Summary of changes - Move test results storing logic from a workflow to `actions/allure-report-generate`	2023-08-15 15:32:30 +01:00
Felix Prasanna	cc2d00fea4	bump vm-builder version to v0.15.4 (#4980 ) Patches a bug in vm-builder where it did not include enough parameters in the query string. These parameters are `host=localhost port=5432`. These parameters were not necessary for the monitor because the `pq` go postgres driver included them by default.	2023-08-11 14:26:53 -04:00
Felix Prasanna	6661f4fd44	bump vm-builder version to v0.15.0-alpha1 (#4934 )	2023-08-08 15:22:10 -05:00
Alexander Bayandin	b9f84b9609	Improve test results format (#4549 ) ## Problem The current test history format is a bit inconvenient: - It stores all test results in one row, so all queries should include subqueries which expand the test - It includes duplicated test results if the rerun is triggered manually for one of the test jobs (for example, if we rerun `debug-pg14`, then the report will include duplicates for other build types/postgres versions) - It doesn't have a reference to run_id, which we use to create a link to allure report Here's the proposed new format: ``` id BIGSERIAL PRIMARY KEY, parent_suite TEXT NOT NULL, suite TEXT NOT NULL, name TEXT NOT NULL, status TEXT NOT NULL, started_at TIMESTAMPTZ NOT NULL, stopped_at TIMESTAMPTZ NOT NULL, duration INT NOT NULL, flaky BOOLEAN NOT NULL, build_type TEXT NOT NULL, pg_version INT NOT NULL, run_id BIGINT NOT NULL, run_attempt INT NOT NULL, reference TEXT NOT NULL, revision CHAR(40) NOT NULL, raw JSONB COMPRESSION lz4 NOT NULL, ``` ## Summary of changes - Misc allure changes: - Update allure to 2.23.1 - Delete files from previous runs in HTML report (by using `sync --delete` instead of `mv`) - Use `test-cases/*.json` instead of `suites.json`, using this directory allows us to catch all reruns. - Until we migrated `scripts/flaky_tests.py` and `scripts/benchmark_durations.py` store test results in 2 formats (in 2 different databases).	2023-08-08 20:09:38 +01:00
Felix Prasanna	459253879e	Revert "bump vm-builder to v0.15.0-alpha1 (#4895 )" (#4931 ) This reverts commit `682dfb3a31`.	2023-08-08 20:21:39 +03:00
Felix Prasanna	682dfb3a31	bump vm-builder to v0.15.0-alpha1 (#4895 )	2023-08-03 14:26:14 -04:00
Alexander Bayandin	b98419ee56	Fix allure report overwriting for different Postgres versions (#4806 ) ## Problem We've got an example of Allure reports from 2 different runners for the same build that started to upload at the exact second, making one overwrite another ## Summary of changes - Use the Postgres version to distinguish artifacts (along with the build type)	2023-07-26 15:19:18 +01:00

1 2 3 4 5 ...

452 Commits