rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-09 06:22:57 +00:00

Author	SHA1	Message	Date
Alexander Bayandin	e65f0fe874	CI(benchmarks): make job split consistent across reruns (#6614 ) ## Problem We've got several issues with the current `benchmarks` job setup: - `benchmark_durations.json` file (that we generate in runtime to split tests into several jobs[0]) is not consistent between these jobs (and very not consistent with the file if we rerun the job). I.e. test selection for each job can be different, which could end up in missed tests in a test run. - `scripts/benchmark_durations` doesn't fetch all tests from the database (it doesn't expect any extra directories inside `test_runner/performance`) - For some reason, currently split into 4 groups ends up with the 4th group has no tests to run, which fails the job[1] - [0] https://github.com/neondatabase/neon/pull/4683 - [1] https://github.com/neondatabase/neon/issues/6629 ## Summary of changes - Generate `benchmark_durations.json` file once before we start `benchmarks` jobs (this makes it consistent across the jobs) and pass the file content through the GitHub Actions input (this makes it consistent for reruns) - `scripts/benchmark_durations` fix SQL query for getting all required tests - Split benchmarks into 5 jobs instead of 4 jobs.	2024-02-06 17:00:55 +00:00
Joonas Koivunen	bb92721168	build: migrate check-style-rust to small runners (#6588 ) We have more small runners than large runners, and often a shortage of large runners. Migrate `check-style-rust` to run on small runners.	2024-02-06 15:53:04 +00:00
Em Sharnoff	d820d64e38	Bump vm-builder v0.21.0 -> v0.23.2 (#6480 ) Relevant changes were all from v0.23.0: - neondatabase/autoscaling#724 - neondatabase/autoscaling#726 - neondatabase/autoscaling#732 Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-02-02 22:39:20 +00:00
Alexander Bayandin	30c9e145d7	check-macos-build: switch job to macos-14 (M1) (#6539 ) ## Problem - GitHub made available `macos-14` runners, and they run on M1 processors[0] - The price is the same as Intel-based runners — "macOS \| 3 or 4 (M1 or Intel) \| $0.08"[1], but runners on Apple Silicon should be significantly faster than their Intel counterparts. - Most developers who use macOS use Apple Silicon-based Macs nowadays. - [0] https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/ - [1] https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#per-minute-rates ## Summary of changes - Run `check-macos-build` on `macos-14`	2024-02-02 10:51:20 +00:00
Alexander Bayandin	fa52cd575e	Remove old tests results and old coverage collection (#6376 ) ## Problem We have switched to new test results and new coverage results, so no need to collect these data in old formats. ## Summary of changes - Remove "Upload coverage report" for old coverage report - Remove "Store Allure test stat in the DB" for old test results format	2024-02-01 13:36:55 +00:00
Christian Schwarz	3a82430432	fixup(#6492 ): also switch the benchmarks that runs on merge-to-main back to std-fs (#6501 )	2024-01-28 00:15:11 +01:00
Arpad Müller	734755eaca	Enable nextest retries for the arm build (#6496 ) Also make the NEXTEST_RETRIES declaration more local. Requested in https://github.com/neondatabase/neon/pull/6493#issuecomment-1912110202	2024-01-27 05:16:11 +01:00
Christian Schwarz	e34166a28f	CI: switch back to std-fs io engine for soak time before next release (#6492 ) PR #5824 introduced the concept of io engines in pageserver and implemented `tokio-epoll-uring` in addition to our current method, `std-fs`. We used `tokio-epoll-uring` in CI for a day to get more exposure to the code. Now it's time to switch CI back so that we test with `std-fs` as well, because that's what we're (still) using in production.	2024-01-26 22:48:34 +01:00
Alexander Bayandin	4c245b0f5a	update_build_tools_image.yml: Push build-tools image to Docker Hub (#6481 ) ## Problem - `docker.io/neondatabase/build-tools:pinned` image is frequently outdated on Docker Hub because there's no automated way to update it. - `update_build_tools_image.yml` workflow contains legacy roll-back logic, which is not required anymore because it updates only a single image. ## Summary of changes - Make `update_build_tools_image.yml` workflow push images to both ECR and Docker Hub - Remove unneeded roll-back logic	2024-01-26 16:12:49 +00:00
Christian Schwarz	918b03b3b0	integrate tokio-epoll-uring as alternative VirtualFile IO engine (#5824 )	2024-01-26 09:25:07 +01:00
Alexander Bayandin	d36623ad74	CI: cancel old e2e-tests on new commits (#6463 ) ## Problem Triggered `e2e-tests` job is not cancelled along with other jobs in a PR if the PR get new commits. We can improve the situation by setting `concurrency_group` for the remote workflow (https://github.com/neondatabase/cloud/pull/9622 adds `concurrency_group` group input to the remote workflow). Ref https://neondb.slack.com/archives/C059ZC138NR/p1706087124297569 Cloud's part added in https://github.com/neondatabase/cloud/pull/9622 ## Summary of changes - Set `concurrency_group` parameter when triggering `e2e-tests` - At the beginning of a CI pipeline, trigger Cloud's `cancel-previous-in-concurrency-group.yml` workflow which cancels previously triggered e2e-tests	2024-01-25 19:25:29 +00:00
Arpad Müller	d52b81340f	S3 based recovery (#6155 ) Adds a new `time_travel_recover` function to the `RemoteStorage` trait that allows time travel like functionality for S3 buckets, regardless of their content (it is not even pageserver related). It takes a different approach from [this post](https://aws.amazon.com/blogs/storage/point-in-time-restore-for-amazon-s3-buckets/) that is more complicated. It takes as input a prefix a target timestamp, and a limit timestamp: * executes [`ListObjectVersions`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectVersions.html) * obtains the latest version that comes before the target timestamp * copies that latest version to the same prefix * if there is versions newer than the limit timestamp, it doesn't do anything for the file The limit timestamp is meant to be some timestamp before the start of the recovery operation and after any changes that one wants to revert. For example, it might be the time point after a tenant was detached from all involved pageservers. The limiting mechanism ensures that the operation is idempotent and can be retried without causing additional writes/copies. The approach fulfills all the requirements laid out in 8233, and is a recoverable operation. Nothing is deleted permanently, only new entries added to the version log. I also enable [nextest retries](https://nexte.st/book/retries.html) to help with some general S3 flakiness (on top of low level retries). Part of https://github.com/neondatabase/cloud/issues/8233	2024-01-25 18:23:18 +01:00
Cihan Demirci	d34adf46b4	do not provide disclaimer input for the deploy-prod workflow (#6360 ) We've removed this input from the deploy-prod workflow.	2024-01-15 16:15:34 +00:00
Alexander Bayandin	7de829e475	test_runner: replace black with ruff format (#6268 ) ## Problem `black` is slow sometimes, we can replace it with `ruff format` (a new feature in 0.1.2 [0]), which produces pretty similar to black style [1]. On my local machine (MacBook M1 Pro 16GB): ``` # `black` on main $ hyperfine "BLACK_CACHE_DIR=/dev/null poetry run black ." Benchmark 1: BLACK_CACHE_DIR=/dev/null poetry run black . Time (mean ± σ): 3.131 s ± 0.090 s [User: 5.194 s, System: 0.859 s] Range (min … max): 3.047 s … 3.354 s 10 runs ``` ``` # `ruff format` on the current PR $ hyperfine "RUFF_NO_CACHE=true poetry run ruff format" Benchmark 1: RUFF_NO_CACHE=true poetry run ruff format Time (mean ± σ): 300.7 ms ± 50.2 ms [User: 259.5 ms, System: 76.1 ms] Range (min … max): 267.5 ms … 420.2 ms 10 runs ``` ## Summary of changes - Replace `black` with `ruff format` everywhere - [0] https://docs.astral.sh/ruff/formatter/ - [1] https://docs.astral.sh/ruff/formatter/#black-compatibility	2024-01-05 15:35:07 +00:00
Abhijeet Patil	f28bdb6528	Use nextest for rust unittests (#6223 ) ## Problem `cargo test` doesn't support timeouts or junit output format ## Summary of changes - Add `nextest` to `build-tools` image - Switch `cargo test` with `cargo nextest` on CI - Set timeout	2023-12-30 13:45:31 +00:00
Arpad Müller	a21b719770	Use neon-github-ci-tests S3 bucket for remote_storage tests (#6216 ) This bucket is already used by the pytests. The current bucket github-public-dev is more meant for longer living artifacts. slack thread: https://neondb.slack.com/archives/C039YKBRZB4/p1703124944669009 Part of https://github.com/neondatabase/cloud/issues/8233 / #6155	2023-12-21 17:28:28 +01:00
Alexander Bayandin	1dff98be84	CI: fix build-tools image tag for PRs (#6217 ) ## Problem Fix build-tools image tag calculation for PRs. Broken in https://github.com/neondatabase/neon/pull/6195 ## Summary of changes - Use `pinned` tag instead of `$GITHUB_RUN_ID` if there's no changes in the dockerfile (and we don't build such image)	2023-12-21 14:55:24 +00:00
Abhijeet Patil	61b6c4cf30	Build dockerfile from neon repo (#6195 ) ## Fixing GitHub workflow issue related to build and push images ## Summary of changes Followup of PR#608[move docker file from build repo to neon to solve issue some issues The build started failing because it missed a validation in logic that determines changes in the docker file Also, all the dependent jobs were skipped because of the build and push of the image job. To address the above issue following changes were made - we are adding validation to generate image tag even if it's a merge to repo. - All the dependent jobs won't skip even if the build and push image job is skipped. - We have moved the logic to generate a tag in the sub-workflow. As the tag name was necessary to be passed to the sub-workflow it made sense to abstract that away where it was needed and then store it as an output variable so that downward dependent jobs could access the value. - This made the dependency logic easy and we don't need complex expressions to check the condition on which it will run - An earlier PR was closed that tried solving a similar problem that has some feedback and context before creating this PR https://github.com/neondatabase/neon/pull/6175 ## Checklist before requesting a review - [x] Move the tag generation logic from the main workflow to the sub-workflow of build and push the image - [x] Add a condition to generate an image tag for a non-PR-related run - [x] remove complex if the condition from the job if conditions --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Abhijeet Patil <abhijeet@neon.tech>	2023-12-21 12:46:51 +00:00
Em Sharnoff	58dbca6ce3	Bump vm-builder v0.19.0 -> v0.21.0 (#6197 ) Only applicable change was neondatabase/autoscaling#650, reducing the vector scrape interval (inside the VM) from 15 seconds to 1 second.	2023-12-19 23:48:41 +00:00
Bodobolero	73d247c464	Analyze clickbench performance with explain plans and pg_stat_statements (#6161 ) ## Problem To understand differences in performance between neon, aurora and rds we want to collect explain analyze plans and pg_stat_statements for selected benchmarking runs ## Summary of changes Add workflow input options to collect explain and pg_stat_statements for benchmarking workflow Co-authored-by: BodoBolero <bodobolero@gmail.com>	2023-12-19 11:44:25 +00:00
Alexander Bayandin	9bdc25f0af	Revert "CI: build build-tools image" (#6156 ) It turns out the issue with skipped jobs is not so trivial (because Github checks jobs transitively), a possible workaround with `if: always() && contains(fromJSON('["success", "skipped"]'), needs.build-buildtools-image.result)` will tangle the workflow really bad. We'll need to come up with a better solution. To unblock the main I'm going to revert https://github.com/neondatabase/neon/pull/6082.	2023-12-16 12:32:00 +00:00
Abhijeet Patil	8619e6295a	CI: build build-tools image (#6082 ) ## Currently our build docker file is located in the build repo it makes sense to have it as a part of our neon repo ## Summary of changes We had the docker file that we use to build our binary and other tools resided in the build repo It made sense to bring the docker file to its repo where it has been used So that the contributors can also view it and amend if required It will reduce the maintenance. Docker file changes and code changes can be accommodated in same PR Also, building the image and pushing it to ECR is abstracted in a reusable workflow. Ideal is to use that for any other jobs too ## Checklist before requesting a review - [x] Moved the docker file used to build the binary from the build repo to the neon repo - [x] adding gh workflow to build and push the image - [x] adding gh workflow to tag the pushed image - [x] update readMe file --------- Co-authored-by: Abhijeet Patil <abhijeet@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-12-16 10:33:52 +00:00
Tristan Partin	5ab9592a2d	Add submodule paths as safe directories as a precaution The check-codestyle-rust-arm job requires this for some reason, so let's just add them everywhere we do this workaround.	2023-12-11 13:08:37 -06:00
Tristan Partin	036558c956	Fix git ownership issue in check-codestyle-rust-arm We have this workaround for other jobs. Looks like this one was forgotten about.	2023-12-11 13:08:37 -06:00
Anastasia Lubennikova	5289f341ce	Use test specific directory in test_remote_extensions (#5938 )	2023-11-27 18:57:58 +00:00
Shany Pozin	35f243e787	Move weekly release PR trigger to Monday morning (#5908 )	2023-11-23 19:09:34 +02:00
Em Sharnoff	d0a842a509	Update vm-builder to v0.19.0 and move its customization here (#5783 ) ref neondatabase/autoscaling#600 for more	2023-11-16 18:17:42 +01:00
Alexander Bayandin	f84ac2b98d	Fix baseline commit and branch for code coverage (#5769 ) ## Problem `HEAD` commit for a PR is a phantom merge commit which skews the baseline commit for coverage reports. See https://github.com/neondatabase/neon/pull/5751#issuecomment-1790717867 ## Summary of changes - Use commit hash instead of `HEAD` for finding baseline commits for code coverage - Use the base branch for PRs or the current branch for pushes	2023-11-15 12:40:21 +01:00
Arpad Müller	31a54d663c	Migrate links from wiki to notion (#5862 ) See the slack discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1696429688621489?thread_ts=1695647103.117499	2023-11-14 15:36:47 +00:00
Joonas Koivunen	b7f45204a2	build: deny async-std and friends (#5849 ) rationale: some crates pull these in as default; hopefully these hints will require less cleanup-after and Cargo.lock file watching. follow-up to #5848.	2023-11-10 18:02:22 +01:00
Alexander Bayandin	71b380f90a	Set BUILD_TAG for build-neon job (#5847 ) ## Problem I've added `BUILD_TAG` to docker images. (https://github.com/neondatabase/neon/pull/5812), but forgot to add it to services that we build for tests ## Summary of changes - Set `BUILD_TAG` in `build-neon` job	2023-11-10 12:49:52 +00:00
Alexander Bayandin	6e145a44fa	workflows/neon_extra_builds: run check-codestyle-rust & build-neon on arm64 (#5832 ) ## Problem Some developers use workstations with arm CPUs, and sometimes x86-64 code is not fully compatible with it (for example, https://github.com/neondatabase/neon/pull/5827). Although we don't have arm CPUs in the prod (yet?), it is worth having some basic checks for this architecture to have a better developer experience. Closes https://github.com/neondatabase/neon/issues/5829 ## Summary of changes - Run `check-codestyle-rust`-like & `build-neon`-like jobs on Arm runner - Add `run-extra-build-*` label to run all available extra builds	2023-11-10 12:45:41 +00:00
Anna Stepanyan	893616051d	Update epic-template.md (#5709 ) replace the checkbox list with a a proper task list in the epic template NB: this PR does not change the code, it only touches the github issue templates	2023-11-09 15:24:43 +01:00
Alexander Bayandin	4cd47b7d4b	Dockerfile: Set BUILD_TAG for storage services (#5812 ) ## Problem https://github.com/neondatabase/neon/pull/5576 added `build-tag` reporting to `libmetrics_build_info`, but it's not reported because we didn't set the corresponding env variable in the build process. ## Summary of changes - Add `BUILD_TAG` env var while building services	2023-11-07 13:45:59 +00:00
Shany Pozin	1588601503	Move release PR creation to Friday (#5721 ) Prepare for a new release workflow * Release PR is created on Fridays * The discussion/approval happens during Friday * Sunday morning the deployment will be done in central-il and perf tests will be run * On Monday early IST morning gradually start rolling (starting from US regions as they are still in weekend time) See slack for discussion: https://neondb.slack.com/archives/C04P81J55LK/p1698565305607839?thread_ts=1698428241.031979&cid=C04P81J55LK	2023-10-30 22:10:24 +01:00
Em Sharnoff	39b148b74e	Bump vm-builder v0.18.2 -> v0.18.4 (#5666 ) Only applicable change was neondatabase/autoscaling#584, setting pgbouncer auth_dbname=postgres in order to fix superuser connections from preventing dropping databases.	2023-10-26 20:04:57 +01:00
Alexander Bayandin	85f4514e7d	Get env var for real Azure tests from GitHub (#5662 ) ## Problem We'll need to switch `REMOTE_STORAGE_AZURE_REGION` from the current `eastus2` region to something `eu-central-1`-like. This may require changing `AZURE_STORAGE_ACCESS_KEY`. To make it possible to switch from one place (not to break a lot of builds on CI), move `REMOTE_STORAGE_AZURE_CONTAINER` and `REMOTE_STORAGE_AZURE_REGION` to GitHub Variables. See https://github.com/neondatabase/neon/settings/variables/actions ## Summary of changes - Get values for `REMOTE_STORAGE_AZURE_CONTAINER` & `REMOTE_STORAGE_AZURE_REGION` from GitHub Variables	2023-10-25 22:54:23 +01:00
Alexander Bayandin	4778b6a12e	Switch to querying new tests results DB (#5616 ) ## Problem We started to store test results in a new format in https://github.com/neondatabase/neon/pull/4549. This PR switches scripts to query this db. (we can completely remove old DB/ingestions scripts in a couple of weeks after the PR merged) ## Summary of changes - `scripts/benchmark_durations.py` query new database - `scripts/flaky_tests.py` query new database	2023-10-25 14:25:13 +01:00
Em Sharnoff	44202eeb3b	Bump vm-builder v0.18.1 -> v0.18.2 (#5646 ) Only applicable change was neondatabase/autoscaling#571, removing the postgres_exporter flags `--auto-discover-databases` and `--exclude-databases=...`	2023-10-24 16:04:28 -07:00
Alexander Bayandin	a8a800af51	Run real Azure tests on CI (#5627 ) ## Problem We do not run real Azure-related tests on CI ## Summary of changes - Set required env variables to run real Azure blob storage tests on CI	2023-10-24 12:12:11 +01:00
Arthur Petukhovsky	ba856140e7	Fix neon_extra_build.yml (#5605 ) Build walproposer-lib in gather-rust-build-stats, fix nproc usage, fix walproposer-lib on macos.	2023-10-19 22:20:39 +01:00
Shany Pozin	893b7bac9a	Fix neon_extra_builds.yml : nproc is not supported in mac os (#5598 ) ## Problem nproc is not supported in mac os, use sysctl -n hw.ncpu instead	2023-10-19 15:24:23 +01:00
Arthur Petukhovsky	66f8f5f1c8	Call walproposer from Rust (#5403 ) Create Rust bindings for C functions from walproposer. This allows to write better tests with real walproposer code without spawning multiple processes and starting up the whole environment. `make walproposer-lib` stage was added to build static libraries `libwalproposer.a`, `libpgport.a`, `libpgcommon.a`. These libraries can be statically linked to any executable to call walproposer functions. `libs/walproposer/src/walproposer.rs` contains `test_simple_sync_safekeepers` to test that walproposer can be called from Rust to emulate sync_safekeepers logic. It can also be used as a usage example.	2023-10-19 14:17:15 +01:00
Em Sharnoff	16c87b5bda	Bump vm-builder v0.17.12 -> v0.18.1 (#5583 ) Only applicable change was neondatabase/autoscaling#566, updating pgbouncer to 1.21.0 and enabling support for prepared statements.	2023-10-18 11:10:01 +02:00
Alexander Bayandin	522aaca718	Temporary deploy staging preprod region from main (#5477 ) ## Problem Stating preprod region can't use `release-XXX` right now, the config is unified across all regions, it supports only `XXX`. Ref https://neondb.slack.com/archives/C03H1K0PGKH/p1696506459720909?thread_ts=1696437812.365249&cid=C03H1K0PGKH ## Summary of changes - Deploy staging-preprod from main	2023-10-05 14:02:20 +00:00
Alexander Bayandin	7a2cafb34d	Use zstd to compress large allure artifacts (#5458 ) ## Problem - Because we compress artifacts file by file, we don't need to put them into `tar` containers (ie instead of `tar.gz` we can use just `gz`). - Pythons gz single-threaded and pretty slow. A benchmark has shown ~20 times speedup (19.876176291 vs 0.8748335830000009) on my laptop (for a pageserver.log size is 1.3M) ## Summary of changes - Replace tarfile with zstandart - Update allure to 2.24.0	2023-10-04 16:20:16 +01:00
Em Sharnoff	5fdc80db03	Bump vm-builder v0.17.11 -> v0.17.12 (#5407 ) Only relevant change is neondatabase/autoscaling#534 - refer there for more details.	2023-09-28 09:52:39 +02:00
Em Sharnoff	a24cd69589	Bump vm-builder v0.17.10 -> v0.17.11 (#5371 ) This only includes the changes from neondatabase/autoscaling#525, which improves graceful VM shutdown.	2023-09-25 19:49:07 +01:00
Alexander Bayandin	3048a5f0e2	Deploy releases to staging-preprod first (#5308 ) ## Problem Before releasing new version to production, we'd like to run a set of required checks on the incoming release. The simplest approach, which doesn't require many changes — dedicate one staging region to `preprod` installation. The proposed changes to the release flow are the following: - When a release PR is merged into the release branch — trigger deployment from the release branch to a dedicated staging-preprod region (for now, it's going to be `eu-west-1` — Ireland) Corresponding infrastructure PR: https://github.com/neondatabase/aws/pull/585 ## Summary of changes - Trigger `deploy.dev` workflow with `-f deployPreprodRegion=true` for release branch	2023-09-22 14:17:43 +01:00
Em Sharnoff	18f3a706da	Bump vm-builder v0.17.5 -> v0.17.10 (#5334 ) Only notable change is including neondatabase/autoscaling#523, which we hope will help with making sure that TCP connections are properly terminated before shutdown (which hopefully fixes a leak in the pageserver).	2023-09-18 17:30:34 +00:00

1 2 3 4 5 ...

490 Commits