rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 22:12:56 +00:00

Author	SHA1	Message	Date
Alexander Bayandin	69cb1ee479	CI(replication-tests): store test results & change notification channel (#8687 ) ## Problem We want to store Nightly Replication test results in the database and notify the relevant Slack channel about failures ## Summary of changes - Store test results in the database - Notify `on-call-compute-staging-stream` about failures	2024-08-15 22:41:58 +01:00
Alexander Bayandin	aa2e16f307	CI: misc cleanup & fixes (#8559 ) ## Problem A bunch of small fixes and improvements for CI, that are too small to have a separate PR for them ## Summary of changes - CI(build-and-test): fix parenthesis - CI(actionlint): fix path to workflow file - CI: remove default args from actions/checkout - CI: remove `gen3` label, using a couple `self-hosted` + `small{,-arm64}`/`large{,-arm64}` is enough - CI: prettify Slack messages, hide links behind text messages - C(build-and-test): add more dependencies to `conclusion` job	2024-08-14 17:56:59 +01:00
Peter Bendel	f57c2fe8fb	Automatically prepare/restore Aurora and RDS databases from pg_dump in benchmarking workflow (#8682 ) ## Problem We use infrastructure as code (TF) to deploy AWS Aurora and AWS RDS Postgres database clusters. Whenever we have a change in TF (e.g. every year to upgrade to a higher Postgres version or when we change the cluster configuration) TF will apply the change and create a new AWS database cluster. However our benchmarking testcase also expects databases in these clusters and tables loaded with data. So we add auto-detection - if the AWS RDS instances are "empty" we create the necessary databases and restore a pg_dump. Important Notes: - These steps are NOT run in each benchmarking run, but only after a new RDS instance has been deployed. - the benchmarking workflows use GitHub secrets to find the connection string for the database. These secrets still need to be (manually or programmatically using git cli) updated if some port of the connection string (e.g. user, password or hostname) changes. ## Summary of changes In each benchmarking run check if - database has already been created - if not create it - database has already been restored - if not restore it Supported databases - tpch - clickbench - user example Supported platforms: - AWS RDS Postgres - AWS Aurora serverless Postgres Sample workflow run - but this one uses Neon database to test the restore step and not real AWS databases https://github.com/neondatabase/neon/actions/runs/10321441086/job/28574350581 Sample workflow run - with real AWS database clusters https://github.com/neondatabase/neon/actions/runs/10346816389/job/28635997653 Verification in second run - with real AWS database clusters - that second time the restore is skipped https://github.com/neondatabase/neon/actions/runs/10348469517/job/28640778223	2024-08-12 21:46:35 +02:00
Peter Bendel	2ca5ff26d7	Run a subset of benchmarking job steps on GitHub action runners in Azure - closer to the system under test (#8651 ) ## Problem Latency from one cloud provider to another one is higher than within the same cloud provider. Some of our benchmarks are latency sensitive - we run a pgbench or psql in the github action runner and the system under test is running in Neon (database project). For realistic perf tps and latency results we need to compare apples to apples and run the database client in the same "latency distance" for all tests. ## Summary of changes Move job steps that test Neon databases deployed on Azure into Azure action runners. - bench strategy variant using azure database - pgvector strategy variant using azure database - pgbench-compare strategy variants using azure database ## Test run https://github.com/neondatabase/neon/actions/runs/10314848502	2024-08-09 08:36:29 +01:00
Alexander Bayandin	e6e578821b	CI(benchmarking): set pub/sub projects for LR tests (#8483 ) ## Problem > Currently, long-running LR tests recreate endpoints every night. We'd like to have along-running buildup of history to exercise the pageserver in this case (instead of "unit-testing" the same behavior everynight). Closes #8317 ## Summary of changes - Update Postgres version for replication tests - Set `BENCHMARK_PROJECT_ID_PUB`/`BENCHMARK_PROJECT_ID_SUB` env vars to projects that were created for this purpose --------- Co-authored-by: Sasha Krassovsky <krassovskysasha@gmail.com>	2024-08-05 22:06:47 +00:00
Alexander Bayandin	f72fe68626	CI(benchmarking): make neonvm default provisioner (#8538 ) ## Problem We don't allow regular end-users to use `k8s-pod` provisioner, but we still use it in nightly benchmarks ## Summary of changes - Remove `provisioner` input from `neon-create-project` action, use `k8s-neonvm` as a default provioner - Change `neon-` platform prefix to `neonvm-` - Remove `neon-captest-freetier` and `neon-captest-new` as we already have their `neonvm` counterparts	2024-07-30 13:38:23 +01:00
Peter Bendel	f76a4e0ad2	Temporarily remove week-end test for res-aurora from pgbench-compare benchmarking runs (#8493 ) ## Problem The rds-aurora endpoint connection cannot be reached from GitHub action runners. Temporarily remove this DBMS from the pgbench comparison runs. ## Summary of changes On Saturday we normally run Neon in comparison with AWS RDS-Postgres and AWS RDS-Aurora. Remove Aurora until we have a working setup	2024-07-25 09:51:20 +01:00
Peter Bendel	392d3524f9	Bodobolero/fix root permissions (#8429 ) ## Problem My prior PR https://github.com/neondatabase/neon/pull/8422 caused leftovers in the GitHub action runner work directory with root permission. As an example see here https://github.com/neondatabase/neon/actions/runs/10001857641/job/27646237324#step:3:37 To work-around we install vanilla postgres as non-root using deb packages in /home/nonroot user directory ## Summary of changes - since we cannot use root we install the deb pkgs directly and create symbolic links for psql, pgbench and libs in expected places - continue jobs an aws even if azure jobs fail (because this region is currently unreliable)	2024-07-19 14:40:55 +01:00
Peter Bendel	841b76ea7c	Temporarily use vanilla pgbench and psql (client) for running pgvector benchmark (#8422 ) ## Problem https://github.com/neondatabase/neon/issues/8275 is not yet fixed Periodic benchmarking fails with SIGABRT in pgvector step, see https://github.com/neondatabase/neon/actions/runs/9967453263/job/27541159738#step:7:393 ## Summary of changes Instead of using pgbench and psql from Neon artifacts, download vanilla postgres binaries into the container and use those to run the client side of the test.	2024-07-18 18:18:18 +02:00
Peter Bendel	f2b8e390e7	Bodobolero/pgbench compare azure (#8409 ) ## Problem We want to run performance tests on all supported cloud providers. We want to run most tests on the postgres version which is default for new projects in production, currently (July 24) this is postgres version 16 ## Summary of changes - change default postgres version for some (performance) tests to 16 (which is our default for new projects in prod anyhow) - add azure region to pgbench_compare jobs - add azure region to pgvector benchmarking jobs - re-used project `weathered-snowflake-88107345` was prepared with 1 million embeddings running on 7 minCU 7 maxCU in azure region to compare with AWS region (pgvector indexing and hnsw queries) - see job pgbench-pgvector - Note we now have a 11 environments combinations where we run pgbench-compare and 5 are for k8s-pod (deprecated) which we can remove in the future once auto-scaling team approves. ## Logs A current run with the changes from this pull request is running here https://github.com/neondatabase/neon/actions/runs/9972096222 Note that we currently expect some failures due to - https://github.com/neondatabase/neon/issues/8275 - instability of projects on azure region	2024-07-17 16:56:32 +02:00
Peter Bendel	c11b9cb43d	Run Performance bench on more platforms (#8312 ) ## Problem https://github.com/neondatabase/cloud/issues/14721 ## Summary of changes add one more platform to benchmarking job `57535c039c/.github/workflows/benchmarking.yml (L57C3-L126)` Run with pg 16, provisioner k8-neonvm by default on the new platform. Adjust some test cases to - not depend on database client <-> database server latency by pushing loops into server side pl/pgSQL functions - increase statement and test timeouts First successful run of these job steps https://github.com/neondatabase/neon/actions/runs/9869817756/job/27254280428	2024-07-11 10:07:12 +01:00
Tristan Partin	1c57f6bac3	Add long running replication tests These tests will help verify that replication, both physical and logical, works as expected in Neon. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-08 07:30:22 -07:00
Alexander Bayandin	6216df7765	CI(benchmarking): move psql queries to actions/run-python-test-set (#8230 ) ## Problem Some of the Nightly benchmarks fail with the error ``` + /tmp/neon/pg_install/v14/bin/pgbench --version /tmp/neon/pg_install/v14/bin/pgbench: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory ``` Originally, we added the `pgbench --version` call to check that `pgbench` is installed and to fail earlier if it's not. The failure happens because we don't have `LD_LIBRARY_PATH` set for every job, and it also affects `psql` command. We can move it to `actions/run-python-test-set` so as not to duplicate code (as it already have `LD_LIBRARY_PATH` set). ## Summary of changes - Remove `pgbench --version` call - Move `psql` commands to common `actions/run-python-test-set`	2024-07-02 15:21:23 +00:00
Alexander Bayandin	e823b92947	CI(build-tools): Remove libpq from build image (#8206 ) ## Problem We use `build-tools` image as a base image to build other images, and it has a pretty old `libpq-dev` installed (v13; it wasn't that old until I removed system Postgres 14 from `build-tools` image in https://github.com/neondatabase/neon/pull/6540) ## Summary of changes - Remove `libpq-dev` from `build-tools` image - Set `LD_LIBRARY_PATH` for tests (for different Postgres binaries that we use, like psql and pgbench) - Set `PQ_LIB_DIR` to build Storage Controller - Set `LD_LIBRARY_PATH`/`DYLD_LIBRARY_PATH` in the Storage Controller where it calls Postgres binaries	2024-07-01 13:11:55 +01:00
Alexander Bayandin	54a06de4b5	CI: Use `runner.arch` in cache keys along with `runner.os` (#8175 ) ## Problem The cache keys that we use on CI are the same for X64 and ARM64 (`runner.arch`) ## Summary of changes - Include `runner.arch` along with `runner.os` into cache keys	2024-06-27 13:56:03 +01:00
Peter Bendel	46210035c5	add halfvec indexing and queries to periodic pgvector performance tests (#8057 ) ## Problem halfvec data type was introduced in pgvector 0.7.0 and is popular because it allows smaller vectors, smaller indexes and potentially better performance. So far we have not tested halfvec in our periodic performance tests. This PR adds halfvec indexing and halfvec queries to the test.	2024-06-14 18:36:50 +02:00
a-masterov	b0a954bde2	CI: switch ubuntu-latest with ubuntu-22.04 (#7256 ) (#7901 ) ## Problem We use ubuntu-latest as a default OS for running jobs. It can cause problems due to instability, so we should use the LTS version of Ubuntu. ## Summary of changes The image ubuntu-latest was changed with ubuntu-22.04 in workflows. ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-05-30 08:25:10 +02:00
Peter Bendel	fabeff822f	Performance test for pgvector HNSW index build and queries (#7873 ) ## Problem We want to regularly verify the performance of pgvector HNSW parallel index builds and parallel similarity search using HNSW indexes. The first release that considerably improved the index-build parallelism was pgvector 0.7.0 and we want to make sure that we do not regress by our neon compute VM settings (swap, memory over commit, pg conf etc.) ## Summary of changes Prepare a Neon project with 1 million openAI vector embeddings (vector size 1536). Run HNSW indexing operations in the regression test for the various distance metrics. Run similarity queries using pgbench with 100 concurrent clients. I have also added the relevant metrics to the grafana dashboards pgbench and olape --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-05-28 11:05:33 +00:00
Alexander Bayandin	90a8ff55fa	CI(benchmarking): Add Sharded Tenant for pgbench (#7186 ) ## Problem During Nightly Benchmarks, we want to collect pgbench results for sharded tenants as well. ## Summary of changes - Add pre-created sharded project for pgbench	2024-04-02 14:39:24 +01:00
Alexander Bayandin	1d5e476c96	CI: use build-tools image from dockerhub (#6795 ) ## Problem Currently, after updating `Dockerfile.build-tools` in a PR, it requires a manual action to make it `pinned`, i.e., the default for everyone. It also makes all opened PRs use such images (even created in the PR and without such changes). This PR overhauls the way we build and use `build-tools` image (and uses the image from Docker Hub). ## Summary of changes - The `neondatabase/build-tools` image gets tagged with the latest commit sha for the `Dockerfile.build-tools` file - Each PR calculates the tag for `neondatabase/build-tools`, tries to pull it, and rebuilds the image with such tag if it doesn't exist. - Use `neondatabase/build-tools` as a default image - When running on `main` branch — create a `pinned` tag and push it to ECR - Use `concurrency` to ensure we don't build `build-tools` image for the same commit in parallel from different PRs	2024-02-28 12:38:11 +00:00
Alexander Bayandin	feb359b459	CI: Update deprecated GitHub Actions (#6822 ) ## Problem We use a bunch of deprecated actions. See https://github.com/neondatabase/neon/actions/runs/7958569728 (Annotations section) ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-java@v3, actions/cache@v3, actions/github-script@v6. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/. ``` ## Summary of changes - `actions/cache@v3` -> `actions/cache@v4` - `actions/checkout@v3` -> `actions/checkout@v4` - `actions/github-script@v6` -> `actions/github-script@v7` - `actions/setup-java@v3` -> `actions/setup-java@v4` - `actions/upload-artifact@v3` -> `actions/upload-artifact@v4`	2024-02-19 21:46:22 +00:00
Bodobolero	73d247c464	Analyze clickbench performance with explain plans and pg_stat_statements (#6161 ) ## Problem To understand differences in performance between neon, aurora and rds we want to collect explain analyze plans and pg_stat_statements for selected benchmarking runs ## Summary of changes Add workflow input options to collect explain and pg_stat_statements for benchmarking workflow Co-authored-by: BodoBolero <bodobolero@gmail.com>	2023-12-19 11:44:25 +00:00
Alexander Bayandin	34e39645c4	GitHub Workflows: add actionlint (#5265 ) ## Problem Add a CI pipeline that checks GitHub Workflows with https://github.com/rhysd/actionlint (it uses `shellcheck` for shell scripts in steps) To run it locally: `SHELLCHECK_OPTS=--exclude=SC2046,SC2086 actionlint` ## Summary of changes - Add `.github/workflows/actionlint.yml` - Fix actionlint warnings	2023-09-10 20:05:07 +01:00
Alexander Bayandin	e4b1d6b30a	Misc post-merge fixes (#5219 ) ## Problem - `SCALE: unbound variable` from https://github.com/neondatabase/neon/pull/5079 - The layout of the GitHub auto-comment is broken if the code coverage section follows flaky test section from https://github.com/neondatabase/neon/pull/4999 ## Summary of changes - `benchmarking.yml`: Rename `SCALE` to `TEST_OLAP_SCALE` - `comment-test-report.js`: Add an extra new-line before Code coverage section	2023-09-06 20:11:44 +03:00
Alexander Bayandin	8e25d3e79e	test_runner: add scale parameter to tpc-h tests (#5079 ) ## Problem It's hard to find out which DB size we use for OLAP benchmarks (TPC-H in particular). This PR adds handling of `TEST_OLAP_SCALE` env var, which is get added to a test name as a parameter. This is required for performing larger periodic benchmarks. ## Summary of changes - Handle `TEST_OLAP_SCALE` in `test_runner/performance/test_perf_olap.py` - Set `TEST_OLAP_SCALE` in `.github/workflows/benchmarking.yml` to a TPC-H scale	2023-09-06 13:22:57 +03:00
Alexander Bayandin	6c3605fc24	Nightly Benchmarks: Increase timeout for pgbench-compare job (#4551 ) ## Problem In the test environment vacuum duration fluctuates from ~1h to ~5h, along with another two 1h benchmarks (`select-only` and `simple-update`) it could be up to 7h which is longer than 6h timeout. ## Summary of changes - Increase timeout for pgbench-compare job to 8h - Remove 6h timeouts from Nightly Benchmarks (this is a default value)	2023-06-23 12:47:37 +01:00
Alexander Bayandin	0322e2720f	Nightly Benchmarks: add neonvm to pgbench-compare (#4225 )	2023-05-16 12:46:28 +01:00
Alexander Bayandin	bb06d281ea	Run regressions tests on both Postgres 14 and 15 (#4192 ) This PR adds tests runs on Postgres 15 and created unified Allure report with results for all tests. - Split `.github/actions/allure-report` into `.github/actions/allure-report-store` and `.github/actions/allure-report-generate` - Add debug or release pytest parameter for all tests (depending on `BUILD_TYPE` env variable) - Add Postgres version as a pytest parameter for all tests (depending on `DEFAULT_PG_VERSION` env variable) - Fix `test_wal_restore` and `restore_from_wal.sh` to support path with `[`/`]` in it (fixed by applying spellcheck to the script and fixing all warnings), `restore_from_wal_archive.sh` is deleted as unused. - All known failures on Postgres 15 marked with xfail	2023-05-12 15:28:51 +01:00
Alexander Bayandin	13e53e5dc8	GitHub Workflows: use '!cancelled' instead of 'success or failure'	2023-04-12 15:22:18 +01:00
Alexander Bayandin	c94b8998be	GitHub Workflows: print error messages to stderr	2023-04-12 15:22:18 +01:00
Alexander Bayandin	218062ceba	GitHub Workflows: use ref_name instead of ref	2023-04-12 15:22:18 +01:00
Alexander Bayandin	c79d5a947c	Nightly Benchmarks: run third-party benchmarks once a week (#3987 )	2023-04-11 10:58:04 +01:00
Alexander Bayandin	818e341af0	Nightly Benchmarks: replace neon-captest-prefetch with -new/-reuse (#3970 ) We have enabled prefetch by default, let's use this in Nightly Benchmarks: - effective_io_concurrency=100 by default (instead of 32) - maintenance_io_concurrency=100 by default (instead of 32) Rename `neon-captest-prefetch` to `neon-captest-new` (for pgbench with initialisation) and `neon-captest-reuse` (for OLAP scenarios)	2023-04-09 12:52:49 +01:00
Alexander Bayandin	4d64edf8a5	Nightly Benchmarks: Add free tier sized compute (#3969 ) - Add support for VMs and CU - Add free tier limited benchmark (0.25 CU) - Ensure we use 1 CU by default for pgbench workload	2023-04-06 19:18:24 +03:00
Alexander Bayandin	c28bfd4c63	Nightly Benchmarks: add user provided example (#3308 )	2023-01-12 23:03:21 +00:00
Alexander Bayandin	201fedd65c	tpch-compare: use rust image instead of rustlegacy (#3182 )	2022-12-22 12:40:39 +00:00
Alexander Bayandin	8d39fcdf72	pgbench-compare: don't run neon-captest-new (#3130 ) Do not run Nightly Benchmarks on `neon-captest-new`. This is a temporary solution to avoid spikes in the storage we consume during the test run. To collect data for the default instance, we could run tests weekly (i.e. not daily).	2022-12-16 13:23:36 +00:00
Alexander Bayandin	c819b699be	Nightly Benchmark: run neon-captest-reuse from staging (#3086 ) The project has been migrated (now it is `restless-king-632302`), and now we should run tests from staging runners. Test run: https://github.com/neondatabase/neon/actions/runs/3686865543/jobs/6241367161 Ref https://github.com/neondatabase/cloud/issues/2836	2022-12-13 23:02:45 +00:00
Alexander Bayandin	9747e90f3a	Nightly Benchmarks: Move from captest to staging (#2838 ) Migrate Nightly Benchmarks from captest to staging. - Migrate GitHub Workflows - Replace `zenith-benchmarker` with regular runners - Remove `environment` parameter from Neon GitHub Actions, add `postgres_version` - The only job left on captest is `neon-captest-reuse`, which will be moved to staging after its project migration. Ref https://github.com/neondatabase/cloud/issues/2836	2022-12-08 22:28:25 +00:00
Alexander Bayandin	a19c487766	Nightly Benchmarks: add TPC-H benchmark (#2978 ) Ref: https://www.tpc.org/tpch/	2022-12-08 15:32:49 +00:00
Alexander Bayandin	ed27c98022	Nightly Benchmarks: use new prefetch settings (#3000 ) - Replace `seqscan_prefetch_buffers` with `effective_io_concurrency` and `maintenance_io_concurrency` for `clickbench-compare` job (see https://github.com/neondatabase/neon/pull/2876) - Get the database name in a runtime (it can be `main` or `neondb` or something else)	2022-12-03 13:11:02 +00:00
MMeent	145e7e4b96	Prefetch cleanup: (#2876 ) - Enable `enable_seqscan_prefetch` by default - Drop use of `seqscan_prefetch_buffers` in favor of `[maintenance,effective]_io_concurrency` This includes adding some fields to the HeapScan execution node, and vacuum state. - Cleanup some conditionals in vacuumlazy.c - Clarify enable_seqscan_prefetch GUC description - Fix issues in heap SeqScan prefetching where synchronize_seqscan machinery wasn't handled properly.	2022-12-02 13:35:01 +01:00
Alexander Bayandin	3ba92d238e	Nightly Benchmarks: Fix default db name and clickbench-compare trigger (#2938 ) - Fix database name: `main` -> `neondb` - Fix `clickbench-compare` trigger; the job should be triggered even if `pgbench-compare` fails	2022-11-28 12:08:04 +00:00
Alexander Bayandin	480175852f	Nightly Benchmarks: add OLAP-style benchmark (clickbench) (#2855 ) Add ClickBench benchmark, an OLAP-style benchmark, to Nightly Benchmarks. The full run of 43 queries on the original dataset takes more than 6h (only 34 queries got processed on in 6h) on our default-sized compute. Having this, currently, would mean having some really unstable tests because of our regular deployment to staging/captest environment (see https://github.com/neondatabase/cloud/issues/1872). I've reduced the dataset size to the first 10^7 rows from the original 10^8 rows. Now it takes ~30-40 minutes to pass. Ref https://github.com/ClickHouse/ClickBench/tree/main/aurora-postgresql Ref https://benchmark.clickhouse.com/	2022-11-25 18:41:26 +00:00
Alexander Bayandin	6b2bc7f775	Nightly Benchmarks: Add RDS Postgres (#2859 ) Add RDS Postgres `db.m5.large` instance to Nightly Benchmarks	2022-11-21 15:25:09 +00:00
Rory de Zoete	53267969d7	Preparation for ARM runners (#2751 ) Need to make the runner tag more specific else we inadvertently might run workloads on the wrong arch Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>	2022-11-16 11:28:57 +01:00
Alexander Bayandin	03190a2161	GitHub Actions: Do not create Allure report for cancelled jobs (#2813 ) If a workflow is cancelled, do not delay its finishing by creating an allure report.	2022-11-15 10:27:59 +00:00
Alexander Bayandin	ebf54b0de0	Nightly Benchmarks: Add 50 GB projects (#2612 )	2022-10-13 10:00:29 +01:00
Alexander Bayandin	93775f6ca7	GitHub Actions: replace deprecated set-output with GITHUB_OUTPUT (#2608 )	2022-10-12 10:22:24 +01:00
Sergey Melnikov	34bea270f0	Fix POSTGRES_DISTRIB_DIR for benchmarks on ec2 runner (#2594 )	2022-10-10 09:12:50 +00:00

1 2

78 Commits