rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-16 01:42:55 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	b33ca57839	WIP: Share the storage env between some tests This introduces a new 'neon_tenant: NeonTestTenant' fixture, which can be used instead of a NeonEnv. All tests using 'neon_tenant' share the same NeonEnv. NeonTestTenant provides many of the same members as NeonEnv, like 'create_branch', 'endpoints' etc. But not the ones that deal with storage, nor create_tenant. I converted some existing tests to use the new fixture. More could be converted, if we add more features to NeonTestTenant: - Many tests grab the pageserver HTTP client and call GC / compact functions on the pageserver. They are per-tenant operations, so they could safely be done one a shared pageserver too. - Some tests use failpoints to add delays or pauses to various operations in the pageserver. If the failpoints were tenant- or timeline-scoped, they could be used on a shared pageserver too. - Some tests print intoduce errors in the pageserver logs. They set allowlists for those error messages, because any unexpected errors in the logs cause the test to fail. If the allowlist was tenant- or timleine-scoped, we could allow those tests to share the env. - Some tests use helper functions like check_restored_datadir_content or fork_at_current_lsn, which take a NeonEnv as argument. They could use NeonTestTenant instead (or become methods in NeonTestTenant). - Some tests want to use extra tenant config for the initial tenant. We could expand the neon_tenant fixture to allow that. (Perhaps introduce NeonTestTenantBuilder which allows setting config before creating the initial tenant) - Some tests create multiple tenants. That could be allowed in a shared env, as long as we track which tenants belong to the test. See https://github.com/neondatabase/neon/issues/9193	2024-10-05 13:59:38 +03:00
Vlad Lazar	3f91ea28d9	tests: add infra and test for storcon leadership transfer (#8587 ) ## Problem https://github.com/neondatabase/neon/pull/8588 implemented the mechanism for storage controller leadership transfers. However, there's no tests that exercise the behaviour. ## Summary of changes 1. Teach `neon_local` how to handle multiple storage controller instances. Each storage controller instance gets its own subdirectory (`storage_controller_1, ...`). `storage_controller start\|stop` subcommands have also been extended to optionally accept an instance id. 2. Add a storage controller proxy test fixture. It's a basic HTTP server that forwards requests from pageserver and test env to the currently configured storage controller. 3. Add a test which exercises storage controller leadership transfer. 4. Finally fix a couple bugs that the test surfaced	2024-08-16 13:05:04 +01:00
John Spray	44f42627dd	pageserver/controller: error handling for shard splitting (#7074 ) ## Problem Shard splits worked, but weren't safe against failures (e.g. node crash during split) yet. Related: #6676 ## Summary of changes - Introduce async rwlocks at the scope of Tenant and Node: - exclusive tenant lock is used to protect splits - exclusive node lock is used to protect new reconciliation process that happens when setting node active - exclusive locks used in both cases when doing persistent updates (e.g. node scheduling conf) where the update to DB & in-memory state needs to be atomic. - Add failpoints to shard splitting in control plane and pageserver code. - Implement error handling in control plane for shard splits: this detaches child chards and ensures parent shards are re-attached. - Crash-safety for storage controller restarts requires little effort: we already reconcile with nodes over a storage controller restart, so as long as we reset any incomplete splits in the DB on restart (added in this PR), things are implicitly cleaned up. - Implement reconciliation with offline nodes before they transition to active: - (in this context reconciliation means something like startup_reconcile, not literally the Reconciler) - This covers cases where split abort cannot reach a node to clean it up: the cleanup will eventually happen when the node is marked active, as part of reconciliation. - This also covers the case where a node was unavailable when the storage controller started, but becomes available later: previously this allowed it to skip the startup reconcile. - Storage controller now terminates on panics. We only use panics for true "should never happen" assertions, and these cases can leave us in an un-usable state if we keep running (e.g. panicking in a shard split). In the unlikely event that we get into a crashloop as a result, we'll rely on kubernetes to back us off. - Add `test_sharding_split_failures` which exercises a variety of failure cases during shard split.	2024-03-14 09:11:57 +00:00
Joonas Koivunen	ffd146c3e5	refactor: globals in tests (#5298 ) Refactor tests to have less globals. This will allow to hopefully write more complex tests for our new metric collection requirements in #5297. Includes reverted work from #4761 related to test globals. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: MMeent <matthias@neon.tech>	2023-09-13 22:05:30 +03:00
Alexander Bayandin	78a7f68902	Make pg_version and build_type regular parameters (#4311 ) ## Problem All tests have already been parametrised by Postgres version and build type (to have them distinguishable in the Allure report), but despite it, it's anyway required to have DEFAULT_PG_VERSION and BUILD_TYPE env vars set to corresponding values, for example to run`test_timeline_deletion_with_files_stuck_in_upload_queue[release-pg14-local_fs]` test it's required to set `DEFAULT_PG_VERSION=14` and `BUILD_TYPE=release`. This PR makes the test framework pick up parameters from the test name itself. ## Summary of changes - Postgres version and build type related fixtures now are function-scoped (instead of being sessions scoped before) - Deprecate `--pg-version` argument in favour of DEFAULT_PG_VERSION env variable (it's easier to parse) - GitHub autocomment now includes only one command with all the failed tests + runs them in parallel	2023-07-03 13:51:40 +01:00
Alexander Bayandin	bb06d281ea	Run regressions tests on both Postgres 14 and 15 (#4192 ) This PR adds tests runs on Postgres 15 and created unified Allure report with results for all tests. - Split `.github/actions/allure-report` into `.github/actions/allure-report-store` and `.github/actions/allure-report-generate` - Add debug or release pytest parameter for all tests (depending on `BUILD_TYPE` env variable) - Add Postgres version as a pytest parameter for all tests (depending on `DEFAULT_PG_VERSION` env variable) - Fix `test_wal_restore` and `restore_from_wal.sh` to support path with `[`/`]` in it (fixed by applying spellcheck to the script and fixing all warnings), `restore_from_wal_archive.sh` is deleted as unused. - All known failures on Postgres 15 marked with xfail	2023-05-12 15:28:51 +01:00
Alexander Bayandin	653e633c59	test_runner: add --pg-version pytest argument (#4037 ) - allows setting Postgres version for testing using --pg-version argument - fixes tests for the non-default Postgres version.	2023-05-05 02:57:47 +03:00
Alexander Bayandin	105b8bb9d3	test_runner: automatically rerun flaky tests (#3880 ) This PR adds a plugin that automatically reruns (up to 3 times) flaky tests. Internally, it uses data from `TEST_RESULT_CONNSTR` database and `pytest-rerunfailures` plugin. As the first approximation we consider the test flaky if it has failed on the main branch in the last 10 days. Flaky tests are fetched by `scripts/flaky_tests.py` script (it's possible to use it in a standalone mode to learn which tests are flaky), stored to a JSON file, and then the file is passed to the pytest plugin.	2023-04-04 12:21:54 +01:00
Alexander Bayandin	4c2bb43775	Reformat all python files by black & isort	2022-08-22 14:57:09 +01:00
Heikki Linnakangas	f4233fde39	Silence "Module already imported" warning in python tests We were getting a warning like this from the pg_regress tests: =================== warnings summary =================== /usr/lib/python3/dist-packages/_pytest/config/__init__.py:663 /usr/lib/python3/dist-packages/_pytest/config/__init__.py:663: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: fixtures.pg_stats self.import_plugin(import_spec) -- Docs: https://docs.pytest.org/en/stable/warnings.html ------------------ Benchmark results ------------------- To fix, reorder the imports in conftest.py. I'm not sure what exactly the problem was or why the order matters, but the warning is gone and that's good enough for me.	2022-07-20 16:55:41 +03:00
Thang Pham	6cfebc096f	Add read/write throughput performance tests (#1883 ) Part of #1467 This PR adds several performance tests that compare the [PG statistics](https://www.postgresql.org/docs/current/monitoring-stats.html) obtained when running PG benchmarks against Neon and vanilla PG to measure the read/write throughput of the DB.	2022-06-06 12:32:10 -04:00
bojanserafimov	90e2c9ee1f	Rename zenith to neon in python tests (#1871 )	2022-06-02 16:21:28 -04:00
bojanserafimov	335abfcc28	Add slow seqscan perf test (#1283 )	2022-02-16 10:59:51 -05:00
Dmitry Rodionov	c75bc9b8b0	Change benchmark plugin layout so pytest loads it properly when running all tests (not necessary performance ones) resolves #837	2021-11-04 16:33:31 +03:00
Heikki Linnakangas	ab2f0ad1a8	Fix and reorganize python tests. - The 'pageserver' fixture now sets up the repository and starts up the Page Server automatically. In other words, the 'pageserver' fixture provides a Page Server that's up and running and ready to use in tests. - The 'pageserver' fixture now also creates a branch called 'empty', right after initializing the repository. By convention, all the tests start by createing a new branch off 'empty' for the test. This allows running all the tests against the same Page Server concurrently. (I haven't tested that though. pytest doensn't provide an option to run tests in parallel but there are extensions for that.) - Remove the 'zen_simple' fixture. Now that 'pageserver' provides server that's up and running, it's pretty simple to use the 'pageserver' and 'postgres' fixtures directly. - Don't assume host name or ports in the tests. They now use the fields in the fixtures for that. That allows assigning the ports dynamically, making it possible to run multiple page servers in parallel, or running the tests in parallel with another page server. This commit still hard codes the Page Server's port in the fixture, though, so more work is needed to actually make it possible. - I made some changes to the 'postgres' fixture in commit `532918e13d`, which broke the other tests. Fix them. - Divide the tests into two "batches" of roughly equal runtime, which can be run in parallel - Merge the 'test_file' and 'test_filter' options in CircleCI config into one 'test_selection' option, for simplicity.	2021-05-17 20:44:00 +03:00

15 Commits