rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 12:32:54 +00:00

Author	SHA1	Message	Date
Alexander Bayandin	486a985629	mypy: enable check_untyped_defs (#3142 ) Enable `check_untyped_defs` and fix warnings.	2022-12-21 09:38:42 +00:00
Alexander Bayandin	12e6f443da	test_perf_pgbench: switch to server-side data generation (#3058 ) To offload the network and reduce its impact, I suggest switching to server-side data generation for the pgbench initialize workflow.	2022-12-18 00:02:04 +00:00
Alexander Bayandin	8fcba150db	test_seqscans: temporarily disable remote test (#3101 ) Temporarily disable `test_seqscans` for remote projects; they acquire too much space and time. We can try to reenable it back after switching to per-test projects.	2022-12-14 18:05:05 +00:00
Kirill Bulatov	4d201619ed	Remove large database files after every test suite (#3090 ) Closes https://github.com/neondatabase/neon/issues/1984 Closes https://github.com/neondatabase/neon/pull/2830 A follow-up of https://github.com/neondatabase/neon/pull/2830, I've noticed that benchmarks failed again due to out of space issues. Removes most of the pageserver and safekeeper files from disk after every pytest suite run. ``` $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 104K test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" --preserve-database-files # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 123M test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] ``` Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>	2022-12-14 13:09:08 +00:00
Alexander Bayandin	0f445827f5	test_seqscans: increase table size for remote test (#3057 ) Increase table size four times to fix the following error: ``` ______________________ test_seqscans[remote-100000-100-0] ______________________ test_runner/performance/test_seqscans.py:57: in test_seqscans assert int(shared_buffers) < int(table_size) E assert 536870912 < 181239808 E + where 536870912 = int(536870912) E + and 181239808 = int(181239808) ``` 536870912 / 181239808 ≈ 2.96	2022-12-10 23:35:05 +00:00
Alexander Bayandin	a19c487766	Nightly Benchmarks: add TPC-H benchmark (#2978 ) Ref: https://www.tpc.org/tpch/	2022-12-08 15:32:49 +00:00
Kirill Bulatov	6a57d5bbf9	Make the request tracing test more useful	2022-12-06 23:52:16 +02:00
Alexander Bayandin	ab073696d0	test_bulk_update: use new prefetch settings (#3007 ) Replace `seqscan_prefetch_buffers` with `effective_io_concurrency` & `maintenance_io_concurrency` in one more place (the last one!)	2022-12-05 10:56:01 +00:00
Alexander Bayandin	480175852f	Nightly Benchmarks: add OLAP-style benchmark (clickbench) (#2855 ) Add ClickBench benchmark, an OLAP-style benchmark, to Nightly Benchmarks. The full run of 43 queries on the original dataset takes more than 6h (only 34 queries got processed on in 6h) on our default-sized compute. Having this, currently, would mean having some really unstable tests because of our regular deployment to staging/captest environment (see https://github.com/neondatabase/cloud/issues/1872). I've reduced the dataset size to the first 10^7 rows from the original 10^8 rows. Now it takes ~30-40 minutes to pass. Ref https://github.com/ClickHouse/ClickBench/tree/main/aurora-postgresql Ref https://benchmark.clickhouse.com/	2022-11-25 18:41:26 +00:00
Heikki Linnakangas	15db566420	Allow setting gc/compaction_period to 0, to disable automatic GC/compaction Many python tests were setting the GC/compaction period to large values, to effectively disable GC / compaction. Reserve value 0 to mean "explicitly disabled". We also set them to 0 in unit tests now, although currently, unit tests don't launch the background jobs at all, so it won't have any effect. Fixes https://github.com/neondatabase/neon/issues/2917	2022-11-25 20:14:06 +02:00
Alexander Bayandin	1a316a264d	Disable statement timeout for performance tests (#2891 ) Fix `test_seqscans` by disabling statement timeout. Also, replace increasing statement timeout with disabling it for performance tests. This should make tests more stable and allow us to observe performance degradation instead of test failures.	2022-11-25 16:05:45 +00:00
Konstantin Knizhnik	21ec28d9bc	Add bulk update test (#2902 )	2022-11-23 17:51:35 +02:00
bojanserafimov	c6f095a821	Fix remote seqscan test (#2878 )	2022-11-21 17:21:47 -05:00
Alexander Bayandin	cb9b26776e	Fix test_seqscans on remote cluster (#2869 ) A remote project is reused between tests, so we need to ensure that we don't have a table with the same name already created.	2022-11-19 23:39:42 +00:00
bojanserafimov	2655bdbb2e	Add remote seqscans test (#2840 )	2022-11-18 09:05:13 -05:00
andres	c11cbf0f5c	fix test_compare_child_and_root_pgbench_perf to do a fair comparison	2022-11-13 21:03:54 +02:00
bojanserafimov	7fd88fab59	Trace read requests (#2762 )	2022-11-10 16:43:04 -05:00
bojanserafimov	7edc098c40	Add perf test instructions (#2777 )	2022-11-10 16:05:57 -05:00
Alexander Bayandin	d5b7832c21	Fix test_wal_backpressure tests (#2792 ) Fix expected return type for `fetchone `: ``` AssertionError: assert False + where False = isinstance((Decimal('56048'), '55 kB', '0/1CF52D8', '0/1CE77E8'), list) ```	2022-11-10 16:15:04 +00:00
Vadim Kharitonov	f720dd735e	Stricter mypy linters for `test_runner/fixtures/*`	2022-11-10 12:47:27 +01:00
Andrés	9211923bef	Pageserver Python tests should not fail if the server is built with no testing feature (#2636 ) Co-authored-by: andres <andres.rodriguez@outlook.es>	2022-10-20 10:46:57 +03:00
Heikki Linnakangas	a22165d41e	Add tests for comparing root and child branch performance. Author: Thang Pham <thang@neon.tech>	2022-10-08 10:07:33 +03:00
Alexander Bayandin	ebab89ebd2	test_runner: pass password to pgbench via PGPASSWORD (#2468 )	2022-09-23 12:51:33 +00:00
Konstantin Knizhnik	f3073a4db9	R-Tree layer map (#2317 ) Replace the layer array and linear search with R-tree So far, the in-memory layer map that holds information about layer files that exist, has used a simple Vec, in no particular order, to hold information about all the layers. That obviously doesn't scale very well; with thousands of layer files the linear search was consuming a lot of CPU. Replace it with a two-dimensional R-tree, with Key and LSN ranges as the dimensions. For the R-tree, use the 'rstar' crate. To be able to use that, we convert the Keys and LSNs into 256-bit integers. 64 bits would be enough to represent LSNs, and 128 bits would be enough to represent Keys. However, we use 256 bits, because rstar internally performs multiplication to calculate the area of rectangles, and the result of multiplying two 128 bit integers doesn't necessarily fit in 128 bits, causing integer overflow and, if overflow-checks are enabled, panic. To avoid that, we use 256 bit integers. Add a performance test that creates a lot of layer files, to demonstrate the benefit.	2022-09-22 08:35:06 +03:00
Heikki Linnakangas	a5019bf771	Use a simpler way to set extra options for benchmark test. Commit `43a4f7173e` fixed the case that there are extra options in the connection string, but broke it in the case when there are not. Fix that. But on second thoughts, it's more straightforward set the options with ALTER DATABASE, so change the workflow yaml file to do that instead.	2022-09-20 13:48:50 +03:00
Heikki Linnakangas	e4f775436f	Don't override other options than statement_timeout in test conn string. In commit `6985f6cd6c`, I tried passing extra GUCs in the 'options' part of the connection string, but it didn't work because the pgbench test overrode it with the statement_timeout. Change it so that it adds the statement_timeout to any other options, instead of replacing them.	2022-09-20 09:46:15 +03:00
Egor Suvorov	e968b5e502	tests: do not set num_safekeepers = 1, it's the default (#2457 ) Also get rid if `with_safekeepers` parameter in tests. Its meaning has changed: `False` meant "no safekeepers" which is not supported anymore, so we assume it's always `True`. See #1648	2022-09-15 21:43:51 +03:00
Kirill Bulatov	b8eb908a3d	Rename old project name references	2022-09-14 08:14:05 +03:00
Konstantin Knizhnik	eef7475408	Add tests for measuring effect of lsn caching (#2384 ) * Add tests for measurif effet of lsn caching * Fix formatting of test_latency.py * Fix test_lsn_mapping test	2022-09-03 17:06:19 +03:00
Heikki Linnakangas	47bd307cb8	Add python types to represent LSNs, tenant IDs and timeline IDs. (#2351 ) For better ergonomics. I always found it weird that we used UUID to actually mean a tenant or timeline ID. It worked because it happened to have the same length, 16 bytes, but it was hacky.	2022-09-02 10:16:47 +03:00
Alexander Bayandin	39a3bcac36	test_runner: fix flake8 warnings	2022-08-22 14:57:09 +01:00
Alexander Bayandin	4c2bb43775	Reformat all python files by black & isort	2022-08-22 14:57:09 +01:00
Alexander Bayandin	4cddb0f1a4	Set up a workflow to run pgbench against captest (#2077 )	2022-08-15 18:54:31 +01:00
Dmitry Rodionov	cdfa9fe705	avoid duplicate parameter, increase timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	9430abae05	use event so it fires only if workload thread successfully finished	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	4da4c7f769	increase statement timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	092a9b74d3	use only s3 in boto3-stubs and update mypy Newer version of mypy fixes buggy error when trying to update only boto3 stubs. However it brings new checks and starts to yell when we index into cusror.fetchone without checking for None first. So this introduces a wrapper to simplify quering for scalar values. I tried to use cursor_factory connection argument but without success. There can be a better way to do that, but this looks the simplest	2022-08-01 18:28:49 +03:00
Alexander Bayandin	9dcb9ca3da	test/performance: ensure we don't have tables that we're creating (#2135 )	2022-07-22 11:00:05 +01:00
Thang Pham	ed102f44d9	Reduce memory allocations for page server (#2010 ) ## Overview This patch reduces the number of memory allocations when running the page server under a heavy write workload. This mostly helps improve the speed of WAL record ingestion. ## Changes - modified `DatadirModification` to allow reuse the struct's allocated memory after each modification - modified `decode_wal_record` to allow passing a `DecodedWALRecord` reference. This helps reuse the struct in each `decode_wal_record` call - added a reusable buffer for serializing object inside the `InMemoryLayer::put_value` function - added a performance test simulating a heavy write workload for testing the changes in this patch ### Semi-related changes - remove redundant serializations when calling `DeltaLayer::put_value` during `InMemoryLayer::write_to_disk` function call [1] - removed the info span `info_span!("processing record", lsn = %lsn)` during each WAL ingestion [2] ## Notes - [1]: in `InMemoryLayer::write_to_disk`, a deserialization is called ``` let val = Value::des(&buf)?; delta_layer_writer.put_value(key, *lsn, val)?; ``` `DeltaLayer::put_value` then creates a serialization based on the previous deserialization ``` let off = self.blob_writer.write_blob(&Value::ser(&val)?)?; ``` - [2]: related: https://github.com/neondatabase/neon/issues/733	2022-07-21 12:08:26 -04:00
Konstantin Knizhnik	572ae74388	More precisely control size of inmem layer (#1927 ) * More precisely control size of inmem layer * Force recompaction of L0 layers if them contains large non-wallogged BLOBs to avoid too large layers * Add modified version of test_hot_update test (test_dup_key.py) which should generate large layers without large number of tables * Change test name in test_dup_key * Add Layer::get_max_key_range function * Add layer::key_iter method and implement new approach of splitting layers during compaction based on total size of all key values * Add test_large_schema test for checking layer file size after compaction * Make clippy happy * Restore checking LSN distance threshold for checkpoint in-memory layer * Optimize stoage keys iterator * Update pageserver/src/layered_repository.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/layered_repository.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/layered_repository.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/layered_repository.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/layered_repository.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Fix code style * Reduce number of tables in test_large_schema to make it fit in timeout with debug build * Fix style of test_large_schema.py * Fix handlng of duplicates layers Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>	2022-07-21 07:45:11 +03:00
Thang Pham	160e52ec7e	Optimize branch creation (#2101 ) Resolves #2054 Context: branch creation needs to wait for GC to acquire `gc_cs` lock, which prevents creating new timelines during GC. However, because individual timeline GC iteration also requires `compaction_cs` lock, branch creation may also need to wait for compactions of multiple timelines. This results in large latency when creating a new branch, which we advertised as "instantly". This PR optimizes the latency of branch creation by separating GC into two phases: 1. Collect GC data (branching points, cutoff LSNs, etc) 2. Perform GC for each timeline The GC bottleneck comes from step 2, which must wait for compaction of multiple timelines. This PR modifies the branch creation and GC functions to allow GC to hold the GC lock only in step 1. As a result, branch creation doesn't need to wait for compaction to finish but only needs to wait for GC data collection step, which is fast.	2022-07-19 14:56:25 -04:00
Alexander Bayandin	00c26ff3a3	Bring periodic perf tests on GitHub back (#2037 ) * test/fixtures: fix DeprecationWarning * workflows/benchmarking: increase timeout * test: switch pgbench to default(simple) query mode * test/performance: ensure we don't have tables that we're creating * workflows/pg_clients: remove unused env var * workflows/benchmarking: change platform name	2022-07-07 19:53:23 +01:00
bojanserafimov	84b9fcbbd5	Increase a few test timeouts (#1977 )	2022-06-23 11:51:56 -04:00
Thang Pham	37465dafe3	Add wal backpressure tests (#1919 ) Resolves #1889. This PR adds new tests to measure the WAL backpressure's performance under different workloads. ## Changes - add new performance tests in `test_wal_backpressure.py` - allow safekeeper's fsync to be configurable when running tests	2022-06-20 11:40:55 -04:00
Thang Pham	6cfebc096f	Add read/write throughput performance tests (#1883 ) Part of #1467 This PR adds several performance tests that compare the [PG statistics](https://www.postgresql.org/docs/current/monitoring-stats.html) obtained when running PG benchmarks against Neon and vanilla PG to measure the read/write throughput of the DB.	2022-06-06 12:32:10 -04:00
bojanserafimov	90e2c9ee1f	Rename zenith to neon in python tests (#1871 )	2022-06-02 16:21:28 -04:00
Kirill Bulatov	e5cb727572	Replace callmemaybe with etcd subscriptions on safekeeper timeline info	2022-06-01 16:07:04 +03:00
Heikki Linnakangas	ffbb9dd155	Add a 5 minute timeout to python tests. The CI times out after 10 minutes of no output. It's annoying if a test hangs and is killed by the CI timeout, because you don't get information about which test was running. Try to avoid that, by adding a slightly smaller timeout in pytest itself. You can override it on a per-test basis if needed, but let's try to keep our tests shorter than that. For the Postgres regression tests, use a longer 30 minute timeout. They're not really a single test, but many tests wrapped in a single pytest test. It's OK for them to run longer in aggregate, each Postgres test is still fairly short.	2022-05-19 14:04:14 +03:00
Anastasia Lubennikova	a2561f0a78	Use tenant's pitr_interval instead of hardroded 0 in the command. Adjust python tests that use the	2022-05-13 18:32:14 +03:00
Thang Pham	ae20751724	update `ZenithCli::create_tenant` return signature (#1692 ) to include the initial timeline's ID in addition to the new tenant's ID. Context: follow-up of https://github.com/neondatabase/neon/pull/1689	2022-05-12 17:27:08 -04:00

1 2

93 Commits