rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 14:02:55 +00:00

Author	SHA1	Message	Date
Egor Suvorov	0b5b2e8e0b	postgres_ffi/wal_craft: extract trait Crafter Make the intent of the code clearer.	2022-07-08 18:30:56 +02:00
Egor Suvorov	60e5dc10e6	postgres_ffi/wal_generate: use 'craft' instead of 'generate' It does very fine-tuned byte-to-byte WAL crafting, not a sloppy generation. Hence 'craft' sounds like a better description.	2022-07-08 18:30:56 +02:00
Thang Pham	1f5918b36d	Delay calculating the starting LSN when doing timeline branching (#2053 ) Previously, upon branching, if no starting LSN is specified, we determine the start LSN based on the source timeline's last record LSN in `timelines::create_timeline` function, which then calls `Repository::branch_timeline` to create the timeline. Inside the `LayeredRepository::branch_timeline` function, to start branching, we try to acquire a GC lock to prevent GC from removing data needed for the new timeline. However, a GC iteration takes time, so the GC lock can be held for a long period of time. As a result, the previously determined starting LSN can become invalid because of GC. This PR fixes the above issue by delaying the LSN calculation part and moving it to be inside `LayeredRepository::branch_timeline` function.	2022-07-08 10:29:29 -04:00
Egor Suvorov	80b7a3b51a	Test what happens when XLOG_SWITCH ends on page boundary, fix #1991	2022-07-08 15:37:26 +02:00
Egor Suvorov	85bda437de	postgres_ffi/wal_generate: add last_wal_record_xlog_switch and use it in tests Fix #1190: WalDecoder did not return correct LSN of the next record after processing a XLOG_SWITCH record	2022-07-08 15:37:26 +02:00
MMeent	52f445094a	Update vendor/postgres to 14.4 (#2049 ) Co-authored-by: Matthias van de Meent <matthias@neon.tech>	2022-07-08 14:51:44 +02:00
Egor Suvorov	bcdee3d3b5	test_runner: add test_crafted_wal_end.py For some reason both non-`simple` tests spend about 10 seconds in the post-restart `INSERT INTO` query on my machine, see #2023	2022-07-08 13:56:37 +02:00
Egor Suvorov	c08fa9d562	postgres_ffi/wal_generate: support generating WAL for an already running Postgres server * ensure_server_config() function is added to ensure the server does not have background processes which intervene with WAL generation * Rework command line syntax * Add `print-postgres-config` subcommand which prints the required server configuration	2022-07-08 13:56:37 +02:00
Alexander Bayandin	00c26ff3a3	Bring periodic perf tests on GitHub back (#2037 ) * test/fixtures: fix DeprecationWarning * workflows/benchmarking: increase timeout * test: switch pgbench to default(simple) query mode * test/performance: ensure we don't have tables that we're creating * workflows/pg_clients: remove unused env var * workflows/benchmarking: change platform name	2022-07-07 19:53:23 +01:00
Dmitry Rodionov	ec0faf3ac6	retry timeline delete	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	1a5af6d7a5	extend detach/delete tests	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	520ffb341b	fix pageserver openapi spec	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	9f2b40645d	review cleanup, point timeline/detach to timeline/delete	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	168214e0b6	use tenant status endpoint to check whether timelines were downloaded or not	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	d9d4ef12c3	review cleanup	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	e1e24336b7	review adjustments, bring back timeline_detach and rename it to timeline_delete	2022-07-07 21:20:04 +03:00
Dmitry Rodionov	4c54e4b37d	switch to per-tenant attach/detach download operations of all timelines for one tenant are now grouped together so when attach is invoked pageserver downloads all of them and registers them in a single apply_sync_status_update call so branches can be used safely with attach/detach	2022-07-07 21:20:04 +03:00
Andrey Taranik	ae116ff0a9	update timeout for proxy deploy (#2047 )	2022-07-07 18:09:57 +03:00
Heikki Linnakangas	e6ea049165	If an error happens during import of base backup or WAL, log it. We only sent the error to the client, with no trace in the pageserver log. Log it, similar to how we log errors in GetPage@LSN requests.	2022-07-07 16:05:13 +03:00
Alexey Kondratov	747d009bb4	Fix panic while waiting for Postgres readiness in the compute_ctl (#2021 ) We were reading Postgres pid file and looking for the 'ready' status, but it could be empty or we could not read it. So add all the checks.	2022-07-07 11:56:58 +02:00
Alexander Bayandin	cb5df3c627	github/actions: set missing VIP_VAP_ACCESS_TOKEN (#2045 )	2022-07-07 10:47:03 +01:00
Heikki Linnakangas	0e3456351f	Shrink thread pools used for WAL receivers and background tasks. I noticed that the pageserver has a very large virtual memory size, several GB, even though it doesn't actually use that much memory. That's not much of a problem normally, but I hit it because I wanted to run tests with a limited virtual memory size, by calling setrlimit(RLIMIT_AS), but the highest limit you can set is 2 GB. I was not able to start pageserver with a limit of 2 GB. On Linux, each thread allocates 32 MB of virtual memory. I read this on some random forum on the Internet, but unfortunately could not find the source again now. Empirically, reducing the number of threads clearly helps to bring down the virtual memory size. Aside from the virtual memory usage, it seems excessive to launch 40 threads in both of those thread pools. The tokio default is to have as many worker threads as there are CPU cores in the system. That seems like a fine heuristic for us, too, so remove the explicit setting of the pool size and rely on the default. Note that the GC and compaction tasks are actually run with tokio spawn_blocking, so the threads that are actually doing the work, and possibly waiting on I/O, are not consuming threads from the thread pool. The WAL receiver work is done in the tokio worker threads, but the WAL receivers are more CPU bound so that seems OK. Also remove the explicit maxinum on blocking tasks. I'm not sure what the right value for that would be, or whether the value we set (100) would be better than the tokio default (512). Since the value was arbitrary, let's just rely on the tokio default for that, too.	2022-07-06 22:36:38 +03:00
Alexander Bayandin	1faf49da0f	github/actions: set PERF_TEST_RESULT_CONNSTR from secrets (#2040 )	2022-07-06 19:24:06 +01:00
bojanserafimov	4a96259bdd	Add export/import test (#2036 )	2022-07-06 13:45:26 -04:00
bojanserafimov	242af75653	Fix signal file parsing (#2042 )	2022-07-06 13:45:02 -04:00
Arthur Petukhovsky	8fabdc6708	Add tests with concurrent computes. Removes test_restart_compute, as added test_compute_restarts is stronger.	2022-07-06 18:07:29 +04:00
Alexander Bayandin	07df7c2edd	github/actions: fix storing perf data for main (#2038 )	2022-07-06 13:15:15 +01:00
Kirill Bulatov	50821c0a3c	Return download stream directly from the remote storage API	2022-07-05 21:45:15 +03:00
Andrey Taranik	68adfe0fc8	inventory file fix for neon-stress env	2022-07-05 21:29:03 +04:00
Dmitry Rodionov	cfdf79aceb	harden create_empty_timeline Reorder checks so it checks whether the timeline exists before writing something to disk, possibly replacing valid content	2022-07-05 16:44:18 +03:00
bojanserafimov	32560e75d2	Enable relocation test (#1974 )	2022-07-05 08:27:57 -04:00
Heikki Linnakangas	bb69e0920c	Do not overwrite an existing image layer. See github issues #1594 and #1690 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2022-07-05 14:45:31 +03:00
Alexander Bayandin	05f6a1394d	Add tests for different Postgres client libraries (#2008 ) * Add tests for different postgres clients * test/fixtures: sanitize test name for test_output_dir * test/fixtures: do not look for etcd before runtime * Add workflow for testing Postgres client libraries	2022-07-05 12:22:58 +01:00
Heikki Linnakangas	844832ffe4	Bump vendor/postgres Contains changes from two PRs in vendor/postgres: - https://github.com/neondatabase/postgres/pull/163 - https://github.com/neondatabase/postgres/pull/176	2022-07-05 10:55:03 +03:00
bojanserafimov	d29c545b5d	Gc/compaction thread pool, take 2 (#1933 ) Decrease the number of pageserver threads by running gc and compaction in a blocking tokio thread pool	2022-07-05 02:06:40 -04:00
Kirill Bulatov	6abdb12724	Fix 1.62 Clippy errors	2022-07-04 23:46:37 +03:00
Alexander Bayandin	7898e72990	Remove duplicated checks from LocalEnv	2022-07-04 22:35:00 +03:00
Dmitry Rodionov	65704708fa	remove unused imports, make more use of pathlib.Path	2022-07-01 18:56:51 +03:00
Arseny Sher	6100a02d0f	Prefix WAL files in s3 with environment name. It wasn't merged to prod yet, so safe to enable.	2022-07-01 19:21:28 +04:00
Arseny Sher	97fed38213	Fix `cadaca010c` for older ssh clients.	2022-07-01 19:20:59 +04:00
Arseny Sher	cadaca010c	Make ansible to work with storage nodes through teleport from local box.	2022-07-01 16:58:34 +03:00
Bojan Serafimov	f09c09438a	Fix gc after import	2022-07-01 11:10:49 +03:00
Dmitry Rodionov	00fc696606	replace extra urlencode dependency with already present url library	2022-06-30 14:32:15 +03:00
Kirill Bulatov	1d0706cf25	Fix walreceiver connection selection mechanism * Avoid reconnecting to safekeeper immediately after its failure by limiting candidates to those with fewest connection attempts. Thus we don't have to wait lagging_wal_timeout (10s by default) before switch happens even if no new changes are generated, and current test_restarts_under_load expects some commits to happen within 4s. * Make default max_lsn_wal_lag larger, otherwise we constant reconnections happen during normal work. * Fix wal_connection_attempts maintanance, preventing busy loop of reconnections.	2022-06-30 00:40:12 +03:00
Dmitry Ivanov	5ee19b0758	Fix bloated coverage uploads (#2005 ) Move coverage data to a better directory, merge it better and don't publish it from CircleCI pipeline	2022-06-29 17:59:19 +03:00
Kirill Bulatov	cef90d9220	Disable cachepot for GH Actions builds (#2007 )	2022-06-29 17:56:02 +03:00
Kirill Bulatov	4a05413a4c	More code coverage fixes in GH Actions (#2002 )	2022-06-27 22:40:20 +03:00
Kirill Bulatov	dd61f3558f	Fix coverage upload credentials retrieval (#2001 )	2022-06-27 20:41:09 +03:00
Kirill Bulatov	8a714f1ebf	Add coverage to GH actions and rework part of them (#1987 )	2022-06-27 19:15:56 +03:00
Arseny Sher	137291dc24	Push to etcd from safekeeper many timelines concurrently. Mitigates latency fee, making push throughput 1-1.5 order of magnitude bigger. Also make leases per timeline, not per whole safekeeper, avoiding storing garbage in etcd for deleted timelines while safekeeper is alive.	2022-06-27 16:30:21 +03:00

1 2 3 4 5 ...

1785 Commits