rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-21 23:20:40 +00:00

Author	SHA1	Message	Date
anastasia	83ed930bc2	WIP. Launch and shutdown tenant threads together with walreceiver. TODO: now walreceiver only disconnects if safekeeper was shut down. Implemnt proper walreceiver disconnection.	2021-11-01 18:04:00 +03:00
anastasia	071e30cc53	Expose TENANT_THREADS_COUNT metric to observe number of currently active checkpointer and GC threads	2021-11-01 18:04:00 +03:00
Kirill Bulatov	e6ef27637b	Better API to handle timeline metadata properly	2021-10-29 23:51:40 +03:00
Patrick Insinger	b532470792	Set SO_REUSEADDR for all TCP listeners	2021-10-29 12:45:26 -07:00
Heikki Linnakangas	e0d7ecf91c	Refactor 'zenith' CLI subcommand handling Also fixes 'zenith safekeeper restart -m immediate'. The stop-mode was previously ignored.	2021-10-29 19:01:01 +03:00
Kirill Bulatov	edba2e9744	Use a proper extension for the readme file	2021-10-28 18:55:14 +03:00
Egor Suvorov	7e552b645f	Add disk write/sync metrics to Safekeeper (#745 )	2021-10-28 18:38:36 +03:00
anastasia	ea5900f155	Refactoring of checkpointer and GC. Move them to a separate tenant_threads module to detangle thread management from LayeredRepository implementation.	2021-10-27 20:50:26 +03:00
anastasia	28ab40c8b7	fix init_repo() call in register_relish_download()	2021-10-27 20:50:26 +03:00
Alexey Kondratov	d423142623	Proxy: wait for kick on .pgpass connection (zenithdb/console#227 )	2021-10-27 20:24:23 +03:00
Dmitry Rodionov	1c0e85f9a0	review cleanups	2021-10-27 13:30:34 +03:00
Dmitry Rodionov	5bc09074ea	add a flag to avoid non incremental size calculation in pageserver http api This calculation is not that heavy but it is needed only in tests, and in case the number of tenants/timelines is high the calculation can take noticeable time. Resolves https://github.com/zenithdb/zenith/issues/804	2021-10-27 13:30:34 +03:00
Heikki Linnakangas	1fac4a3c91	Fix a few messages. Pointed out by Egor in https://github.com/zenithdb/zenith/pull/788, but I accidentally pushed that before fixing these.	2021-10-27 10:58:21 +03:00
Heikki Linnakangas	1bc917324d	Use -m immediate for 'immediate' shutdown	2021-10-27 10:49:38 +03:00
Heikki Linnakangas	af429fb401	Improve 'zenith' CLI utility for safekeepers and a config file. The 'zenith' CLI utility can now be used to launch safekeepers. By default, one safekeeper is configured. There are new 'safekeeper start/stop' subcommands to manage the safekeepers. Each safekeeper is given a name that can be used to identify the safekeeper to start/stop with the 'zenith start/stop' commands. The safekeeper data is stored in '.zenith/safekeepers/<name>'. The 'zenith start' command now starts the pageserver and also all safekeepers. 'zenith stop' stops pageserver, all safekeepers, and all postgres nodes. Introduce new 'zenith pageserver start/stop' subcommands for starting/stopping just the page server. The biggest change here is to the 'zenith init' command. This adds a new 'zenith init --config=<path to toml file>' option. It takes a toml config file that describes the environment. In the config file, you can specify options for the pageserver, like the pg and http ports, and authentication. For each safekeeper, you can define a name and the pg and http ports. If you don't use the --config option, you get a default configuration with a pageserver and one safekeeper. Note that that's different from the previous default of no safekeepers. Any fields that are omitted in the configuration file are filled with defaults. You can also specify the initial tenant ID in the config file. A couple of sample config files are added in the control_plane/ directory. The --pageserver-pg-port, --pageserver-http-port, and --pageserver-auth options to 'zenith init' are removed. Use a config file instead. Finally, change the python test fixtures to use the new 'zenith' commands and the config file to describe the environment.	2021-10-27 10:49:38 +03:00
Heikki Linnakangas	710fe02d0b	Return success on 'zenith stop' if the page server is already stopped.	2021-10-27 01:10:24 +03:00
Heikki Linnakangas	de87aad990	Remove a few unused functions	2021-10-27 01:10:24 +03:00
Heikki Linnakangas	41d48719e1	In python tests, skip ports that are already in use. We've seen some failures with "Address already in use" errors in the tests. It's not clear why, perhaps some server processes are not cleaned up properly after test, or maybe the socket is still in TIME_WAIT state. In any case, let's make the tests more robust by checking that the port is free, before trying to use it.	2021-10-27 00:46:24 +03:00
Kirill Bulatov	d88377f9f0	Remove log from zenith_utils	2021-10-26 23:24:11 +03:00
Kirill Bulatov	ecd577c934	Simplify tracing declarations	2021-10-26 23:24:11 +03:00
anastasia	f43f8401ee	Don't wait for wal-redo process for non-relational records replay	2021-10-26 19:30:28 +03:00
Arseny Sher	1877bbc7cb	bump vendor/postgres to fix reconnection busy loop	2021-10-26 15:43:19 +03:00
Heikki Linnakangas	a064ebb64c	Cope with missing 'tenantid' in '.zenith/config' file. We generate the initial tenantid and store it in the file, so it shouldn't be missing. But let's cope with it. (This comes handy with the bigger changes I'm working on at https://github.com/zenithdb/zenith/pull/788)	2021-10-25 21:24:11 +03:00
Heikki Linnakangas	4726870e8d	Remove obsolete comment. We store the pageserver port in the .zenith/config file.	2021-10-25 21:16:58 +03:00
Heikki Linnakangas	3bbc106c70	Prefer long CLI option name for clarity.	2021-10-25 21:16:58 +03:00
Heikki Linnakangas	66eb081876	Improve comment on 'base_dir'	2021-10-25 21:16:58 +03:00
Kirill Bulatov	f291ab2b87	Do not panic on missing tenant	2021-10-25 18:36:30 +03:00
Heikki Linnakangas	66ec135676	Refactor pytest fixtures Instead of having a lot of separate fixtures for setting up the page server, the compute nodes, the safekeepers etc., have one big ZenithEnv object that encapsulates the whole environment. Every test either uses a shared "zenith_simple_env" fixture, which contains the default setup of a pageserver with no authentication, and no safekeepers. Tests that want to use safekeepers or authentication set up a custom test-specific ZenithEnv fixture. Gathering information about the whole environment into one object makes some things simpler. For example, when a new compute node is created, you no longer need to pass the 'wal_acceptors' connection string as argument to the 'postgres.create_start' function. The 'create_start' function fetches that information directly from the ZenithEnv object.	2021-10-25 14:14:47 +03:00
Heikki Linnakangas	28af3e5008	Remove some unnecessary fixture arguments	2021-10-25 14:14:45 +03:00
Heikki Linnakangas	f337d73a6c	Rearrange output dirs a bit Each test now gets its own test output directory, like 'test_output/test_foobar', even when TEST_SHARED_FIXTURES is used. When TEST_SHARED_FIXTURES is not used, the zenith repo for each test is created under a 'repo' subdir inside the test output dir, e.g. 'test_output/test_foobar/repo'	2021-10-25 14:14:43 +03:00
Heikki Linnakangas	57ce541521	Remove unnecessary 'pg_bin' object from 'postgres' fixture. It was only used in check_restored_datadir_content(), and that function can construct it easily from the other information it has.	2021-10-25 14:14:41 +03:00
Heikki Linnakangas	e14f24034f	Turn a few path-fixtures to global variables This way, they're readily accessible from the classes and functions that are not themselves fixtures	2021-10-25 14:14:38 +03:00
Kirill Bulatov	04fb0a0342	Add core relish backup and restore functionality	2021-10-22 22:22:38 +03:00
Heikki Linnakangas	8c42dcc041	Fix safekeeper -D option. The -D option to specify working directory was broken: $ mkdir foobar $ ./target/debug/safekeeper -D foobar Error: failed to open "foobar/safekeeper.log" Caused by: No such file or directory (os error 2) This was because we both chdir'd into to specified directory, and also prepended the directory to all the paths. So in the above example, it actually tried to create the log file in "foobar/foobar/safekepeer.log" Change it to work the same way as in the pageserver: chdir to the specified directory, and leave 'workdir' always set to ".". We wouldn't necessarily need the 'workdir' variable in the config at all, and could assume that the current working directory is always the safekeeper data directory, but I'd like to keep this consistent with the the pageserver. The page server doesn't assume that for the sake of unit tests. We don't currently have unit tests in the safekeeper that write to disk but we might want to in the future.	2021-10-22 08:39:58 +03:00
Alexey Kondratov	9070a4dc02	Turn off back pressure by default	2021-10-22 01:40:43 +03:00
Egor Suvorov	86a28458c6	test_runner: use Python 3.7 in CI and improve its support (#775 ) * We actually need Python 3.7 because of dataclasses * Rerun 'pipenv lock' under Python 3.7 and add 'pipenv' to dev deps * Update docs on developing for Python 3.7 * CircleCI: use Python 3.7 via Docker image instead of Orb	2021-10-21 20:01:29 +03:00
Egor Suvorov	c058d04250	Rename WalAcceptor to Safekeeper in most places (#741 )	2021-10-21 18:26:43 +03:00
Konstantin Knizhnik	c310932121	Implement backpressure for compute node to avoid WAL overflow Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2021-10-21 18:15:50 +03:00
Egor Suvorov	ff563ff080	test_runner: fix mypy errors and force it on CI (#774 ) * Fix bugs found by mypy * Add some missing types and runtime checks, remove unused code * Make ZenithPageserver start right away for better type safety * Add `types-` packages to Pipfile Pin mypy version and run it on CircleCI	2021-10-21 13:51:54 +03:00
anastasia	7f9d2a7d05	Change 'zenith tenant list' API to return tenant state added in `0dc7a3fc`	2021-10-21 11:04:22 +03:00
Arthur Petukhovsky	13f4e173c9	Wait for safekeepers to catch up in test_restarts_under_load (#776 )	2021-10-20 14:42:53 +03:00
Dmitry Ivanov	85116a8375	[proxy] Prevent TLS stream from hanging This change causes writer halves of a TLS stream to always flush after a portion of bytes has been written by `std::io::copy`. Furthermore, some cosmetic and minor functional changes are made to facilitate debug.	2021-10-20 14:15:49 +03:00
Egor Suvorov	e42c884c2b	test_runner/README: add note on capturing logs (#778 ) Became actual after #674	2021-10-20 01:55:49 +03:00
Egor Suvorov	eb706bc9f4	Force yapf (Python code formatter) in CI (#772 ) * Add yapf run to CircleCI * Pin yapf version * Enable `SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES` setting * Reformat all existing code with slight manual adjustments * test_runner/README: note that yapf is forced	2021-10-19 20:13:47 +03:00
Dmitry Rodionov	798df756de	suppress FileNotFound exception instead of missing_ok=True because the latter is added in python 3.8 and we claim to support >3.6	2021-10-19 17:13:42 +03:00
Dmitry Rodionov	732d13fe06	use cached-property package because python<3.8 doesnt have cached_property in functools	2021-10-19 17:13:42 +03:00
Heikki Linnakangas	feae7f39c1	Support read-only nodes Change 'zenith.signal' file to a human-readable format, similar to backup_label. It can contain a "PREV LSN: %X/%X" line, or a special value to indicate that it's OK to start with invalid LSN ('none'), or that it's a read-only node and generating WAL is forbidden ('invalid'). The 'zenith pg create' and 'zenith pg start' commands now take a node name parameter, separate from the branch name. If the node name is not given, it defaults to the branch name, so this doesn't break existing scripts. If you pass "foo@<lsn>" as the branch name, a read-only node anchored at that LSN is created. The anchoring is performed by setting the 'recovery_target_lsn' option in the postgresql.conf file, and putting the server into standby mode with 'standby.signal'. We no longer store the synthetic checkpoint record in the WAL segment. The postgres startup code has been changed to use the copy of the checkpoint record in the pg_control file, when starting in zenith mode.	2021-10-19 09:48:12 +03:00
Heikki Linnakangas	c2b468c958	Separate node name from the branch name in ComputeControlPlane This is in preparation for supporting read-only nodes. You can launch multiple read-only nodes on the same brach, so we need an identifier for each node, separate from the branch name.	2021-10-19 09:48:10 +03:00
Heikki Linnakangas	e272a380b4	On new repo, start writing WAL only after the initial checkpoint record. Previously, the first WAL record on the 'main' branch overwrote the initial checkpoint record, with invalid 'xl_prev'. That's harmless, but also pretty ugly. I bumped into this while I was trying to tighen up the checks for when a valid 'prev_lsn' is required. With this patch, the first WAL record gets a valid 'xl_prev' value. It doesn't matter much currently, but let's be tidy.	2021-10-19 09:48:04 +03:00
anastasia	0dc7a3fc15	Change tenant_mgr to use TenantState. It allows to avoid locking entire TENANTS list while one tenant is bootstrapping and prepares the code for remote storage integration.	2021-10-18 15:40:06 +03:00

1 2 3 4 5 ...

1012 Commits