rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 20:42:54 +00:00

Author	SHA1	Message	Date
Kirill Bulatov	a5e10c4f64	Tidy up pageserver's endpoints	2022-03-10 19:38:58 +02:00
Kirill Bulatov	7b5482bac0	Properly store the branch name mappings	2022-03-10 19:38:58 +02:00
Kirill Bulatov	c7569dce47	Allow passing initial timeline id into zenith CLI commands	2022-03-10 19:38:58 +02:00
Kirill Bulatov	4d0f7fd1e4	Update Zenith CLI config between runs	2022-03-10 19:38:58 +02:00
Kirill Bulatov	f49990ed43	Allow creating timelines by branching off ancestors	2022-03-10 19:38:58 +02:00
Kirill Bulatov	0c91091c63	Avoid point in time concept on pageserver level	2022-03-10 19:38:58 +02:00
Kirill Bulatov	10f811e886	Use `timeline` instead of `branch` in pageserver's API	2022-03-10 19:38:58 +02:00
Arseny Sher	f86cf93435	Refactor timeline creation on safekeepers, allowing storing peer ids. Have separate routine and http endpoint to create timeline on safekeepers. It is not used yet, i.e. timeline is still created implicitly, but we'll change that once infrastructure for learning which tlis are assigned to which safekeepers will be ready, preventing accidental creation by compute. Changes format of safekeeper control file, allowing to store set of peers. Knowing peers provides a part of foundation for peer recovery (calculating min horizons like truncate_lsn for WAL truncation and commit_lsn for sync-safekeepers replacement) and proper membership change; similarly, we don't yet use it for now. Employing cf file version bump, extracts tenant_id and timeline_id to top level where it is more suitable. Also adds a bunch of LSNs there and rename truncate_lsn to more specific peer_horizon_lsn.	2022-03-06 08:06:38 +03:00
Kirill Bulatov	9424bfae22	Use a separate newtype for ZId that (de)serialize as hex strings	2022-03-04 10:58:40 +02:00
Dmitry Rodionov	1d90b1b205	add node id to pageserver (#1310 ) * Add --id argument to safekeeper setting its unique u64 id. In preparation for storage node messaging. IDs are supposed to be monotonically assigned by the console. In tests it is issued by ZenithEnv; at the zenith cli level and fixtures, string name is completely replaced by integer id. Example TOML configs are adjusted accordingly. Sequential ids are chosen over Zid mainly because they are compact and easy to type/remember. * add node id to pageserver This adds node id parameter to pageserver configuration. Also I use a simple builder to construct pageserver config struct to avoid setting node id to some temporary invalid value. Some of the changes in test fixtures are needed to split init and start operations for envrionment. Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2022-03-04 01:10:42 +03:00
Heikki Linnakangas	993b544ad0	Change default parameters for back pressure Fixes issue #1238 and #1189. Extracted from PR #1194, with some comment editorialization by me. Author: Konstantin Knizhnik <knizhnik@zenith.tech>	2022-02-22 13:56:21 +03:00
Kirill Bulatov	76b74349cb	Bump pageserver dependencies	2022-02-10 08:33:22 -05:00
Dhammika Pathirana	1e8ca497e0	Fix safekeeper loopback addr (#1247 ) Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-02-10 09:23:53 +03:00
anastasia	5abe2129c6	Extend replication protocol with ZentihFeedback message to pass current_timeline_size to compute node Put standby_status_update fields into ZenithFeedback and send them as one message. Pass values sizes together with keys in ZenithFeedback message.	2022-01-27 11:20:45 +03:00
Dmitry Rodionov	e6f2d70517	use 2021 rust edition	2022-01-25 18:48:49 +03:00
Dmitry Ivanov	d3542c34f1	Refactoring: use anyhow::Context's methods where possible	2022-01-19 16:33:48 +03:00
Kirill Bulatov	8ab4c8a050	Code review fixes	2022-01-11 15:44:23 +02:00
Kirill Bulatov	7c4a653230	Propagate Zenith CLI's RUST_LOG env var to subprocesses	2022-01-11 15:44:23 +02:00
Kirill Bulatov	384b2a91fa	Pass generic pageserver params through zenith cli	2022-01-11 15:44:23 +02:00
anastasia	980f5f8440	Propagate remote_consistent_lsn to safekeepers. Change meaning of lsns in HOT_STANDBY_FEEDBACK: flush_lsn = disk_consistent_lsn, apply_lsn = remote_consistent_lsn Update compute node backpressure configuration respectively. Update compute node configuration: set 'synchronous_commit=remote_write' in setup without safekeepers. This way compute node doesn't have to wait for data checkpoint on pageserver. This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.	2021-12-24 15:32:54 +03:00
Kirill Bulatov	114a757d1c	Use generic config parameters in pageserver cli Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>	2021-12-23 18:58:28 +02:00
Kirill Bulatov	673c297949	Download timelines on demand	2021-12-10 17:23:35 +02:00
Heikki Linnakangas	cb4a8396fb	Use rustls rather than native-tls in all dependencies. We depends on rustls in postgres_backend anyway, so might as well use it for all TLS stuff. Seems better to depend on only one library both from a security point of view, and because fewer dependencies means less code to compile. With this commit, we no longer depend on OpenSSL.	2021-12-10 15:14:27 +02:00
Heikki Linnakangas	6ecd442fb9	Remove a bunch of unnecessary dependencies.	2021-12-10 14:24:33 +02:00
Dmitry Rodionov	557e3024cd	Forward pageserver connection string from compute to safekeeper This is needed for implementation of tenant rebalancing. With this change safekeeper becomes aware of which pageserver is supposed to be used for replication from this particular compute.	2021-12-06 21:28:49 +03:00
Dmitry Ivanov	7cec13d1df	Improve shutdown story for code coverage This patch introduces fixes for several problems affecting LLVM-based code coverage: * Daemonizing parent processes should call _exit() to prevent coverage data file corruption (.profraw) due to concurrent writes. Implement proper shutdown handlers in safekeeper.	2021-12-06 13:27:52 +03:00
anastasia	b7685eb6ba	Enable backpressure	2021-12-06 12:49:42 +03:00
Heikki Linnakangas	1bc917324d	Use -m immediate for 'immediate' shutdown	2021-10-27 10:49:38 +03:00
Heikki Linnakangas	af429fb401	Improve 'zenith' CLI utility for safekeepers and a config file. The 'zenith' CLI utility can now be used to launch safekeepers. By default, one safekeeper is configured. There are new 'safekeeper start/stop' subcommands to manage the safekeepers. Each safekeeper is given a name that can be used to identify the safekeeper to start/stop with the 'zenith start/stop' commands. The safekeeper data is stored in '.zenith/safekeepers/<name>'. The 'zenith start' command now starts the pageserver and also all safekeepers. 'zenith stop' stops pageserver, all safekeepers, and all postgres nodes. Introduce new 'zenith pageserver start/stop' subcommands for starting/stopping just the page server. The biggest change here is to the 'zenith init' command. This adds a new 'zenith init --config=<path to toml file>' option. It takes a toml config file that describes the environment. In the config file, you can specify options for the pageserver, like the pg and http ports, and authentication. For each safekeeper, you can define a name and the pg and http ports. If you don't use the --config option, you get a default configuration with a pageserver and one safekeeper. Note that that's different from the previous default of no safekeepers. Any fields that are omitted in the configuration file are filled with defaults. You can also specify the initial tenant ID in the config file. A couple of sample config files are added in the control_plane/ directory. The --pageserver-pg-port, --pageserver-http-port, and --pageserver-auth options to 'zenith init' are removed. Use a config file instead. Finally, change the python test fixtures to use the new 'zenith' commands and the config file to describe the environment.	2021-10-27 10:49:38 +03:00
Heikki Linnakangas	710fe02d0b	Return success on 'zenith stop' if the page server is already stopped.	2021-10-27 01:10:24 +03:00
Heikki Linnakangas	de87aad990	Remove a few unused functions	2021-10-27 01:10:24 +03:00
Heikki Linnakangas	a064ebb64c	Cope with missing 'tenantid' in '.zenith/config' file. We generate the initial tenantid and store it in the file, so it shouldn't be missing. But let's cope with it. (This comes handy with the bigger changes I'm working on at https://github.com/zenithdb/zenith/pull/788)	2021-10-25 21:24:11 +03:00
Heikki Linnakangas	4726870e8d	Remove obsolete comment. We store the pageserver port in the .zenith/config file.	2021-10-25 21:16:58 +03:00
Heikki Linnakangas	3bbc106c70	Prefer long CLI option name for clarity.	2021-10-25 21:16:58 +03:00
Heikki Linnakangas	66eb081876	Improve comment on 'base_dir'	2021-10-25 21:16:58 +03:00
Alexey Kondratov	9070a4dc02	Turn off back pressure by default	2021-10-22 01:40:43 +03:00
Egor Suvorov	c058d04250	Rename WalAcceptor to Safekeeper in most places (#741 )	2021-10-21 18:26:43 +03:00
Konstantin Knizhnik	c310932121	Implement backpressure for compute node to avoid WAL overflow Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2021-10-21 18:15:50 +03:00
anastasia	7f9d2a7d05	Change 'zenith tenant list' API to return tenant state added in `0dc7a3fc`	2021-10-21 11:04:22 +03:00
Heikki Linnakangas	feae7f39c1	Support read-only nodes Change 'zenith.signal' file to a human-readable format, similar to backup_label. It can contain a "PREV LSN: %X/%X" line, or a special value to indicate that it's OK to start with invalid LSN ('none'), or that it's a read-only node and generating WAL is forbidden ('invalid'). The 'zenith pg create' and 'zenith pg start' commands now take a node name parameter, separate from the branch name. If the node name is not given, it defaults to the branch name, so this doesn't break existing scripts. If you pass "foo@<lsn>" as the branch name, a read-only node anchored at that LSN is created. The anchoring is performed by setting the 'recovery_target_lsn' option in the postgresql.conf file, and putting the server into standby mode with 'standby.signal'. We no longer store the synthetic checkpoint record in the WAL segment. The postgres startup code has been changed to use the copy of the checkpoint record in the pg_control file, when starting in zenith mode.	2021-10-19 09:48:12 +03:00
Heikki Linnakangas	c2b468c958	Separate node name from the branch name in ComputeControlPlane This is in preparation for supporting read-only nodes. You can launch multiple read-only nodes on the same brach, so we need an identifier for each node, separate from the branch name.	2021-10-19 09:48:10 +03:00
anastasia	d7c9dd06f4	Implement graceful shutdown at 'pageserver stop': - perform checkpoint for each tenant repository. - wait for the completion of all threads. Add new option 'immediate' to 'pageserver stop' command to terminate the pageserver immediately.	2021-10-11 13:35:01 +03:00
Max Sharnoff	84f7dcd052	Fix clippy errors on nightly (2021-09-29) (#691 ) Most of the changes are for the new if-then-panic lint added in https://github.com/rust-lang/rust-clippy/pull/7669.	2021-10-01 15:45:42 -07:00
Heikki Linnakangas	d4eed61f57	Refactor code for parsing and creating postgresql.conf. There's surely more that could be done, but this makes it a bit more readable at least.	2021-09-23 19:34:27 +03:00
Kirill Bulatov	1aa7218fd6	Show underlying pageserver error details	2021-09-17 16:16:05 +03:00
Kirill Bulatov	7dda9f2894	Fix clippy lints and enable clippy checking in CI	2021-09-16 15:09:16 +03:00
Dmitry Rodionov	9563336d9a	Bring back check for interferring processes, add more comments and descriptive errors	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	4ebe643d0c	Support parallel test running for python tests Support is done via pytest-xdist plugin. To use the feature add -n<concurrency> to pytest invocation e.g. pytest -n8 to run 8 tests in parallel. Changes in code are mostly about ports assigning. Previously port for pageserver was hardcoded without the ability to override through zenith cli and ports for started compute nodes were calculated twice, in zenith cli and in test code. Now zenith cli supports port arguments for pageserver and compute nodes to be passed explicitly. Tests are modified in such a way that each worker gets a non overlapping port range which can be configured and now contains 100 ports. These ports are distributed to test services (pageserver, wal acceptors, compute nodes) so they can work independently.	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	dc897fb864	remove pageserver remotes support since we do not have tests for that and feature itself is delayed (#136 )	2021-09-15 13:24:35 +03:00
Arseny Sher	0aec60938a	Make flush_lsn reported by safekeepers point to record boundary. Otherwise we produce corrupted record holes in WAL during compute node restart in case there was an unfinished record from the old compute, as these reports advance commit_lsn -- reliably persisted part of WAL. ref #549. Mostly by @knizhnik. I adjusted to make sure proposer always starts streaming since record beginning so we don't need special quirks for decoding in safekeeper.	2021-09-11 06:10:10 +03:00

1 2 3 4

167 Commits