rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-27 16:12:56 +00:00

Author	SHA1	Message	Date
Arseny Sher	86045ac36c	Prefix per-cluster directory with ztenant_id in safekeeper. Currently ztimelineids are unique, but all APIs accept the pair, so let's keep it everywhere for uniformity. Carry around ZTTId containing both ZTenantId and ZTimelineId for simplicity. (existing clusters on staging ought to be preprocessed for that)	2022-01-27 17:22:07 +03:00
anastasia	5abe2129c6	Extend replication protocol with ZentihFeedback message to pass current_timeline_size to compute node Put standby_status_update fields into ZenithFeedback and send them as one message. Pass values sizes together with keys in ZenithFeedback message.	2022-01-27 11:20:45 +03:00
Dmitry Rodionov	e6f2d70517	use 2021 rust edition	2022-01-25 18:48:49 +03:00
Dmitry Rodionov	458bc0c838	walkeeper: use named type as a key in callmemaybe subscriptions hashmap	2022-01-24 17:20:15 +03:00
Dmitry Rodionov	37c440c5d3	Introduce first version of tenant migraiton between pageservers This patch includes attach/detach http endpoints in pageservers. Some changes in callmemaybe handling inside safekeeper and an integrational test to check migration with and without load. There are still some rough edges that will be addressed in follow up patches	2022-01-24 17:20:15 +03:00
anastasia	81e94d1897	Add LSN and Backpressure descriptions to glossary.md	2022-01-24 12:52:30 +03:00
Kirill Bulatov	65290b2e96	Ensure every submodule compiles on its own	2022-01-21 17:34:15 +03:00
Dmitry Ivanov	d3542c34f1	Refactoring: use anyhow::Context's methods where possible	2022-01-19 16:33:48 +03:00
Heikki Linnakangas	dab30c27b6	Refactor thread management and shutdown This introduces a new module to handle thread creation and shutdown. All page server threads are now registered in a global hash map, and there's a function to request individual threads to shut down gracefully. Thread shutdown request is signalled to the thread with a flag, as well as a Future that can be used to wake up async operations if shutdown is requested. Use that facility to have the libpq listener thread respond to pageserver shutdown, based on Kirill's earlier prototype (https://github.com/zenithdb/zenith/pull/1088). That addresses https://github.com/zenithdb/zenith/issues/1036, previously the libpq listener thread would not exit until one more connection arrives. This also eliminates a resource leak in the accept() loop. Previously, we added the JoinHanlde of each new thread to a vector but old handles for threads that had already exited were never removed.	2022-01-14 18:36:10 +02:00
Heikki Linnakangas	772d853dcf	Fix race condition leading to panic in walkeeper. The walkeeper launch two threads for each connection, and uses a guard object to remove entry from 'replicas' array, when finishes. But only the background thread held onto the guard object, so if the background thread finished before the other thread, the array entry would be removed prematurely, which lead to panic in the check_stop_streaming() call. Fixes https://github.com/zenithdb/zenith/issues/1103	2022-01-13 11:21:11 +02:00
Arseny Sher	ab4d272149	Add safekeeper --dump-control-file option. Hexalize zids there for better output; since Serde doesn't support several formats for one struct, on-disk representation is changed as well, make upgrade.rs cope with it.	2022-01-12 19:47:24 +03:00
Arseny Sher	233c4811db	Fix default safekeeper http port.	2022-01-11 10:13:27 +03:00
Arseny Sher	25a515b968	Don't call immediately on resume in callmemaybe. It creates busy loop if pageserver <-> safekeeper connection fails after it was established (e.g. currently due to 'segment checkpoint not found' error on pageserver). Also wake up callmemaybe thread regularly once in recall_period regardless of channel activity.	2022-01-03 20:44:36 +03:00
bojanserafimov	24eca8d58b	Parse cancel message in pq_proto (#1060 )	2021-12-28 16:43:44 -05:00
anastasia	eba897ffe7	send CallmeEvent::Unsubscribe request only when pageserver is caught up with safekeeper and it's time to stop streaming	2021-12-28 17:50:48 +03:00
Arseny Sher	a163650a99	Refactor Postgres command parsing in safekeeper. Do it separately with SafekeeperPostgresCommand enum as a result. Since query is always C string, switch postgres_backend process_query argument from Bytes to &str. Make passing ztli/ztenant id in safekeeper connection string optional; this is needed for upcoming intra-safekeeper heartbeat cmd which is not bound to any timeline.	2021-12-24 15:48:13 +03:00
anastasia	980f5f8440	Propagate remote_consistent_lsn to safekeepers. Change meaning of lsns in HOT_STANDBY_FEEDBACK: flush_lsn = disk_consistent_lsn, apply_lsn = remote_consistent_lsn Update compute node backpressure configuration respectively. Update compute node configuration: set 'synchronous_commit=remote_write' in setup without safekeepers. This way compute node doesn't have to wait for data checkpoint on pageserver. This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.	2021-12-24 15:32:54 +03:00
Arseny Sher	56312522f9	Make safekeeper namings more consistent with reality. s/send_wal.rs/handler.rs s/SendWalHandler/SafekeeperPostgresHandler s/replication.rs/send_wal.rs	2021-12-21 13:24:23 +03:00
anastasia	3b61f364f7	Stop WAL streaming threads, when compute node is shut down. WAL stream uses the 2 connections: 1. Compute node (walproposer) -> Safekeeper (ReceiveWalConn module) When compute node is shut down, safekeeper needs to stop the respective receiving thread. Prior to this PR it didn't work because PostgresBackend haven't handled disconnection properly. 2. Safekeeper (ReplicationConn module) -> pageserver (walreceiver thread) When incoming WAL stream is gone, safekeeper can stop streaming WAL and cancel connection as soon as replica is caught up. Note that the WAL can be streamed to multiple replicas simultaneously, only disconnect ones that are caught up to the last_recieved_lsn.	2021-12-20 12:34:28 +03:00
Arseny Sher	8f0cafd508	Grab safekeeper.lock on the whole directory instead of per tli. closes #976	2021-12-13 22:11:04 +03:00
Heikki Linnakangas	cb4a8396fb	Use rustls rather than native-tls in all dependencies. We depends on rustls in postgres_backend anyway, so might as well use it for all TLS stuff. Seems better to depend on only one library both from a security point of view, and because fewer dependencies means less code to compile. With this commit, we no longer depend on OpenSSL.	2021-12-10 15:14:27 +02:00
Heikki Linnakangas	c77e30116e	Split waldecoder.rs into two source files. Move the code for decoding a WAL stream into WAL records into 'postgres_ffi', and keep the code to parse the WAL records deeper in 'pageserver' crate, renamed to walrecord.rs. This tidies up the dependencies a bit. 'walkeeper' reuses the same waldecoder routines, and it used to depend on 'pageserver' because of that. Now it only depends on 'postgres_ffi'. (The comment in walkeeper/Cargo.toml that claimed that the dependency was needed for ZTimelineId was obsolete. ZTimelineId is defined in 'zenith_utils', the dependency was actually needed for the waldecoder.)	2021-12-10 15:14:13 +02:00
Heikki Linnakangas	9d369f158c	Update rust-s3 to version 0.28.0 0.28.0 includes two changes I submitted to upstream: - Add support for older ListObjects API, needed to use rust-s3 with Google Cloud Storage: https://github.com/durch/rust-s3/pull/229 - If file is smaller than one chunk, don't initiate multi-part upload. https://github.com/durch/rust-s3/pull/228 These are not critical for Zenith right now, but let's stay up-to-date.	2021-12-10 14:52:08 +02:00
Heikki Linnakangas	6ecd442fb9	Remove a bunch of unnecessary dependencies.	2021-12-10 14:24:33 +02:00
anastasia	5293e183c5	callmemaybe. review code cleanup	2021-12-09 13:31:49 +03:00
anastasia	93ff5f7ff0	Add default value for safekeeper --recall option. DEFAULT_RECALL_PERIOD is 1 second.	2021-12-09 13:31:49 +03:00
anastasia	41dce68bdd	callmemaybe refactoring - Don't spawn a separate thread for each connection. Instead use one thread per safekeeper, that iterates over all connections and sends callback requests for them. -Use tokio postgres to connect to the pageserver, to avoid spawning a new thread for each connection. callmemaybe review fixes: - Spawn all request_callback tasks separately. - Remember 'last_call_time' and only send request_callback if 'recall_period' has passed. - If task hasn't finished till next recall, abort it and try again. - Add pause/resume CallmeEvents to avoid spamming pageserver when connection already established.	2021-12-09 13:31:49 +03:00
Arseny Sher	37c85d5fd9	Switch safekeeper from log to tracing logging. Add context to wal acceptor and wal sender threads showing timeline id and unique id differentiating them.	2021-12-09 06:57:46 +03:00
Arthur Petukhovsky	450fb9eafe	Don't persist control file without sync (#966 )	2021-12-07 15:02:44 +03:00
Dmitry Rodionov	557e3024cd	Forward pageserver connection string from compute to safekeeper This is needed for implementation of tenant rebalancing. With this change safekeeper becomes aware of which pageserver is supposed to be used for replication from this particular compute.	2021-12-06 21:28:49 +03:00
Arseny Sher	bd34d7ecfc	Bump safekeeper control file version and allow reading the previous one. Should have been a part of `cba4da3f4d` to provide upgrade for previously existing clusters. Separates version independent header (magic + version) out of SafeKeeperState to choose what to deserialize.	2021-12-06 19:47:55 +03:00
Dmitry Ivanov	7cec13d1df	Improve shutdown story for code coverage This patch introduces fixes for several problems affecting LLVM-based code coverage: * Daemonizing parent processes should call _exit() to prevent coverage data file corruption (.profraw) due to concurrent writes. Implement proper shutdown handlers in safekeeper.	2021-12-06 13:27:52 +03:00
anastasia	c7f3b4e62c	Clarify the meaning of StandbyReply LSNs: write_lsn - The last LSN received and processed by pageserver's walreceiver. flush_lsn - same as write_lsn. At pageserver it doesn't guarantees data persistence, but it's fine. We rely on safekeepers. apply_lsn - The LSN at which pageserver guaranteed persistence of all received data (disk_consistent_lsn).	2021-12-06 12:49:42 +03:00
Arseny Sher	cba4da3f4d	Add term history to safekeepers. Persist full history of term switches on safekeepers instead of storing only the single term of the highest entry (called epoch). This allows easily and correctly find the divergence point of two logs and truncate the obsolete part before overwriting it with entries of the newer proposer(s). Full history of the proposer is transferred in separate message before proposer starts streaming; it is immediately persisted by safekeeper, though he might not yet have entries for some older terms there. That's because we can't atomically append to WAL and update the control file anyway, so locally available WAL must be taken into account when looking at the history. We should sometimes purge term history entries beyond truncate_lsn; this is not done here. Per https://github.com/zenithdb/rfcs/pull/12 Closes #296. Bumps vendor/postgres.	2021-12-03 12:43:57 +03:00
Arthur Petukhovsky	93cc40584d	Shutdown socket on CopyFail (#938 ) Fixes #935	2021-11-26 16:48:27 +03:00
Arseny Sher	e7ca8ef5a8	Use PG timelineid 1 everywhere. As changing it doesn't have useful meaning in Zenith. ref #824	2021-11-11 13:53:39 +03:00
Arseny Sher	5603259c53	In wal_proposer_recovery, don't wait outcoming WAL to be committed. Otherwise we're deadlocking ourselves. Oversight of `33007cc`.	2021-11-10 01:38:25 +03:00
Arseny Sher	ce15c62f35	Fix 'send WAL up to' debug logging.	2021-11-10 01:38:25 +03:00
Dmitry Rodionov	07dddfed28	Use more robust way to persist safekeeper control file. Now safekeeper control file updated in a following way: 1. Write data to temp file 2. Fsync the temporary file (if sync option is specified) 3. Rename temporary file to actual control file 4. Fsync containing directory (if sync option is specified) 5. Fsync file after rename (if sync option is specified). Note that action 5 is not mentioned anywhere as required but it is done in postgres this way (see durable_rename). Also because of the rename machinery switch to use dedicated lock file to prevent running several safekeepers concurrently on the same data cleanup fsync control file after rename to match postgres behaviour	2021-11-09 17:51:46 +03:00
Egor Suvorov	33007cc0bb	Safekeeper's `START_REPLICATION` handler: remove `stop_point`, do not handle `start_point == 0` (#777 )	2021-11-04 14:50:33 +03:00
Dmitry Rodionov	987833e0b9	Propagate git SHA to zenith binaries Git commit sha is displayed when --version flag is used and is written to logs during service startup. Uses git_version crate when git is available, and GIT_VERSION environment variable otherwise which is the case for docker builds.	2021-11-04 14:22:29 +03:00
anastasia	85f8bf97f5	Name walkeeper threads to make debugging more convenient	2021-11-01 19:09:57 +03:00
Patrick Insinger	b532470792	Set SO_REUSEADDR for all TCP listeners	2021-10-29 12:45:26 -07:00
Egor Suvorov	7e552b645f	Add disk write/sync metrics to Safekeeper (#745 )	2021-10-28 18:38:36 +03:00
Heikki Linnakangas	8c42dcc041	Fix safekeeper -D option. The -D option to specify working directory was broken: $ mkdir foobar $ ./target/debug/safekeeper -D foobar Error: failed to open "foobar/safekeeper.log" Caused by: No such file or directory (os error 2) This was because we both chdir'd into to specified directory, and also prepended the directory to all the paths. So in the above example, it actually tried to create the log file in "foobar/foobar/safekepeer.log" Change it to work the same way as in the pageserver: chdir to the specified directory, and leave 'workdir' always set to ".". We wouldn't necessarily need the 'workdir' variable in the config at all, and could assume that the current working directory is always the safekeeper data directory, but I'd like to keep this consistent with the the pageserver. The page server doesn't assume that for the sake of unit tests. We don't currently have unit tests in the safekeeper that write to disk but we might want to in the future.	2021-10-22 08:39:58 +03:00
Egor Suvorov	c058d04250	Rename WalAcceptor to Safekeeper in most places (#741 )	2021-10-21 18:26:43 +03:00
Konstantin Knizhnik	c310932121	Implement backpressure for compute node to avoid WAL overflow Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2021-10-21 18:15:50 +03:00
Arthur Petukhovsky	13f4e173c9	Wait for safekeepers to catch up in test_restarts_under_load (#776 )	2021-10-20 14:42:53 +03:00
Arseny Sher	de744a44dd	Add /timeline http request to safekeeper returning its status. Which is mainly generational state (terms) and useful LSNs. Also add /status basic healthcheck request which is now used in tests to determine the safekeeper is up; this fixes #726. ref #115	2021-10-14 19:02:38 +03:00
Egor Suvorov	6b6b3f68be	Safekeeper metrics refactor (#747 )	2021-10-13 16:28:24 +03:00

1 2 3 4

177 Commits