rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-27 08:09:58 +00:00

Author	SHA1	Message	Date
Arseny Sher	5865f85ae2	Add --id argument to safekeeper setting its unique u64 id. In preparation for storage node messaging. IDs are supposed to be monotonically assigned by the console. In tests it is issued by ZenithEnv; at the zenith cli level and fixtures, string name is completely replaced by integer id. Example TOML configs are adjusted accordingly. Sequential ids are chosen over Zid mainly because they are compact and easy to type/remember.	2022-02-23 08:33:50 +03:00
Dhammika Pathirana	b815f5fb9f	Add no_sync check in storage Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-02-22 12:01:12 -08:00
Arthur Petukhovsky	dba1d36a4a	Refactor WAL utils in safekeeper (#1290 ) wal_storage.rs was split up from timeline.rs, safekeeper.rs and send_wal.rs, and now contains all WAL related code from the safekeeper. Now there are PhysicalStorage for persisting WAL to disk and WalReader for reading it. This allows optimizing PhysicalStorage without affecting too much of other code. Also there is a separate structure for persisting control file now in control_file.rs.	2022-02-21 17:20:53 +03:00
Dhammika Pathirana	d2b896381a	Add safekeeper tenant tags in lsn/wal metrics Signed-off-by: Dhammika Pathirana <dhammika@gmail.com> Add tenant_id in lsn/wal metrics (#1234)	2022-02-18 08:26:37 -08:00
Dhammika Pathirana	009f6d4ae8	Fix safekeeper metric tags Signed-off-by: Dhammika Pathirana <dhammika@gmail.com> Use separate tags in sk storage file histo (#1234)	2022-02-18 08:26:37 -08:00
Kirill Bulatov	6eef401602	Move routerify behind zenith_utils	2022-02-10 08:33:22 -05:00
Kirill Bulatov	76b74349cb	Bump pageserver dependencies	2022-02-10 08:33:22 -05:00
Heikki Linnakangas	fa8a6c0e94	Reduce logging of walkeeper normal operations. It was printing a lot of stuff to the log with INFO level, for routine things like receiving or sending messages. Reduce the noise. The amount of logging was excessive, and it was also consuming a fair amount of CPU (about 20% of safekeeper's CPU usage in a little test I ran).	2022-02-10 08:34:30 +02:00
Egor Suvorov	2866a9e82e	Fix safekeeper LSN metrics (#1216 ) * Always initialize flush_lsn/commit_lsn metrics on a specific timeline, no more `n/a` * Update flush_lsn metrics missing from `cba4da3f4d` * Ensure that flush_lsn found on load is >= than both commit_lsn and truncate_lsn * Add some debug logging	2022-02-07 20:05:16 +03:00
Heikki Linnakangas	2d93b129a0	Avoid eprintln() in pageserver and walkeeper. Use log::error!() instead. I spotted a few of these "connection error" lines in the logs, without timestamps and the other stuff we print for all other log messages.	2022-02-05 17:59:31 +02:00
Arseny Sher	729ac38ea8	Centralize suspending/resuming timeline activity on safekeepers. Timeline is active whenever there is at least 1 connection from compute or pageserver is not caught up. Currently 'active' means callmemaybes are being sent. Fixes race: now suspend condition checking and callmemaybe unsubscribe happen under the same lock.	2022-02-03 02:34:10 +03:00
Arseny Sher	86045ac36c	Prefix per-cluster directory with ztenant_id in safekeeper. Currently ztimelineids are unique, but all APIs accept the pair, so let's keep it everywhere for uniformity. Carry around ZTTId containing both ZTenantId and ZTimelineId for simplicity. (existing clusters on staging ought to be preprocessed for that)	2022-01-27 17:22:07 +03:00
anastasia	5abe2129c6	Extend replication protocol with ZentihFeedback message to pass current_timeline_size to compute node Put standby_status_update fields into ZenithFeedback and send them as one message. Pass values sizes together with keys in ZenithFeedback message.	2022-01-27 11:20:45 +03:00
Dmitry Rodionov	e6f2d70517	use 2021 rust edition	2022-01-25 18:48:49 +03:00
Dmitry Rodionov	458bc0c838	walkeeper: use named type as a key in callmemaybe subscriptions hashmap	2022-01-24 17:20:15 +03:00
Dmitry Rodionov	37c440c5d3	Introduce first version of tenant migraiton between pageservers This patch includes attach/detach http endpoints in pageservers. Some changes in callmemaybe handling inside safekeeper and an integrational test to check migration with and without load. There are still some rough edges that will be addressed in follow up patches	2022-01-24 17:20:15 +03:00
anastasia	81e94d1897	Add LSN and Backpressure descriptions to glossary.md	2022-01-24 12:52:30 +03:00
Kirill Bulatov	65290b2e96	Ensure every submodule compiles on its own	2022-01-21 17:34:15 +03:00
Dmitry Ivanov	d3542c34f1	Refactoring: use anyhow::Context's methods where possible	2022-01-19 16:33:48 +03:00
Heikki Linnakangas	dab30c27b6	Refactor thread management and shutdown This introduces a new module to handle thread creation and shutdown. All page server threads are now registered in a global hash map, and there's a function to request individual threads to shut down gracefully. Thread shutdown request is signalled to the thread with a flag, as well as a Future that can be used to wake up async operations if shutdown is requested. Use that facility to have the libpq listener thread respond to pageserver shutdown, based on Kirill's earlier prototype (https://github.com/zenithdb/zenith/pull/1088). That addresses https://github.com/zenithdb/zenith/issues/1036, previously the libpq listener thread would not exit until one more connection arrives. This also eliminates a resource leak in the accept() loop. Previously, we added the JoinHanlde of each new thread to a vector but old handles for threads that had already exited were never removed.	2022-01-14 18:36:10 +02:00
Heikki Linnakangas	772d853dcf	Fix race condition leading to panic in walkeeper. The walkeeper launch two threads for each connection, and uses a guard object to remove entry from 'replicas' array, when finishes. But only the background thread held onto the guard object, so if the background thread finished before the other thread, the array entry would be removed prematurely, which lead to panic in the check_stop_streaming() call. Fixes https://github.com/zenithdb/zenith/issues/1103	2022-01-13 11:21:11 +02:00
Arseny Sher	ab4d272149	Add safekeeper --dump-control-file option. Hexalize zids there for better output; since Serde doesn't support several formats for one struct, on-disk representation is changed as well, make upgrade.rs cope with it.	2022-01-12 19:47:24 +03:00
Arseny Sher	233c4811db	Fix default safekeeper http port.	2022-01-11 10:13:27 +03:00
Arseny Sher	25a515b968	Don't call immediately on resume in callmemaybe. It creates busy loop if pageserver <-> safekeeper connection fails after it was established (e.g. currently due to 'segment checkpoint not found' error on pageserver). Also wake up callmemaybe thread regularly once in recall_period regardless of channel activity.	2022-01-03 20:44:36 +03:00
bojanserafimov	24eca8d58b	Parse cancel message in pq_proto (#1060 )	2021-12-28 16:43:44 -05:00
anastasia	eba897ffe7	send CallmeEvent::Unsubscribe request only when pageserver is caught up with safekeeper and it's time to stop streaming	2021-12-28 17:50:48 +03:00
Arseny Sher	a163650a99	Refactor Postgres command parsing in safekeeper. Do it separately with SafekeeperPostgresCommand enum as a result. Since query is always C string, switch postgres_backend process_query argument from Bytes to &str. Make passing ztli/ztenant id in safekeeper connection string optional; this is needed for upcoming intra-safekeeper heartbeat cmd which is not bound to any timeline.	2021-12-24 15:48:13 +03:00
anastasia	980f5f8440	Propagate remote_consistent_lsn to safekeepers. Change meaning of lsns in HOT_STANDBY_FEEDBACK: flush_lsn = disk_consistent_lsn, apply_lsn = remote_consistent_lsn Update compute node backpressure configuration respectively. Update compute node configuration: set 'synchronous_commit=remote_write' in setup without safekeepers. This way compute node doesn't have to wait for data checkpoint on pageserver. This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.	2021-12-24 15:32:54 +03:00
Arseny Sher	56312522f9	Make safekeeper namings more consistent with reality. s/send_wal.rs/handler.rs s/SendWalHandler/SafekeeperPostgresHandler s/replication.rs/send_wal.rs	2021-12-21 13:24:23 +03:00
anastasia	3b61f364f7	Stop WAL streaming threads, when compute node is shut down. WAL stream uses the 2 connections: 1. Compute node (walproposer) -> Safekeeper (ReceiveWalConn module) When compute node is shut down, safekeeper needs to stop the respective receiving thread. Prior to this PR it didn't work because PostgresBackend haven't handled disconnection properly. 2. Safekeeper (ReplicationConn module) -> pageserver (walreceiver thread) When incoming WAL stream is gone, safekeeper can stop streaming WAL and cancel connection as soon as replica is caught up. Note that the WAL can be streamed to multiple replicas simultaneously, only disconnect ones that are caught up to the last_recieved_lsn.	2021-12-20 12:34:28 +03:00
Arseny Sher	8f0cafd508	Grab safekeeper.lock on the whole directory instead of per tli. closes #976	2021-12-13 22:11:04 +03:00
Heikki Linnakangas	cb4a8396fb	Use rustls rather than native-tls in all dependencies. We depends on rustls in postgres_backend anyway, so might as well use it for all TLS stuff. Seems better to depend on only one library both from a security point of view, and because fewer dependencies means less code to compile. With this commit, we no longer depend on OpenSSL.	2021-12-10 15:14:27 +02:00
Heikki Linnakangas	c77e30116e	Split waldecoder.rs into two source files. Move the code for decoding a WAL stream into WAL records into 'postgres_ffi', and keep the code to parse the WAL records deeper in 'pageserver' crate, renamed to walrecord.rs. This tidies up the dependencies a bit. 'walkeeper' reuses the same waldecoder routines, and it used to depend on 'pageserver' because of that. Now it only depends on 'postgres_ffi'. (The comment in walkeeper/Cargo.toml that claimed that the dependency was needed for ZTimelineId was obsolete. ZTimelineId is defined in 'zenith_utils', the dependency was actually needed for the waldecoder.)	2021-12-10 15:14:13 +02:00
Heikki Linnakangas	9d369f158c	Update rust-s3 to version 0.28.0 0.28.0 includes two changes I submitted to upstream: - Add support for older ListObjects API, needed to use rust-s3 with Google Cloud Storage: https://github.com/durch/rust-s3/pull/229 - If file is smaller than one chunk, don't initiate multi-part upload. https://github.com/durch/rust-s3/pull/228 These are not critical for Zenith right now, but let's stay up-to-date.	2021-12-10 14:52:08 +02:00
Heikki Linnakangas	6ecd442fb9	Remove a bunch of unnecessary dependencies.	2021-12-10 14:24:33 +02:00
anastasia	5293e183c5	callmemaybe. review code cleanup	2021-12-09 13:31:49 +03:00
anastasia	93ff5f7ff0	Add default value for safekeeper --recall option. DEFAULT_RECALL_PERIOD is 1 second.	2021-12-09 13:31:49 +03:00
anastasia	41dce68bdd	callmemaybe refactoring - Don't spawn a separate thread for each connection. Instead use one thread per safekeeper, that iterates over all connections and sends callback requests for them. -Use tokio postgres to connect to the pageserver, to avoid spawning a new thread for each connection. callmemaybe review fixes: - Spawn all request_callback tasks separately. - Remember 'last_call_time' and only send request_callback if 'recall_period' has passed. - If task hasn't finished till next recall, abort it and try again. - Add pause/resume CallmeEvents to avoid spamming pageserver when connection already established.	2021-12-09 13:31:49 +03:00
Arseny Sher	37c85d5fd9	Switch safekeeper from log to tracing logging. Add context to wal acceptor and wal sender threads showing timeline id and unique id differentiating them.	2021-12-09 06:57:46 +03:00
Arthur Petukhovsky	450fb9eafe	Don't persist control file without sync (#966 )	2021-12-07 15:02:44 +03:00
Dmitry Rodionov	557e3024cd	Forward pageserver connection string from compute to safekeeper This is needed for implementation of tenant rebalancing. With this change safekeeper becomes aware of which pageserver is supposed to be used for replication from this particular compute.	2021-12-06 21:28:49 +03:00
Arseny Sher	bd34d7ecfc	Bump safekeeper control file version and allow reading the previous one. Should have been a part of `cba4da3f4d` to provide upgrade for previously existing clusters. Separates version independent header (magic + version) out of SafeKeeperState to choose what to deserialize.	2021-12-06 19:47:55 +03:00
Dmitry Ivanov	7cec13d1df	Improve shutdown story for code coverage This patch introduces fixes for several problems affecting LLVM-based code coverage: * Daemonizing parent processes should call _exit() to prevent coverage data file corruption (.profraw) due to concurrent writes. Implement proper shutdown handlers in safekeeper.	2021-12-06 13:27:52 +03:00
anastasia	c7f3b4e62c	Clarify the meaning of StandbyReply LSNs: write_lsn - The last LSN received and processed by pageserver's walreceiver. flush_lsn - same as write_lsn. At pageserver it doesn't guarantees data persistence, but it's fine. We rely on safekeepers. apply_lsn - The LSN at which pageserver guaranteed persistence of all received data (disk_consistent_lsn).	2021-12-06 12:49:42 +03:00
Arseny Sher	cba4da3f4d	Add term history to safekeepers. Persist full history of term switches on safekeepers instead of storing only the single term of the highest entry (called epoch). This allows easily and correctly find the divergence point of two logs and truncate the obsolete part before overwriting it with entries of the newer proposer(s). Full history of the proposer is transferred in separate message before proposer starts streaming; it is immediately persisted by safekeeper, though he might not yet have entries for some older terms there. That's because we can't atomically append to WAL and update the control file anyway, so locally available WAL must be taken into account when looking at the history. We should sometimes purge term history entries beyond truncate_lsn; this is not done here. Per https://github.com/zenithdb/rfcs/pull/12 Closes #296. Bumps vendor/postgres.	2021-12-03 12:43:57 +03:00
Arthur Petukhovsky	93cc40584d	Shutdown socket on CopyFail (#938 ) Fixes #935	2021-11-26 16:48:27 +03:00
Arseny Sher	e7ca8ef5a8	Use PG timelineid 1 everywhere. As changing it doesn't have useful meaning in Zenith. ref #824	2021-11-11 13:53:39 +03:00
Arseny Sher	5603259c53	In wal_proposer_recovery, don't wait outcoming WAL to be committed. Otherwise we're deadlocking ourselves. Oversight of `33007cc`.	2021-11-10 01:38:25 +03:00
Arseny Sher	ce15c62f35	Fix 'send WAL up to' debug logging.	2021-11-10 01:38:25 +03:00
Dmitry Rodionov	07dddfed28	Use more robust way to persist safekeeper control file. Now safekeeper control file updated in a following way: 1. Write data to temp file 2. Fsync the temporary file (if sync option is specified) 3. Rename temporary file to actual control file 4. Fsync containing directory (if sync option is specified) 5. Fsync file after rename (if sync option is specified). Note that action 5 is not mentioned anywhere as required but it is done in postgres this way (see durable_rename). Also because of the rename machinery switch to use dedicated lock file to prevent running several safekeepers concurrently on the same data cleanup fsync control file after rename to match postgres behaviour	2021-11-09 17:51:46 +03:00

1 2 3 4

188 Commits