rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-27 16:12:56 +00:00

Author	SHA1	Message	Date
Arthur Petukhovsky	5a6405848d	Bump vendor/postgres (#1086 )	2022-01-05 14:27:51 +03:00
Alexey Kondratov	f64074c609	Move compute_tools from console repo (zenithdb/console#383 ) Currently it's included with minimal changes and lives aside of the main workspace. Later we may re-use and combine common parts with zenith control_plane. This change is mostly needed to unify cloud deployment pipeline: 1.1. build compute-tools image 1.2. build compute-node image based on the freshly built compute-tools 2. build zenith image So we can roll new compute image and new storage required by it to operate properly. Also it becomes easier to test console against some specific version of compute-node/-tools.	2021-12-28 20:17:29 +03:00
anastasia	980f5f8440	Propagate remote_consistent_lsn to safekeepers. Change meaning of lsns in HOT_STANDBY_FEEDBACK: flush_lsn = disk_consistent_lsn, apply_lsn = remote_consistent_lsn Update compute node backpressure configuration respectively. Update compute node configuration: set 'synchronous_commit=remote_write' in setup without safekeepers. This way compute node doesn't have to wait for data checkpoint on pageserver. This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.	2021-12-24 15:32:54 +03:00
Arthur Petukhovsky	f56db3da68	Bump vendor/postgres (#996 )	2021-12-21 16:53:08 +03:00
anastasia	bb5aba42eb	bump vendor/postgres to use correct backpressure commit	2021-12-08 18:57:18 +03:00
Dmitry Rodionov	557e3024cd	Forward pageserver connection string from compute to safekeeper This is needed for implementation of tenant rebalancing. With this change safekeeper becomes aware of which pageserver is supposed to be used for replication from this particular compute.	2021-12-06 21:28:49 +03:00
anastasia	b7685eb6ba	Enable backpressure	2021-12-06 12:49:42 +03:00
anastasia	c7f3b4e62c	Clarify the meaning of StandbyReply LSNs: write_lsn - The last LSN received and processed by pageserver's walreceiver. flush_lsn - same as write_lsn. At pageserver it doesn't guarantees data persistence, but it's fine. We rely on safekeepers. apply_lsn - The LSN at which pageserver guaranteed persistence of all received data (disk_consistent_lsn).	2021-12-06 12:49:42 +03:00
Arseny Sher	cba4da3f4d	Add term history to safekeepers. Persist full history of term switches on safekeepers instead of storing only the single term of the highest entry (called epoch). This allows easily and correctly find the divergence point of two logs and truncate the obsolete part before overwriting it with entries of the newer proposer(s). Full history of the proposer is transferred in separate message before proposer starts streaming; it is immediately persisted by safekeeper, though he might not yet have entries for some older terms there. That's because we can't atomically append to WAL and update the control file anyway, so locally available WAL must be taken into account when looking at the history. We should sometimes purge term history entries beyond truncate_lsn; this is not done here. Per https://github.com/zenithdb/rfcs/pull/12 Closes #296. Bumps vendor/postgres.	2021-12-03 12:43:57 +03:00
Konstantin Knizhnik	f72d4814b1	Extract page images from FPI WAL records (#949 ) * Extract page images from FPI WAL records * Fix issues reported in review	2021-11-30 12:57:26 +03:00
Alexey Kondratov	0d6bf14ecb	Use vendor/postgres rebased on top of REL_14_1	2021-11-12 16:27:43 +03:00
Arthur Petukhovsky	9aaa02bc9a	Fix high CPU usage in walproposer (#860 ) * Bump vendor/postgres * Update time limits for test_restarts_under_load	2021-11-10 17:18:07 +03:00
Arseny Sher	229dc7704f	Bump vendor/postgres.	2021-11-08 17:32:13 +03:00
Heikki Linnakangas	7ed39655dc	Bump vendor/postgres	2021-11-04 10:35:50 +02:00
Arseny Sher	1877bbc7cb	bump vendor/postgres to fix reconnection busy loop	2021-10-26 15:43:19 +03:00
Alexey Kondratov	9070a4dc02	Turn off back pressure by default	2021-10-22 01:40:43 +03:00
Konstantin Knizhnik	c310932121	Implement backpressure for compute node to avoid WAL overflow Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>	2021-10-21 18:15:50 +03:00
Heikki Linnakangas	feae7f39c1	Support read-only nodes Change 'zenith.signal' file to a human-readable format, similar to backup_label. It can contain a "PREV LSN: %X/%X" line, or a special value to indicate that it's OK to start with invalid LSN ('none'), or that it's a read-only node and generating WAL is forbidden ('invalid'). The 'zenith pg create' and 'zenith pg start' commands now take a node name parameter, separate from the branch name. If the node name is not given, it defaults to the branch name, so this doesn't break existing scripts. If you pass "foo@<lsn>" as the branch name, a read-only node anchored at that LSN is created. The anchoring is performed by setting the 'recovery_target_lsn' option in the postgresql.conf file, and putting the server into standby mode with 'standby.signal'. We no longer store the synthetic checkpoint record in the WAL segment. The postgres startup code has been changed to use the copy of the checkpoint record in the pg_control file, when starting in zenith mode.	2021-10-19 09:48:12 +03:00
Heikki Linnakangas	e3945d94fd	Store unlogged tables locally, and replace PD_WAL_LOGGED. All the changes are in the vendor/postgres side. However, because we now generate fewer Full Page Writes, the 'branch_behind' test needs to be modified so that it still generates enough WAL to consume a few WAL segments.	2021-10-06 10:58:15 +03:00
Heikki Linnakangas	86e14f2f1a	Bump vendor/postgres	2021-09-30 20:36:57 +03:00
Heikki Linnakangas	c846a824de	Bump vendor/postgres, to use buffered I/O in WAL redo process. Greatly reduces the CPU overhead in the WAL redo process.	2021-09-24 21:48:30 +03:00
Max Sharnoff	139936197a	bump vendor/postgres: Catch walkeeper ErrorResponse (#650 ) Postgres commit message: PQgetCopyData can sometimes indicate that the copy is done if the backend returns an error response. So while we still expect that the walkeeper never sends CopyDone, we can't expect it to never produce errors.	2021-09-23 14:55:38 -07:00
Arseny Sher	97c4cd4434	bump vendor/postgres	2021-09-23 12:22:53 +03:00
Heikki Linnakangas	296586b7ce	bump vendor/postgres	2021-09-20 18:52:55 +03:00
Heikki Linnakangas	ad5f16f724	Improve the protocol between Postgres and page server. - Use different message formats for different kinds of response messages. - Add an Error message, for passing errors from page server to Postgres. Previously, we would respond to 'exists' request with 'false', and to 'nblocks' request with 0, if an error happened. Fix those to return an error message to the client. GetPage requests had a mechanism to return an error, but it was just a flag with no error message. - Add a flag to requests, to indicate that we actually want the latest page version on the timeline, and the LSN is just a hint that we know that there haven't been any modifications since that LSN. The flag isn't used for anything yet, but I'm planning to use it to fix https://github.com/zenithdb/zenith/issues/567	2021-09-17 16:38:14 +03:00
Arseny Sher	87bc18972f	bump vendor/postgres	2021-09-16 11:41:29 +03:00
Arseny Sher	6dc66eefb6	bump vendor/postgres	2021-09-11 06:10:10 +03:00
Konstantin Knizhnik	f83108002b	Revert "Bump postgres version" This reverts commit `511873aaed`.	2021-09-07 15:06:43 +03:00
Konstantin Knizhnik	511873aaed	Bump postgres version	2021-09-07 15:05:08 +03:00
Arseny Sher	51d36b9930	bump vendor/postgres	2021-09-06 13:06:20 +03:00
Konstantin Knizhnik	b227c63edf	Set proper xl_prev in basebackup, when possible. In a passing fix two minor issues with basabackup: * check that we can't create branches with pre-initdb LSN's * normalize branch LSN's that are pointing to the segment boundary patch by @knizhnik closes #506	2021-09-03 14:58:59 +03:00
Stas Kelvich	8c07a36fda	Remove last_valid_lsn tracking in wal_receiver. There are two main reasons for that: a) Latest unfinished record may disapper after compute node restart, so let's try not leak volatile part of the WAL into the repository. Always use last_valid_record instead. That change requires different getPage@LSN logic in postgres -- we need to ask LSN's that point to some complete record instead of GetFlushRecPtr() that can point in the middle of the record. That was already done by @knizhnik to deal with the same problem during the work on `postgres --sync-safekeepers`. Postgres will use LSN's aligned on 0x8 boundary in get_page requests, so we also need to be sure that last_valid_record is aligned. b) Switch to get_last_record_lsn() in basebackup@no_lsn. When compute node is running without safekeepers and streams WAL directly to pageserver it is important to match basebackup LSN and LSN of replication start. Before this commit basebackup@no_lsn was waiting for last_valid_lsn and walreceiver started replication with last_record_lsn, which can be less. So replication was failing since compute node doesn't have requested WAL.	2021-09-02 12:06:12 +03:00
Max Sharnoff	625abf3c52	Bump vendor/postgres for walproposer cleanup ref zenithdb/postgres#69	2021-08-31 13:09:16 -07:00
anastasia	c0ace1efff	Bump vendor/postgres to use relsize cache.	2021-08-31 14:10:50 +03:00
Arseny Sher	6cbc08f1fb	bump pg version	2021-08-27 15:22:10 +03:00
Heikki Linnakangas	81dd4bc41e	Fix decoding XLOG_HEAP_DELETE and XLOG_HEAP_UPDATE records. Because the t_cid field was missing from the XlHeapDelete struct that corresponds to the PostgreSQL xl_heap_delete struct, the check for the XLH_DELETE_ALL_VISIBLE_CLEARED flag did not work correctly. Decoding XlHeapUpdate struct was also missing the t_cid field, but that didn't cause any immediate problems because in that struct, the t_cid field is after all the fields that the page server cares about. But fix that too, as it was an accident waiting to happen. The bug was mostly hidden by the VM page handling in zenith_wallog_page, where it forcibly generates a FPW record whenever a VM page is evicted: else if (forknum == VISIBILITYMAP_FORKNUM && !RecoveryInProgress()) { /* * Always WAL-log vm. * We should never miss clearing visibility map bits. * * TODO Is it too bad for performance? * Hopefully we do not evict actively used vm too often. */ XLogRecPtr recptr; recptr = log_newpage_copy(&reln->smgr_rnode.node, forknum, blocknum, buffer, false); XLogFlush(recptr); lsn = recptr; But that was just hiding the issue: it's still visible if you had a read-only node relying on the data in the page server, or you killed and restarted the primary node, or you started a branch. In the included test case, I used a new branch to expose this. Fixes https://github.com/zenithdb/zenith/issues/461	2021-08-24 15:59:25 +03:00
anastasia	ad8b5c3845	use updated vendor/postgres	2021-08-23 18:19:59 +03:00
Heikki Linnakangas	2450f82de5	Introduce a new "layered" repository implementation. This replaces the RocksDB based implementation with an approach using "snapshot files" on disk, and in-memory btreemaps to hold the recent changes. This make the repository implementation a configuration option. You can choose 'layered' or 'rocksdb' with "zenith init --repository-format=<format>" The unit tests have been refactored to exercise both implementations. 'layered' is now the default. Push/pull is not implemented. The 'test_history_inmemory' test has been commented out accordingly. It's not clear how we will implement that functionality; probably by copying the snapshot files directly.	2021-08-16 10:06:48 +03:00
Max Sharnoff	5eb1738e8b	Rework walkeeper protocol to use libpq (#366 ) Most of the work here was done on the postgres side. There's more information in the commit message there. (see: `04cfa326a5`) On the WAL acceptor side, we're now expecting 'START_WAL_PUSH' to initialize the WAL keeper protocol. Everything else is mostly the same, with the only real difference being that protocol messages are now discrete CopyData messages sent over the postgres protocol. For the sake of documentation, the full set of these messages is: <- recv: START_WAL_PUSH query <- recv: server info from postgres (type `ServerInfo`) -> send: walkeeper info (type `SafeKeeperInfo`) <- recv: vote info (type `RequestVote`) if node id mismatch: -> send: self node id (type `NodeId`); exit -> send: confirm vote (with node id) (type `NodeId`) loop: <- recv: info and maybe WAL block (type `SafeKeeperRequest` + bytes) (break loop if done) -> send: confirm receipt (type `SafeKeeperResponse`)	2021-08-13 11:25:16 -07:00
Heikki Linnakangas	f8de71eab0	Update vendor/postgres to fix race condition leading to CRC errors. Fixes https://github.com/zenithdb/zenith/issues/413	2021-08-13 14:02:26 +03:00
Dmitry Rodionov	ce5333656f	Introduce authentication v0.1. Current state with authentication. Page server validates JWT token passed as a password during connection phase and later when performing an action such as create branch tenant parameter of an operation is validated to match one submitted in token. To allow access from console there is dedicated scope: PageServerApi, this scope allows access to all tenants. See code for access validation in: PageServerHandler::check_permission. Because we are in progress of refactoring of communication layer involving wal proposer protocol, and safekeeper<->pageserver. Safekeeper now doesn’t check token passed from compute, and uses “hardcoded” token passed via environment variable to communicate with pageserver. Compute postgres now takes token from environment variable and passes it as a password field in pageserver connection. It is not passed through settings because then user will be able to retrieve it using pg_settings or SHOW .. I’ve added basic test in test_auth.py. Probably after we add authentication to remaining network paths we should enable it by default and switch all existing tests to use it.	2021-08-11 20:05:54 +03:00
anastasia	949ac54401	Add test of clog (pg_xact) truncation	2021-08-11 05:49:24 +03:00
Arseny Sher	1dc2ae6968	Point vendor/postgres to main.	2021-08-04 14:21:01 +03:00
Stas Kelvich	04ae63a5c4	use proper postgres version	2021-08-04 14:15:07 +03:00
Dmitry Rodionov	767590bbd5	support tenants this patch adds support for tenants. This touches mostly pageserver. Directory layout on disk is changed to contain new layer of indirection. Now path to particular repository has the following structure: <pageserver workdir>/tenants/<tenant id>. Tenant id has the same format as timeline id. Tenant id is included in pageserver commands when needed. Also new commands are available in pageserver: tenant_list, tenant_create. This is also reflected CLI. During init default tenant is created and it's id is saved in CLI config, so following commands can use it without extra options. Tenant id is also included in compute postgres configuration, so it can be passed via ServerInfo to safekeeper and in connection string to pageserver. For more info see docs/multitenancy.md.	2021-07-22 20:54:20 +03:00
Konstantin Knizhnik	3cded20662	Refactring after Heikki review	2021-07-16 18:43:07 +03:00
Heikki Linnakangas	c5509b05de	Revert accidental change to vendor/postgres. I accidentally changed it in `befefe8d84`.	2021-07-16 12:37:10 +03:00
Heikki Linnakangas	befefe8d84	Run 'cargo fmt'. Fixes a few formatting discrepancies had crept in recently.	2021-07-14 22:03:14 +03:00
Heikki Linnakangas	bfc27bee5e	Revert the fix to allegedly inaccurate comment. I misread the code. It does indeed only call checkpoint() every 10 segments. Revert that change, but keep the rest of the comment fixes.	2021-07-10 18:53:47 +03:00
Dmitry Ivanov	2712eaee15	[postgres] Enable seccomp bpf	2021-07-09 14:59:45 +03:00

1 2 3

106 Commits