rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-03 19:42:55 +00:00

Author	SHA1	Message	Date
Alexey Kondratov	b1df043794	Do cargo audit in CI	2021-08-16 19:06:43 +03:00
Heikki Linnakangas	7ee8de3725	Add metrics to WAL redo. Track the time spent on replaying WAL records by the special Postgres process, the time spent waiting for acces to the Postgres process (since there is only one per tenant), and the number of records replayed.	2021-08-16 15:49:17 +03:00
Heikki Linnakangas	047a05efb2	Minor formatting and comment fixes.	2021-08-16 15:48:59 +03:00
Dmitry Rodionov	0c4ab80eac	try to be more intelligent in WalAcceptor.start, added a bunch of typing sugar to wal acceptor fixtures	2021-08-16 14:27:44 +03:00
Heikki Linnakangas	2450f82de5	Introduce a new "layered" repository implementation. This replaces the RocksDB based implementation with an approach using "snapshot files" on disk, and in-memory btreemaps to hold the recent changes. This make the repository implementation a configuration option. You can choose 'layered' or 'rocksdb' with "zenith init --repository-format=<format>" The unit tests have been refactored to exercise both implementations. 'layered' is now the default. Push/pull is not implemented. The 'test_history_inmemory' test has been commented out accordingly. It's not clear how we will implement that functionality; probably by copying the snapshot files directly.	2021-08-16 10:06:48 +03:00
Max Sharnoff	5eb1738e8b	Rework walkeeper protocol to use libpq (#366 ) Most of the work here was done on the postgres side. There's more information in the commit message there. (see: `04cfa326a5`) On the WAL acceptor side, we're now expecting 'START_WAL_PUSH' to initialize the WAL keeper protocol. Everything else is mostly the same, with the only real difference being that protocol messages are now discrete CopyData messages sent over the postgres protocol. For the sake of documentation, the full set of these messages is: <- recv: START_WAL_PUSH query <- recv: server info from postgres (type `ServerInfo`) -> send: walkeeper info (type `SafeKeeperInfo`) <- recv: vote info (type `RequestVote`) if node id mismatch: -> send: self node id (type `NodeId`); exit -> send: confirm vote (with node id) (type `NodeId`) loop: <- recv: info and maybe WAL block (type `SafeKeeperRequest` + bytes) (break loop if done) -> send: confirm receipt (type `SafeKeeperResponse`)	2021-08-13 11:25:16 -07:00
Heikki Linnakangas	6e22a8f709	Refactor WAL redo to not use a separate thread. My main motivation is to make it easier to attribute time spent in WAL redo to the request that needed the WAL redo. With this patch, the WAL redo is performed by the requester thread, so it shows up in stack traces and in 'perf' report as part of the requester's call stack. This is also slightly simpler (less lines of code) and should be a bit faster too.	2021-08-13 17:23:36 +03:00
Heikki Linnakangas	f8de71eab0	Update vendor/postgres to fix race condition leading to CRC errors. Fixes https://github.com/zenithdb/zenith/issues/413	2021-08-13 14:02:26 +03:00
Heikki Linnakangas	8517d9696d	Move gc_iteration() function to Repository trait. The upcoming layered storage implementation handles GC as a repository-wide operation because it needs to pay attention to the branch points of all timelines.	2021-08-12 23:46:01 +03:00
Heikki Linnakangas	97f9021c88	Fix JWT token encoding issue in test. On my laptop, the server was receiving the token as a string with extra b'...' escaping, e.g as "b'eyJ0....0ifQA'" instead of just "eyJ0....0ifQA". That was causing the test to fail. I'm using Python 3.9, while the CI is using Python 3.8. I suspect that's why. My version of pyjwt might be different too. See also https://github.com/jpadilla/pyjwt/issues/391.	2021-08-12 20:46:14 +03:00
Heikki Linnakangas	0a92b31496	If a pg_regress test fails in CI, save regression.diffs	2021-08-12 18:39:23 +03:00
anastasia	6c3726913f	Introduce check for physical relishes. They represent files and use RelationSizeEntry to track existing and dropped files. They can be both blocky and non-blocky. get_relish_size() and get_rel_exists() functions work with physical relishes, not only with blocky ones.	2021-08-12 14:42:21 +03:00
anastasia	1bfade8adc	Issue #330 . Use put_unlink for twophase relishes. Follow PostgreSQL logic: remove Twophase files when prepared transaction is committed/aborted. Always store Twophase segments as materialized page images (no wal records).	2021-08-12 14:42:21 +03:00
anastasia	4eebe22fbb	cargo fmt	2021-08-12 14:42:21 +03:00
Heikki Linnakangas	20d5e757ca	Remove now-unused get_next_tag function. The only caller was removed by commit `c99a211b01`.	2021-08-11 22:16:38 +03:00
Heikki Linnakangas	70cb399d59	Add convenience function to create a RowDescriptor message for an int8 col. Makes the code to construct a result set a bit more terse and readable.	2021-08-11 20:17:33 +03:00
Dmitry Rodionov	ce5333656f	Introduce authentication v0.1. Current state with authentication. Page server validates JWT token passed as a password during connection phase and later when performing an action such as create branch tenant parameter of an operation is validated to match one submitted in token. To allow access from console there is dedicated scope: PageServerApi, this scope allows access to all tenants. See code for access validation in: PageServerHandler::check_permission. Because we are in progress of refactoring of communication layer involving wal proposer protocol, and safekeeper<->pageserver. Safekeeper now doesn’t check token passed from compute, and uses “hardcoded” token passed via environment variable to communicate with pageserver. Compute postgres now takes token from environment variable and passes it as a password field in pageserver connection. It is not passed through settings because then user will be able to retrieve it using pg_settings or SHOW .. I’ve added basic test in test_auth.py. Probably after we add authentication to remaining network paths we should enable it by default and switch all existing tests to use it.	2021-08-11 20:05:54 +03:00
Arseny Sher	5f0fd093d7	Revert "Walkeeper safe info (#408 )" Temporary revert commit `0ee2e16b17` as it leads to safekeeper state deserialization failure. Let's sort that out and get it back.	2021-08-11 16:26:35 +03:00
Konstantin Knizhnik	0ee2e16b17	Walkeeper safe info (#408 ) * Align prev record CRC on 8-bytes boundary * Upadate safekeeper in-memory status on receiving message from WAL proposer	2021-08-11 09:14:05 +03:00
Konstantin Knizhnik	b607f0fd8e	Align prev record CRC on 8-bytes boundary (#407 )	2021-08-11 08:56:37 +03:00
anastasia	c99a211b01	Fix CLOG truncate handling in case of wraparound.	2021-08-11 05:49:24 +03:00
anastasia	949ac54401	Add test of clog (pg_xact) truncation	2021-08-11 05:49:24 +03:00
anastasia	e406811375	Fixes for handling SLRU relishes: replace get_tx_status() with self.get_tx_is_in_progress() to handle xacts in truncated SLRU segments correctly	2021-08-11 05:49:24 +03:00
anastasia	590ace104a	Fixes for handling SLRU relishes: - don't return ZERO_PAGE from get_page_at_lsn_nowait() for truncated SLRU segments;	2021-08-11 05:49:24 +03:00
anastasia	e475f82ff1	Rename get_rel_size() to get_relish_size(). Don't bail if relish is not found, just return None and let the caller to decide how to handle this	2021-08-11 05:49:24 +03:00
anastasia	a368642790	cargo fmt	2021-08-10 14:26:52 +03:00
anastasia	8c7983797b	Remove unused SLRUTruncate ObjectValue	2021-08-10 14:26:32 +03:00
anastasia	5dd9a66f9e	Move postgres backend messages to trace level	2021-08-10 14:26:28 +03:00
anastasia	cc877f1980	Add unit test for find_end_of_wal(). Based on previous attempt to add same test by @lubennikovaav Now WAL files are generated by initdb command.	2021-08-10 12:30:21 +03:00
anastasia	a5d57ca10b	list_nonrels() returns elements in arbitrary order. Remove incorrect comments that say otherwise.	2021-08-06 15:23:46 +03:00
Konstantin Knizhnik	3ca3394170	[refer #395 ] Check WAL record CRC in waldecoder (#396 )	2021-08-05 16:57:57 +03:00
Heikki Linnakangas	e59e0ae2dc	Clarify the terms "WAL service", "safekeeper", "proposer"	2021-08-05 10:27:56 +03:00
Stas Kelvich	ec07acfb12	fix typo in run_initdb()	2021-08-04 23:57:17 +03:00
Stas Kelvich	fa04096733	cargo fmt pass	2021-08-04 23:51:02 +03:00
Dmitry Ivanov	754892402c	Enable full feature set for hyper in zenith_utils Server functionality requires not only the "server" feature flag, but also either "http1" or "http2" (or both). To make things simpler (and prevent analogous problems), enable all features.	2021-08-04 21:41:17 +03:00
Stas Kelvich	02b9be488b	Disable GC test. Current GC test is flaky and overly strict. Since we are migrating to the layered repo format with different GC implementation let's just silence this test for now.	2021-08-04 18:33:33 +03:00
Arseny Sher	cc3ac2b74c	Allow safekeeper to stream till real end of wal. Otherwise it prematurely terminates, e.g. in test_compute_restart. ref #388	2021-08-04 18:03:43 +03:00
Arseny Sher	1dc2ae6968	Point vendor/postgres to main.	2021-08-04 14:21:01 +03:00
Stas Kelvich	04ae63a5c4	use proper postgres version	2021-08-04 14:15:07 +03:00
Arseny Sher	b77fade7b8	Look up wal directory properly in all find_end_of_wal callers. ref #388	2021-08-04 14:15:07 +03:00
Stas Kelvich	56565c0f58	look up WAL in right directory	2021-08-04 14:15:07 +03:00
Dmitry Ivanov	ed634ec320	Extract message processing function from PostgresBackend's event loop This patch has been extracted from #348, where it became unnecessary after we had decided that we didn't want to measure anything inside PostgresBackend. IMO the change is good enough to make its way into the codebase, even though it brings nothing "new" to the code.	2021-08-04 10:49:02 +03:00
Alexey Kondratov	bcaa59c0b9	Test compute restart with AND without safekeepers	2021-08-04 00:05:19 +03:00
Dmitry Ivanov	cb1b4a12a6	Add some prometheus metrics to pageserver The metrics are served by an http endpoint, which is meant to be spawned in a new thread. In the future the endpoint will provide more APIs, but for the time being, we won't bother with proper routing.	2021-08-03 21:42:24 +03:00
Heikki Linnakangas	9ff122835f	Refactor ObjectTags, intruducing a new concept called "relish" This clarifies - I hope - the abstractions between Repository and ObjectRepository. The ObjectTag struct was a mix of objects that could be accessed directly through the public Timeline interface, and also objects that were created and used internally by the ObjectRepository implementation and not supposed to be accessed directly by the callers. With the RelishTag separaate from ObjectTag, the distinction is more clear: RelishTag is used in the public interface, and ObjectTag is used internally between object_repository.rs and object_store.rs, and it contains the internal metadata object types. One awkward thing with the ObjectTag struct was that the Repository implementation had to distinguish between ObjectTags for relations, and track the size of the relation, while others were used to store "blobs". With the RelishTags, some relishes are considered "non-blocky", and the Repository implementation is expected to track their sizes, while others are stored as blobs. I'm not 100% happy with how RelishTag captures that either: it just knows that some relish kinds are blocky and some non-blocky, and there's an is_block() function to check that. But this does enable size-tracking for SLRUs, allowing us to treat them more like relations. This changes the way SLRUs are stored in the repository. Each SLRU segment, e.g. "pg_clog/0000", "pg_clog/0001", are now handled as a separate relish. This removes the need for the SLRU-specific put_slru_truncate() function in the Timeline trait. SLRU truncation is now handled by caling put_unlink() on the segment. This is more in line with how PostgreSQL stores SLRUs and handles their trunction. The SLRUs are "blocky", so they are accessed one 8k page at a time, and repository tracks their size. I considered an alternative design where we would treat each SLRU segment as non-blocky, and just store the whole file as one blob. Each SLRU segment is up to 256 kB in size, which isn't that large, so that might've worked fine, too. One reason I didn't do that is that it seems better to have the WAL redo routines be as close as possible to the PostgreSQL routines. It doesn't matter much in the repository, though; we have to track the size for relations anyway, so there's not much difference in whether we also do it for SLRUs. While working on this, I noticed that the CLOG and MultiXact redo code did not handle wraparound correctly. We need to fix that, but for now, I just commented them out with a FIXME comment.	2021-08-03 14:01:05 +03:00
Heikki Linnakangas	f0030ae003	Handle SLRU ZERO records directly by storing an all-zeros page image. It's simpler than storing the original WAL record.	2021-08-03 13:59:51 +03:00
Heikki Linnakangas	acc0f41985	Don't try to launch duplicate WAL redo thread if tenant already exists. The codepath for tenant_create command first launched the WAL redo thread, and then called branches::create_repo() which checked if the tenant's directory already exists. That's problematic, because launching the WAL redo thread will run initdb if the directory doesn't already exist. Race condition: If the tenant already exists, it will have a WAL redo thread already running, and the old and new WAL redo thread might try to run initdb at the same time, causing all kinds of weird failures. The test_pageserver_api test was failing 100% repeatably on my laptop because of this. I'm not sure why this doesn't occur on the CI: Jul 31 18:05:48.877 INFO running initdb in "./tenants/5227e4eb90894775ac6b8a8c76f24b2e/wal-redo-datadir", location: pageserver::walredo, pageserver/src/walredo.rs:483 thread 'WAL redo thread' panicked at 'initdb failed: The files belonging to this database system will be owned by user "heikki". This user must also own the server process. The database cluster will be initialized with locale "C". The default database encoding has accordingly been set to "SQL_ASCII". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory ./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Europe/Helsinki creating configuration files ... ok running bootstrap script ... stderr: 2021-07-31 15:05:48.875 GMT [282569] LOG: could not open configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf": No such file or directory 2021-07-31 15:05:48.875 GMT [282569] FATAL: configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf" contains errors child process exited with exit code 1 initdb: removing data directory "./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir"	2021-07-31 18:13:21 +03:00
Alexey Kondratov	bd7d811921	Add libseccomp-dev as a dep to Dockerfile	2021-07-25 17:46:47 +03:00
anastasia	14b6796915	Send pgdata subdirs with basebackup. Fix for `1e6267a`.	2021-07-25 17:46:47 +03:00
Max Sharnoff	3f4815efa2	Correct `LeSer` doc: "Big Endian" -> "Little Endian" (#362 )	2021-07-23 12:38:37 -07:00

1 2 3 4 5 ...

670 Commits