rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-06 04:52:55 +00:00

Author	SHA1	Message	Date
Arseny Sher	5f0fd093d7	Revert "Walkeeper safe info (#408 )" Temporary revert commit `0ee2e16b17` as it leads to safekeeper state deserialization failure. Let's sort that out and get it back.	2021-08-11 16:26:35 +03:00
Konstantin Knizhnik	0ee2e16b17	Walkeeper safe info (#408 ) * Align prev record CRC on 8-bytes boundary * Upadate safekeeper in-memory status on receiving message from WAL proposer	2021-08-11 09:14:05 +03:00
Konstantin Knizhnik	b607f0fd8e	Align prev record CRC on 8-bytes boundary (#407 )	2021-08-11 08:56:37 +03:00
anastasia	c99a211b01	Fix CLOG truncate handling in case of wraparound.	2021-08-11 05:49:24 +03:00
anastasia	949ac54401	Add test of clog (pg_xact) truncation	2021-08-11 05:49:24 +03:00
anastasia	e406811375	Fixes for handling SLRU relishes: replace get_tx_status() with self.get_tx_is_in_progress() to handle xacts in truncated SLRU segments correctly	2021-08-11 05:49:24 +03:00
anastasia	590ace104a	Fixes for handling SLRU relishes: - don't return ZERO_PAGE from get_page_at_lsn_nowait() for truncated SLRU segments;	2021-08-11 05:49:24 +03:00
anastasia	e475f82ff1	Rename get_rel_size() to get_relish_size(). Don't bail if relish is not found, just return None and let the caller to decide how to handle this	2021-08-11 05:49:24 +03:00
anastasia	a368642790	cargo fmt	2021-08-10 14:26:52 +03:00
anastasia	8c7983797b	Remove unused SLRUTruncate ObjectValue	2021-08-10 14:26:32 +03:00
anastasia	5dd9a66f9e	Move postgres backend messages to trace level	2021-08-10 14:26:28 +03:00
anastasia	cc877f1980	Add unit test for find_end_of_wal(). Based on previous attempt to add same test by @lubennikovaav Now WAL files are generated by initdb command.	2021-08-10 12:30:21 +03:00
anastasia	a5d57ca10b	list_nonrels() returns elements in arbitrary order. Remove incorrect comments that say otherwise.	2021-08-06 15:23:46 +03:00
Konstantin Knizhnik	3ca3394170	[refer #395 ] Check WAL record CRC in waldecoder (#396 )	2021-08-05 16:57:57 +03:00
Heikki Linnakangas	e59e0ae2dc	Clarify the terms "WAL service", "safekeeper", "proposer"	2021-08-05 10:27:56 +03:00
Stas Kelvich	ec07acfb12	fix typo in run_initdb()	2021-08-04 23:57:17 +03:00
Stas Kelvich	fa04096733	cargo fmt pass	2021-08-04 23:51:02 +03:00
Dmitry Ivanov	754892402c	Enable full feature set for hyper in zenith_utils Server functionality requires not only the "server" feature flag, but also either "http1" or "http2" (or both). To make things simpler (and prevent analogous problems), enable all features.	2021-08-04 21:41:17 +03:00
Stas Kelvich	02b9be488b	Disable GC test. Current GC test is flaky and overly strict. Since we are migrating to the layered repo format with different GC implementation let's just silence this test for now.	2021-08-04 18:33:33 +03:00
Arseny Sher	cc3ac2b74c	Allow safekeeper to stream till real end of wal. Otherwise it prematurely terminates, e.g. in test_compute_restart. ref #388	2021-08-04 18:03:43 +03:00
Arseny Sher	1dc2ae6968	Point vendor/postgres to main.	2021-08-04 14:21:01 +03:00
Stas Kelvich	04ae63a5c4	use proper postgres version	2021-08-04 14:15:07 +03:00
Arseny Sher	b77fade7b8	Look up wal directory properly in all find_end_of_wal callers. ref #388	2021-08-04 14:15:07 +03:00
Stas Kelvich	56565c0f58	look up WAL in right directory	2021-08-04 14:15:07 +03:00
Dmitry Ivanov	ed634ec320	Extract message processing function from PostgresBackend's event loop This patch has been extracted from #348, where it became unnecessary after we had decided that we didn't want to measure anything inside PostgresBackend. IMO the change is good enough to make its way into the codebase, even though it brings nothing "new" to the code.	2021-08-04 10:49:02 +03:00
Alexey Kondratov	bcaa59c0b9	Test compute restart with AND without safekeepers	2021-08-04 00:05:19 +03:00
Dmitry Ivanov	cb1b4a12a6	Add some prometheus metrics to pageserver The metrics are served by an http endpoint, which is meant to be spawned in a new thread. In the future the endpoint will provide more APIs, but for the time being, we won't bother with proper routing.	2021-08-03 21:42:24 +03:00
Heikki Linnakangas	9ff122835f	Refactor ObjectTags, intruducing a new concept called "relish" This clarifies - I hope - the abstractions between Repository and ObjectRepository. The ObjectTag struct was a mix of objects that could be accessed directly through the public Timeline interface, and also objects that were created and used internally by the ObjectRepository implementation and not supposed to be accessed directly by the callers. With the RelishTag separaate from ObjectTag, the distinction is more clear: RelishTag is used in the public interface, and ObjectTag is used internally between object_repository.rs and object_store.rs, and it contains the internal metadata object types. One awkward thing with the ObjectTag struct was that the Repository implementation had to distinguish between ObjectTags for relations, and track the size of the relation, while others were used to store "blobs". With the RelishTags, some relishes are considered "non-blocky", and the Repository implementation is expected to track their sizes, while others are stored as blobs. I'm not 100% happy with how RelishTag captures that either: it just knows that some relish kinds are blocky and some non-blocky, and there's an is_block() function to check that. But this does enable size-tracking for SLRUs, allowing us to treat them more like relations. This changes the way SLRUs are stored in the repository. Each SLRU segment, e.g. "pg_clog/0000", "pg_clog/0001", are now handled as a separate relish. This removes the need for the SLRU-specific put_slru_truncate() function in the Timeline trait. SLRU truncation is now handled by caling put_unlink() on the segment. This is more in line with how PostgreSQL stores SLRUs and handles their trunction. The SLRUs are "blocky", so they are accessed one 8k page at a time, and repository tracks their size. I considered an alternative design where we would treat each SLRU segment as non-blocky, and just store the whole file as one blob. Each SLRU segment is up to 256 kB in size, which isn't that large, so that might've worked fine, too. One reason I didn't do that is that it seems better to have the WAL redo routines be as close as possible to the PostgreSQL routines. It doesn't matter much in the repository, though; we have to track the size for relations anyway, so there's not much difference in whether we also do it for SLRUs. While working on this, I noticed that the CLOG and MultiXact redo code did not handle wraparound correctly. We need to fix that, but for now, I just commented them out with a FIXME comment.	2021-08-03 14:01:05 +03:00
Heikki Linnakangas	f0030ae003	Handle SLRU ZERO records directly by storing an all-zeros page image. It's simpler than storing the original WAL record.	2021-08-03 13:59:51 +03:00
Heikki Linnakangas	acc0f41985	Don't try to launch duplicate WAL redo thread if tenant already exists. The codepath for tenant_create command first launched the WAL redo thread, and then called branches::create_repo() which checked if the tenant's directory already exists. That's problematic, because launching the WAL redo thread will run initdb if the directory doesn't already exist. Race condition: If the tenant already exists, it will have a WAL redo thread already running, and the old and new WAL redo thread might try to run initdb at the same time, causing all kinds of weird failures. The test_pageserver_api test was failing 100% repeatably on my laptop because of this. I'm not sure why this doesn't occur on the CI: Jul 31 18:05:48.877 INFO running initdb in "./tenants/5227e4eb90894775ac6b8a8c76f24b2e/wal-redo-datadir", location: pageserver::walredo, pageserver/src/walredo.rs:483 thread 'WAL redo thread' panicked at 'initdb failed: The files belonging to this database system will be owned by user "heikki". This user must also own the server process. The database cluster will be initialized with locale "C". The default database encoding has accordingly been set to "SQL_ASCII". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory ./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Europe/Helsinki creating configuration files ... ok running bootstrap script ... stderr: 2021-07-31 15:05:48.875 GMT [282569] LOG: could not open configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf": No such file or directory 2021-07-31 15:05:48.875 GMT [282569] FATAL: configuration file "/home/heikki/git-sandbox/zenith/test_output/test_tenant_list/repo/./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir/postgresql.conf" contains errors child process exited with exit code 1 initdb: removing data directory "./tenants/0305b1326f3ea33add0929d516da7cb6/wal-redo-datadir"	2021-07-31 18:13:21 +03:00
Alexey Kondratov	bd7d811921	Add libseccomp-dev as a dep to Dockerfile	2021-07-25 17:46:47 +03:00
anastasia	14b6796915	Send pgdata subdirs with basebackup. Fix for `1e6267a`.	2021-07-25 17:46:47 +03:00
Max Sharnoff	3f4815efa2	Correct `LeSer` doc: "Big Endian" -> "Little Endian" (#362 )	2021-07-23 12:38:37 -07:00
anastasia	ec03848d2f	Fix pageserver.log destination for zenith init. The problem was caused by merge conflict in `767590b`	2021-07-23 16:22:01 +03:00
anastasia	1e6267a35f	Get rid of snapshot directory + related code cleanup and refactoring. - Add new subdir postgres_ffi/samples/ for config file samples. - Don't copy wal to the new branch on zenith init or zenith branch. - Import_timeline_wal on zenith init.	2021-07-23 13:21:45 +03:00
Heikki Linnakangas	47824c5fca	Remove page server interactive mode. It was pretty cool, but no one used it, and it had gotten badly out of date. The main interesting thing with it was to see some basic metrics on the fly, while the page server is running, but the metrics collection had been broken for a long time, too. Best to just remove it.	2021-07-23 12:21:21 +03:00
Dmitry Rodionov	767590bbd5	support tenants this patch adds support for tenants. This touches mostly pageserver. Directory layout on disk is changed to contain new layer of indirection. Now path to particular repository has the following structure: <pageserver workdir>/tenants/<tenant id>. Tenant id has the same format as timeline id. Tenant id is included in pageserver commands when needed. Also new commands are available in pageserver: tenant_list, tenant_create. This is also reflected CLI. During init default tenant is created and it's id is saved in CLI config, so following commands can use it without extra options. Tenant id is also included in compute postgres configuration, so it can be passed via ServerInfo to safekeeper and in connection string to pageserver. For more info see docs/multitenancy.md.	2021-07-22 20:54:20 +03:00
Stas Kelvich	d210ba5fdb	Update README.md	2021-07-22 20:33:34 +03:00
Dmitry Ivanov	8b656bad5f	Add a missing [cfg(test)] We don't always need to compile tests.	2021-07-22 16:46:27 +03:00
Dmitry Ivanov	97329d4906	Add a test for EOF in walkeeper's background thread It would be nice to have a proper Timeline mock api, but this time we'll get by with what we have.	2021-07-22 12:12:55 +03:00
Dmitry Ivanov	6a3b9b1d46	Fix accidental busyloop in walkeeper's background thread It used to be the case that walkeeper's background thread failed to recognize the end of stream (EOF) signaled by the `Ok(None)` result of `FeMessage::read`.	2021-07-22 12:12:55 +03:00
anastasia	c913404739	Redirect log to pageserver.log during zenith init. Add new module logger.rs that contains shared code to init logging	2021-07-21 18:56:34 +03:00
anastasia	8e42af9b1d	Remove unused 'identify_system' pageserver query	2021-07-21 18:55:41 +03:00
Arseny Sher	fe17188464	Alternative way to truncate behind-the-vcl part of log. Which is important to do before bumping epoch.	2021-07-21 17:27:05 +03:00
Arseny Sher	51b50f5cf5	Fix truncating the wal after VCL.	2021-07-21 17:27:05 +03:00
Arseny Sher	9e3fe2b4d4	Truncate not matching part of log. ref #296	2021-07-21 17:27:05 +03:00
Arseny Sher	eb1618f2ed	TLA+ specification of proposer-acceptor consensus protocol. And .cfg file for running TLC. ref #293	2021-07-21 17:27:05 +03:00
Stas Kelvich	791312824d	set superuser name in python tests too	2021-07-21 17:22:22 +03:00
Stas Kelvich	a17b2a4364	reflect postgres superuser changes in pageserver->compute connstring	2021-07-21 17:22:22 +03:00
sharnoff	c4b2bf7ebd	Use 'zenith_admin' as superuser name in `initdb`	2021-07-21 17:22:22 +03:00

1 2 3 4 5 ...

703 Commits