rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-27 08:09:58 +00:00

Author	SHA1	Message	Date
Eric Seppanen	5b79e033bd	wip: adapt layered_repository to snapfile	2021-08-19 20:23:11 -07:00
Eric Seppanen	93b2e49939	remove snapshot id; replace with timeline id The snapshot id doesn't make sense when two snapshots are squashed. Better to use the timeline id anyway. It's still a little strange to squash two timelines together, but it must be possible to flatten history even if there have been many branch events, so this is should be a control plane / snapshot management problem.	2021-08-18 12:11:22 -07:00
Eric Seppanen	c3833ef0f4	add snapshot squashing Add logic to squash snapshot files. Add snaptool (a binary for inspecting and manipulating snapshot files). Use bookfile 0.3, which allows concurrent reads.	2021-08-18 12:11:22 -07:00
Eric Seppanen	acfc5c5d21	add another Page constructor, which copies bytes	2021-08-18 12:11:22 -07:00
Eric Seppanen	0a0d12368e	snapfile: add snapshot metadata Add some basic metadata to the snapshot file, including an id number, predecessor snapshot, and lsn.	2021-08-18 12:11:22 -07:00
Eric Seppanen	8d2b517359	snapfile: split apart code into multiple files versioned.rs: for things that get serialized and must be versioned to avoid breaking backwards compatibility. page.rs: for the Page struct.	2021-08-18 12:11:22 -07:00
Eric Seppanen	26bcd72619	add snapfile The snapfile crate implements a snapshot file format. The format relies heavily on the bookfile crate for the structured file format, and the aversion crate for versioned data structures. The overall structure of the file looks like this: - first 4KB: bookfile header - next 8KB * N: raw page data for N pages - page index map (from page identifier to data offset) - bookfile chapter index When a SnapFile is opened for reading, the page index map is read into memory; any page can be read directly from the file from that point.	2021-08-18 12:11:22 -07:00
Dmitry Rodionov	4bce65ff9a	bump rust version in ci to 1.52.1	2021-08-17 20:31:28 +03:00
Heikki Linnakangas	3319befc30	Revert a bunch of commits that I pushed by accident This reverts commits: `e35a5aa550` `a389c2ed7f` `11ebcb531f` `8d2b61f4d1` `882f549236` `ddb7155bbe` Those were follow-up work on top of PR https://github.com/zenithdb/zenith/pull/430, but they were still very much not ready.	2021-08-17 19:20:27 +03:00
Heikki Linnakangas	ddb7155bbe	WIP Store base images in separate ImageLayers	2021-08-17 18:55:04 +03:00
Heikki Linnakangas	882f549236	WIP: store base images separately	2021-08-17 18:54:53 +03:00
Heikki Linnakangas	8d2b61f4d1	Move code to handle snapshot filenames	2021-08-17 18:54:53 +03:00
Heikki Linnakangas	11ebcb531f	Add Gauge for # of layers	2021-08-17 18:54:53 +03:00
Heikki Linnakangas	a389c2ed7f	WIP: Track oldest open layer	2021-08-17 18:54:53 +03:00
Heikki Linnakangas	e35a5aa550	WIP: track mem usage	2021-08-17 18:54:53 +03:00
Heikki Linnakangas	45f641cabb	Handle last "open" layer specially in LayerMap. There can be only one "open" layer for each segment. That's the last one, implemented by InMemoryLayer. That's the only one where new records can be appended to. Much of the code needed to distinguish between the last open layer and other layers anyway, so make the distinction explicit in LayerMap.	2021-08-17 18:54:51 +03:00
Heikki Linnakangas	48f4a7b886	Refactor get_page_at_lsn() logic to layered_repository.rs There was a a lot of duplicated code between the get_page_at_lsn() implementations in InMemoryLayer and SnapshotLayer. Move the code for requesting WAL redo from the Layer trait into LayeredTimeline. The get-function in Layer now just returns the WAL records and base image to the caller, and the caller is responsible for performing the WAL redo on them.	2021-08-17 18:54:48 +03:00
Heikki Linnakangas	91f72fabc9	Work with smaller segments. Split each relish into fixed-sized 10 MB segments. Separate layers are created for each segment. This reduces the write amplification if you have a large relation and update only parts of it; the downside is that you have a lot more files. The 10 MB is just a guess, we should do some modeling and testing in the future to figure out the optimal size. Each segment tracks the size of the segment separately. To figure out the total size of a relish, you need to loop through the segment to find the highest segment that's in use. That's a bit inefficient, but will do for now. We might want to add a cache or something later.	2021-08-17 18:54:41 +03:00
anastasia	cbeb67067c	Issue #367 . Change CLI so that we always create node from scratch at 'pg start'. This operation preserve previously existing config Add new flag '--config-only' to 'pg create'. If this flag is passed, don't perform basebackup, just fill initial postgresql.conf for the node.	2021-08-17 18:12:31 +03:00
anastasia	921ec390bc	cargo fmt	2021-08-16 19:41:07 +03:00
Heikki Linnakangas	f37cb21305	Update Cargo.lock for addition of 'bincode' Commit `5eb1738e8b` added a dependency to the 'bincode' crate. 'cargo build' adds it to Cargo.lock automatically, so let's remember it.	2021-08-16 19:24:26 +03:00
Heikki Linnakangas	7ee8de3725	Add metrics to WAL redo. Track the time spent on replaying WAL records by the special Postgres process, the time spent waiting for acces to the Postgres process (since there is only one per tenant), and the number of records replayed.	2021-08-16 15:49:17 +03:00
Heikki Linnakangas	047a05efb2	Minor formatting and comment fixes.	2021-08-16 15:48:59 +03:00
Dmitry Rodionov	0c4ab80eac	try to be more intelligent in WalAcceptor.start, added a bunch of typing sugar to wal acceptor fixtures	2021-08-16 14:27:44 +03:00
Heikki Linnakangas	2450f82de5	Introduce a new "layered" repository implementation. This replaces the RocksDB based implementation with an approach using "snapshot files" on disk, and in-memory btreemaps to hold the recent changes. This make the repository implementation a configuration option. You can choose 'layered' or 'rocksdb' with "zenith init --repository-format=<format>" The unit tests have been refactored to exercise both implementations. 'layered' is now the default. Push/pull is not implemented. The 'test_history_inmemory' test has been commented out accordingly. It's not clear how we will implement that functionality; probably by copying the snapshot files directly.	2021-08-16 10:06:48 +03:00
Max Sharnoff	5eb1738e8b	Rework walkeeper protocol to use libpq (#366 ) Most of the work here was done on the postgres side. There's more information in the commit message there. (see: `04cfa326a5`) On the WAL acceptor side, we're now expecting 'START_WAL_PUSH' to initialize the WAL keeper protocol. Everything else is mostly the same, with the only real difference being that protocol messages are now discrete CopyData messages sent over the postgres protocol. For the sake of documentation, the full set of these messages is: <- recv: START_WAL_PUSH query <- recv: server info from postgres (type `ServerInfo`) -> send: walkeeper info (type `SafeKeeperInfo`) <- recv: vote info (type `RequestVote`) if node id mismatch: -> send: self node id (type `NodeId`); exit -> send: confirm vote (with node id) (type `NodeId`) loop: <- recv: info and maybe WAL block (type `SafeKeeperRequest` + bytes) (break loop if done) -> send: confirm receipt (type `SafeKeeperResponse`)	2021-08-13 11:25:16 -07:00
Heikki Linnakangas	6e22a8f709	Refactor WAL redo to not use a separate thread. My main motivation is to make it easier to attribute time spent in WAL redo to the request that needed the WAL redo. With this patch, the WAL redo is performed by the requester thread, so it shows up in stack traces and in 'perf' report as part of the requester's call stack. This is also slightly simpler (less lines of code) and should be a bit faster too.	2021-08-13 17:23:36 +03:00
Heikki Linnakangas	f8de71eab0	Update vendor/postgres to fix race condition leading to CRC errors. Fixes https://github.com/zenithdb/zenith/issues/413	2021-08-13 14:02:26 +03:00
Heikki Linnakangas	8517d9696d	Move gc_iteration() function to Repository trait. The upcoming layered storage implementation handles GC as a repository-wide operation because it needs to pay attention to the branch points of all timelines.	2021-08-12 23:46:01 +03:00
Heikki Linnakangas	97f9021c88	Fix JWT token encoding issue in test. On my laptop, the server was receiving the token as a string with extra b'...' escaping, e.g as "b'eyJ0....0ifQA'" instead of just "eyJ0....0ifQA". That was causing the test to fail. I'm using Python 3.9, while the CI is using Python 3.8. I suspect that's why. My version of pyjwt might be different too. See also https://github.com/jpadilla/pyjwt/issues/391.	2021-08-12 20:46:14 +03:00
Heikki Linnakangas	0a92b31496	If a pg_regress test fails in CI, save regression.diffs	2021-08-12 18:39:23 +03:00
anastasia	6c3726913f	Introduce check for physical relishes. They represent files and use RelationSizeEntry to track existing and dropped files. They can be both blocky and non-blocky. get_relish_size() and get_rel_exists() functions work with physical relishes, not only with blocky ones.	2021-08-12 14:42:21 +03:00
anastasia	1bfade8adc	Issue #330 . Use put_unlink for twophase relishes. Follow PostgreSQL logic: remove Twophase files when prepared transaction is committed/aborted. Always store Twophase segments as materialized page images (no wal records).	2021-08-12 14:42:21 +03:00
anastasia	4eebe22fbb	cargo fmt	2021-08-12 14:42:21 +03:00
Heikki Linnakangas	20d5e757ca	Remove now-unused get_next_tag function. The only caller was removed by commit `c99a211b01`.	2021-08-11 22:16:38 +03:00
Heikki Linnakangas	70cb399d59	Add convenience function to create a RowDescriptor message for an int8 col. Makes the code to construct a result set a bit more terse and readable.	2021-08-11 20:17:33 +03:00
Dmitry Rodionov	ce5333656f	Introduce authentication v0.1. Current state with authentication. Page server validates JWT token passed as a password during connection phase and later when performing an action such as create branch tenant parameter of an operation is validated to match one submitted in token. To allow access from console there is dedicated scope: PageServerApi, this scope allows access to all tenants. See code for access validation in: PageServerHandler::check_permission. Because we are in progress of refactoring of communication layer involving wal proposer protocol, and safekeeper<->pageserver. Safekeeper now doesn’t check token passed from compute, and uses “hardcoded” token passed via environment variable to communicate with pageserver. Compute postgres now takes token from environment variable and passes it as a password field in pageserver connection. It is not passed through settings because then user will be able to retrieve it using pg_settings or SHOW .. I’ve added basic test in test_auth.py. Probably after we add authentication to remaining network paths we should enable it by default and switch all existing tests to use it.	2021-08-11 20:05:54 +03:00
Arseny Sher	5f0fd093d7	Revert "Walkeeper safe info (#408 )" Temporary revert commit `0ee2e16b17` as it leads to safekeeper state deserialization failure. Let's sort that out and get it back.	2021-08-11 16:26:35 +03:00
Konstantin Knizhnik	0ee2e16b17	Walkeeper safe info (#408 ) * Align prev record CRC on 8-bytes boundary * Upadate safekeeper in-memory status on receiving message from WAL proposer	2021-08-11 09:14:05 +03:00
Konstantin Knizhnik	b607f0fd8e	Align prev record CRC on 8-bytes boundary (#407 )	2021-08-11 08:56:37 +03:00
anastasia	c99a211b01	Fix CLOG truncate handling in case of wraparound.	2021-08-11 05:49:24 +03:00
anastasia	949ac54401	Add test of clog (pg_xact) truncation	2021-08-11 05:49:24 +03:00
anastasia	e406811375	Fixes for handling SLRU relishes: replace get_tx_status() with self.get_tx_is_in_progress() to handle xacts in truncated SLRU segments correctly	2021-08-11 05:49:24 +03:00
anastasia	590ace104a	Fixes for handling SLRU relishes: - don't return ZERO_PAGE from get_page_at_lsn_nowait() for truncated SLRU segments;	2021-08-11 05:49:24 +03:00
anastasia	e475f82ff1	Rename get_rel_size() to get_relish_size(). Don't bail if relish is not found, just return None and let the caller to decide how to handle this	2021-08-11 05:49:24 +03:00
anastasia	a368642790	cargo fmt	2021-08-10 14:26:52 +03:00
anastasia	8c7983797b	Remove unused SLRUTruncate ObjectValue	2021-08-10 14:26:32 +03:00
anastasia	5dd9a66f9e	Move postgres backend messages to trace level	2021-08-10 14:26:28 +03:00
anastasia	cc877f1980	Add unit test for find_end_of_wal(). Based on previous attempt to add same test by @lubennikovaav Now WAL files are generated by initdb command.	2021-08-10 12:30:21 +03:00
anastasia	a5d57ca10b	list_nonrels() returns elements in arbitrary order. Remove incorrect comments that say otherwise.	2021-08-06 15:23:46 +03:00

1 2 3 4 5 ...

690 Commits