rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 20:42:54 +00:00

Author	SHA1	Message	Date
Patrick Insinger	91ff09151d	Remove disk IO from `InMemoryLayer::freeze` Move the creation of Image and Delta layers from `InMemoryLayer::freeze()` to `InMemoryLayer::write_to_disk`.	2021-09-15 16:04:49 -07:00
Patrick Insinger	fea5954b18	Change filling gap println! to trace!	2021-09-15 14:22:04 -07:00
Max Sharnoff	b11b0bb088	bin_ser: reject trailing bytes by default (#587 ) Changes `LeSer`/`BeSer::des`. Also adds a new `des_prefix` function to keep a way to allow trailing bytes.	2021-09-15 11:48:19 -07:00
Dmitry Rodionov	0ede933719	temporary disable parallel test runs as it seems to misbehave when there are several concurrent CI runs	2021-09-15 18:59:59 +03:00
Kirill Bulatov	3ab60ce76f	Unify tokio deps and bump cargo resolver version	2021-09-15 16:00:08 +03:00
Dmitry Rodionov	01ef2baef0	show more context for zenith cli run errors	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	6a2e4bfdd9	use parallel test execution in ci	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	9563336d9a	Bring back check for interferring processes, add more comments and descriptive errors	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	4ebe643d0c	Support parallel test running for python tests Support is done via pytest-xdist plugin. To use the feature add -n<concurrency> to pytest invocation e.g. pytest -n8 to run 8 tests in parallel. Changes in code are mostly about ports assigning. Previously port for pageserver was hardcoded without the ability to override through zenith cli and ports for started compute nodes were calculated twice, in zenith cli and in test code. Now zenith cli supports port arguments for pageserver and compute nodes to be passed explicitly. Tests are modified in such a way that each worker gets a non overlapping port range which can be configured and now contains 100 ports. These ports are distributed to test services (pageserver, wal acceptors, compute nodes) so they can work independently.	2021-09-15 14:02:15 +03:00
Dmitry Rodionov	dc897fb864	remove pageserver remotes support since we do not have tests for that and feature itself is delayed (#136 )	2021-09-15 13:24:35 +03:00
Max Sharnoff	a2498f3e67	Improve walkeeper replication error messages & context (#585 )	2021-09-14 11:59:14 -07:00
Patrick Insinger	d150f3ce8c	Detect writes on frozen InMemoryLayers Data written to frozen layers is lost. It will not appear in on-disk structures or in successor InMemoryLayers. Here we detect this race, and fail. I think this race is rare, but this should make it easier to track down when it happens.	2021-09-14 11:44:48 -07:00
Patrick Insinger	cff4572774	Avoid race in `get_layer_for_write` Implement the changes suggested in a comment, create `get_layer_for_read_locked` so that `get_layer_for_write` doesn't have to drop the LayerMap lock when searching for the predecessor.	2021-09-14 11:24:24 -07:00
Dmitry Rodionov	84008a2560	factor out common logging initialisation routine This contains a lowest common denominator of pageserver and safekeeper log initialisation routines. It uses daemonize flag to decide where to stream log messages. In case daemonize is true log messages are forwarded to file. Otherwise streaming to stdout is used. Usage of stdout for log output is the default in docker side of things, so make it easier to browse our logs via builtin docker commands.	2021-09-14 18:09:14 +03:00
Dmitry Ivanov	6b7f3bc78c	Add inter-repo CI job to CircleCI configuration This job will be responsible for triggering remote CI pipeline in zenithdb/console repository. That way, we'll always know when a PR to zenithdb/zenith breaks the cloud console app.	2021-09-14 16:56:04 +03:00
Arseny Sher	a68c23448a	Skip the bootstrap hole in safekeeper's find_end_of_wal. Otherwise restart of safekeeper before the first segment is filled makes it report 0 as flushed LSN. To this end, tweak find_end_of_wal_segment to allow starting from given LSN, not only from the start of the segment. While here, make it less panicky.	2021-09-13 22:46:04 +03:00
Dmitry Rodionov	9043f45489	removes protobuf dependency (brought by prometheus default features)	2021-09-13 15:57:41 +03:00
Heikki Linnakangas	6afd99c73f	Fix misc typos in comments.	2021-09-13 12:31:04 +03:00
nkotlyarov	18b5165b22	Update README.md typo	2021-09-12 15:35:18 +03:00
Arseny Sher	6dc66eefb6	bump vendor/postgres	2021-09-11 06:10:10 +03:00
Arseny Sher	0aec60938a	Make flush_lsn reported by safekeepers point to record boundary. Otherwise we produce corrupted record holes in WAL during compute node restart in case there was an unfinished record from the old compute, as these reports advance commit_lsn -- reliably persisted part of WAL. ref #549. Mostly by @knizhnik. I adjusted to make sure proposer always starts streaming since record beginning so we don't need special quirks for decoding in safekeeper.	2021-09-11 06:10:10 +03:00
Patrick Insinger	7c62a57e54	initialize tenant_mgr after daemonizing Ran into problems launching the WAL redo process on OS X after 4b73ad. Launching the `initdb` process was met with "bad file descriptor" errors. Using dtrace, I found shortly after calling `posix_spawn` for `initdb`, `kevent` was returning this error. I haven't dug super deep to see if the daemonization itself is the problem, but this commit fixes it for me. My hunch is that some file descriptors used when the Tokio runtime is initailzed become invalid in the daemon process.	2021-09-10 13:00:39 +03:00
Heikki Linnakangas	59e7ca585d	Minor fixes	2021-09-10 12:43:11 +03:00
anastasia	3dea06b825	Update layered_repository/README.md	2021-09-10 12:43:11 +03:00
Heikki Linnakangas	ab33614ab1	Forbid adding WAL to the repository after advancing last record LSN. When you advance last record LSN, all changes up to that LSN should be imported into repository. We have been a bit sloppy about that when it comes to the checkpoint information that we also store in the repository. In WAL receiver, for example, we would receive a WAL record, advance last record LSN, and only then update the checkpoint relish at the same LSN. Reorder that so that you advance the last record LSN only after updating the checkpoint relish. It hasn't apparently caused any problems so far, but let's be tidy. Tighten the check for that in get_layer_for_write(), so that it checks for 'lsn > last_record_lsn' rather than 'lsn >= last_record_lsn'.	2021-09-10 10:59:09 +03:00
Heikki Linnakangas	03dff207db	Remove start_lsn arg from `create_empty_repository`. Always use lsn(0) as the initial last_record_lsn. It is updated soon after creating the timeline anyway, after loading the bootstrap data, so it doesn't stay long in that state. I was a bit worried about using a special value like 0, but it's actually nice that you can distinguish it from any real LSN value. The unit tests have been using Lsn(0) as the initial start LSN all along.	2021-09-10 10:24:35 +03:00
Heikki Linnakangas	6a8785379a	Add explicit 'wait_lsn' calls before get_page_at_lsn and such calls. Move the responsibility to wait for the WAL to arrive to the callers, and remove the wait_lsn() calls from the Timeline::get_page_at_lsn() and friends. We were not totally consistent before, list_rels() was missing the wait_lsn() call for example. Closes https://github.com/zenithdb/zenith/issues/521	2021-09-10 09:56:11 +03:00
Heikki Linnakangas	507177b42e	Refactor code to handle incoming page requests.	2021-09-09 18:48:46 +03:00
anastasia	b79754d06e	list_rels() and list_nonrels() refactoring: move shared code to list_relishes() function.	2021-09-09 16:05:32 +03:00
anastasia	674807eee1	Add test for dropped reltaions. Fix list_rels() and list_nonrels() functions	2021-09-09 16:05:32 +03:00
Konstantin Knizhnik	30c0343727	Use layer start_lsn instead of *entry_lsn as LSN to continue WAL record traversal at next layer (#573 ) refer #532	2021-09-09 15:15:50 +03:00
Dmitry Rodionov	4fae115dc2	propagate pageserver http error messages to zenith cli	2021-09-08 17:32:59 +03:00
anastasia	3d17255400	Add comment to 'pg stop' changes	2021-09-08 14:12:00 +03:00
anastasia	5488ce8834	Change CLI command 'pg stop' to avoid races in tests. Stop postgres immediately only when destroy option is used. Otherwise, use default shutdown mode (fast).	2021-09-08 14:12:00 +03:00
Max Sharnoff	d7313bb85c	Switch tokio-postgres dependency to git repo The other crates in this repository use zenithdb/rust-postgres as a dependency for the related items, instead of the crates.io versions. Switching to using that for the proxy as well removes an additional three dependencies when we compile. (319 -> 316)	2021-09-07 19:49:03 -07:00
Dmitry Rodionov	4b73ada26e	fix connection error appeared on zenith start by binding sockets before daemonization also use less annoying error reporting by not printing full error messages for connect errors in first several connection retries closes #507	2021-09-07 20:50:27 +03:00
Dmitry Rodionov	b4ecae33e4	add incremental tracking of logical timeline size In order to exclude problems with synchronizing disk and memory logical size is not stored in metadata on disk. It is calculated on timeline "start" by scanning the contents of layered repo and then size is maintained via an atomic variable. This patch also adds new endpoint to pageserver http api: branch detail. It allows retrieval of a particular branch info by its name. Size info is also added to the response of the endpoint and used in tests.	2021-09-07 18:25:15 +03:00
Patrick Insinger	1b9e49eb60	pageserver - update `unload()` comment Update comment to reflect changes made in 5ac4a2 and 98f496	2021-09-07 08:19:42 -07:00
Heikki Linnakangas	7a03e32dd5	Use Rust shorthand range syntax	2021-09-07 18:10:07 +03:00
Heikki Linnakangas	018a606987	Refactor code in LayerMap, for readability - Reorder the structs and functions - Delegate many of the operations in LayerMap to SegEntry. For example, `LayerMap::insert_open` now looks up the right SegEntry struct, and then calls `SegEntry::insert_open` on it. - Use HashMap::entry() function with or_default() to implement the lookups with less code	2021-09-07 18:10:07 +03:00
Heikki Linnakangas	26782851a9	Rename OpenSegEntry to OpenLayerEntry That's more appropriate: it's a struct that holds a Layer, not a segment.	2021-09-07 18:10:07 +03:00
Heikki Linnakangas	04ee1d5977	Add test for managing old open segments in binary heap. I thought this test would trigger the bug fixed previous commit, but it did not. More tests are nice in any case.	2021-09-07 18:10:07 +03:00
Heikki Linnakangas	6245702c7c	Comment fixes	2021-09-07 18:10:07 +03:00
Heikki Linnakangas	9098f2159d	Fix comparison routines of OpenSegEntry Commit `66929ad6fb` added a 'generation' number to open segments stored in the layer map, to distinguish old layers from layers that were added to the map during checkpoint processing. But it neglected the OpenSegEntry::cmp() function. It seems that the cmp() function is never used by BinaryHeap, so this didn't cause any user-visible bugs (I tried adding a panic() to the cmp() function and it didn't fire). But it's clearly wrong and we need to fix it, anyway.	2021-09-07 18:10:07 +03:00
Kirill Bulatov	292bdaa6a7	Update documentation to note some Postgres specifics	2021-09-07 17:48:41 +03:00
anastasia	6f0c065743	preserve filediff artifacts in CI	2021-09-07 16:58:21 +03:00
anastasia	94c50e3e90	Fix check_restored_datadir_content(). Call 'basebackup' command directly, instead of relying on CLI	2021-09-07 16:58:21 +03:00
Konstantin Knizhnik	f83108002b	Revert "Bump postgres version" This reverts commit `511873aaed`.	2021-09-07 15:06:43 +03:00
Konstantin Knizhnik	511873aaed	Bump postgres version	2021-09-07 15:05:08 +03:00
anastasia	eb3fd7a8da	print diff for mismatching files in check_restored_datadir_content()	2021-09-06 18:21:23 +03:00

1 2 3 4 5 ...

821 Commits