rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 14:02:55 +00:00

Author	SHA1	Message	Date
Kirill Bulatov	fb05e4cb0b	Show better error messages on pageserver failures	2021-09-29 01:55:41 +03:00
Egor Suvorov	b0a7234759	pageserver: fix stale default listen addrs * In command line help * In dummy_conf	2021-09-28 20:57:51 +03:00
Egor Suvorov	ddf4b15ebc	pageserver: use const_format crate to generate default listen addrs	2021-09-28 20:57:51 +03:00
Egor Suvorov	3065532f15	pageserver: fix mistype in listen-http arg help	2021-09-28 20:57:51 +03:00
Heikki Linnakangas	014be8b230	Use Iterator, to avoid making one copy of page_versions BTreeMap Reduces the CPU time spent in checkpointing, in the write_to_disk() function.	2021-09-27 19:28:02 +03:00
Heikki Linnakangas	08978458be	Refactor write_to_disk, handling dropped segment as a special case. Similar to what commit `7fb7f67b` did to 'freeze', dealing with the dropped segment separately from the rest of the logic makes the code easier to follow. It is also needed by the next commit that replaces the code to build new BTreeMap with an iterator; we cannot pass one of two kinds of closures as argument, it has to always be the same one. Having separate DeltaLayer::create() calls for the case of dropped segment and the other cases works around that.	2021-09-27 19:23:32 +03:00
Heikki Linnakangas	2252d9faa8	Switch to RwLock in InMemoryLayer Allows more parallelism basically for free.	2021-09-27 19:15:40 +03:00
Arthur Petukhovsky	22e15844ae	Fix clippy errors (#673 )	2021-09-27 18:59:30 +03:00
Konstantin Knizhnik	ca9af37478	Do not write WAL at pageserver (#645 ) * Do not write WAL at pageserver * Remove import_timeline_wal function	2021-09-27 14:15:55 +03:00
Heikki Linnakangas	b71e3a40e2	Add more details to the log, when an error happens in GetPage request.	2021-09-24 21:44:22 +03:00
Heikki Linnakangas	41dfc117e7	Buffer the writes to the WAL redo process pipe. Reduces the CPU time spent in the write() syscalls. I noticed that we were spending a lot of CPU time in libc::write, coming from request_redo(), in the 'bulk_insert' test. According to some quick profiling with 'perf', this reduces the CPU time spent in request_redo() from about 30% to 15%. For some reason, it doesn't reduce the overall runtime of the 'bulk_insert' test much, maybe by one second if you squint (from about 37s to 36s), so there must be some other bottleneck, like I/O. But this is surely still a good idea, just based on the reduced CPU cycles.	2021-09-24 21:12:38 +03:00
sharnoff	a72707b8cb	Redo #655 with fix: Allow `LeSer`/`BeSer` impls missing either `Serialize` or `Deserialize` Commit message copied below: * Allow LeSer/BeSer impls missing Serialize/Deserialize Currently, using `LeSer` or `BeSer` requires that the type implements both `Serialize` and `DeserializeOwned`, even if we're only using the trait for one of those functionalities. Moving the bounds to the methods gives the convenience of the traits without requiring unnecessary derives. * Remove unused #[derive(Serialize/Deserialize)] This should hopefully reduce compile times - if only by a little bit. Some of these were already unused (we weren't using LeSer/BeSer for the types), but most are have become unused with the change to LeSer/BeSer.	2021-09-24 10:58:01 -07:00
Max Sharnoff	0f770967b4	Revert "Allow `LeSer`/`BeSer` impls missing either `Serialize` or `Deserialize` (#655 ) This reverts commit `bd9f4794d9`.	2021-09-24 10:18:36 -07:00
Max Sharnoff	bd9f4794d9	Allow `LeSer`/`BeSer` impls missing either `Serialize` or `Deserialize` (#655 ) * Allow LeSer/BeSer impls missing Serialize/Deserialize Currently, using `LeSer` or `BeSer` requires that the type implements both `Serialize` and `DeserializeOwned`, even if we're only using the trait for one of those functionalities. Moving the bounds to the methods gives the convenience of the traits without requiring unnecessary derives. * Remove unused #[derive(Serialize/Deserialize)] This should hopefully reduce compile times - if only by a little bit. Some of these were already unused (we weren't using LeSer/BeSer for the types), but most are have become unused with the change to LeSer/BeSer.	2021-09-24 10:06:03 -07:00
Heikki Linnakangas	ff5cbe2694	Support overlapping and nested Layers in the layer map. This introduces a new tree data structure for holding intervals, and queries of the form "which intervals contain the given point?". It then uses that to store the Layers in the layer map, instead of the BTreeMap. While we don't currently create overlapping layers in the page server, that situation might arise in the future if we start to create extra layers for performance purposes, or as part of some multi-stage garbage collection operation that creates new layers in some interval and then removes old ones. The situation might also arise if you have multiple page servers running on the same timeline, freezing layers at different points, and both uploading them to S3. So even though overlapping layers might not happen currently, let's avoid getting confused if it does happen for some reason. Fixes https://github.com/zenithdb/zenith/issues/517.	2021-09-24 14:10:52 +03:00
Heikki Linnakangas	2319e0ec8f	Define a layer's start and end bounds more precisely. After this, a layer's start bound is always defined to be inclusive, and end bound exclusive. For example, if you have a layer in the range 100-200, that layer can be used for GetPage@LSN requests at LSN 100, 199, or anything in between. But for LSN 200, you need to look at the next layer (if one exists). This is one part of a fix for https://github.com/zenithdb/zenith/issues/517. After this, the page server shouldn't create layers for the same segment with the same LSN, which avoids the issue. However, the same thing would still happen, if you managed to create layers with same start LSN again. That could happen e.g. if you had two page servers running, or in some weird crash/restart scenario, or due to bugs or features added later. The next commit makes the layer map more robust, so that it tolerates that situation without deleting wrong files.	2021-09-24 14:10:49 +03:00
Patrick Insinger	7db3a9e7d9	walredo - don't use RefCell on stdin/stdout	2021-09-23 08:42:58 -07:00
Patrick Insinger	c81ee3bd5b	Add some comments to the checkpoint process	2021-09-23 13:19:45 +03:00
anastasia	7fb7f67bb4	Fix relish extention after it was dropped or truncated. - Turn dropped layers into non-writeable in get_layer_for_write(). - Handle non-writeable dropped layers in checkpointer. They don't need freezing, so just remove them from list of open_segs and write out to disk. - Remove code that handles dropped layers in freeze() function. It is not used anymore.	2021-09-23 13:19:45 +03:00
anastasia	86164c8b33	Add unit tests for drop_lsn. test_drop_extend and test_truncate_extend illustrate what happens if we dropped a segment and then created it again within the same layer.	2021-09-23 13:19:45 +03:00
anastasia	a4fc6da57b	Fix gc_internal to treat dropped layers. Some dropped layers serve as tombstones for earlier layers and thus cannot be garbage collected. Add new fields to GcResult for layers that are preserved as tombstones	2021-09-23 12:21:47 +03:00
anastasia	c934e724a8	Enable test_list_rels_drop test	2021-09-23 12:21:47 +03:00
anastasia	e554f9514f	gc refactoring - rename 'compact' argument of GC to 'checkpoint_before_gc'. - gc_iteration_internal() refactoring	2021-09-23 12:21:47 +03:00
Max Sharnoff	90ef661673	Fix rustc & clippy warnings for nightly (2021-09-19) (#629 ) Fix clippy warnings for nightly (2021-09-19)	2021-09-22 11:24:43 -07:00
Dmitry Rodionov	579b5ee944	exclude labels formatting for every operation in LOGICAL_TIMELINE_SIZE gauge metric	2021-09-22 18:03:48 +03:00
Heikki Linnakangas	a91eeb1c65	Buffer the writes when writing a layer to disk. Significantly reduces the CPU time spent on libc::write.	2021-09-21 16:54:29 +03:00
Patrick Insinger	ea4c3639e3	Include layer metadata in layer summary chapters Include all data stored in layer filenames and the tenant+timeline IDs inside a summary chapter. Use this chapter in the `dump_layerfile` utility.	2021-09-20 07:57:51 -07:00
Heikki Linnakangas	745627c8ca	Remove unused FE/BE ControlFile message. It's a remnant of some old tests in Zenith, but isn't used anymore. It doesn't exist in PostgreSQL.	2021-09-17 20:06:04 +03:00
Heikki Linnakangas	540973eac4	Don't get confused on request of latest page version with very old LSN. If the 'latest' flag in the client request is true, the client wants the latest page version regardless of the LSN in the request. The LSN is just a hint in that case, indicating that the page hasn't been modified since since that LSN. The LSN can be very old, so it's possible that the page server has already garbage collected away the layer at that LSN. We tried to fetch the old layer and errored out if that happened. To fix, always fetch the data as of last-record-LSN, if 'latest' is set in the client request. We now only use the LSN to wait if the requested LSN hasn't been received and processed yet. Fixes https://github.com/zenithdb/zenith/issues/567	2021-09-17 18:56:05 +03:00
Heikki Linnakangas	ad5f16f724	Improve the protocol between Postgres and page server. - Use different message formats for different kinds of response messages. - Add an Error message, for passing errors from page server to Postgres. Previously, we would respond to 'exists' request with 'false', and to 'nblocks' request with 0, if an error happened. Fix those to return an error message to the client. GetPage requests had a mechanism to return an error, but it was just a flag with no error message. - Add a flag to requests, to indicate that we actually want the latest page version on the timeline, and the LSN is just a hint that we know that there haven't been any modifications since that LSN. The flag isn't used for anything yet, but I'm planning to use it to fix https://github.com/zenithdb/zenith/issues/567	2021-09-17 16:38:14 +03:00
Kirill Bulatov	1d5abf1253	Initial version of the relish storage	2021-09-17 15:30:22 +03:00
Max Sharnoff	3743344e64	Add `get_timeline_for_tenant()` to `tenant_mgr` (#615 ) Most of the previous usages of get_repository_for_tenant were followed by immediately getting a timeline in that repository, without keeping it around for longer. The new `get_timeline_for_tenant` function implements that same behavior, but in one line.	2021-09-16 10:38:21 -07:00
Kirill Bulatov	7dda9f2894	Fix clippy lints and enable clippy checking in CI	2021-09-16 15:09:16 +03:00
anastasia	8de41f1d70	Change checkpoint_distance type to u64	2021-09-16 12:33:50 +03:00
anastasia	6984d33b4e	Run GC and checkpointer separate threads. Add checkpoint_period configuration parameter	2021-09-16 12:33:50 +03:00
anastasia	98d4f9cea5	Add checkpoint_distance config parameter. - Change hardcoded OLDEST_INMEM_DISTANCE value to pageserver config option checkpoint_distance. - Get rid of 'force' flag in checkpoint_internal(). Use checkpoint_distance=0 instead.	2021-09-16 12:33:50 +03:00
Patrick Insinger	25b7d424ab	Prevent frozen InMemoryLayer races Instead of panicking when a race happens, retry the operation after getting a new layer.	2021-09-15 20:50:51 -07:00
Patrick Insinger	a5bd306db9	Ensure InMemoryLayer predecessor updated correctly When the new open InMemoryLayer predecessor is updated, ensure it was pointing to the old frozen layer.	2021-09-15 16:04:49 -07:00
Patrick Insinger	0cbee4a416	Don't hold lock on LayerMap while writing to disk	2021-09-15 16:04:49 -07:00
Patrick Insinger	91ff09151d	Remove disk IO from `InMemoryLayer::freeze` Move the creation of Image and Delta layers from `InMemoryLayer::freeze()` to `InMemoryLayer::write_to_disk`.	2021-09-15 16:04:49 -07:00
Patrick Insinger	fea5954b18	Change filling gap println! to trace!	2021-09-15 14:22:04 -07:00
Max Sharnoff	b11b0bb088	bin_ser: reject trailing bytes by default (#587 ) Changes `LeSer`/`BeSer::des`. Also adds a new `des_prefix` function to keep a way to allow trailing bytes.	2021-09-15 11:48:19 -07:00
Kirill Bulatov	3ab60ce76f	Unify tokio deps and bump cargo resolver version	2021-09-15 16:00:08 +03:00
Dmitry Rodionov	4ebe643d0c	Support parallel test running for python tests Support is done via pytest-xdist plugin. To use the feature add -n<concurrency> to pytest invocation e.g. pytest -n8 to run 8 tests in parallel. Changes in code are mostly about ports assigning. Previously port for pageserver was hardcoded without the ability to override through zenith cli and ports for started compute nodes were calculated twice, in zenith cli and in test code. Now zenith cli supports port arguments for pageserver and compute nodes to be passed explicitly. Tests are modified in such a way that each worker gets a non overlapping port range which can be configured and now contains 100 ports. These ports are distributed to test services (pageserver, wal acceptors, compute nodes) so they can work independently.	2021-09-15 14:02:15 +03:00
Patrick Insinger	d150f3ce8c	Detect writes on frozen InMemoryLayers Data written to frozen layers is lost. It will not appear in on-disk structures or in successor InMemoryLayers. Here we detect this race, and fail. I think this race is rare, but this should make it easier to track down when it happens.	2021-09-14 11:44:48 -07:00
Patrick Insinger	cff4572774	Avoid race in `get_layer_for_write` Implement the changes suggested in a comment, create `get_layer_for_read_locked` so that `get_layer_for_write` doesn't have to drop the LayerMap lock when searching for the predecessor.	2021-09-14 11:24:24 -07:00
Dmitry Rodionov	84008a2560	factor out common logging initialisation routine This contains a lowest common denominator of pageserver and safekeeper log initialisation routines. It uses daemonize flag to decide where to stream log messages. In case daemonize is true log messages are forwarded to file. Otherwise streaming to stdout is used. Usage of stdout for log output is the default in docker side of things, so make it easier to browse our logs via builtin docker commands.	2021-09-14 18:09:14 +03:00
Heikki Linnakangas	6afd99c73f	Fix misc typos in comments.	2021-09-13 12:31:04 +03:00
Arseny Sher	0aec60938a	Make flush_lsn reported by safekeepers point to record boundary. Otherwise we produce corrupted record holes in WAL during compute node restart in case there was an unfinished record from the old compute, as these reports advance commit_lsn -- reliably persisted part of WAL. ref #549. Mostly by @knizhnik. I adjusted to make sure proposer always starts streaming since record beginning so we don't need special quirks for decoding in safekeeper.	2021-09-11 06:10:10 +03:00
Patrick Insinger	7c62a57e54	initialize tenant_mgr after daemonizing Ran into problems launching the WAL redo process on OS X after 4b73ad. Launching the `initdb` process was met with "bad file descriptor" errors. Using dtrace, I found shortly after calling `posix_spawn` for `initdb`, `kevent` was returning this error. I haven't dug super deep to see if the daemonization itself is the problem, but this commit fixes it for me. My hunch is that some file descriptors used when the Tokio runtime is initailzed become invalid in the daemon process.	2021-09-10 13:00:39 +03:00

1 2 3 4 5 ...

429 Commits