rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-06 21:12:55 +00:00

Author	SHA1	Message	Date
Dmitry Ivanov	43ded1c54b	[proxy] Minor cleanup	2021-11-17 16:28:44 +03:00
Heikki Linnakangas	f8702d4625	Fix checking for whether segment exists on a frozen in-memory layer. Ever since we've had frozen in-memory layers, having an 'end_lsn' no longer means that the layer has been dropped. Need to check the 'dropped' flag explicitly. This was reliably causing a failure on the new 'test_parallel_copy' test in https://github.com/zenithdb/zenith/pull/864. I'm not sure why it doesn't happen on main branch, but the bug is pretty straightforward when you see it.	2021-11-15 20:19:15 +02:00
Dmitry Rodionov	44111e3ba3	Prohibit branch creation at lsn that was already garbage collected. This introduces new timeline field latest_gc_cutoff. It is updated before each gc iteration. New check is added to branch_timelines to prevent branch creation with start point less than latest_gc_cutoff. Also this adds a check to get_page_at_lsn which asserts that lsn at which the page is requested was not garbage collected. This check currently is triggered for readonly nodes which are pinned to specific lsn and because they are not tracked in pageserver garbage collection can remove data that still might be referenced. This is a bug and will be fixed separately.	2021-11-15 20:03:16 +03:00
Patrick Insinger	298bc588f9	pageserver - don't try to GC InMemoryLayers	2021-11-15 09:01:45 -08:00
Heikki Linnakangas	4ba521f53f	Add performance test case for parallel COPY TO	2021-11-15 14:49:53 +02:00
Heikki Linnakangas	431d32756b	Add a buffer cache, and use it to store materialized pages. The buffer cache is shared across all tenants, allowing memory to be dynamically allocated where it's needed the most. The cache works on 8 kB pages, and uses the clock algorithm for replacement policy; same as the PostgreSQL buffer cache. One peculiarity is that the materialized page versions can be looked up by an inexact LSN, to find the latest page version with an LSN >= the search key. The code is structured to support caching other kinds of pages in the same cache in the future, but with a different mapping key. Co-authored-by: Patrick Insinger <patrick@zenith.tech>	2021-11-12 11:02:12 -08:00
Heikki Linnakangas	3d172d98a3	Improve layered repo README. Add an informal overview of how it works.	2021-11-12 19:59:31 +02:00
Heikki Linnakangas	849ac791a6	Bandaid fix for "page not found" errors, when a table is loaded. During parallel load of a table, Postgres sometimes requests a page from the page server for which no WAL has been generated yet. That's normal; Postgres expects the page to be full of zeros. There was a special case for that in LayeredTimeline::materialize_page, but the problem remained when you're crossing a segment boundary, so that there's no layer for the segment at all. It would be nice to have a more robust cross-check for this case. That might need help from the Postgres side. But this extends the bandaid fix we had in materialize_page() to the case where cross segment boundary. Fixes https://github.com/zenithdb/zenith/issues/841	2021-11-12 18:47:39 +02:00
Alexey Kondratov	de5e6a15ae	Set LD_LIBRARY_PATH in the check_restored_datadir_content() psql call Otherwise we may use outdated system libpq. Also print stdout/stderr if basebackup failed in check_restored_datadir_content()	2021-11-12 16:27:43 +03:00
Alexey Kondratov	0d6bf14ecb	Use vendor/postgres rebased on top of REL_14_1	2021-11-12 16:27:43 +03:00
Heikki Linnakangas	d1e79c4af3	Fix locking issues in VirtualFile machinery. There were two separate locking issues that could lead to a deadlock, both related to holding a lock for longer than necessary: 1. In the loop in `VirtualFile::with_file`, the "handle_guard" was held across iterations of the loop. Because of that, if the handle was changed by a concurrent thread, the loop would try to acquire the handle lock, when it was still holding the lock from previous iteration. To fix, release the lock earlier. There was no need to hold it across iterations, it was just accidental. 2. In the same function, we also held the "slot_guard" longer than necessary. It's only needed in the first part of the loop, where we check if the current handle is valid. If it's not, the slot lock can be immediately released. But it was not, it was kept over the acquisition of the handle lock. I'm not sure if that alone could cause problems, but let's release the lock as soon as possible anyway. Add a test case, based on Konstantin's test program to demonstrate the deadlock.	2021-11-11 20:12:59 +02:00
Kirill Bulatov	abb2ac5246	Better context when erroring	2021-11-11 19:22:05 +02:00
Kirill Bulatov	99dbbe5f18	Allow downloading remote files partially	2021-11-11 18:51:34 +02:00
Arseny Sher	e7ca8ef5a8	Use PG timelineid 1 everywhere. As changing it doesn't have useful meaning in Zenith. ref #824	2021-11-11 13:53:39 +03:00
Patrick Insinger	1ce4976e36	pageserver - track size of VecMaps	2021-11-10 11:09:34 -08:00
Heikki Linnakangas	9300107cdf	Cache Book objects, use virtual files to avoid running out of fds. Currently, whenever a page version is needed from an image or delta layer, we open the file and read and parse the bookfile headers. That's pretty expensive. To reduce the overhead, introduce a cache of open file descriptors, and use that to cache the Book objects so that we don't need to read the metadata on every access.	2021-11-10 17:19:37 +02:00
Arthur Petukhovsky	9aaa02bc9a	Fix high CPU usage in walproposer (#860 ) * Bump vendor/postgres * Update time limits for test_restarts_under_load	2021-11-10 17:18:07 +03:00
Arseny Sher	5603259c53	In wal_proposer_recovery, don't wait outcoming WAL to be committed. Otherwise we're deadlocking ourselves. Oversight of `33007cc`.	2021-11-10 01:38:25 +03:00
Arseny Sher	ce15c62f35	Fix 'send WAL up to' debug logging.	2021-11-10 01:38:25 +03:00
Egor Suvorov	eaff0cd568	Check python for the whole repository and improve docs (#813 )	2021-11-09 22:23:29 +03:00
Egor Suvorov	587935ebed	Add Safekeeper metrics tests (#746 ) * zenith_fixtures.py: add SafekeeperHttpClient.get_metrics() * Ensure that `collect_lsn` and `flush_lsn`'s reported values look reasonable in `test_many_timelines`	2021-11-09 22:18:59 +03:00
Dmitry Rodionov	07dddfed28	Use more robust way to persist safekeeper control file. Now safekeeper control file updated in a following way: 1. Write data to temp file 2. Fsync the temporary file (if sync option is specified) 3. Rename temporary file to actual control file 4. Fsync containing directory (if sync option is specified) 5. Fsync file after rename (if sync option is specified). Note that action 5 is not mentioned anywhere as required but it is done in postgres this way (see durable_rename). Also because of the rename machinery switch to use dedicated lock file to prevent running several safekeepers concurrently on the same data cleanup fsync control file after rename to match postgres behaviour	2021-11-09 17:51:46 +03:00
Arseny Sher	229dc7704f	Bump vendor/postgres.	2021-11-08 17:32:13 +03:00
Dmitry Rodionov	067f2ac814	fix perf repo branch name	2021-11-08 13:27:23 +03:00
Dmitry Rodionov	865870a8e5	Follow up staging benchmarking * change zenith-perf-data checkout ref to be main * set cluster id through secrets so there is no code changes required when we wipe out clusters on staging * display full pgbench output on error	2021-11-05 14:07:11 +03:00
Arthur Petukhovsky	d19263aec8	Adjust timeouts for test_restarts_under_load (#830 ) * Adjust timeouts for test_restarts_under_load * Add test timeout for test_restarts_under_load	2021-11-04 19:58:40 +03:00
Heikki Linnakangas	6d742719a1	Fix infinite loop in looking up predecessor layer Commit `960c7d69a8` changed the LSN returned in the Continue case in InMemoryLayer::get_page_reconstruct_data(), but neglected to make the same change in DeltaLayer. Also add an escape hatch to the loop in materialize_page() to avoid getting stuck in an infinite loop, if a bug like this reoccurs.	2021-11-04 16:07:12 +02:00
Dmitry Rodionov	c75bc9b8b0	Change benchmark plugin layout so pytest loads it properly when running all tests (not necessary performance ones) resolves #837	2021-11-04 16:33:31 +03:00
Egor Suvorov	33007cc0bb	Safekeeper's `START_REPLICATION` handler: remove `stop_point`, do not handle `start_point == 0` (#777 )	2021-11-04 14:50:33 +03:00
Dmitry Rodionov	987833e0b9	Propagate git SHA to zenith binaries Git commit sha is displayed when --version flag is used and is written to logs during service startup. Uses git_version crate when git is available, and GIT_VERSION environment variable otherwise which is the case for docker builds.	2021-11-04 14:22:29 +03:00
Kirill Bulatov	f36acf00de	Reduce "relish" word usages in remote storage	2021-11-04 12:53:42 +02:00
Kirill Bulatov	956fc3dec9	Tidy up and make consistent the remote storate API	2021-11-04 12:53:42 +02:00
Heikki Linnakangas	b38e841f2d	Use poll() in communication with WAL redo process. The tokio futures added some overhead, so switch to plain non-blocking I/O with poll(). In a simple pgbench test on my laptop (select-only queries, scale-factor 1 `pgbench -P1 -T50 -S`), this gives about 10% improvement, from about 4300 TPS to 4800 TPS.	2021-11-04 10:39:04 +02:00
Heikki Linnakangas	3a0111c75e	Refactor functions for constructing WAL redo messages. Instead of building a separate Vec<u8> to hold each message, serialize all the messages to one big Vec<u8>. This eliminates some Vec allocation and memcpy() overhead. The downside is that if there are a lot of records to replay, we have to serialize them all into one big chunk of memory. That shouldn't be a problem in practice. If you need to replay millions of records to reconstruct a page, we should've materialized a new image of that page earlier already.	2021-11-04 10:39:00 +02:00
Heikki Linnakangas	086a02ab92	Add performance test for simple seq scans. Fixes https://github.com/zenithdb/zenith/issues/831	2021-11-04 10:36:45 +02:00
Heikki Linnakangas	7ed39655dc	Bump vendor/postgres	2021-11-04 10:35:50 +02:00
Dmitry Rodionov	c6172dae47	implement performance tests against our staging environment tests are based on self-hosted runner which is physically close to our staging deployment in aws, currently tests consist of various configurations of pgbenchi runs. Also these changes rework benchmark fixture by removing globals and allowing to collect reports with desired metrics and dump them to json for further analysis. This is also applicable to usual performance tests which use local zenith binaries.	2021-11-04 02:15:46 +03:00
Heikki Linnakangas	4ba783d0af	Remove a couple of unused functions. We might want to have custom serialize/deserialize functions for WALRecords and PageVersions for performance reasons, see github issue 832. But that would probably look a bit different from this, and currently these functions are just dead.	2021-11-03 19:10:23 +02:00
Patrick Insinger	0457fe81a9	pageserver - make `PageVersion` an enum	2021-11-03 09:28:49 -07:00
Heikki Linnakangas	fb524dd973	Put a global limit on memory used by in-memory layers. Adds simple global tracking of memory used by the in-memory layers. It's very approximate, it doesn't take into account allocator, memory fragmentation or many other things, but it's a good first step. After storing a WAL record in the repository, the WAL receiver checks if the global memory usage. If it's above a configurable threshold (hard coded at 128 MB at the moment), it evicts a layer. The victim layer is chosen by GClock algorithm, similar to that used in the Postgres buffer cache. This stops the page server from using an unbounded amount of memory. It's pretty crude, the eviction and materializing and writing a layer to disk happens now in the WAL receiver thread. It would be nice to move that to a background thread, and it would be nice to have a smarter policy on when to materialize a new image layer and when to just write out a delta layer, and it would be nice to have more accurate accounting of memory. But this should fix the most pressing OOM issues, and is a step in the right direction. Co-authored-by: Patrick Insinger <patrickinsinger@gmail.com>	2021-11-02 15:49:39 +02:00
Heikki Linnakangas	8c6d2664c0	Support removing arbitrary open layers, not just the oldest one	2021-11-02 15:43:16 +02:00
Patrick Insinger	cdbbd15eb9	pageserver - add InMemoryLayer global map (#817 )	2021-11-01 12:20:24 -07:00
anastasia	85f8bf97f5	Name walkeeper threads to make debugging more convenient	2021-11-01 19:09:57 +03:00
anastasia	83ed930bc2	WIP. Launch and shutdown tenant threads together with walreceiver. TODO: now walreceiver only disconnects if safekeeper was shut down. Implemnt proper walreceiver disconnection.	2021-11-01 18:04:00 +03:00
anastasia	071e30cc53	Expose TENANT_THREADS_COUNT metric to observe number of currently active checkpointer and GC threads	2021-11-01 18:04:00 +03:00
Kirill Bulatov	e6ef27637b	Better API to handle timeline metadata properly	2021-10-29 23:51:40 +03:00
Patrick Insinger	b532470792	Set SO_REUSEADDR for all TCP listeners	2021-10-29 12:45:26 -07:00
Heikki Linnakangas	e0d7ecf91c	Refactor 'zenith' CLI subcommand handling Also fixes 'zenith safekeeper restart -m immediate'. The stop-mode was previously ignored.	2021-10-29 19:01:01 +03:00
Kirill Bulatov	edba2e9744	Use a proper extension for the readme file	2021-10-28 18:55:14 +03:00
Egor Suvorov	7e552b645f	Add disk write/sync metrics to Safekeeper (#745 )	2021-10-28 18:38:36 +03:00

... 3 4 5 6 7 ...

1255 Commits