rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-21 15:10:44 +00:00

Author	SHA1	Message	Date
Rory de Zoete	6dbf202e0d	Update crane copy target (#2704 ) Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>	2022-10-27 16:00:40 +02:00
Arseny Sher	b42bf9265a	Enable etcd compaction in neon_local.	2022-10-27 10:47:08 +03:00
Stas Kelvich	1f08ba5790	Avoid debian-testing packages in compute Dockerfiles plv8 can only be built with a fairly new gold linker version. We used to install it via binutils packages from testing, but it also updates libc and that causes troubles in the resulting image as different extensions were built against different libc versions. We could either use libc from debian-testing everywhere or restrain from using testing packages and install necessary programs manually. This patch uses the latter approach: gold for plv8 and cmake for h3 are installed manually. In a passing declare h3_postgis as a safe extension (previous omission).	2022-10-27 09:44:16 +03:00
bojanserafimov	0c54eb65fb	Move pagestream api to libs/pageserver_api (#2698 )	2022-10-26 17:32:31 -04:00
mikecaat	259a5f356e	Add a docker-compose example file (#1943 ) (#2666 ) Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>	2022-10-26 13:59:25 +03:00
Sergey Melnikov	a3cb8c11e0	Do not release to new staging proxies on release (#2685 )	2022-10-25 23:51:23 +00:00
bojanserafimov	9fb2287f87	Add draw_timeline binary (#2688 )	2022-10-25 11:25:22 -04:00
Alexander Bayandin	834ffe1bac	Add data format backward compatibility tests (#2626 )	2022-10-25 16:41:50 +02:00
Stas Kelvich	df18b041c0	Use apt version pinning instead of repo priorities Higher `bullseye` priority doesn't works for packages installed via `bullseye-updates`, e.g.: ``` libc-bin: Installed: 2.31-13+deb11u5 Candidate: 2.35-3 Version table: 2.35-3 500 500 http://ftp.debian.org/debian testing/main amd64 Packages *** 2.31-13+deb11u5 500 500 http://deb.debian.org/debian bullseye-updates/main amd64 Packages 100 /var/lib/dpkg/status 2.31-13+deb11u4 990 990 http://deb.debian.org/debian bullseye/main amd64 Packages ``` Try version pinning instead	2022-10-25 14:29:11 +03:00
Anastasia Lubennikova	39897105b2	Check postgres version and ensure that public schema exists before running GRANT query on it	2022-10-25 09:55:24 +03:00
Stas Kelvich	2f399f08b2	Hotfix to disable grant create on public schema `GRANT CREATE ON SCHEMA public` fails if there is no schema `public`. Disable it in release for now and make a better fix later (it is needed for v15 support).	2022-10-25 09:55:24 +03:00
Arseny Sher	9f49605041	Fix division by zero panic in determine_offloader.	2022-10-22 18:25:12 +03:00
Konstantin Knizhnik	7b6431cbd7	Disable wal_log_hints by default (#2598 ) * Disable wal_log_hints by default * Remove obsolete comment anbout wal_log_hints	2022-10-22 14:59:18 +03:00
Lassi Pölönen	321aeac3d4	Json logging capability (#2624 ) * Support configuring the log format as json or plain. Separately test json and plain logger. They would be competing on the same global subscriber otherwise. * Implement log_format for pageserver config * Implement configurable log format for safekeeper.	2022-10-21 17:30:20 +00:00
Andrés	71ef7b6663	Remove cached_property package (#2673 ) Co-authored-by: andres <andres.rodriguez@outlook.es>	2022-10-21 20:02:31 +03:00
Kirill Bulatov	5928cb33c5	Introduce timeline state (#2651 ) Similar to https://github.com/neondatabase/neon/pull/2395, introduces a state field in Timeline, that's possible to subscribe to. Adjusts * walreceiver to not to have any connections if timeline is not Active * remote storage sync to not to schedule uploads if timeline is Broken * not to create timelines if a tenant/timeline is broken * automatically switches timelines' states based on tenant state Does not adjust timeline's gc, checkpointing and layer flush behaviour much, since it's not safe to cancel these processes abruptly and there's task_mgr::shutdown_tasks that does similar thing.	2022-10-21 15:51:48 +00:00
Sergey Melnikov	6ff2c61ae0	Refactor safekeeper s3 config and change it for new account (#2672 )	2022-10-21 13:44:08 +00:00
Arseny Sher	7480a0338a	Determine safekeeper for offloading WAL without etcd election API. This API is rather pointless, as sane choice anyway requires knowledge of peers status and leaders lifetime in any case can intersect, which is fine for us -- so manual elections are straightforward. Here, we deterministically choose among the reasonably caught up safekeepers, shifting by timeline id to spread the load. A step towards custom broker https://github.com/neondatabase/neon/issues/2394	2022-10-21 15:33:27 +03:00
Sergey Melnikov	2709878b8b	Deploy scram proxies into new account (#2643 )	2022-10-21 14:21:22 +03:00
Kirill Bulatov	39e4bdb99e	Actualize tenant and timeline API modifiers (#2661 ) * Actualize tenant and timeline API modifiers * Use anyhow::Result explicitly	2022-10-21 10:58:43 +00:00
Anastasia Lubennikova	52e75fead9	Use anyhow::Result explicitly	2022-10-21 12:47:06 +03:00
Anastasia Lubennikova	a347d2b6ac	#2616 handle 'Unsupported pg_version' error properly	2022-10-21 12:47:06 +03:00
Heikki Linnakangas	fc4ea3553e	test_gc_cutoff.py fixes (#2655 ) * Fix bogus early exit from GC. Commit `91411c415a` added this failpoint, but the early exit was not intentional. * Cleanup test_gc_cutoff.py test. - Remove the 'scale' parameter, this isn't a benchmark - Tweak pgbench and pageserver options to create garbage faster that the the GC can collect away. The test used to take just under 5 minutes, which was uncomfortably close to the default 5 minute test timeout, and annoyingly even without the hard limit. These changes bring it down to about 1-2 minutes. - Improve comments, fix typos - Rename the failpoint. The old name, 'gc-before-save-metadata' implied that the failpoint was before the metadata update, but it was in fact much later in the function. - Move the call to persist the metadata outside the lock, to avoid holding it for too long. To verify that this test still covers the original bug, https://github.com/neondatabase/neon/issues/2539, I commenting out updating the metadata file like this: ``` diff --git a/pageserver/src/tenant/timeline.rs b/pageserver/src/tenant/timeline.rs index 1e857a9a..f8a9f34a 100644 --- a/pageserver/src/tenant/timeline.rs +++ b/pageserver/src/tenant/timeline.rs @@ -1962,7 +1962,7 @@ impl Timeline { } // Persist the new GC cutoff value in the metadata file, before // we actually remove anything. - self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?; + //self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?; info!("GC starting"); ``` It doesn't fail every time with that, but it did fail after about 5 runs.	2022-10-21 02:39:55 +03:00
Dmitry Rodionov	cca1ace651	make launch_wal_receiver infallible	2022-10-21 00:40:12 +03:00
Sergey Melnikov	30984c163c	Fix race between pushing image to ECR and copying to dockerhub (#2662 )	2022-10-20 23:01:01 +03:00
Konstantin Knizhnik	7404777efc	Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged (#2657 ) * Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged refer ##2587 * Bump postgres versions	2022-10-20 20:06:05 +03:00
Heikki Linnakangas	eb1bdcc6cf	If an FSM or VM page cannot be reconstructed, fill it with zeros. If we cannot reconstruct an FSM or VM page, while creating image layers, fill it with zeros instead. That should always be safe, for the FSM and VM, in the sense that you won't lose actual user data. It will get cleaned up by VACUUM later. We had a bug with FSM/VM truncation, where we truncated the FSM and VM at WAL replay to a smaller size than PostgreSQL originally did. We thought was harmless, as the FSM and VM are not critical for correctness and can be zeroed out or truncated without affecting user data. However, it lead to a situation where PostgreSQL created incremental WAL records for pages that we had already truncated away in the pageserver, and when we tried to replay those WAL records, that failed. That lead to a permanent error in image layer creation, and prevented it from ever finishing. See https://github.com/neondatabase/neon/issues/2601. With this patch, those pages will be filled with zeros in the image layer, which allows the image layer creation to finish.	2022-10-20 17:27:01 +03:00
Arthur Petukhovsky	f5ab9f761b	Remove flaky checks in test_delete_force (#2567 )	2022-10-20 17:14:32 +04:00
Kirill Bulatov	306a47c4fa	Use uninit mark files during timeline init for atomic creation (#2489 ) Part of https://github.com/neondatabase/neon/pull/2239 Regular, from scratch, timeline creation involves initdb to be run in a separate directory, data from this directory to be imported into pageserver and, finally, timeline-related background tasks to start. This PR ensures we don't leave behind any directories that are not marked as temporary and that pageserver removes such directories on restart, allowing timeline creation to be retried with the same IDs, if needed. It would be good to later rewrite the logic to use a temporary directory, similar what tenant creation does. Yet currently it's harder than this change, so not done.	2022-10-20 14:19:17 +03:00
Kirill Bulatov	84c5f681b0	Fix test feature detection (#2659 ) Follow-up of #2636 and #2654 , fixing the test detection feature. Pageserver currently outputs features as ``` /target/debug/pageserver --version Neon page server git:7734929a8202c8cc41596a861ffbe0b51b5f3cb9 failpoints: true, features: ["testing", "profiling"] ```	2022-10-20 13:44:03 +03:00
Kirill Bulatov	50297bef9f	RFC about Tenant / Timeline guard objects (#2660 ) Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2022-10-20 12:49:54 +03:00
Andrés	9211923bef	Pageserver Python tests should not fail if the server is built with no testing feature (#2636 ) Co-authored-by: andres <andres.rodriguez@outlook.es>	2022-10-20 10:46:57 +03:00
bojanserafimov	7734929a82	Remove stale todos (#2630 )	2022-10-19 22:59:22 +00:00
Heikki Linnakangas	bc5ec43056	Fix flaky physical-size tests in test_timeline_size.py. These two tests, test_timeline_physical_size_post_compaction and test_timeline_physical_size_post_gc, assumed that after you have waited for the WAL from a bulk insertion to arrive, and you run a cycle of checkpoint and compaction, no new layer files are created. Because if a new layer file is created while we are calculating the incremental and non-incremental physical sizes, they might differ. However, the tests used a very small checkpoint_distance, so even a small amount of WAL generated in PostgreSQL could cause a new layer file to be created. Autovacuum can kick in at any time, and do that. That caused occasional failues in the test. I was able to reproduce it reliably by adding a long delay between the incremental and non-incremental size calculations: ``` --- a/pageserver/src/http/routes.rs +++ b/pageserver/src/http/routes.rs @@ -129,6 +129,9 @@ async fn build_timeline_info( } }; let current_physical_size = Some(timeline.get_physical_size()); + if include_non_incremental_physical_size { + std:🧵:sleep(std::time::Duration::from_millis(60000)); + } let info = TimelineInfo { tenant_id: timeline.tenant_id, ``` To fix, disable autovacuum for the table. Autovacuum could still kick in for other tables, e.g. catalog tables, but that seems less likely to generate enough WAL to causea new layer file to be flushed. If this continues to be a problem in the future, we could simply retry the physical size call a few times, if there's a mismatch. A mismatch could happen every once in a while, but it's very unlikely to happen more than once or twice in a row. Fixes https://github.com/neondatabase/neon/issues/2212	2022-10-19 23:50:21 +03:00
MMeent	b237feedab	Add more redo metrics: (#2645 ) - Measure size of redo WAL (new histogram), with bounds between 24B-32kB - Add 2 more buckets at the upper end of the redo time histogram We often (>0.1% of several hours each day) take more than 250ms to do the redo round-trip to the postgres process. We need to measure these redo times more precisely.	2022-10-19 22:47:11 +02:00
Alexey Kondratov	4d1e48f3b9	[compute_ctl] Use postgres::config to properly escape database names (#2652 ) We've got at least one user in production that cannot create a database with a trailing space in the name. This happens because we use `url` crate for manipulating the DATABASE_URL, but it follows a standard that doesn't fit really well with Postgres. For example, it trims all trailing spaces from the path: > Remove any leading and trailing C0 control or space from input. > See: https://url.spec.whatwg.org/#url-parsing But we used `set_path()` to set database name and it's totally valid to have trailing spaces in the database name in Postgres. Thus, use `postgres::config::Config` to modify database name in the connection details.	2022-10-19 19:20:06 +02:00
Anastasia Lubennikova	7576b18b14	[compute_tools] fix GRANT CREATE ON SCHEMA public - run the grant query in each database	2022-10-19 18:37:52 +03:00
Konstantin Knizhnik	6b49b370fc	Fix build after applying PR #2558	2022-10-19 13:55:30 +03:00
Konstantin Knizhnik	91411c415a	Persists latest_gc_cutoff_lsn before performing GC (#2558 ) * Persists latest_gc_cutoff_lsn before performing GC * Peform some refactoring and code deduplication refer #2539 * Add test for persisting GC cutoff * Fix python test style warnings * Bump postgres version * Reduce number of iterations in test_gc_cutoff test * Bump postgres version * Undo bumping postgres version	2022-10-19 12:32:03 +03:00
Kirill Bulatov	c67cf34040	Update GH Action version (#2646 )	2022-10-19 11:16:36 +03:00
bojanserafimov	8fbe437768	Improve pageserver IO metrics (#2629 )	2022-10-18 11:53:28 -04:00
Heikki Linnakangas	989d78aac8	Buffer the TCP incoming stream on libpq connections. Reduces the number of syscalls needed to read the commands from the compute. Here's a snippet of strace output from the pageserver, when performing a sequential scan on a table, with prefetch: 3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1 3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4 3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\3", 27, 0, NULL, NULL) = 27 3084934 pread64(28, "\0\0\0\1\0\0\0\0\0\0\0\253 "..., 8192, 25190400) = 8192 3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\3A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}]) 3084934 read(46, "\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3084934 sendto(47, "d\0\0 \5f\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1 3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4 3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\4", 27, 0, NULL, NULL) = 27 3084934 pread64(28, " \0=\0L\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0;;\0\0\0\4\4\0"..., 8192, 25198592) = 8192 3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\4A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}]) 3084934 read(46, "\0\0\0\0\260\344q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3084934 sendto(47, "d\0\0 \5f\0\0\0\0\260\344q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1 3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4 3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\5", 27, 0, NULL, NULL) = 27 3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\5A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}]) 3084934 read(46, "\0\0\0\0\330\377q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3084934 sendto(47, "d\0\0 \5f\0\0\0\0\330\377q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 This shows the interaction for three get_page_at_lsn requests. For each request, the pageserver performs three recvfrom syscalls to read the incoming request from the socket. After this patch, those recvfrom calls are gone: 3086123 read(47, "\0\0\0\0\360\222q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3086123 sendto(45, "d\0\0 \5f\0\0\0\0\360\222q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 3086123 pread64(29, " "..., 8192, 25182208) = 8192 3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\2A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}]) 3086123 read(47, "\0\0\0\0000\256q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3086123 sendto(45, "d\0\0 \5f\0\0\0\0000\256q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 3086123 pread64(29, "\0\0\0\1\0\0\0\0\0\0\0\253 "..., 8192, 25190400) = 8192 3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\3A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}]) 3086123 read(47, "\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192 3086123 sendto(45, "d\0\0 \5f\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198 3086123 pread64(29, " \0=\0L\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0;;\0\0\0\4\4\0"..., 8192, 25198592) = 8192 3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\4A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010 3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}]) In this test, the compute sends a batch of prefetch requests, and they are read from the socket in one syscall. That syscall was not captured by the strace snippet above, but there are much fewer of them than before.	2022-10-18 18:46:07 +03:00
Stas Kelvich	7ca72578f9	Enable plv8 again Now with quickfix for https://github.com/plv8/plv8/issues/503	2022-10-18 18:34:27 +03:00
Heikki Linnakangas	41550ec8bf	Remove unnecessary indirections of libpqwalproposer functions In the Postgres backend, we cannot link directly with libpq (check the pgsql-hackers arhive for all kinds of fun that ensued when we tried to do that). Therefore, the libpq functions are used through the thin wrapper functions in libpqwalreceiver.so, and libpqwalreceiver.so is loaded dynamically. To hide the dynamic loading and make the calls look like regular functions, we use macros to hide the function pointers. We had inherited the same indirections in libpqwalproposer, but it's not needed since the neon extension is already a shared library that's loaded dynamically. There's no problem calling the functions directly there. Remove the indirections.	2022-10-18 18:25:30 +03:00
Sergey Melnikov	0cd2d91b9d	Fix deploy-new job by installing sivel.toiletwater (#2641 )	2022-10-18 14:44:19 +00:00
Sergey Melnikov	546e9bdbec	Deploy storage into new account and migrate to management API v2 (#2619 ) Deploy storage into new account Migrate safekeeper and pageserver initialisation to management api v2	2022-10-18 15:52:15 +03:00
Heikki Linnakangas	59bc7e67e0	Use an optimized version of amplify_num. Speeds up layer_map::search somewhat. I also opened a PR in the upstream rust-amplify repository with these changes, see https://github.com/rust-amplify/rust-amplify/pull/148. We can switch back to upstream version when that's merged.	2022-10-18 15:00:10 +03:00
Heikki Linnakangas	2418e72649	Speed up layer_map::search, by remembering the "envelope" for each layer. Lookups in the R-tree call the "envelope" function for every comparison, and our envelope function isn't very cheap, so that overhead adds up. Create the envelope once, when the layer is inserted into the tree, and store it along with the layer. That uses some more memory per layer, but that's not very significant. Speeds up the search operation 2x	2022-10-18 15:00:10 +03:00
Heikki Linnakangas	80746b1c7a	Add micro-benchmark for layer map search function The test data was extracted from our pgbench benchmark project on the captest environment, the one we use for the 'neon-captest-reuse' test.	2022-10-18 15:00:10 +03:00
Dmitry Rodionov	129f7c82b7	remove redundant expect_tenant_to_download_timeline	2022-10-18 11:21:48 +03:00

1 2 3 4 5 ...

2253 Commits