rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-07 21:42:56 +00:00

Author	SHA1	Message	Date
Joonas Koivunen	cf68963b18	Add initial tenant sizing model and a http route to query it (#2714 ) Tenant size information is gathered by using existing parts of `Tenant::gc_iteration` which are now separated as `Tenant::refresh_gc_info`. `Tenant::refresh_gc_info` collects branch points, and invokes `Timeline::update_gc_info`; nothing was supposed to be changed there. The gathered branch points (through Timeline's `GcInfo::retain_lsns`), `GcInfo::horizon_cutoff`, and `GcInfo::pitr_cutoff` are used to build up a Vec of updates fed into the `libs/tenant_size_model` to calculate the history size. The gathered information is now exposed using `GET /v1/tenant/{tenant_id}/size`, which which will respond with the actual calculated size. Initially the idea was to have this delivered as tenant background task and exported via metric, but it might be too computationally expensive to run it periodically as we don't yet know if the returned values are any good. Adds one new metric: - pageserver_storage_operations_seconds with label `logical_size` - separating from original `init_logical_size` Adds a pageserver wide configuration variable: - `concurrent_tenant_size_logical_size_queries` with default 1 This leaves a lot of TODO's, tracked on issue #2748.	2022-11-03 12:39:19 +00:00
Arseny Sher	63221e4b42	Fix sk->ps walsender shutdown on sk side on caughtup. This will fix many threads issue, but code around awfully still wants improvement. https://github.com/neondatabase/neon/issues/2722	2022-11-03 16:20:55 +04:00
bojanserafimov	d7eeb73f6f	Impl serialize for pagestream FeMessage (#2741 )	2022-11-02 23:44:07 -04:00
Joonas Koivunen	5112142997	fix: use different port for temporary postgres (#2743 ) `test_tenant_relocation` ends up starting a temporary postgres instance with a fixed port. the change makes the port configurable at scripts/export_import_between_pageservers.py and uses that in test_tenant_relocation.	2022-11-02 18:37:48 +00:00
bojanserafimov	a0a74868a4	Fix clippy (#2742 )	2022-11-02 12:30:09 -04:00
Christian Schwarz	b154992510	timeline_list_handler: avoid spawn_blocking As per https://github.com/neondatabase/neon/issues/2731#issuecomment-1299335813 refs https://github.com/neondatabase/neon/issues/2731	2022-11-02 16:22:58 +01:00
Christian Schwarz	a86a38c96e	README: fix instructions on how to run tests The `make debug` target doesn't exist, and I can't find it in the Git history.	2022-11-02 16:22:58 +01:00
Christian Schwarz	590f894db8	tenant_status: remove unnecessary spawn_blocking The spawn_blocking is pointless in this cases: get_tenant is not expected to block for any meaningful amount of time. There are get_tenant calls in most other functions in the file too, and they don't bother with spawn_blocking. Let's remove the spawn_blocking from tenant_status, too, to be consistent. fixes https://github.com/neondatabase/neon/issues/2731	2022-11-02 16:22:58 +01:00
Alexander Bayandin	0a0595b98d	test_backward_compatibility: assign random port to compute (#2738 )	2022-11-02 15:22:38 +00:00
Dmitry Rodionov	e56d11c8e1	fix style if possible (cannot really split long lines in mermaid)	2022-11-02 17:15:49 +02:00
Dmitry Rodionov	ccdc3188ed	update according to discussion and comments	2022-11-02 17:15:49 +02:00
Dmitry Rodionov	67401cbdb8	pageserver s3 coordination	2022-11-02 17:15:49 +02:00
Kirill Bulatov	d42700280f	Remove daemonize from storage components (#2677 ) Move daemonization logic into `control_plane`. Storage binaries now only crate a lockfile to avoid concurrent services running in the same directory.	2022-11-02 02:26:37 +02:00
Kirill Bulatov	6df4d5c911	Bump rustc to 1.62.1 (#2728 ) Changelog: https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1621-2022-07-19	2022-11-02 01:21:33 +02:00
Dmitry Rodionov	32d14403bd	remove wrong is_active filter for timelines in compaction/gc Gc needs to know about all branch points, not only ones for timelines that are active at the moment of gc. If timeline is inactive then we wont know about branch point. In this case gc can delete data that is needed by child timeline. For compaction it is less severe. Delaying compaction can cause an effect on performance. So it is still better to run it. There is a logic to exit it quickly if there is nothing to compact	2022-11-01 18:07:08 +02:00
Dmitry Ivanov	0df3467146	Refactoring: replace `utils::connstring` with `Url`-based APIs	2022-11-01 18:17:36 +03:00
Dmitry Rodionov	c64a121aa8	do not nest wal_connection_manager span inside parent one	2022-11-01 15:08:23 +02:00
Heikki Linnakangas	22cc8760b9	Move walredo process code under pgxn in the main 'neon' repository. - Refactor the way the WalProposerMain function is called when started with --sync-safekeepers. The postgres binary now explicitly loads the 'neon.so' library and calls the WalProposerMain in it. This is simpler than the global function callback "hook" we previously used. - Move the WAL redo process code to a new library, neon_walredo.so, and use the same mechanism as for --sync-safekeepers to call the WalRedoMain function, when launched with --walredo argument. - Also move the seccomp code to neon_walredo.so library. I kept the configure check in the postgres side for now, though.	2022-10-31 01:11:50 +01:00
Arseny Sher	596d622a82	Fix test_prepare_snapshot. It should checkpoint pageserver after waiting for all data arrival, not before.	2022-10-28 22:12:31 +04:00
Sergey Melnikov	7481fb082c	Fix bugs in #2713 (#2716 )	2022-10-28 14:12:49 +00:00
Arseny Sher	1eb9bd052a	Bump vendor/postgres-v15 to fix XLP_FIRST_IS_CONTRECORD issue. ref https://github.com/neondatabase/cloud/issues/2688	2022-10-28 16:45:11 +03:00
Sergey Melnikov	59a3ca4ec6	Deploy proxy to new prod regions (#2713 ) * Refactor proxy deploy * Test new prod deploy * Remove assume role * Add new values * Add all regions	2022-10-28 16:25:28 +03:00
Sergey Melnikov	e86a9105a4	Deploy storage to new prod regions (#2709 )	2022-10-28 10:17:27 +00:00
Stas Kelvich	d3c8749da5	Build compute postgres with openssl support The main reason for that change is that Postgres 15 requires OpenSSL for `pgcrypto` to work. Also not a bad idea to have SSL-enabled Postgres in general.	2022-10-28 10:39:22 +03:00
Alexander Bayandin	128dc8d405	Nightly Benchmarks: fix workflow (#2708 )	2022-10-27 19:26:10 +03:00
Alexander Bayandin	0cbae6e8f3	test_backward_compatibility: friendlier error message (#2707 )	2022-10-27 15:54:49 +00:00
Alexander Stanovoy	78e412b84b	The fix of #2650 . (#2686 ) * Wrappers and drop implementations for image and delta layer writers. * Two regression tests for the image and delta layer files.	2022-10-27 14:02:55 +00:00
Rory de Zoete	6dbf202e0d	Update crane copy target (#2704 ) Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>	2022-10-27 16:00:40 +02:00
Arseny Sher	b42bf9265a	Enable etcd compaction in neon_local.	2022-10-27 10:47:08 +03:00
Stas Kelvich	1f08ba5790	Avoid debian-testing packages in compute Dockerfiles plv8 can only be built with a fairly new gold linker version. We used to install it via binutils packages from testing, but it also updates libc and that causes troubles in the resulting image as different extensions were built against different libc versions. We could either use libc from debian-testing everywhere or restrain from using testing packages and install necessary programs manually. This patch uses the latter approach: gold for plv8 and cmake for h3 are installed manually. In a passing declare h3_postgis as a safe extension (previous omission).	2022-10-27 09:44:16 +03:00
bojanserafimov	0c54eb65fb	Move pagestream api to libs/pageserver_api (#2698 )	2022-10-26 17:32:31 -04:00
mikecaat	259a5f356e	Add a docker-compose example file (#1943 ) (#2666 ) Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>	2022-10-26 13:59:25 +03:00
Sergey Melnikov	a3cb8c11e0	Do not release to new staging proxies on release (#2685 )	2022-10-25 23:51:23 +00:00
bojanserafimov	9fb2287f87	Add draw_timeline binary (#2688 )	2022-10-25 11:25:22 -04:00
Alexander Bayandin	834ffe1bac	Add data format backward compatibility tests (#2626 )	2022-10-25 16:41:50 +02:00
Stas Kelvich	df18b041c0	Use apt version pinning instead of repo priorities Higher `bullseye` priority doesn't works for packages installed via `bullseye-updates`, e.g.: ``` libc-bin: Installed: 2.31-13+deb11u5 Candidate: 2.35-3 Version table: 2.35-3 500 500 http://ftp.debian.org/debian testing/main amd64 Packages *** 2.31-13+deb11u5 500 500 http://deb.debian.org/debian bullseye-updates/main amd64 Packages 100 /var/lib/dpkg/status 2.31-13+deb11u4 990 990 http://deb.debian.org/debian bullseye/main amd64 Packages ``` Try version pinning instead	2022-10-25 14:29:11 +03:00
Anastasia Lubennikova	39897105b2	Check postgres version and ensure that public schema exists before running GRANT query on it	2022-10-25 09:55:24 +03:00
Stas Kelvich	2f399f08b2	Hotfix to disable grant create on public schema `GRANT CREATE ON SCHEMA public` fails if there is no schema `public`. Disable it in release for now and make a better fix later (it is needed for v15 support).	2022-10-25 09:55:24 +03:00
Arseny Sher	9f49605041	Fix division by zero panic in determine_offloader.	2022-10-22 18:25:12 +03:00
Konstantin Knizhnik	7b6431cbd7	Disable wal_log_hints by default (#2598 ) * Disable wal_log_hints by default * Remove obsolete comment anbout wal_log_hints	2022-10-22 14:59:18 +03:00
Lassi Pölönen	321aeac3d4	Json logging capability (#2624 ) * Support configuring the log format as json or plain. Separately test json and plain logger. They would be competing on the same global subscriber otherwise. * Implement log_format for pageserver config * Implement configurable log format for safekeeper.	2022-10-21 17:30:20 +00:00
Andrés	71ef7b6663	Remove cached_property package (#2673 ) Co-authored-by: andres <andres.rodriguez@outlook.es>	2022-10-21 20:02:31 +03:00
Kirill Bulatov	5928cb33c5	Introduce timeline state (#2651 ) Similar to https://github.com/neondatabase/neon/pull/2395, introduces a state field in Timeline, that's possible to subscribe to. Adjusts * walreceiver to not to have any connections if timeline is not Active * remote storage sync to not to schedule uploads if timeline is Broken * not to create timelines if a tenant/timeline is broken * automatically switches timelines' states based on tenant state Does not adjust timeline's gc, checkpointing and layer flush behaviour much, since it's not safe to cancel these processes abruptly and there's task_mgr::shutdown_tasks that does similar thing.	2022-10-21 15:51:48 +00:00
Sergey Melnikov	6ff2c61ae0	Refactor safekeeper s3 config and change it for new account (#2672 )	2022-10-21 13:44:08 +00:00
Arseny Sher	7480a0338a	Determine safekeeper for offloading WAL without etcd election API. This API is rather pointless, as sane choice anyway requires knowledge of peers status and leaders lifetime in any case can intersect, which is fine for us -- so manual elections are straightforward. Here, we deterministically choose among the reasonably caught up safekeepers, shifting by timeline id to spread the load. A step towards custom broker https://github.com/neondatabase/neon/issues/2394	2022-10-21 15:33:27 +03:00
Sergey Melnikov	2709878b8b	Deploy scram proxies into new account (#2643 )	2022-10-21 14:21:22 +03:00
Kirill Bulatov	39e4bdb99e	Actualize tenant and timeline API modifiers (#2661 ) * Actualize tenant and timeline API modifiers * Use anyhow::Result explicitly	2022-10-21 10:58:43 +00:00
Anastasia Lubennikova	52e75fead9	Use anyhow::Result explicitly	2022-10-21 12:47:06 +03:00
Anastasia Lubennikova	a347d2b6ac	#2616 handle 'Unsupported pg_version' error properly	2022-10-21 12:47:06 +03:00
Heikki Linnakangas	fc4ea3553e	test_gc_cutoff.py fixes (#2655 ) * Fix bogus early exit from GC. Commit `91411c415a` added this failpoint, but the early exit was not intentional. * Cleanup test_gc_cutoff.py test. - Remove the 'scale' parameter, this isn't a benchmark - Tweak pgbench and pageserver options to create garbage faster that the the GC can collect away. The test used to take just under 5 minutes, which was uncomfortably close to the default 5 minute test timeout, and annoyingly even without the hard limit. These changes bring it down to about 1-2 minutes. - Improve comments, fix typos - Rename the failpoint. The old name, 'gc-before-save-metadata' implied that the failpoint was before the metadata update, but it was in fact much later in the function. - Move the call to persist the metadata outside the lock, to avoid holding it for too long. To verify that this test still covers the original bug, https://github.com/neondatabase/neon/issues/2539, I commenting out updating the metadata file like this: ``` diff --git a/pageserver/src/tenant/timeline.rs b/pageserver/src/tenant/timeline.rs index 1e857a9a..f8a9f34a 100644 --- a/pageserver/src/tenant/timeline.rs +++ b/pageserver/src/tenant/timeline.rs @@ -1962,7 +1962,7 @@ impl Timeline { } // Persist the new GC cutoff value in the metadata file, before // we actually remove anything. - self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?; + //self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?; info!("GC starting"); ``` It doesn't fail every time with that, but it did fail after about 5 runs.	2022-10-21 02:39:55 +03:00

1 2 3 4 5 ...

2280 Commits