rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 14:02:55 +00:00

Author	SHA1	Message	Date
Sasha Krassovsky	0ba4cae491	Fix RLS/REPLICATION granting (#6083 ) ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-12-08 12:55:44 -08:00
Alexey Kondratov	bc1020f965	compute_ctl: Notify waiters when Postgres failed to start (#6034 ) In case of configuring the empty compute, API handler is waiting on condvar for compute state change. Yet, previously if Postgres failed to start we were just setting compute status to `Failed` without notifying. It causes a timeout on control plane side, although we can return a proper error from compute earlier. With this commit API handler should be properly notified.	2023-12-05 13:38:45 +01:00
Alexey Kondratov	85d08581ed	[compute_ctl] Introduce feature flags in the compute spec (#6016 ) ## Problem In the past we've rolled out all new `compute_ctl` functionality right to all users, which could be risky. I want to have a more fine-grained control over what we enable, in which env and to which users. ## Summary of changes Add an option to pass a list of feature flags to `compute_ctl`. If not passed, it defaults to an empty list. Any unknown flags are ignored. This allows us to release new experimental features safer, as we can then flip the flag for one specific user, only Neon employees, free / pro / etc. users and so on. Or control it per environment. In the current implementation feature flags are passed via compute spec, so they do not allow controlling behavior of `empty` computes. For them, we can either stick with the previous approach, i.e. add separate cli args or introduce a more generic `--features` cli argument.	2023-12-04 19:54:18 +01:00
Anastasia Lubennikova	e3512340c1	Override neon.max_cluster_size for the time of compute_ctl (#5998 ) Temporarily reset neon.max_cluster_size to avoid the possibility of hitting the limit, while we are applying config: creating new extensions, roles, etc...	2023-12-03 15:21:44 +00:00
Alexey Kondratov	c1295bfb3a	[compute_ctl] User correct HTTP code in the /configure errors (#6017 ) It was using `PRECONDITION_FAILED` for errors during `ComputeSpec` to `ParsedSpec` conversion, but this disobeys the OpenAPI spec [1] and correct code should be `BAD_REQUEST` for any spec processing errors. While on it, I also noticed that `compute_ctl` OpenAPI spec has an invalid format and fixed it. [1] `fd81945a60/compute_tools/src/http/openapi_spec.yaml (L119-L120)`	2023-12-01 18:19:55 +01:00
Anastasia Lubennikova	4c29e0594e	Update neon extension relocatable for existing installations (#5943 )	2023-11-27 23:29:24 +00:00
Anastasia Lubennikova	92bc2bb132	Refactor remote extensions feature to request extensions from proxy (#5836 ) instead of direct S3 request. Pros: - simplify code a lot (no need to provide AWS credentials and paths); - reduce latency of downloading extension data as proxy resides near computes; -reduce AWS costs as proxy has cache and 1000 computes asking the same extension will not generate 1000 downloads from S3. - we can use only one S3 bucket to store extensions (and rid of regional buckets which were introduced to reduce latency); Changes: - deprecate remote-ext-config compute_ctl parameter, use http://pg-ext-s3-gateway if any old format remote-ext-cofig is provided; - refactor tests to use mock http server;	2023-11-27 12:10:23 +00:00
Anastasia Lubennikova	87b8ac3ec3	Only create neon extension in postgres database; (#5918 ) Create neon extension in neon schema.	2023-11-26 08:37:01 +00:00
Anastasia Lubennikova	f5dfa6f140	Create extension neon in existing databases too	2023-11-23 18:53:03 +00:00
Anastasia Lubennikova	f8d9bd8d14	Add extension neon to all databases. - Run CREATE EXTENSION neon for template1, so that it was created in all databases. - Run ALTER EXTENSION neon in all databases, to always have the newest version of the extension in computes. - Add test_neon_extension test	2023-11-23 18:53:03 +00:00
Em Sharnoff	cad0dca4b8	compute_ctl: Remove deprecated flag `--file-cache-on-disk` (#5622 ) See neondatabase/cloud#7516 for more.	2023-11-18 12:43:54 +01:00
Joonas Koivunen	462f04d377	Smaller test addition and change (#5858 ) - trivial serialization roundtrip test for `pageserver::repository::Value` - add missing `start_paused = true` to 15s test making it <0s test - completely unrelated future clippy lint avoidance (helps beta channel users)	2023-11-14 18:04:34 +01:00
Christian Schwarz	e9b227a11e	cleanup unused RemoteStorage fields (#5830 ) Found this while working on #5771	2023-11-08 16:54:33 +00:00
John Spray	b989ad1922	extend test_change_pageserver for failure case, rework changing pageserver (#5693 ) Reproducer for https://github.com/neondatabase/neon/issues/5692 The test change in this PR intentionally fails, to demonstrate the issue. --------- Co-authored-by: Sasha Krassovsky <krassovskysasha@gmail.com>	2023-11-08 11:26:56 +00:00
Joonas Koivunen	4be6bc7251	refactor: remove unnecessary unsafe (#5802 ) unsafe impls for `Send` and `Sync` should not be added by default. in the case of `SlotGuard` removing them does not cause any issues, as the compiler automatically derives those. This PR adds requirement to document the unsafety (see [clippy::undocumented_unsafe_blocks]) and opportunistically adds `#![deny(unsafe_code)]` to most places where we don't have unsafe code right now. TRPL on Send and Sync: https://doc.rust-lang.org/book/ch16-04-extensible-concurrency-sync-and-send.html [clippy::undocumented_unsafe_blocks]: https://rust-lang.github.io/rust-clippy/master/#/undocumented_unsafe_blocks	2023-11-07 10:26:25 +00:00
Heikki Linnakangas	b85fc39bdb	Update control plane API path for getting compute spec. (#5357 ) We changed the path in the control plane. The old path is still accepted for compatibility with existing computes, but we'd like to phase it out.	2023-11-06 09:26:09 +02:00
Em Sharnoff	367971a0e9	vm-monitor: Remove support for file cache in tmpfs (#5617 ) ref neondatabase/cloud#7516. We switched everything over to file cache on disk, now time to remove support for having it in tmpfs.	2023-11-02 16:06:16 +00:00
Muhammet Yazici	4f0a8e92ad	fix: Add bearer prefix to Authorization header (#5740 ) ## Problem Some requests with `Authorization` header did not properly set the `Bearer ` prefix. Problem explained here https://github.com/neondatabase/cloud/issues/6390. ## Summary of changes Added `Bearer ` prefix to missing requests.	2023-11-01 09:41:48 +03:00
Em Sharnoff	f7067a38b7	compute_ctl: Assume --vm-monitor-addr arg is always present (#5611 ) It has a default value, so this should be sound. Treating its presence as semantically significant was leading to spurious warnings.	2023-10-31 10:00:23 -07:00
Konstantin Knizhnik	ad99fa5f03	Grant BYPASSRLS and REPLICATION to exited roles (#5657 ) ## Problem Role need to have REPLICATION privilege to be able to used for logical replication. New roles are created with this option. This PR tries to update existed roles. ## Summary of changes Update roles in `handle_roles` method ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-30 15:29:25 +00:00
Sasha Krassovsky	116c342cad	Support changing pageserver dynamically (#5542 ) ## Problem We currently require full restart of compute if we change the pageserver url ## Summary of changes Makes it so that we don't have to do a full restart, but can just send SIGHUP	2023-10-26 10:56:07 -07:00
Konstantin Knizhnik	b1a1126152	Grant replication permission to newly created users (#5615 ) ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-20 21:29:17 +03:00
Konstantin Knizhnik	5c88213eaf	Logical replication (#5271 ) ## Problem See https://github.com/neondatabase/company_projects/issues/111 ## Summary of changes Save logical replication files in WAL at compute and include them in basebackup at pate server. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2023-10-18 16:42:22 +03:00
Alexey Kondratov	0ca342260c	[compute_ctl+pgxn] Handle invalid databases after failed drop (#5561 ) ## Problem In `89275f6c1e` we fixed an issue, when we were dropping db in Postgres even though cplane request failed. Yet, it introduced a new problem that we now de-register db in cplane even if we didn't actually drop it in Postgres. ## Summary of changes Here we revert extension change, so we now again may leave db in invalid state after failed drop. Instead, `compute_ctl` is now responsible for cleaning up invalid databases during full configuration. Thus, there are two ways of recovering from failed DROP DATABASE: 1. User can just repeat DROP DATABASE, same as in Vanilla Postgres. 2. If they didn't, then on next full configuration (dbs / roles changes in the API; password reset; or data availability check) invalid db will be cleaned up in the Postgres and re-created by `compute_ctl`. So again it follows pretty much the same semantics as Vanilla Postgres -- you need to drop it again after failed drop. That way, we have a recovery trajectory for both problems. See this commit for info about `invalid` db state: `a4b4cc1d60` According to it: > An invalid database cannot be connected to anymore, but can still be dropped. While on it, this commit also fixes another issue, when `compute_ctl` was trying to connect to databases with `ALLOW CONNECTIONS false`. Now it will just skip them. Fixes #5435	2023-10-16 20:46:45 +02:00
duguorong009	25a37215f3	fix: replace all `std::PathBuf`s with `camino::Utf8PathBuf` (#5352 ) Fixes #4689 by replacing all of `std::Path` , `std::PathBuf` with `camino::Utf8Path`, `camino::Utf8PathBuf` in - pageserver - safekeeper - control_plane - libs/remote_storage Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-10-04 17:52:23 +03:00
Joonas Koivunen	db8ff9d64b	testing: record walredo failures to test reports (#5451 ) We have rare walredo failures with pg16. Let's introduce recording of failing walredo input in `#[cfg(feature = "testing")]`. There is additional logging (the value reconstruction path logging usually shown with not found keys), keeping it for `#[cfg(features = "testing")]`. Cc: #5404.	2023-10-04 11:24:30 +03:00
MMeent	83e7e5dbbd	Feat/postgres 16 (#4761 ) This adds PostgreSQL 16 as a vendored postgresql version, and adapts the code to support this version. The important changes to PostgreSQL 16 compared to the PostgreSQL 15 changeset include the addition of a neon_rmgr instead of altering Postgres's original WAL format. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-12 15:11:32 +02:00
Konstantin Knizhnik	f64b338ce3	Ingore DISK_FULL error when performing availability check for client (#5010 ) See #5001 No space is what's expected if we're at size limit. Of course if SK incorrectly returned "no space", the availability check wouldn't fire. But users would notice such a bug quite soon anyways. So ignoring "no space" is the right trade-off. ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-09 21:51:04 +03:00
Rahul Modpur	485a2cfdd3	Fix pg_config version parsing (#5200 ) ## Problem Fix pg_config version parsing ## Summary of changes Use regex to capture major version of postgres #5146	2023-09-07 15:34:22 +02:00
Chengpeng Yan	dfe2e5159a	remove the duplicate entries in postgresql.conf (#5090 )	2023-09-06 13:57:03 -04:00
Nikita Kalyanov	77658a155b	support deploying in IPv6-only environments (#4135 ) A set of changes to enable neon to work in IPv6 environments. The changes are backward-compatible but allow to deploy neon even to IPv6-only environments: - bind to both IPv4 and IPv6 interfaces - allow connections to Postgres from IPv6 interface - parse the address from control plane that could also be IPv6	2023-09-05 12:45:46 +03:00
Alexey Kondratov	f2c21447ce	[compute_ctl] Create check availability data during full configuration (#5084 ) I've moved it to the API handler in the `589cf1ed2` to do not delay compute start. Yet, we now skip full configuration and catalog updates in the most hot path -- waking up suspended compute, and only do it at: - first start - start with applying new configuration - start for availability check so it doesn't really matter anymore. The problem with creating the table and test record in the API handler is that someone can fill up timeline till the logical limit. Then it's suspended and availability check is scheduled, so it fails. If table + test row are created at the very beginning, we reserve a 8 KB page for future checks, which theoretically will last almost forever. For example, my ~1y old branch still has 8 KB sized test table: ```sql cloud_admin@postgres=# select pg_relation_size('health_check'); pg_relation_size ------------------ 8192 (1 row) ``` --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-08-30 17:44:28 +02:00
Anastasia Lubennikova	a7c0e4dcd0	Check if custiom extension is enabled. This check was lost in the latest refactoring. If extension is not present in 'public_extensions' or 'custom_extensions' don't download it	2023-08-30 17:47:06 +03:00
Anastasia Lubennikova	e5a397cf96	Form archive_path for remote extensions on the fly	2023-08-30 13:56:51 +03:00
Em Sharnoff	8d2a4aa5f8	vm-monitor: Add flag for when file cache on disk (#5130 ) Part 1 of 2, for moving the file cache onto disk. Because VMs are created by the control plane (and that's where the filesystem for the file cache is defined), we can't rely on any kind of synchronization between releases, so the change needs to be feature-gated (kind of), with the default remaining the same for now. See also: neondatabase/cloud#6593	2023-08-29 12:44:48 -07:00
Felix Prasanna	85d6d9dc85	monitor/compute_ctl: remove references to the informant (#5115 ) Also added some docs to the monitor :) Co-authored-by: Em Sharnoff <sharnoff@neon.tech>	2023-08-29 02:59:27 +03:00
Em Sharnoff	529f8b5016	compute_ctl: Fix switched vm-monitor args (#5117 ) Small switcheroo from #4946.	2023-08-28 14:55:41 +02:00
Felix Prasanna	7b5489a0bb	compute_ctl: start pg in cgroup for vms (#4920 ) Starts `postgres` in cgroup directly from `compute_ctl` instead of from `vm-builder`. This is required because the `vm-monitor` cannot be in the cgroup it is managing. Otherwise, it itself would be frozen when freezing the cgroup. Requires https://github.com/neondatabase/cloud/pull/6331, which adds the `AUTOSCALING` environment variable letting `compute_ctl` know to start `postgres` in the cgroup. Requires https://github.com/neondatabase/autoscaling/pull/468, which prevents `vm-builder` from starting the monitor and putting postgres in a cgroup. This will require a `VM_BUILDER_VERSION` bump.	2023-08-25 15:59:12 -04:00
Felix Prasanna	18537be298	monitor: listen on correct port to accept agent connections (#5100 ) ## Problem The previous arguments have the monitor listen on `localhost`, which the informant can connect to since it's also in the VM, but which the agent cannot. Also, the port is wrong. ## Summary of changes Listen on `0.0.0.0:10301`	2023-08-24 17:32:46 -04:00
Felix Prasanna	3128eeff01	compute_ctl: add vm-monitor (#4946 ) Co-authored-by: Em Sharnoff <sharnoff@neon.tech>	2023-08-24 15:54:37 -04:00
Anastasia Lubennikova	786c7b3708	Refactor remote extensions index download. Don't download ext_index.json from s3, but instead receive it as a part of spec from control plane. This eliminates s3 access for most compute starts, and also allows us to update extensions spec on the fly	2023-08-17 12:48:33 +03:00
Sasha Krassovsky	3a71cf38c1	Grant BypassRLS to new neon_superuser roles (#4935 )	2023-08-10 21:04:45 +02:00
Alek Westover	17aea78aa7	delete already present files from library index (#4955 )	2023-08-10 16:51:16 +03:00
Alek Westover	e157b16c24	if control file already exists ignore the remote version of the extension (#4945 )	2023-08-09 18:56:09 +00:00
bojanserafimov	94ad9204bb	Measure compute-pageserver latency (#4901 ) Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-08-09 13:20:30 -04:00
Anastasia Lubennikova	da128a509a	fix pkglibdir path for remote extensions	2023-08-09 19:13:11 +03:00
Anastasia Lubennikova	4ce7aa9ffe	Fix extensions download error handling (#4941 ) Don't panic if library or extension is not found in remote extension storage or download has failed. Instead, log the error and proceed - if file is not present locally as well, postgres will fail with postgres error. If it is a shared_preload_library, it won't start, because of bad config. Otherwise, it will just fail to run the SQL function/ command that needs the library. Also, don't try to download extensions if remote storage is not configured.	2023-08-09 15:37:51 +03:00
bojanserafimov	4ad0c8f960	compute_ctl: Prewarm before starting http server (#4867 )	2023-08-02 14:19:06 -04:00
Alek Westover	d005c77ea3	Tar Remote Extensions (#4715 ) Add infrastructure to dynamically load postgres extensions and shared libraries from remote extension storage. Before postgres start downloads list of available remote extensions and libraries, and also downloads 'shared_preload_libraries'. After postgres is running, 'compute_ctl' listens for HTTP requests to load files. Postgres has new GUC 'extension_server_port' to specify port on which 'compute_ctl' listens for requests. When PostgreSQL requests a file, 'compute_ctl' downloads it. See more details about feature design and remote extension storage layout in docs/rfcs/024-extension-loading.md --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Alek Westover <alek.westover@gmail.com>	2023-08-02 12:38:12 +03:00
bojanserafimov	ddbe170454	Prewarm compute nodes (#4828 )	2023-07-31 14:13:32 -04:00

1 2 3 4

160 Commits