rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-17 13:10:38 +00:00

Author	SHA1	Message	Date
Japin Li	cdaed4d79c	Fix outdated comment (#8149 ) Commit `97b48c23f` changes the log wait timeout from 1 second to 100 milliseconds but forgets to update the comment.	2024-07-03 13:55:36 -04:00
Em Sharnoff	f86845f64b	compute_ctl: Auto-set dynamic_shared_memory_type (#7348 ) Part of neondatabase/cloud#12047. The basic idea is that for our VMs, we want to enable swap and disable Linux memory overcommit. Alongside these, we should set postgres' dynamic_shared_memory_type to mmap, but we want to avoid setting it to mmap if swap is not enabled. Implementing this in the control plane would be fiddly, but it's relatively straightforward to add to compute_ctl.	2024-04-10 13:13:48 +00:00
Arpad Müller	c0e0fc8151	Update Rust to 1.76.0 (#6683 ) [Release notes](https://github.com/rust-lang/rust/releases/tag/1.75.0).	2024-02-08 19:57:02 +01:00
Anastasia Lubennikova	e6e013b3b7	Fix pgbouncer settings update: - Start pgbouncer in VM from postgres user, to allow connection to pgbouncer admin console. - Remove unused compute_ctl options --pgbouncer-connstr and --pgbouncer-ini-path. - Fix and cleanup code of connection to pgbouncer, add retries because pgbouncer may not be instantly ready when compute_ctl starts.	2024-01-18 11:27:12 +00:00
Arthur Petukhovsky	97b48c23f8	Compact some compute_ctl logs (#6346 ) Print postgres roles in a single line and add some info.	2024-01-12 18:24:22 +00:00
Arthur Petukhovsky	71beabf82d	Join multiline postgres logs in compute_ctl (#5903 ) Postgres can write multiline logs, and they are difficult to handle after they are mixed with other logs. This PR combines multiline logs from postgres into a single line, where previous line breaks are replaced with unicode zero-width spaces. Then postgres logs are written to stderr with `PG:` prefix. It makes it easy to distinguish postgres logs from all other compute logs with a simple grep, e.g. `\|= "PG:"`	2024-01-10 15:11:43 +00:00
Anastasia Lubennikova	6e40900569	Manage pgbouncer configuration from compute_ctl: - add pgbouncer_settings section to compute spec; - add pgbouncer-connstr option to compute_ctl. - add pgbouncer-ini-path option to compute_ctl. Default: /etc/pgbouncer/pgbouncer.ini Apply pgbouncer config on compute start and respec to override default spec. Save pgbouncer config updates to pgbouncer.ini to preserve them across pgbouncer restarts.	2023-12-26 15:17:09 +00:00
Sasha Krassovsky	0ba4cae491	Fix RLS/REPLICATION granting (#6083 ) ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-12-08 12:55:44 -08:00
Konstantin Knizhnik	ad99fa5f03	Grant BYPASSRLS and REPLICATION to exited roles (#5657 ) ## Problem Role need to have REPLICATION privilege to be able to used for logical replication. New roles are created with this option. This PR tries to update existed roles. ## Summary of changes Update roles in `handle_roles` method ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-30 15:29:25 +00:00
Alexey Kondratov	0ca342260c	[compute_ctl+pgxn] Handle invalid databases after failed drop (#5561 ) ## Problem In `89275f6c1e` we fixed an issue, when we were dropping db in Postgres even though cplane request failed. Yet, it introduced a new problem that we now de-register db in cplane even if we didn't actually drop it in Postgres. ## Summary of changes Here we revert extension change, so we now again may leave db in invalid state after failed drop. Instead, `compute_ctl` is now responsible for cleaning up invalid databases during full configuration. Thus, there are two ways of recovering from failed DROP DATABASE: 1. User can just repeat DROP DATABASE, same as in Vanilla Postgres. 2. If they didn't, then on next full configuration (dbs / roles changes in the API; password reset; or data availability check) invalid db will be cleaned up in the Postgres and re-created by `compute_ctl`. So again it follows pretty much the same semantics as Vanilla Postgres -- you need to drop it again after failed drop. That way, we have a recovery trajectory for both problems. See this commit for info about `invalid` db state: `a4b4cc1d60` According to it: > An invalid database cannot be connected to anymore, but can still be dropped. While on it, this commit also fixes another issue, when `compute_ctl` was trying to connect to databases with `ALLOW CONNECTIONS false`. Now it will just skip them. Fixes #5435	2023-10-16 20:46:45 +02:00
arpad-m	982fce1e72	Fix rustdoc warnings and test cargo doc in CI (#4711 ) ## Problem `cargo +nightly doc` is giving a lot of warnings: broken links, naked URLs, etc. ## Summary of changes * update the `proc-macro2` dependency so that it can compile on latest Rust nightly, see https://github.com/dtolnay/proc-macro2/pull/391 and https://github.com/dtolnay/proc-macro2/issues/398 * allow the `private_intra_doc_links` lint, as linking to something that's private is always more useful than just mentioning it without a link: if the link breaks in the future, at least there is a warning due to that. Also, one might enable [`--document-private-items`](https://doc.rust-lang.org/cargo/commands/cargo-doc.html#documentation-options) in the future and make these links work in general. * fix all the remaining warnings given by `cargo +nightly doc` * make it possible to run `cargo doc` on stable Rust by updating `opentelemetry` and associated crates to version 0.19, pulling in a fix that previously broke `cargo doc` on stable: https://github.com/open-telemetry/opentelemetry-rust/pull/904 * Add `cargo doc` to CI to ensure that it won't get broken in the future. Fixes #2557 ## Future work * Potentially, it might make sense, for development purposes, to publish the generated rustdocs somewhere, like for example [how the rust compiler does it](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/index.html). I will file an issue for discussion.	2023-07-15 05:11:25 +03:00
Alexey Kondratov	ed938885ff	[compute_ctl] Fix deletion of template databases (#4661 ) If database was created with `is_template true` Postgres doesn't allow dropping it right away and throws error ``` ERROR: cannot drop a template database ``` so we have to unset `is_template` first. Fixing it, I noticed that our `escape_literal` isn't exactly correct and following the same logic as in `quote_literal_internal`, we need to prepend string with `E`. Otherwise, it's not possible to filter `pg_database` using `escape_literal()` result if name contains `\`, for example. Also use `FORCE` to drop database even if there are active connections. We run this from `cloud_admin`, so it should have enough privileges. NB: there could be other db states, which prevent us from dropping the database. For example, if db is used by any active subscription or logical replication slot. TODO: deal with it once we allow logical replication. Proper fix should involve returning an error code to the control plane, so it could figure out that this is a non-retryable error, return it to the user and mark operation as permanently failed. Related to neondatabase/cloud#4258	2023-07-13 13:18:35 +02:00
Joonas Koivunen	44e7d5132f	fix: hide token from logs (#4584 ) fixes #4583 and also changes all needlessly arg listing places to use `skip_all`.	2023-06-29 15:53:16 +03:00
Sasha Krassovsky	b1477b4448	Create neon_superuser role, grant it to roles created from control plane (#4425 ) ## Problem Currently, if a user creates a role, it won't by default have any grants applied to it. If the compute restarts, the grants get applied. This gives a very strange UX of being able to drop roles/not have any access to anything at first, and then once something triggers a config application, suddenly grants are applied. This removes these grants.	2023-06-24 01:38:27 +03:00
Heikki Linnakangas	df3bae2ce3	Use `compute_ctl` to manage Postgres in tests. (#3886 ) This adds test coverage for 'compute_ctl', as it is now used by all the python tests. There are a few differences in how 'compute_ctl' is called in the tests, compared to the real web console: - In the tests, the postgresql.conf file is included as one large string in the spec file, and it is written out as it is to the data directory. I added a new field for that to the spec file. The real web console, however, sets all the necessary settings in the 'settings' field, and 'compute_ctl' creates the postgresql.conf from those settings. - In the tests, the information needed to connect to the storage, i.e. tenant_id, timeline_id, connection strings to pageserver and safekeepers, are now passed as new fields in the spec file. The real web console includes them as the GUCs in the 'settings' field. (Both of these are different from what the test control plane used to do: It used to write the GUCs directly in the postgresql.conf file). The plan is to change the control plane to use the new method, and remove the old method, but for now, support both. Some tests that were sensitive to the amount of WAL generated needed small changes, to accommodate that compute_ctl runs the background health monitor which makes a few small updates. Also some tests shut down the pageserver, and now that the background health check can run some queries while the pageserver is down, that can produce a few extra errors in the logs, which needed to be allowlisted. Other changes: - remove obsolete comments about PostgresNode; - create standby.signal file for Static compute node; - log output of `compute_ctl` and `postgres` is merged into `endpoints/compute.log`. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-06-06 14:59:36 +01:00
Sasha Krassovsky	6052ecee07	Add connector extension to send Role/Database updates to console (#3891 ) ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-05-25 12:36:57 +03:00
MMeent	e6ec2400fc	Enable hot standby PostgreSQL replicas. Notes: - This still needs UI support from the Console - I've not tuned any GUCs for PostgreSQL to make this work better - Safekeeper has gotten a tweak in which WAL is sent and how: It now sends zero-ed WAL data from the start of the timeline's first segment up to the first byte of the timeline to be compatible with normal PostgreSQL WAL streaming. - This includes the commits of #3714 Fixes one part of https://github.com/neondatabase/neon/issues/769 Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-04-27 15:26:44 +02:00
Heikki Linnakangas	f0b2e076d9	Move compute_ctl structs used in HTTP API and spec file to separate crate. This is in preparation of using compute_ctl to launch postgres nodes in the neon_local control plane. And seems like a good idea to separate the public interfaces anyway. One non-mechanical change here is that the 'metrics' field is moved under the Mutex, instead of using atomics. We were not using atomics for performance but for convenience here, and it seems more clear to not use atomics in the model for the HTTP response type.	2023-04-09 21:52:28 +03:00
Alexey Kondratov	e42982fb1e	[compute_ctl] Empty computes and /configure API (#3963 ) This commit adds an option to start compute without spec and then pass it a valid spec via `POST /configure` API endpoint. This is a main prerequisite for maintaining the pool of compute nodes in the control-plane. For example: 1. Start compute with ```shell cargo run --bin compute_ctl -- -i no-compute \ -p http://localhost:9095 \ -D compute_pgdata \ -C "postgresql://cloud_admin@127.0.0.1:5434/postgres" \ -b ./pg_install/v15/bin/postgres ``` 2. Configure it with ```shell curl -d "{\"spec\": $(cat ./compute-spec.json)}" http://localhost:3080/configure ``` Internally, it's implemented using a `Condvar` + `Mutex`. Compute spec is moved under Mutex, as it's now could be updated in the http handler. Also `RwLock` was replaced with `Mutex` because the latter works well with `Condvar`. First part of the neondatabase/cloud#4433	2023-04-06 21:21:58 +02:00
Heikki Linnakangas	5a123b56e5	Remove obsolete hack to rename neon-specific GUCs. I checked the console database, we don't have any of these left in production.	2023-03-28 17:57:22 +03:00
Heikki Linnakangas	d1537a49fa	Fix escaping in postgresql.conf that we generate at compute startup If there are any config options that contain single quotes or backslashes, they need to be escaped	2023-03-10 14:59:21 +02:00
Heikki Linnakangas	856d01ff68	Add newline at end of postgresql.conf	2023-03-10 14:59:21 +02:00
Alexey Kondratov	e43c413a3f	[compute_tools] Add /insights endpoint to compute_ctl (#3704 ) This commit adds a basic HTTP API endpoint that allows scraping the `pg_stat_statements` data and getting a list of slow queries. New insights like cache hit rate and so on could be added later. Extension `pg_stat_statements` is checked / created only if compute tries to load the corresponding shared library. The latter is configured by control-plane and currently covered with feature flag. Co-authored by Eduard Dyckman (bird.duskpoet@gmail.com)	2023-03-09 14:21:10 +01:00
Alexey Kondratov	20b1e26e74	[compute_ctl] Make role deletion spec processing idempotent (#3380 ) Previously, we were trying to re-assign owned objects of the already deleted role. This were causing a crash loop in the case when compute was restarted with a spec that includes delta operation for role deletion. To avoid such cases, check that role is still present before calling `reassign_owned_objects`. Resolves neondatabase/cloud#3553	2023-01-20 15:37:24 +01:00
Heikki Linnakangas	e5cc2f92c4	Switch to 'tracing' for logging, restructure code to make use of spans. Refactors Compute::prepare_and_run. It's split into subroutines differently, to make it easier to attach tracing spans to the different stages. The high-level logic for waiting for Postgres to exit is moved to the caller. Replace 'env_logger' with 'tracing', and add `#instrument` directives to different stages fo the startup process. This is a fairly mechanical change, except for the changes in 'spec.rs'. 'spec.rs' contained some complicated formatting, where parts of log messages were printed directly to stdout with `print`s. That was a bit messed up because the log normally goes to stderr, but those lines were printed to stdout. In our docker images, stderr and stdout both go to the same place so you wouldn't notice, but I don't think it was intentional. This changes the log format to the default 'tracing_subscriber::format' format. It's different from the Postgres log format, however, and because both compute_tools and Postgres print to the same log, it's now a mix of two different formats. I'm not sure how the Grafana log parsing pipeline can handle that. If it's a problem, we can build custom formatter to change the compute_tools log format to be the same as Postgres's, like it was before this commit, or we can change the Postgres log format to match tracing_formatter's, or we can start printing compute_tool's log output to a different destination than Postgres	2023-01-18 19:42:47 +02:00
Vadim Kharitonov	9b71215906	Simplify some functions in compute_tools and fix typo errors in func name	2022-12-22 15:05:43 +01:00
andres	1cf257bc4a	feedback	2022-11-08 20:15:54 +04:00
Alexey Kondratov	4d1e48f3b9	[compute_ctl] Use postgres::config to properly escape database names (#2652 ) We've got at least one user in production that cannot create a database with a trailing space in the name. This happens because we use `url` crate for manipulating the DATABASE_URL, but it follows a standard that doesn't fit really well with Postgres. For example, it trims all trailing spaces from the path: > Remove any leading and trailing C0 control or space from input. > See: https://url.spec.whatwg.org/#url-parsing But we used `set_path()` to set database name and it's totally valid to have trailing spaces in the database name in Postgres. Thus, use `postgres::config::Config` to modify database name in the connection details.	2022-10-19 19:20:06 +02:00
Joonas Koivunen	e8b195acb7	fix: apply notify workaround on m1 mac docker (#2564 ) workaround as discussed in the notify repository.	2022-10-06 11:13:40 +03:00
Heikki Linnakangas	9b9bbad462	Use 'notify' crate to wait for PostgreSQL startup. Compute node startup time is very important. After launching PostgreSQL, use 'notify' to be notified immediately when it has updated the PID file, instead of polling. The polling loop had 100 ms interval so this shaves up to 100 ms from the startup time.	2022-10-04 13:00:15 +03:00
Heikki Linnakangas	537b2c1ae6	Remove unnecessary check for open PostgreSQL TCP port. The loop checked if the TCP port is open for connections, by trying to connect to it. That seems unnecessary. By the time the postmaster.pid file says that it's ready, the port should be open. Remove that check.	2022-10-04 12:09:13 +03:00
MMeent	f99ccb5041	Extract WalProposer into the neon extension (#2217 ) Including, but not limited to: * Fixes to neon management code to support walproposer-as-an-extension * Fix issue in expected output of pg settings serialization. * Show the logs of a failed --sync-safekeepers process in CI * Add compat layer for renamed GUCs in postgres.conf * Update vendor/postgres to the latest origin/main	2022-08-18 17:12:28 +02:00
Alexey Kondratov	747d009bb4	Fix panic while waiting for Postgres readiness in the compute_ctl (#2021 ) We were reading Postgres pid file and looking for the 'ready' status, but it could be empty or we could not read it. So add all the checks.	2022-07-07 11:56:58 +02:00
Kirill Bulatov	6abdb12724	Fix 1.62 Clippy errors	2022-07-04 23:46:37 +03:00
Alexey Kondratov	772c2fb4ff	Report startup metrics and failure reason from compute_ctl (#1581 ) + neondatabase/cloud#1103 This adds a couple of control endpoints to simplify compute state discovery for control-plane. For example, now we may figure out that Postgres wasn't able to start or basebackup failed within seconds instead of just blindly polling the compute readiness for a minute or two. Also we now expose startup metrics (time of the each step: basebackup, sync safekeepers, config, total). Console grabs them after each successful start and report as histogram to prometheus and grafana. OpenAPI spec is added and up-tp date, but is not currently used in the console yet.	2022-05-18 13:03:29 +04:00
Stas Kelvich	389bd1faeb	Support for SCRAM-SHA-256 in compute tools	2022-04-18 22:19:01 +03:00
Kirill Bulatov	949f8b4633	Fix 1.59 rustc clippy warnings	2022-03-02 21:35:34 +02:00
Dmitry Ivanov	d3542c34f1	Refactoring: use anyhow::Context's methods where possible	2022-01-19 16:33:48 +03:00
Alexey Kondratov	f64074c609	Move compute_tools from console repo (zenithdb/console#383 ) Currently it's included with minimal changes and lives aside of the main workspace. Later we may re-use and combine common parts with zenith control_plane. This change is mostly needed to unify cloud deployment pipeline: 1.1. build compute-tools image 1.2. build compute-node image based on the freshly built compute-tools 2. build zenith image So we can roll new compute image and new storage required by it to operate properly. Also it becomes easier to test console against some specific version of compute-node/-tools.	2021-12-28 20:17:29 +03:00

39 Commits