rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-15 01:12:56 +00:00

Author	SHA1	Message	Date
Arpad Müller	552249607d	apply clippy fixes for 1.88.0 beta (#12331 ) The 1.88.0 stable release is near (this Thursday). We'd like to fix most warnings beforehand so that the compiler upgrade doesn't require approval from too many teams. This is therefore a preparation PR (like similar PRs before it). There is a lot of changes for this release, mostly because the `uninlined_format_args` lint has been added to the `style` lint group. One can read more about the lint [here](https://rust-lang.github.io/rust-clippy/master/#/uninlined_format_args). The PR is the result of `cargo +beta clippy --fix` and `cargo fmt`. One remaining warning is left for the proxy team. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2025-06-24 10:12:42 +00:00
Mikhail	5c356c63eb	endpoint_storage compute_ctl integration (#11550 ) Add `/lfc/(prewarm\|offload)` routes to `compute_ctl` which interact with endpoint storage. Add `prewarm_lfc_on_startup` spec option which, if enabled, downloads LFC prewarm data on compute startup. Resolves: https://github.com/neondatabase/cloud/issues/26343	2025-05-06 22:02:12 +00:00
Arpad Müller	a22be5af72	Migrate the last crates to edition 2024 (#10998 ) Migrates the remaining crates to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, however some code changes had to be done to accommodate the edition's breaking changes. Like the previous migration PRs, this is comprised of three commits: * the first does the edition update and makes `cargo check`/`cargo clippy` pass. we had to update bindgen to make its output [satisfy the requirements of edition 2024](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-extern.html) * the second commit does a `cargo fmt` for the new style edition. * the third commit reorders imports as a one-off change. As before, it is entirely optional. Part of #10918	2025-02-27 09:40:40 +00:00
Heikki Linnakangas	2bbb4d3e1c	Remove misc unused code (#9014 )	2024-09-16 18:45:19 +00:00
Arpad Müller	cbcd4058ed	Fix 1.82 clippy lint too_long_first_doc_paragraph (#8941 ) Addresses the 1.82 beta clippy lint `too_long_first_doc_paragraph` by adding newlines to the first sentence if it is short enough, and making a short first sentence if there is the need.	2024-09-06 14:33:52 +02:00
Christian Schwarz	e26ef640c1	pageserver: remove `trace_read_requests` (#8338 ) `trace_read_requests` is a per `Tenant`-object option. But the `handle_pagerequests` loop doesn't know which `Tenant` object (i.e., which shard) the request is for. The remaining use of the `Tenant` object is to check `tenant.cancel`. That check is incorrect [if the pageserver hosts multiple shards](https://github.com/neondatabase/neon/issues/7427#issuecomment-2220577518). I'll fix that in a future PR where I completely eliminate the holding of `Tenant/Timeline` objects across requests. See [my code RFC](https://github.com/neondatabase/neon/pull/8286) for the high level idea. Note that we can always bring the tracing functionality if we need it. But since it's actually about logging the `page_service` wire bytes, it should be a `page_service`-level config option, not per-Tenant. And for enabling tracing on a single connection, we can implement a `set pageserver_trace_connection;` option.	2024-07-11 15:17:07 +02:00
John Spray	b6ec11ad78	control_plane: generalize attachment_service to handle sharding (#6251 ) ## Problem To test sharding, we need something to control it. We could write python code for doing this from the test runner, but this wouldn't be usable with neon_local run directly, and when we want to write tests with large number of shards/tenants, Rust is a better fit efficiently handling all the required state. This service enables automated tests to easily get a system with sharding/HA without the test itself having to set this all up by hand: existing tests can be run against sharded tenants just by setting a shard count when creating the tenant. ## Summary of changes Attachment service was previously a map of TenantId->TenantState, where the principal state stored for each tenant was the generation and the last attached pageserver. This enabled it to serve the re-attach and validate requests that the pageserver requires. In this PR, the scope of the service is extended substantially to do overall management of tenants in the pageserver, including tenant/timeline creation, live migration, evacuation of offline pageservers etc. This is done using synchronous code to make declarative changes to the tenant's intended state (`TenantState.policy` and `TenantState.intent`), which are then translated into calls into the pageserver by the `Reconciler`. Top level summary of modules within `control_plane/attachment_service/src`: - `tenant_state`: structure that represents one tenant shard. - `service`: implements the main high level such as tenant/timeline creation, marking a node offline, etc. - `scheduler`: for operations that need to pick a pageserver for a tenant, construct a scheduler and call into it. - `compute_hook`: receive notifications when a tenant shard is attached somewhere new. Once we have locations for all the shards in a tenant, emit an update to postgres configuration via the neon_local `LocalEnv`. - `http`: HTTP stubs. These mostly map to methods on `Service`, but are separated for readability and so that it'll be easier to adapt if/when we switch to another RPC layer. - `node`: structure that describes a pageserver node. The most important attribute of a node is its availability: marking a node offline causes tenant shards to reschedule away from it. This PR is a precursor to implementing the full sharding service for prod (#6342). What's the difference between this and a production-ready controller for pageservers? - JSON file persistence to be replaced with a database - Limited observability. - No concurrency limits. Marking a pageserver offline will try and migrate every tenant to a new pageserver concurrently, even if there are thousands. - Very simple scheduler that only knows to pick the pageserver with fewest tenants, and place secondary locations on a different pageserver than attached locations: it does not try to place shards for the same tenant on different pageservers. This matters little in tests, because picking the least-used pageserver usually results in round-robin placement. - Scheduler state is rebuilt exhaustively for each operation that requires a scheduler. - Relies on neon_local mechanisms for updating postgres: in production this would be something that flows through the real control plane. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-01-17 18:01:08 +00:00
Joonas Koivunen	4be6bc7251	refactor: remove unnecessary unsafe (#5802 ) unsafe impls for `Send` and `Sync` should not be added by default. in the case of `SlotGuard` removing them does not cause any issues, as the compiler automatically derives those. This PR adds requirement to document the unsafety (see [clippy::undocumented_unsafe_blocks]) and opportunistically adds `#![deny(unsafe_code)]` to most places where we don't have unsafe code right now. TRPL on Send and Sync: https://doc.rust-lang.org/book/ch16-04-extensible-concurrency-sync-and-send.html [clippy::undocumented_unsafe_blocks]: https://rust-lang.github.io/rust-clippy/master/#/undocumented_unsafe_blocks	2023-11-07 10:26:25 +00:00
duguorong009	b3d3a2587d	feat: improve the serde impl for several types(`Lsn`, `TenantId`, `TimelineId` ...) (#5335 ) Improve the serde impl for several types (`Lsn`, `TenantId`, `TimelineId`) by making them sensitive to `Serializer::is_human_readadable` (true for json, false for bincode). Fixes #3511 by: - Implement the custom serde for `Lsn` - Implement the custom serde for `Id` - Add the helper module `serde_as_u64` in `libs/utils/src/lsn.rs` - Remove the unnecessary attr `#[serde_as(as = "DisplayFromStr")]` in all possible structs Additionally some safekeeper types gained serde tests. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-06 11:40:03 +02:00
duguorong009	25a37215f3	fix: replace all `std::PathBuf`s with `camino::Utf8PathBuf` (#5352 ) Fixes #4689 by replacing all of `std::Path` , `std::PathBuf` with `camino::Utf8Path`, `camino::Utf8PathBuf` in - pageserver - safekeeper - control_plane - libs/remote_storage Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-10-04 17:52:23 +03:00
Dmitry Rodionov	6d023484ed	Use mark file to allow for deletion operations to continue through restarts (#4552 ) ## Problem Currently we delete local files first, so if pageserver restarts after local files deletion then remote deletion is not continued. This can be solved with inversion of these steps. But even if these steps are inverted when index_part.json is deleted there is no way to distinguish between "this timeline is good, we just didnt upload it to remote" and "this timeline is deleted we should continue with removal of local state". So to solve it we use another mark file. After index part is deleted presence of this mark file indentifies that it was a deletion intention. Alternative approach that was discussed was to delete all except metadata first, and then delete metadata and index part. In this case we still do not support local only configs making them rather unsafe (deletion in them is already unsafe, but this direction solidifies this direction instead of fixing it). Another downside is that if we crash after local metadata gets removed we may leave dangling index part on the remote which in theory shouldnt be a big deal because the file is small. It is not a big change to choose another approach at this point. ## Summary of changes Timeline deletion sequence: 1. Set deleted_at in remote index part. 2. Create local mark file. 3. Delete local files except metadata (it is simpler this way, to be able to reuse timeline initialization code that expects metadata) 4. Delete remote layers 5. Delete index part 6. Delete meta, timeline directory. 7. Delete mark file. This works for local only configuration without remote storage. Sequence is resumable from any point. resolves #4453 resolves https://github.com/neondatabase/neon/pull/4552 (the issue was created with async cancellation in mind, but we can still have issues with retries if metadata is deleted among the first by remove_dir_all (which doesnt have any ordering guarantees)) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-25 16:25:27 +03:00
Arthur Petukhovsky	b03143dfc8	Use serde_as DisplayFromStr everywhere (#4103 ) We used `display_serialize` previously, but it works only for Serialize. `DisplayFromStr` does the same, but also works for Deserialize.	2023-04-28 13:55:07 +03:00
Dmitry Rodionov	4d8c765485	remove redundant dyn (#3878 ) remove redundant dyn	2023-03-27 12:04:48 +03:00
Arseny Sher	2d42f84389	Add storage_broker binary. Which ought to replace etcd. This patch only adds the binary and adjusts Dockerfile to include it; subsequent ones will add deploy of helm chart and the actual replacement. It is a simple and fast pub-sub message bus. In this patch only safekeeper message is supported, but others can be easily added. Compilation now requires protoc to be installed. Installing protobuf-compiler package is fine for Debian/Ubuntu. ref https://github.com/neondatabase/neon/pull/2733 https://github.com/neondatabase/neon/issues/2394	2022-11-23 22:05:59 +04:00
bojanserafimov	7fd88fab59	Trace read requests (#2762 )	2022-11-10 16:43:04 -05:00
Arseny Sher	7480a0338a	Determine safekeeper for offloading WAL without etcd election API. This API is rather pointless, as sane choice anyway requires knowledge of peers status and leaders lifetime in any case can intersect, which is fine for us -- so manual elections are straightforward. Here, we deterministically choose among the reasonably caught up safekeepers, shifting by timeline id to spread the load. A step towards custom broker https://github.com/neondatabase/neon/issues/2394	2022-10-21 15:33:27 +03:00
Kirill Bulatov	b8eb908a3d	Rename old project name references	2022-09-14 08:14:05 +03:00

17 Commits