rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-24 00:20:37 +00:00

Author	SHA1	Message	Date
Christian Schwarz	d22e23f66d	Merge commit '108f7ec54' into problame/standby-horizon-leases	2025-08-06 17:55:56 +02:00
Christian Schwarz	54480167dc	Merge commit '9c0efba91' into problame/standby-horizon-leases	2025-08-06 17:55:48 +02:00
Christian Schwarz	30e7c4b75d	Merge commit '187170be4' into problame/standby-horizon-leases	2025-08-06 17:55:39 +02:00
Christian Schwarz	35c916c062	Merge commit '5c934efb2' into problame/standby-horizon-leases	2025-08-06 17:50:33 +02:00
Christian Schwarz	02e1aeef66	Merge commit 'a456e818a' into problame/standby-horizon-leases	2025-08-06 17:49:56 +02:00
Christian Schwarz	e2c88c1929	Merge commit '296c9190b' into problame/standby-horizon-leases	2025-08-06 17:49:50 +02:00
Christian Schwarz	553a120075	Merge commit '15f633922' into problame/standby-horizon-leases	2025-08-06 17:49:41 +02:00
Christian Schwarz	e2facbde4e	Merge commit 'cec0543b5' into problame/standby-horizon-leases	2025-08-06 17:47:10 +02:00
Christian Schwarz	b8c8168378	Merge commit 'be5bbaeca' into problame/standby-horizon-leases	2025-08-06 17:46:44 +02:00
Christian Schwarz	28a2cd05d5	Merge commit '5ec82105c' into problame/standby-horizon-leases	2025-08-06 17:46:37 +02:00
Christian Schwarz	1635390a96	fix all clippy complaints in this branch	2025-08-06 17:39:17 +02:00
Christian Schwarz	47146fe1d6	Merge commit '7049003cf' into problame/standby-horizon-leases	2025-08-06 17:17:11 +02:00
Christian Schwarz	2ee0f4271c	fix(page_service): lsn lease API puts tenant_shard_id in tenant_id tracing field The LSN lease api actually accepts a tenant_shard_id, not a tenant_id. But we put the Display of the tenant_shard_id into the tenant_id field. This PR fixes it. Refs - fixes https://databricks.atlassian.net/browse/LKB-2930	2025-08-05 22:48:27 +02:00
Christian Schwarz	8a9f1dd5e7	use tokio::time::Instant internally, chrono::DateTime<Utc> externally; commuicate expiration through rfc3339 format; chrono::DateTime has good Debug fmt so this also serves observability; finish implementing release valve mechanism	2025-08-05 22:47:53 +02:00
Christian Schwarz	9f01840c18	use standby_horizon leases feature in the test, demonstrating that it passes now	2025-08-05 22:47:28 +02:00
Christian Schwarz	44466cebdb	WIP better observability for return values (SystemTime Debug is useless)	2025-08-05 22:46:54 +02:00
Christian Schwarz	b865e85de3	previous commit broke the tests because of the cfg business, see this commit's TODO	2025-08-05 22:46:24 +02:00
Christian Schwarz	73336962a8	finalize 3-stepped feature-gating (legacy,all,leases) + more tests + observability + fixes	2025-08-05 19:24:06 +02:00
Christian Schwarz	3365c8c648	enforce standby_horizon leases are always above applied_gc_cutoff (check against cutoff on upsert + block gc for lease length to allow renewals after attach)	2025-07-26 16:38:44 +02:00
Christian Schwarz	bc09df8823	add todo about init deadline	2025-07-26 16:23:59 +02:00
Christian Schwarz	e1eb98c0e9	add basic test & fix embarrasing bug in cull (needs comment out todo!())	2025-07-26 16:23:59 +02:00
Christian Schwarz	1e61ac6af2	cargo fmt (unrelated to prev commit)	2025-07-26 16:23:59 +02:00
Christian Schwarz	a948054db3	naming orhtodoxy: always refere to leases as LSN leases	2025-07-26 16:23:59 +02:00
Christian Schwarz	2ee24900ca	have claude generate plumbing for standby_horizon_lease_length	2025-07-25 13:16:20 +02:00
Christian Schwarz	23d1029afd	explain why there's no need to check standby_horizon lease deadline for getpage requests	2025-07-25 09:30:27 +00:00
Folke Behrens	108f7ec544	Bump opentelemetry crates to 0.30 (#12680 ) This rebuilds #11552 on top the current Cargo.lock. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2025-07-22 16:05:35 +00:00
Alex Chi Z.	88391ce069	feat(pageserver): create image layers at L0-L1 boundary by default (#12669 ) ## Problem Post LKB-198 rollout. We added a new strategy to generate image layers at the L0-L1 boundary instead of the latest LSN to ensure too many L0 layers do not trigger image layer creation. ## Summary of changes We already rolled it out to all users so we can remove the feature flag now. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-07-22 14:29:26 +00:00
Vlad Lazar	d91d018afa	storcon: handle pageserver disk loss (#12667 ) NB: effectively a no-op in the neon env since the handling is config gated in storcon ## Problem When a pageserver suffers from a local disk/node failure and restarts, the storage controller will receive a re-attach call and return all the tenants the pageserver is suppose to attach, but the pageserver will not act on any tenants that it doesn't know about locally. As a result, the pageserver will not rehydrate any tenants from remote storage if it restarted following a local disk loss, while the storage controller still thinks that the pageserver have all the tenants attached. This leaves the system in a bad state, and the symptom is that PG's pageserver connections will fail with "tenant not found" errors. ## Summary of changes Made a slight change to the storage controller's `re_attach` API: * The pageserver will set an additional bit `empty_local_disk` in the reattach request, indicating whether it has started with an empty disk or does not know about any tenants. * Upon receiving the reattach request, if this `empty_local_disk` bit is set, the storage controller will go ahead and clear all observed locations referencing the pageserver. The reconciler will then discover the discrepancy between the intended state and observed state of the tenant and take care of the situation. To facilitate rollouts this extra behavior in the `re_attach` API is guarded by the `handle_ps_local_disk_loss` command line flag of the storage controller. --------- Co-authored-by: William Huang <william.huang@databricks.com>	2025-07-22 11:04:03 +00:00
Folke Behrens	9c0efba91e	Bump rand crate to 0.9 (#12674 )	2025-07-22 09:31:39 +00:00
Vlad Lazar	30e1213141	pageserver: check env var for ip address before node registration (#12666 ) Include the ip address (optionally read from an env var) in the pageserver's registration request. Note that the ip address is ignored by the storage controller at the moment, which makes it a no-op in the neon env.	2025-07-21 15:32:28 +00:00
Erik Grinaker	5a48365fb9	pageserver/client_grpc: don't set stripe size for unsharded tenants (#12639 ) ## Problem We've had bugs where the compute would use the stale default stripe size from an unsharded tenant after the tenant split with a new stripe size. ## Summary of changes Never specify a stripe size for unsharded tenants, to guard against misuse. Only specify it once tenants are sharded and the stripe size can't change. Also opportunistically changes `GetPageSplitter` to return `anyhow::Result`, since we'll be using this in other code paths as well (specifically during server-side shard splits).	2025-07-21 12:28:39 +00:00
Erik Grinaker	194b9ffc41	pageserver: remove gRPC `CheckRelExists` (#12616 ) ## Problem Postgres will often immediately follow a relation existence check with a relation size query. This incurs two roundtrips, and may prevent effective caching. See [Slack thread](https://databricks.slack.com/archives/C091SDX74SC/p1751951732136139). Touches #11728. ## Summary of changes For the gRPC API: * Add an `allow_missing` parameter to `GetRelSize`, which returns `missing=true` instead of a `NotFound` error. * Remove `CheckRelExists`. There are no changes to libpq behavior.	2025-07-21 11:43:26 +00:00
Erik Grinaker	e181b996c3	utils: move `ShardStripeSize` into `shard` module (#12640 ) ## Problem `ShardStripeSize` will be used in the compute spec and internally in the communicator. It shouldn't require pulling in all of `pageserver_api`. ## Summary of changes Move `ShardStripeSize` into `utils::shard`, along with other basic shard types. Also remove the `Default` implementation, to discourage clients from falling back to a default (it's generally a footgun). The type is still re-exported from `pageserver_api::shard`, along with all the other shard types.	2025-07-21 10:56:20 +00:00
Erik Grinaker	1406bdc6a8	pageserver: improve gRPC cancellation (#12635 ) ## Problem The gRPC page service does not properly react to shutdown cancellation. In particular, Tonic considers an open GetPage stream to be an in-flight request, so it will wait for it to complete before shutting down. Touches [LKB-191](https://databricks.atlassian.net/browse/LKB-191). ## Summary of changes Properly react to the server's cancellation token and take out gate guards in gRPC request handlers. Also document cancellation handling. In particular, that Tonic will drop futures when clients go away (e.g. on timeout or shutdown), so the read path must be cancellation-safe. It is believed to be (modulo possible logging noise), but this will be verified later.	2025-07-21 10:52:18 +00:00
Christian Schwarz	b47d3900b9	observability and debugging facilities	2025-07-20 23:13:02 +00:00
Christian Schwarz	f4b38d5975	expand comment on why we normalize_lsn	2025-07-20 22:38:07 +00:00
Christian Schwarz	2a89f72389	rudimentary leases impl, lacks initial lease deadline stuff	2025-07-20 22:17:03 +00:00
Christian Schwarz	2b5bb850f2	WIP	2025-07-20 20:46:20 +00:00
Christian Schwarz	4dc3acf3ed	WIP: standby_horizon leases	2025-07-20 19:21:47 +00:00
HaoyuHuang	8f627ea0ab	A few more SC changes (#12649 ) ## Problem ## Summary of changes	2025-07-17 23:17:01 +00:00
Arpad Müller	6a353c33e3	print more timestamps in find_lsn_for_timestamp (#12641 ) Observability of `find_lsn_for_timestamp` is lacking, as well as how and when we update gc space and time cutoffs. Log them.	2025-07-17 22:13:21 +00:00
HaoyuHuang	b7fc5a2fe0	A few SC changes (#12615 ) ## Summary of changes A bunch of no-op changes. --------- Co-authored-by: Vlad Lazar <vlad@neon.tech>	2025-07-17 13:14:36 +00:00
Alex Chi Z.	f2828bbe19	fix(pageserver): skip gc-compaction for metadata key ranges (#12618 ) ## Problem part of https://github.com/neondatabase/neon/issues/11318 ; it is not entirely safe to run gc-compaction over the metadata key range due to tombstones and implications of image layers (missing key in image layer == key not exist). The auto gc-compaction trigger already skips metadata key ranges (see `schedule_auto_compaction` call in `trigger_auto_compaction`). In this patch we enforce it directly in gc_compact_inner so that compactions triggered via HTTP API will also be subject to this restriction. ## Summary of changes Ensure gc-compaction only runs on rel key ranges. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-07-16 21:52:18 +00:00
Aleksandr Sarantsev	1178f6fe7c	pageserver: Downgrade log level of 'No broker updates' (#12627 ) ## Problem The warning message was seen during deployment, but it's actually OK. ## Summary of changes - Treat `"No broker updates received for a while ..."` as an info message. Co-authored-by: Aleksandr Sarantsev <aleksandr.sarantsev@databricks.com>	2025-07-16 15:02:01 +00:00
Arpad Müller	5c934efb29	Don't depend on the postgres_ffi just for one type (#12610 ) We don't want to depend on postgres_ffi in an API crate. If there is no such dependency, we can compile stuff like `storcon_cli` without needing a full working postgres build. Fixes regression of #12548 (before we could compile it).	2025-07-15 17:28:08 +00:00
Heikki Linnakangas	5c9c3b3317	Misc cosmetic cleanups (#12598 ) - Remove a few obsolete "allowed error messages" from tests. The pageserver doesn't emit those messages anymore. - Remove misplaced and outdated docstring comment from `test_tenants.py`. A docstring is supposed to be the first thing in a function, but we had added some code before it. And it was outdated, as we haven't supported running without safekeepers for a long time. - Fix misc typos in comments - Remove obsolete comment about backwards compatibility with safekeepers without `TIMELINE_STATUS` API. All safekeepers have it by now.	2025-07-15 14:36:28 +00:00
Vlad Lazar	f8d3f86f58	pageserver: include records in get page debug handler (#12578 ) Include records and image in the debug get page handler. This endpoint does not update the metrics and does not support tracing. Note that this now returns individual bytes which need to be encoded properly for debugging. Co-authored-by: Haoyu Huang <haoyu.huang@databricks.com>	2025-07-14 16:37:28 +00:00
HaoyuHuang	f67a8a173e	A few SK changes (#12577 ) # TLDR This PR is a no-op. ## Problem When a SK loses a disk, it must recover all WALs from the very beginning. This may take days/weeks to catch up to the latest WALs for all timelines it owns. ## Summary of changes When SK starts up, if it finds that it has 0 timelines, - it will ask SC for the timeline it owns. - Then, pulls the timeline from its peer safekeepers to restore the WAL redundancy right away. After pulling timeline is complete, it will become active and accepts new WALs. The current impl is a prototype. We can optimize the impl further, e.g., parallel pull timelines. --------- Co-authored-by: Haoyu Huang <haoyu.huang@databricks.com>	2025-07-14 16:37:04 +00:00
Erik Grinaker	eb830fa547	pageserver/client_grpc: use unbounded pools (#12585 ) ## Problem The communicator gRPC client currently uses bounded client/stream pools. This can artificially constrain clients, especially after we remove pipelining in #12584. [Benchmarks](https://github.com/neondatabase/neon/pull/12583) show that the cost of an idle server-side GetPage worker task is about 26 KB (2.5 GB for 100,000), so we can afford to scale out. In the worst case, we'll degenerate to the current libpq state with one stream per backend, but without the TCP connection overhead. In the common case we expect significantly lower stream counts due to stream sharing, driven e.g. by idle backends, LFC hits, read coalescing, sharding (backends typically only talk to one shard at a time), etc. Currently, Pageservers rarely serve more than 4000 backend connections, so we have at least 2 orders of magnitude of headroom. Touches #11735. Requires #12584. ## Summary of changes Remove the pool limits, and restructure the pools. We still keep a separate bulk pool for Getpage batches of >4 pages (>32 KB), with fewer streams per connection. This reduces TCP-level congestion and head-of-line blocking for non-bulk requests, and concentrates larger window sizes on a smaller set of streams/connections, presumably reducing memory usage. Apart from this, bulk requests don't have any latency penalty compared to other requests.	2025-07-14 13:22:38 +00:00
Erik Grinaker	a203f9829a	pageserver: add timeline_id span when freezing layers (#12572 ) ## Problem We don't log the timeline ID when rolling ephemeral layers during housekeeping. Resolves [LKB-179](https://databricks.atlassian.net/browse/LKB-179) ## Summary of changes Add a span with timeline ID when calling `maybe_freeze_ephemeral_layer` from the housekeeping loop. We don't instrument the function itself, since future callers may not have a span including the tenant_id already, but we don't want to duplicate the tenant_id for these spans.	2025-07-14 12:30:28 +00:00

1 2 3 4 5 ...

3112 Commits