Commit Graph

8492 Commits

Author SHA1 Message Date
Christian Schwarz
d888835555 Merge commit 'ca8852165' into problame/standby-horizon-leases 2025-08-06 18:01:04 +02:00
Christian Schwarz
668e82d713 Merge commit '07c3cfd2a' into problame/standby-horizon-leases 2025-08-06 18:00:59 +02:00
Christian Schwarz
e0d2d293b1 Merge commit '7cd006621' into problame/standby-horizon-leases 2025-08-06 18:00:38 +02:00
Christian Schwarz
8799e87ae3 Merge commit '62d844e65' into problame/standby-horizon-leases 2025-08-06 18:00:09 +02:00
Christian Schwarz
737f5825bb Merge commit 'b623fbae0' into problame/standby-horizon-leases 2025-08-06 17:59:58 +02:00
Christian Schwarz
b95106cd79 Merge commit '5c57e8a11' into problame/standby-horizon-leases 2025-08-06 17:59:21 +02:00
Christian Schwarz
be1c1df6aa Merge commit '84a2556c9' into problame/standby-horizon-leases 2025-08-06 17:58:54 +02:00
Christian Schwarz
7d28fb118b Merge commit 'f85935446' into problame/standby-horizon-leases 2025-08-06 17:58:36 +02:00
Christian Schwarz
daf2b5a806 Merge commit 'b00a0096b' into problame/standby-horizon-leases 2025-08-06 17:56:37 +02:00
Christian Schwarz
e52d0ef311 Merge commit '5b0972151' into problame/standby-horizon-leases 2025-08-06 17:56:07 +02:00
Christian Schwarz
d22e23f66d Merge commit '108f7ec54' into problame/standby-horizon-leases 2025-08-06 17:55:56 +02:00
Christian Schwarz
54480167dc Merge commit '9c0efba91' into problame/standby-horizon-leases 2025-08-06 17:55:48 +02:00
Christian Schwarz
30e7c4b75d Merge commit '187170be4' into problame/standby-horizon-leases 2025-08-06 17:55:39 +02:00
Christian Schwarz
d380111428 Merge commit '87915df2f' into problame/standby-horizon-leases 2025-08-06 17:55:06 +02:00
Christian Schwarz
78a8ac7be9 ruff format 2025-08-06 17:54:36 +02:00
Christian Schwarz
279865c68a Merge commit 'dd7fff655' into problame/standby-horizon-leases 2025-08-06 17:54:17 +02:00
Christian Schwarz
1ace4bcf23 Merge commit '809633903' into problame/standby-horizon-leases 2025-08-06 17:50:43 +02:00
Christian Schwarz
35c916c062 Merge commit '5c934efb2' into problame/standby-horizon-leases 2025-08-06 17:50:33 +02:00
Christian Schwarz
02e1aeef66 Merge commit 'a456e818a' into problame/standby-horizon-leases 2025-08-06 17:49:56 +02:00
Christian Schwarz
e2c88c1929 Merge commit '296c9190b' into problame/standby-horizon-leases 2025-08-06 17:49:50 +02:00
Christian Schwarz
553a120075 Merge commit '15f633922' into problame/standby-horizon-leases 2025-08-06 17:49:41 +02:00
Christian Schwarz
cfe345d3e6 Merge commit 'c34d36d8a' into problame/standby-horizon-leases 2025-08-06 17:47:29 +02:00
Christian Schwarz
e2facbde4e Merge commit 'cec0543b5' into problame/standby-horizon-leases 2025-08-06 17:47:10 +02:00
Christian Schwarz
b8c8168378 Merge commit 'be5bbaeca' into problame/standby-horizon-leases 2025-08-06 17:46:44 +02:00
Christian Schwarz
28a2cd05d5 Merge commit '5ec82105c' into problame/standby-horizon-leases 2025-08-06 17:46:37 +02:00
Christian Schwarz
1635390a96 fix all clippy complaints in this branch 2025-08-06 17:39:17 +02:00
Christian Schwarz
1877b70a35 Merge commit 'e7d18bc18' into problame/standby-horizon-leases 2025-08-06 17:19:37 +02:00
Christian Schwarz
fb7a027211 Merge commit '4ee0da0a2' into problame/standby-horizon-leases 2025-08-06 17:17:45 +02:00
Christian Schwarz
47146fe1d6 Merge commit '7049003cf' into problame/standby-horizon-leases 2025-08-06 17:17:11 +02:00
Christian Schwarz
577eee16f9 https://github.com/neondatabase/neon/pull/12676#discussion_r2220512343; concern about backward compat of TimelineInfo 2025-08-05 23:07:26 +02:00
Christian Schwarz
2ee0f4271c fix(page_service): lsn lease API puts tenant_shard_id in tenant_id tracing field
The LSN lease api actually accepts a tenant_shard_id, not a tenant_id.
But we put the Display of the tenant_shard_id into the tenant_id field.
This PR fixes it.

Refs
- fixes https://databricks.atlassian.net/browse/LKB-2930
2025-08-05 22:48:27 +02:00
Christian Schwarz
8a9f1dd5e7 use tokio::time::Instant internally, chrono::DateTime<Utc> externally; commuicate expiration through rfc3339 format; chrono::DateTime has good Debug fmt so this also serves observability; finish implementing release valve mechanism 2025-08-05 22:47:53 +02:00
Christian Schwarz
9f01840c18 use standby_horizon leases feature in the test, demonstrating that it passes now 2025-08-05 22:47:28 +02:00
Christian Schwarz
44466cebdb WIP better observability for return values (SystemTime Debug is useless) 2025-08-05 22:46:54 +02:00
Christian Schwarz
b865e85de3 previous commit broke the tests because of the cfg business, see this commit's TODO 2025-08-05 22:46:24 +02:00
Christian Schwarz
73336962a8 finalize 3-stepped feature-gating (legacy,all,leases) + more tests + observability + fixes 2025-08-05 19:24:06 +02:00
Christian Schwarz
fc7267a760 feature-gate compute side code 2025-08-05 19:22:58 +02:00
HaoyuHuang
ca88521653 Set neon_superuser privilege under lakebase mode (#12775)
## Problem

## Summary of changes
2025-07-29 21:30:34 +00:00
Suhas Thalanki
07c3cfd2a0 [BRC-2905] Feed back PS-detected data corruption signals to SK and PG… (#12748)
… walproposer (#895)

Data corruptions are typically detected on the pageserver side when it
replays WAL records. However, since PS doesn't synchronously replay WAL
records as they are being ingested through safekeepers, we need some
extra plumbing to feed information about pageserver-detected corruptions
during compaction (and/or WAL redo in general) back to SK and PG for
proper action.

We don't yet know what actions PG/SK should take upon receiving the
signal, but we should have the detection and feedback in place.

Add an extra `corruption_detected` field to the `PageserverFeedback`
message that is sent from PS -> SK -> PG. It's a boolean value that is
set to true when PS detects a "critical error" that signals data
corruption, and it's sent in all `PageserverFeedback` messages. Upon
receiving this signal, the safekeeper raises a
`safekeeper_ps_corruption_detected` gauge metric (value set to 1). The
safekeeper then forwards this signal to PG where a
`ps_corruption_detected` gauge metric (value also set to 1) is raised in
the `neon_perf_counters` view.

Added an integration test in
`test_compaction.py::test_ps_corruption_detection_feedback` that
confirms that the safekeeper and PG can receive the data corruption
signal in the `PageserverFeedback` message in a simulated data
corruption.

## Problem

## Summary of changes

---------

Co-authored-by: William Huang <william.huang@databricks.com>
2025-07-29 20:40:07 +00:00
Erik Grinaker
7cd0066212 page_api: add SplitError for GetPageSplitter (#12709)
Add a `SplitError` for `GetPageSplitter`, with an `Into<tonic::Status>`
implementation. This avoids a bunch of boilerplate to convert
`GetPageSplitter` errors into `tonic::Status`.

Requires #12702.
Touches [LKB-191](https://databricks.atlassian.net/browse/LKB-191).
2025-07-29 18:26:20 +00:00
Suhas Thalanki
bf3a1529bf Report metrics on data/index corruption (#12729)
## Problem

We don't have visibility into data/index corruption.

## Summary of changes
Add data/index corruptions metrics.

PG calls elog ERROR errcode to emit these corruption errors.

PG Changes: https://github.com/neondatabase/postgres/pull/698
2025-07-29 18:08:24 +00:00
Erik Grinaker
65d1be6e90 pageserver: route gRPC requests to child shards (#12702)
## Problem

During shard splits, each parent shard is split and removed
incrementally. Only when all parent shards have split is the split
committed and the compute notified. This can take several minutes for
large tenants. In the meanwhile, the compute will be sending requests to
the (now-removed) parent shards.

This was (mostly) not a problem for the libpq protocol, because it does
shard routing on the server-side. The compute just sends requests to
some Pageserver, and the server will figure out which local shard should
serve it.

It is a problem for the gRPC protocol, where the client explicitly says
which shard it's talking to.

Touches [LKB-191](https://databricks.atlassian.net/browse/LKB-191).
Requires #12772.

## Summary of changes

* Add server-side routing of gRPC requests to any local child shards if
the parent does not exist.
* Add server-side splitting of GetPage batch requests straddling
multiple child shards.
* Move the `GetPageSplitter` into `pageserver_page_api`.

I really don't like this approach, but it avoids making changes to the
split protocol. I could be convinced we should change the split protocol
instead, e.g. to keep the parent shard alive until the split commits and
the compute has been notified, but we can also do that as a later change
without blocking the communicator on it.
2025-07-29 16:28:57 +00:00
Suhas Thalanki
16eb8dda3d some compute ctl changes from hadron (#12760)
Some compute ctl changes from hadron
2025-07-29 16:01:56 +00:00
Heikki Linnakangas
bb32f1b3d0 Move 'criterion' to a dev-dependency (#12762)
It is only used in micro-benchmarks.
2025-07-29 15:35:00 +00:00
a-masterov
5585c32cee Disable autovacuum while running pg_repack test (#12755)
## Problem
Sometimes, the regression test of `pg_repack` fails due to an extra line
in the output.
The most probable cause of this is autovacuum.  
https://databricks.atlassian.net/browse/LKB-2637
## Summary of changes
Autovacuum is disabled during the test.

Co-authored-by: Alexey Masterov <alexey.masterov@databricks.com>
2025-07-29 15:34:02 +00:00
Krzysztof Szafrański
0ffdc98e20 [proxy] Classify "database not found" errors as user errors (#12603)
## Problem

If a user provides a wrong database name in the connection string, it
should be logged as a user error, not postgres error.

I found 4 different places where we log such errors:
1. `proxy/src/stream.rs:193`, e.g.:
```
{"timestamp":"2025-07-15T11:33:35.660026Z","level":"INFO","message":"forwarding error to user","fields":{"kind":"postgres","msg":"database \"[redacted]\" does not exist"},"spans":{"connect_request#9":{"protocol":"tcp","session_id":"ce1f2c90-dfb5-44f7-b9e9-8b8535e8b9b8","conn_info":"[redacted]","ep":"[redacted]","role":"[redacted]"}},"thread_id":22,"task_id":"370407867","target":"proxy::stream","src":"proxy/src/stream.rs:193","extract":{"ep":"[redacted]","session_id":"ce1f2c90-dfb5-44f7-b9e9-8b8535e8b9b8"}}
```
2. `proxy/src/pglb/mod.rs:137`, e.g.:
```
{"timestamp":"2025-07-15T11:37:44.340497Z","level":"WARN","message":"per-client task finished with an error: Couldn't connect to compute node: db error: FATAL: database \"[redacted]\" does not exist","spans":{"connect_request#8":{"protocol":"tcp","session_id":"763baaac-d039-4f4d-9446-c149e32660eb","conn_info":"[redacted]","ep":"[redacted]","role":"[redacted]"}},"thread_id":14,"task_id":"866658139","target":"proxy::pglb","src":"proxy/src/pglb/mod.rs:137","extract":{"ep":"[redacted]","session_id":"763baaac-d039-4f4d-9446-c149e32660eb"}}
```
3. `proxy/src/serverless/mod.rs:451`, e.g. (note that the error is
repeated 4 times — retries?):
```
{"timestamp":"2025-07-15T11:37:54.515891Z","level":"WARN","message":"error in websocket connection: Couldn't connect to compute node: db error: FATAL: database \"[redacted]\" does not exist: Couldn't connect to compute node: db error: FATAL: database \"[redacted]\" does not exist: db error: FATAL: database \"[redacted]\" does not exist: FATAL: database \"[redacted]\" does not exist","spans":{"http_conn#8":{"conn_id":"ec7780db-a145-4f0e-90df-0ba35f41b828"},"connect_request#9":{"protocol":"ws","session_id":"1eaaeeec-b671-4153-b1f4-247839e4b1c7","conn_info":"[redacted]","ep":"[redacted]","role":"[redacted]"}},"thread_id":10,"task_id":"366331699","target":"proxy::serverless","src":"proxy/src/serverless/mod.rs:451","extract":{"conn_id":"ec7780db-a145-4f0e-90df-0ba35f41b828","ep":"[redacted]","session_id":"1eaaeeec-b671-4153-b1f4-247839e4b1c7"}}
```
4. `proxy/src/serverless/sql_over_http.rs:219`, e.g.
```
{"timestamp":"2025-07-15T10:32:34.866603Z","level":"INFO","message":"forwarding error to user","fields":{"kind":"postgres","error":"could not connect to postgres in compute","msg":"database \"[redacted]\" does not exist"},"spans":{"http_conn#19":{"conn_id":"7da08203-5dab-45e8-809f-503c9019ec6b"},"connect_request#5":{"protocol":"http","session_id":"68387f1c-cbc8-45b3-a7db-8bb1c55ca809","conn_info":"[redacted]","ep":"[redacted]","role":"[redacted]"}},"thread_id":17,"task_id":"16432250","target":"proxy::serverless::sql_over_http","src":"proxy/src/serverless/sql_over_http.rs:219","extract":{"conn_id":"7da08203-5dab-45e8-809f-503c9019ec6b","ep":"[redacted]","session_id":"68387f1c-cbc8-45b3-a7db-8bb1c55ca809"}}
```

This PR directly addresses 1 and 4. I _think_ it _should_ also help with
2 and 3, although in those places we don't seem to log `kind`, so I'm
not quite sure. I'm also confused why in 3 the error is repeated
multiple times.

## Summary of changes

Resolves https://github.com/neondatabase/neon/issues/9440
2025-07-29 15:25:22 +00:00
HaoyuHuang
62d844e657 Add changes in spec apply (#12759)
## Problem
All changes are no-op. 

## Summary of changes
2025-07-29 15:22:04 +00:00
Alex Chi Z.
1bb434ab74 fix(test): test_readonly_node_gc compute needs time to acquire lease (#12747)
## Problem

Part of LKB-2368. Compute fails to obtain LSN lease in this test case.
There're many assumptions around how compute obtains the leases, and in
this particular test case, as the LSN lease length is only 8s (which is
shorter than the amount of time where pageserver can restart and compute
can reconnect in terms of force stop), it sometimes cause issues.

## Summary of changes

Add more sleeps around the test case to ensure it's stable at least. We
need to find a more reliable way to test this in the future.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-07-29 14:23:42 +00:00
Alex Chi Z.
dbde37c53a fix(safekeeper): retry if open segment fail (#12757)
## Problem

Fix LKB-2632.

The safekeeper wal read path does not seem to retry at all. This would
cause client read errors on the customer side.

## Summary of changes

- Retry on `safekeeper::wal_backup::read_object`.
- Note that this only retries on S3 HTTP connection errors. Subsequent
reads could fail, and that needs more refactors to make the retry
mechanism work across the path.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-07-29 14:20:43 +00:00
Heikki Linnakangas
5e3cb2ab07 Refactor LFC stats functions (#12696)
Split the functions into two parts: an internal function in file_cache.c
which returns an array of structs representing the result set, and
another function in neon.c with the glue code to expose it as a SQL
function. This is in preparation for the new communicator, which needs
to implement the same SQL functions, but getting the information from a
different place.

In the glue code, use the more modern Postgres way of building a result
set using a tuplestore.
2025-07-29 13:12:44 +00:00