Heikki Linnakangas
e2bad5d9e9
Add debugging HTTP endpoint for dumping the cache tree
2025-05-12 22:54:03 +03:00
Heikki Linnakangas
5623e4665b
bunch of fixes
2025-05-12 18:40:54 +03:00
Heikki Linnakangas
8abb4dab6d
implement shrinking nodes
2025-05-12 03:57:10 +03:00
Heikki Linnakangas
731667ac37
better metrics of the art tree
2025-05-12 02:08:51 +03:00
Heikki Linnakangas
6a1374d106
Pack tree node structs more tightly, avoiding alignment padding
2025-05-12 01:01:58 +03:00
Heikki Linnakangas
f7c908f2f0
more metrics
2025-05-12 01:01:50 +03:00
Heikki Linnakangas
86671e3a0b
Add a bunch of metric counters
2025-05-11 20:11:13 +03:00
Heikki Linnakangas
319cd74f73
Fix eviction
2025-05-11 19:34:50 +03:00
Heikki Linnakangas
0efefbf77c
Add a few metrics, fix page eviction
2025-05-10 03:13:28 +03:00
Heikki Linnakangas
e6a4171fa1
fix concurrency issues with the LFC
...
- Add another locking hash table to track which cached pages are currently being
modified, by smgrwrite() or smgrread() or by prefetch.
- Use single-value Leaf pages in the art tree. That seems simpler after all,
and it eliminates some corner cases where a Value needed to be cloned, which
made it tricky to use atomics or other interior mutability on the Values
2025-05-10 02:36:48 +03:00
Heikki Linnakangas
0c25ea9e31
reduce LOG noise
2025-05-09 18:27:36 +03:00
Heikki Linnakangas
6692321026
Remove dependency on io_uring, use plain std::fs ops instead
...
io_uring is a great idea in the long term, but for now, let's make it
easier to develop locally on macos, where io_uring is not available.
2025-05-06 17:46:21 +03:00
Heikki Linnakangas
791df28755
Linked list fix and add unit test
2025-05-06 16:46:54 +03:00
Heikki Linnakangas
d20da994f4
git add missing file
2025-05-06 15:36:48 +03:00
Heikki Linnakangas
6dbbdaae73
run 'cargo fmt'
2025-05-06 15:35:56 +03:00
Heikki Linnakangas
977bc09d2a
Bunch of fixes, smarter iterator, metrics exporter
2025-05-06 15:28:50 +03:00
Heikki Linnakangas
44269fcd5e
Implement simple eviction and free block tracking
2025-05-06 15:28:15 +03:00
Heikki Linnakangas
44cc648dc8
Implement iterator over keys
...
the implementation is not very optimized, but probably good enough for an MVP
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
884e028a4a
implement deletion in art tree
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
42df3e5453
debugging stats
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
fc743e284f
more work on allocators
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
d02f9a2139
Collect garbage, handle OOMs
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
083118e98e
Implement epoch system
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
54cd2272f1
more memory allocation stuff
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
e40193e3c8
simple block-based allocator
2025-05-06 15:27:38 +03:00
Heikki Linnakangas
ce9f7bacc1
Fix communicator client for recent changes in protocol and client code
2025-05-06 15:26:51 +03:00
Heikki Linnakangas
b7891f8fe8
Include 'neon-shard-id' header in client requests
2025-05-06 15:23:30 +03:00
Elizabeth Murray
5f2adaa9ad
Remove some additional debug info messages.
2025-05-02 10:50:53 -07:00
Elizabeth Murray
3e5e396c8d
Remove some debug info messages.
2025-05-02 10:24:18 -07:00
Elizabeth Murray
9d781c6fda
Add a connection pool module to the grpc client.
2025-05-02 10:22:33 -07:00
Erik Grinaker
cf5d038472
service documentation
2025-05-02 15:20:12 +02:00
Erik Grinaker
d785100c02
page_api: add GetPageRequest::class
2025-05-02 10:48:32 +02:00
Erik Grinaker
2c0d930e3d
page_api: add GetPageResponse::status
2025-04-30 16:48:45 +02:00
Erik Grinaker
66171a117b
page_api: add GetPageRequestBatch
2025-04-30 15:31:11 +02:00
Erik Grinaker
df2806e7a0
page_api: add GetPageRequest::id
2025-04-30 15:00:16 +02:00
Erik Grinaker
07631692db
page_api: protobuf comments
2025-04-30 12:36:11 +02:00
Erik Grinaker
4c77397943
Add neon-shard-id header
2025-04-30 11:18:06 +02:00
Erik Grinaker
7bb58be546
Use authorization header instead of neon-auth-token
2025-04-30 10:38:44 +02:00
Erik Grinaker
b5373de208
page_api: add get_slru_segment()
2025-04-29 17:59:27 +02:00
Erik Grinaker
b86c610f42
page_api: tweaks
2025-04-29 17:23:51 +02:00
Erik Grinaker
0f520d79ab
pageserver: rename data_api to page_api
2025-04-29 15:58:52 +02:00
Heikki Linnakangas
93eb7bb6b8
include lots of changes that went missing by accident
2025-04-29 15:32:27 +03:00
Heikki Linnakangas
e58d0fece1
New communicator, with "integrated" cache accessible from all processes
2025-04-29 11:52:44 +03:00
Alex Chi Z.
11f6044338
fix(pageserver): report synthetic size = 1 if all tls offloaded (2) ( #11731 )
...
## Problem
https://github.com/neondatabase/neon/pull/11648 did this for resident
size instead of synthetic size.
## Summary of changes
Report synthetic_size == 1 if all timelines are offloaded.
Signed-off-by: Alex Chi Z <chi@neon.tech >
2025-04-28 13:45:45 +00:00
Konstantin Knizhnik
692c0f3fb8
Prepare to prewarm support ( #11740 )
...
## Problem
See
(original prewarm implementation)
https://github.com/neondatabase/neon/pull/9197
(functions for storing/restoring LFC state)
https://github.com/neondatabase/neon/pull/9587
(store prefetch results in LFC)
https://github.com/neondatabase/neon/pull/10442
## Summary of changes
Preparation for prewarm implementation.
---------
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech >
2025-04-28 13:24:18 +00:00
Alexander Bayandin
2b1d2a55d6
CI: fix typo oicd -> oidc ( #11747 )
...
## Problem
It's OIDC (OpenID Connect), not OICD
## Summary of changes
- Rename actions input `aws-oicd-role-arn` -> `aws-oidc-role-arn`
2025-04-28 12:44:28 +00:00
Konstantin Knizhnik
60b9fb1baf
Ignore unlogged LSNs in set last written LSN ( #11743 )
...
## Problem
See https://github.com/neondatabase/neon/issues/11718
and https://neondb.slack.com/archives/C033RQ5SPDH/p1745122797538509
GIST other indexes performing "unlogged build" are using so called fake
LSNs - not a real LSN, but something like 0/1. Been stored in lwlsn
cache they cause incorrect lookup at PS.
## Summary of changes
Do not store fake LSNs in LwLSN hash.
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech >
2025-04-28 12:16:29 +00:00
Erik Grinaker
606f14034e
pageserver: improve pageserver_smgr_query_seconds buckets ( #11680 )
...
## Problem
The `pageserver_smgr_query_seconds` buckets are too coarse, using powers
of 10: 1 µs, 10 µs, 100 µs, 1 ms, 10 ms, 100 ms, 1 s, 10 s, 100 s. This
is one of our most crucial latency metrics, and needs better resolution.
Touches #11594 .
## Summary of changes
This patch uses buckets with better resolution around 1 ms (the typical
latency):
* 0.6 ms
* 1 ms
* 3 ms
* 6 ms
* 10 ms
* 30 ms
* 100 ms
* 1 s
* 3 s
These will be the same as the compute's `compute_getpage_wait_seconds`,
to make them comparable across the compute and Pageserver:
https://github.com/neondatabase/flux-fleet/pull/579 . We sacrifice
buckets above 3 s, since these can already be considered "too slow".
This does not change the previously used `CRITICAL_OP_BUCKETS`, which is
also used for other operations on different timescales (e.g. LSN waits).
We should consider replacing this with more appropriate buckets for
specific operations, since it covers a large span with low resolution.
2025-04-28 11:52:44 +00:00
Conrad Ludgate
32393b4393
pg-sni-router: support compute TLS on different port ( #11732 )
...
## Problem
pg-sni-router isn't aware of compute TLS
## Summary of changes
If connections come in on port 4433, we require TLS to compute from
pg-sni-router
2025-04-28 11:29:44 +00:00
Alexander Bayandin
1a29f5672a
CI(check-macos-build): trigger workflow automatically for PRs ( #11706 )
...
## Problem
- if-conditions for the `check-macos-build` workflow don't trigger it on
PRs with relevant changes (in Rust code or Postgres submodules).
- Jobs in the workflow depend on the presence of a cache, which is not
guaranteed.
## Summary of changes
- Fix if-conditions
- Use artifacts on top of cache whenever the workflow depends on it —
the cache might not be available
2025-04-28 09:03:10 +00:00