rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-24 16:40:38 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	69dbad700c	Merge remote-tracking branch 'origin/main' into HEAD	2025-07-12 16:43:57 +03:00
Matthias van de Meent	4566b12a22	NEON: Finish Zenith->Neon rename (#12566 ) Even though we're now part of Databricks, let's at least make this part consistent. ## Summary of changes - PG14: https://github.com/neondatabase/postgres/pull/669 - PG15: https://github.com/neondatabase/postgres/pull/670 - PG16: https://github.com/neondatabase/postgres/pull/671 - PG17: https://github.com/neondatabase/postgres/pull/672 --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-07-11 18:56:39 +00:00
Heikki Linnakangas	3300207523	Update working set size estimate without lock (#12570 ) Update the WSS estimate before acquring the lock, so that we don't need to hold the lock for so long. That seems safe to me, see added comment. I was planning to do this with the new rust-based communicator implementation anyway, but it might help a little with the current C implementation too. And more importantly, having this as a separate PR gives us a chance to review this aspect independently.	2025-07-11 16:05:22 +00:00
Heikki Linnakangas	a8db7ebffb	Minor refactor of the SQL functions to get working set size estimate (#12550 ) Split the functions into two: one internal function to calculate the estimate, and another (two functions) to expose it as SQL functions. This is in preparation of adding new communicator implementation. With that, the SQL functions will dispatch the call to the old or new implementation depending on which is being used.	2025-07-11 14:17:44 +00:00
Erik Grinaker	1637fbce25	Merge fix	2025-07-11 10:50:19 +02:00
Erik Grinaker	8cd5370c00	Merge branch 'main' into communicator-rewrite	2025-07-11 10:39:26 +02:00
Alex Chi Z.	b91f821e8b	fix(libpagestore): update the default stripe size (#12557 ) ## Problem Part of LKB-379 The pageserver connstrings are updated in the postmaster and then there's a hook to propagate it to the shared memory of all backends. However, the shard stripe doesn't. This would cause problems during shard splits: * the compute has active reads/writes * shard split happens and the cplane applies the new config (pageserver connstring + stripe size) * pageserver connstring will be updated immediately once the postmaster receives the SIGHUP, and it will be copied over the the shared memory of all other backends. * stripe size is a normal GUC and we don't have special handling around that, so if any active backend has ongoing txns the value won't be applied. * now it's possible for backends to issue requests based on the wrong stripe size; what's worse, if a request gets cached in the prefetch buffer, it will get stuck forever. ## Summary of changes To make sure it aligns with the current default in storcon. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-07-10 21:49:52 +00:00
Tristan Partin	1b7339b53e	PG: add max_wal_rate (#12470 ) ## Problem One PG tenant may write too fast and overwhelm the PS. The other tenants sharing the same PSs will get very little bandwidth. We had one experiment that two tenants sharing the same PSs. One tenant runs a large ingestion that delivers hundreds of MB/s while the other only get < 10 MB/s. ## Summary of changes Rate limit how fast PG can generate WALs. The default is -1. We may scale the default value with the CPU count. Need to run some experiments to verify. ## How is this tested? CI. PGBench. No limit first. Then set to 1 MB/s and you can see the tps drop. Then reverted the change and tps increased again. pgbench -i -s 10 -p 55432 -h 127.0.0.1 -U cloud_admin -d postgres pgbench postgres -c 10 -j 10 -T 6000000 -P 1 -b tpcb-like -h 127.0.0.1 -U cloud_admin -p 55432 progress: 33.0 s, 986.0 tps, lat 10.142 ms stddev 3.856 progress: 34.0 s, 973.0 tps, lat 10.299 ms stddev 3.857 progress: 35.0 s, 1004.0 tps, lat 9.939 ms stddev 3.604 progress: 36.0 s, 984.0 tps, lat 10.183 ms stddev 3.713 progress: 37.0 s, 998.0 tps, lat 10.004 ms stddev 3.668 progress: 38.0 s, 648.9 tps, lat 12.947 ms stddev 24.970 progress: 39.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 40.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 41.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 42.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 43.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 44.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 45.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 46.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 47.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 48.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 49.0 s, 347.3 tps, lat 321.560 ms stddev 1805.633 progress: 50.0 s, 346.8 tps, lat 9.898 ms stddev 3.809 progress: 51.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 52.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 53.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 54.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 55.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 56.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 57.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 58.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 59.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 60.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 61.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 62.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 63.0 s, 494.5 tps, lat 276.504 ms stddev 1853.689 progress: 64.0 s, 488.0 tps, lat 20.530 ms stddev 71.981 progress: 65.0 s, 407.8 tps, lat 9.502 ms stddev 3.329 progress: 66.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 67.0 s, 0.0 tps, lat 0.000 ms stddev 0.000 progress: 68.0 s, 504.5 tps, lat 71.627 ms stddev 397.733 progress: 69.0 s, 371.0 tps, lat 24.898 ms stddev 29.007 progress: 70.0 s, 541.0 tps, lat 19.684 ms stddev 24.094 progress: 71.0 s, 342.0 tps, lat 29.542 ms stddev 54.935 Co-authored-by: Haoyu Huang <haoyu.huang@databricks.com>	2025-07-10 20:34:11 +00:00
Heikki Linnakangas	bceafc6c32	Update LFC cache hit/miss counters Fixes EXPLAIN (FILECACHE) option	2025-07-10 16:36:53 +03:00
Heikki Linnakangas	dcf8e0565f	Improve communicator README	2025-07-10 15:19:20 +03:00
Heikki Linnakangas	c14cf15b52	Tidy up the memory ordering instructions on request slot code I believe the explicit memory fence instructions are unnecessary. Performing a store with Release ordering makes all the previous non-atomic writes visible too. Per rust docs for Ordering::Release ( https://doc.rust-lang.org/std/sync/atomic/enum.Ordering.html#variant.Release): > When coupled with a store, all previous operations become ordered > before any load of this value with Acquire (or stronger) > ordering. In particular, all previous writes become visible to all > threads that perform an Acquire (or stronger) load of this value. > > ... > > Corresponds to memory_order_release in C++20. The "all previous writes" means non-atomic writes too. It's not very clear from that text, but the C++20 docs that it links to is more explicit about it: > All memory writes (including non-atomic and relaxed atomic) that > happened-before the atomic store from the point of view of thread A, > become visible side-effects in thread B. That is, once the atomic > load is completed, thread B is guaranteed to see everything thread A > wrote to memory. In addition to removing the fence instructions, fix the comments on each atomic Acquire operation to point to the correct Release counterpart. We had such comments but they had gone out-of-date as code has moved.	2025-07-10 15:19:20 +03:00
Heikki Linnakangas	5da06d4129	Make start_neon_io_request() wakeup the communicator process All the callers did that previously. So rather than document that the caller needs to do it, just do it in start_neon_io_request() straight away. (We might want to revisit this if we get codepaths where the C code submits multiple IO requests as a batch. In that case, it would be more efficient to fill all the request slots first and only send one notification to the pipe for all of them)	2025-07-10 15:19:20 +03:00
Heikki Linnakangas	f30c59bec9	Improve comments on request slots	2025-07-10 15:19:20 +03:00
Heikki Linnakangas	47c099a0fb	Rename NeonIOHandle to NeonIORequestSlot All the code talks about "request slots", better to make the struct name reflect that. The "Handle" term was borrowed from Postgres v18 AIO implementation, from the similar handles or slots used to submit IO requests from backends to worker processes. But even though the idea is similar, it's a completely separate implementation and there's nothing else shared between them than the very high level design.	2025-07-10 14:52:16 +03:00
Heikki Linnakangas	b67e8f2edc	Move some code, just for more natural logical ordering	2025-07-10 14:49:29 +03:00
Heikki Linnakangas	b5b1db29bb	Implement shard map live-update	2025-07-10 12:25:15 +03:00
Dimitri Fontaine	1a45b2ec90	Review security model for executing Event Trigger code. (#12463 ) When a function is owned by a superuser (bootstrap user or otherwise), we consider it safe to run it. Only a superuser could have installed it, typically from CREATE EXTENSION script: we trust the code to execute. ## Problem This is intended to solve running pg_graphql Event Triggers graphql_watch_ddl and graphql_watch_drop which are executing the secdef function graphql.increment_schema_version(). ## Summary of changes Allow executing Event Trigger function owned by a superuser and with SECURITY DEFINER properties. The Event Trigger code runs with superuser privileges, and we consider that it's fine. --------- Co-authored-by: Tristan Partin <tristan.partin@databricks.com>	2025-07-10 08:06:33 +00:00
Heikki Linnakangas	ed4652b65b	Update the relsize cache rather than forget it at end of index build This greatly reduces the cases where we make a request to the pageserver with a very recent LSN. Those cases are slow because the pageserver needs to wait for the WAL to arrive. This speeds up the Postgres pg_regress and isolation tests greatly.	2025-07-09 17:21:06 +03:00
Heikki Linnakangas	60d87966b8	minor comment improvement	2025-07-09 16:39:40 +03:00
Heikki Linnakangas	8db138ef64	Plumb through the stripe size to the communicator	2025-07-09 16:18:26 +03:00
Heikki Linnakangas	1ee24602d5	Implement working set size estimation	2025-07-09 16:18:26 +03:00
Heikki Linnakangas	732bd26e70	cargo fmt	2025-07-09 16:18:26 +03:00
Konstantin Knizhnik	4ee0da0a20	Check prefetch response before assignment to slot (#12371 ) ## Problem See [Slack Channel](https://databricks.enterprise.slack.com/archives/C091LHU6NNB) Dropping connection without resetting prefetch state can cause request/response mismatch. And lack of check response correctness in communicator_prefetch_lookupv can cause data corruption. ## Summary of changes 1. Validate response before assignment to prefetch slot. 2. Consume prefetch requests before sending any other requests. --------- Co-authored-by: Kosntantin Knizhnik <konstantin.knizhnik@databricks.com> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-07-09 12:49:21 +00:00
Heikki Linnakangas	d63f1d259a	avoid assertion failure about calling palloc() in critical section	2025-07-08 21:33:25 +03:00
Heikki Linnakangas	4053092408	Fix LSN tracking on "unlogged index builds" Fixes the test_gin_redo.py test failure, and probably some others	2025-07-08 17:22:24 +03:00
Heikki Linnakangas	ccf88e9375	Improve debug logging by printing IO request details	2025-07-08 17:16:09 +03:00
Heikki Linnakangas	a79fd3bda7	Move logic for picking request slot to the C code With this refactoring, the Rust code deals with one giant array of requests, and doesn't know that it's sliced up per backend process. The C code is now responsible for slicing it up. This also adds code to complete old IOs at backends start that were started and left behind by a previous session. That was a little more straightforward to do with the refactoring, which is why I tackled it now.	2025-07-07 12:59:08 +03:00
Heikki Linnakangas	e1b58d5d69	Don't segfault if one of the unimplemented functions are called We'll need to implement these, but let's stop the crashing for now	2025-07-07 11:33:44 +03:00
Erik Grinaker	9ae004f3bc	Rename ShardMap to ShardSpec	2025-07-06 19:13:59 +02:00
Erik Grinaker	4b06b547c1	pageserver/client_grpc: add shard map updates	2025-07-06 13:27:17 +02:00
Heikki Linnakangas	74e0d85a04	fix: Don't lose track of in-progress request if query is cancelled	2025-07-06 13:04:03 +03:00
Heikki Linnakangas	e14bb4be39	Merge remote-tracking branch 'origin/main' into communicator-rewrite	2025-07-05 16:59:51 +03:00
Heikki Linnakangas	f3a6c0d8ff	cargo fmt	2025-07-05 16:26:24 +03:00
Heikki Linnakangas	d6ec1f1a1c	Skip legacy LFC initialization when communicator is used It clashes with the initialization of the LFC file	2025-07-05 16:26:24 +03:00
Heikki Linnakangas	b568189f7b	Build dummy libcommunicator into the 'neon' extension (#12266 ) This doesn't do anything interesting yet, but demonstrates linking Rust code to the neon Postgres extension, so that we can review and test drive just the build process changes independently.	2025-07-04 23:27:28 +00:00
Heikki Linnakangas	4c916552e8	Reduce logging noise These are very useful while debugging, but also very noisy; let's dial it down a little.	2025-07-04 23:11:36 +03:00
Heikki Linnakangas	00affada26	Add request ID to all communicator log lines as context information	2025-07-04 20:34:26 +03:00
Heikki Linnakangas	90d3c09c24	Minor cleanup Tidy up and add some comments. Rename a few things for clarity.	2025-07-04 20:32:59 +03:00
Heikki Linnakangas	6c398aeae7	Fix dependency in Makefile	2025-07-04 20:24:21 +03:00
Heikki Linnakangas	1856bbbb9f	Minor cleanup and commenting	2025-07-04 18:28:34 +03:00
Heikki Linnakangas	bd46dd60a0	Add a temporary timeout to handling an IO request in the communicator It's nicer to timeout in the communicator and return an error to the backend, than PANIC the backend.	2025-07-04 16:08:22 +03:00
Heikki Linnakangas	5f2d476a58	Add request ID to io-in-progress locking table, to ease debugging I also added INFO messages for when a backend blocks on the io-in-progress lock. It's probably too noisy for production, but useful now to get a picture of how much it happens.	2025-07-04 15:55:57 +03:00
Heikki Linnakangas	3231cb6138	Await the io-in-progress locking futures Otherwise they don't do anything. Oops.	2025-07-04 15:55:57 +03:00
Heikki Linnakangas	e558e0da5c	Assign request_id earlier, in the originating backend Makes it more useful for stitching together logs etc. for a specific request.	2025-07-04 15:55:55 +03:00
Heikki Linnakangas	70bf2e088d	Request multiple block numbers in a single GetPageV request That's how it was always intended to be used	2025-07-04 15:49:04 +03:00
Konstantin Knizhnik	436a117c15	Do not allocate anything in subtransaction memory context (#12176 ) ## Problem See https://github.com/neondatabase/neon/issues/12173 ## Summary of changes Allocate table in TopTransactionMemoryContext --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-07-04 10:24:39 +00:00
Heikki Linnakangas	da3f9ee72d	cargo fmt	2025-07-04 12:39:41 +03:00
David Freifeld	794bb7a9e8	Merge branch 'quantumish/comm-lfc-integration' into communicator-rewrite	2025-07-03 10:52:29 -07:00
Konstantin Knizhnik	495112ca50	Add GUC for dynamically enable compare local mode (#12424 ) ## Problem DEBUG_LOCAL_COMPARE mode allows to detect data corruption. But it requires rebuild of neon extension (and so requires special image) and significantly slowdown execution because always fetch pages from page server. ## Summary of changes Introduce new GUC `neon.debug_compare_local`, accepting the following values: " none", "prefetch", "lfc", "all" (by default it is definitely disabled). In mode less than "all", neon SMGR will not fetch page from PS if it is found in local caches. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-07-03 17:37:05 +00:00
Heikki Linnakangas	96a817fa2b	Fix the case that storage auth token is _not_ used I broke that in previous commit while fixing the case of using a token.	2025-07-03 18:39:06 +03:00

1 2 3 4 5 ...

490 Commits