rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-07 14:10:43 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	4c916552e8	Reduce logging noise These are very useful while debugging, but also very noisy; let's dial it down a little.	2025-07-04 23:11:36 +03:00
Heikki Linnakangas	00affada26	Add request ID to all communicator log lines as context information	2025-07-04 20:34:26 +03:00
Heikki Linnakangas	90d3c09c24	Minor cleanup Tidy up and add some comments. Rename a few things for clarity.	2025-07-04 20:32:59 +03:00
Heikki Linnakangas	6c398aeae7	Fix dependency in Makefile	2025-07-04 20:24:21 +03:00
Heikki Linnakangas	1856bbbb9f	Minor cleanup and commenting	2025-07-04 18:28:34 +03:00
Heikki Linnakangas	bd46dd60a0	Add a temporary timeout to handling an IO request in the communicator It's nicer to timeout in the communicator and return an error to the backend, than PANIC the backend.	2025-07-04 16:08:22 +03:00
Heikki Linnakangas	5f2d476a58	Add request ID to io-in-progress locking table, to ease debugging I also added INFO messages for when a backend blocks on the io-in-progress lock. It's probably too noisy for production, but useful now to get a picture of how much it happens.	2025-07-04 15:55:57 +03:00
Heikki Linnakangas	3231cb6138	Await the io-in-progress locking futures Otherwise they don't do anything. Oops.	2025-07-04 15:55:57 +03:00
Heikki Linnakangas	e558e0da5c	Assign request_id earlier, in the originating backend Makes it more useful for stitching together logs etc. for a specific request.	2025-07-04 15:55:55 +03:00
Heikki Linnakangas	70bf2e088d	Request multiple block numbers in a single GetPageV request That's how it was always intended to be used	2025-07-04 15:49:04 +03:00
Heikki Linnakangas	da3f9ee72d	cargo fmt	2025-07-04 12:39:41 +03:00
David Freifeld	794bb7a9e8	Merge branch 'quantumish/comm-lfc-integration' into communicator-rewrite	2025-07-03 10:52:29 -07:00
Heikki Linnakangas	96a817fa2b	Fix the case that storage auth token is _not_ used I broke that in previous commit while fixing the case of using a token.	2025-07-03 18:39:06 +03:00
Heikki Linnakangas	e7b057f2e8	Fix passing storage JWT token to the communicator process Makes the 'test_compute_auth_to_pageserver' test pass	2025-07-03 18:14:22 +03:00
Heikki Linnakangas	956c2f4378	cargo fmt	2025-07-03 16:16:42 +03:00
Heikki Linnakangas	3293e4685e	Fix cases where pageserver gets stuck waiting for LSN The compute might make a request with an LSN that it hasn't even flushed yet.	2025-07-03 16:14:45 +03:00
Erik Grinaker	14214eb853	Add client shard routing	2025-07-03 14:42:35 +02:00
Erik Grinaker	de97b73d6e	Lint fixes	2025-07-03 10:38:14 +02:00
Heikki Linnakangas	d8556616c9	Fix running Postgres in "vanilla mode", without neon storage Some tests do that	2025-07-03 00:32:40 +03:00
Heikki Linnakangas	d8296e60e6	Fix caching of newly extended pages This fixes read errors e.g. in test_compute_catalog.py test (and probably many others).	2025-07-02 23:21:42 +03:00
David Freifeld	86fb7b966a	Update `integrated_cache.rs` to use new hashmap API	2025-07-02 12:18:37 -07:00
Heikki Linnakangas	2cc28c75be	Fix "ERROR: could not read size of rel ..." in many regression tests. We were incorrectly skipping the call to communicator_new_rel_create(), which resulted in an error during index build, when the btree build code tried to check the size of the newly-created relation.	2025-07-02 14:10:11 +03:00
Erik Grinaker	bf01145ae4	Remove some old code	2025-07-02 11:46:54 +02:00
Erik Grinaker	8ab8fc11a3	Use new `PageserverClient`	2025-07-02 11:27:56 +02:00
Heikki Linnakangas	2fefece77d	temporary hack to make regression tests fail faster	2025-07-02 01:42:39 +03:00
Heikki Linnakangas	471191e64e	Fix updating relsize cache during WAL replay This makes some of the test_runner/regress/test_hot_standby.py tests pass, (Others are still failing..)	2025-07-01 21:22:04 +03:00
Heikki Linnakangas	175c2e11e3	Add assertions that the legacy relsize cache is not used with new communicator And fix a few cases where it was being called	2025-07-01 16:44:25 +03:00
Heikki Linnakangas	efdb07e7b6	Implement function to check if page is in local cache This is needed for read replicas. There's one more TODO that needs to implemented before read replicas work though, in neon_extend_rel_size()	2025-07-01 16:22:51 +03:00
Heikki Linnakangas	b0970b415c	Don't call legacy lfc function when new communicator is used	2025-07-01 15:47:26 +03:00
Erik Grinaker	c3cb1ab98d	Merge branch 'main' into communicator-rewrite	2025-06-30 21:07:01 +02:00
Erik Grinaker	a5b0fc560c	Fix/allow remaining clippy lints	2025-06-30 12:36:20 +02:00
Erik Grinaker	67b04f8ab3	Fix a bunch of linter warnings	2025-06-30 11:10:02 +02:00
Heikki Linnakangas	97a8f4ef85	Handle unexpected EOF while doing an LFC read more gracefully There's a bug somewhere because this happens in python regression tests. We need to hunt that down, but in any case, let's not get stuck in an infinite loop if it happens.	2025-06-30 00:59:53 +03:00
Heikki Linnakangas	39f31957e3	Handle pageserver response with different number of pages gracefully Some tests are hitting this case, where pageserver returns 0 page images in the response to a GetPage request. I suspect it's because the code doesn't handle sharding correclty? In any case, let's not panic on it, but return an IO error to the originating backend.	2025-06-29 23:44:28 +03:00
Heikki Linnakangas	f3ba201800	Run `cargo fmt`	2025-06-29 21:21:07 +03:00
Heikki Linnakangas	a352d290eb	Plumb through both libpq and grpc connection strings to the compute Add a new 'pageserver_connection_info' field in the compute spec. It replaces the old 'pageserver_connstring' field with a more complicated struct that includes both libpq and grpc URLs, for each shard (or only one of the the URLs, depending on the configuration). It also includes a flag suggesting which one to use; compute_ctl now uses it to decide which protocol to use for the basebackup. This is compatible with everything that's in production, because the control plane never used the 'pageserver_connstring' field. That was added a long time ago with the idea that it would replace the code that digs the 'neon.pageserver_connstring' GUC from the list of Postgres settings, but we never got around to do that in the control plane. Hence, it was only used with neon_local. But the plan now is to pass the 'pageserver_connection_info' from the control plane, and once that's fully deployed everywhere, the code to parse 'neon.pageserver_connstring' in compute_ctl can be removed. The 'grpc' flag on an endpoint in endpoint config is now more of a suggestion. Compute_ctl gets both URLs, so it can choose to use libpq or grpc as it wishes. It currently always obeys the 'prefer_grpc' flag that's part of the connection info though. Postgres however uses grpc iff the new rust-based communicator is enabled. TODO/plan for the control plane: - Start to pass `pageserver_connection_info` in the spec file. - Also keep the current `neon.pageserver_connstring` setting for now, for backwards compatibility with old computes After that, the `pageserver_connection_info.prefer_grpc` flag in the spec file can be used to control whether compute_ctl uses grpc or libpq. The actual compute's grpc usage will be controlled by the `neon.enable_new_communicator` GUC. It can be set separately from 'prefer_grpc'. Later: - Once all old computes are gone, remove the code to pass `neon.pageserver_connstring`	2025-06-29 18:16:49 +03:00
Heikki Linnakangas	8c122a1c98	Don't call into the old LFC when using the new communicator This fixes errors like `index "pg_class_relname_nsp_index" contains unexpected zero page at block 2` when running the python tests smgrzeroextend() still called into the old LFC's lfc_write() function, even when using the new communicator, which zeroed some arbitrary pages in the LFC file, overwriting pages managed by the new LFC implementation managed by `integrated_cache.rs`	2025-06-29 17:40:46 +03:00
David Freifeld	78b6da270b	Sketchily integrate hashmap rewrite with `integrated_cache`	2025-06-26 16:45:48 -07:00
David Freifeld	4713715c59	Merge branch 'communicator-rewrite' of github.com:neondatabase/neon into communicator-rewrite	2025-06-26 10:26:41 -07:00
David Freifeld	1e74b52f7e	Merge branch 'quantumish/lfc-resizable-map' into communicator-rewrite	2025-06-26 10:26:22 -07:00
Erik Grinaker	e3ecdfbecc	pgxn/neon: actually use UNAME_S	2025-06-26 12:38:44 +02:00
Erik Grinaker	d08e553835	pgxn/neon: fix `callback_get_request_lsn_unsafe` return type	2025-06-26 12:33:59 +02:00
Erik Grinaker	7fffb5b4df	pgxn/neon: fix macOS build	2025-06-26 12:33:39 +02:00
Konstantin Knizhnik	be23eae3b6	Mark pages as avaiable in LFC only after generation check (#12350 ) ## Problem If LFC generation is changed then `lfc_readv_select` will return -1 but pages are still marked as available in bitmap. ## Summary of changes Update bitmap after generation check. Co-authored-by: Kosntantin Knizhnik <konstantin.knizhnik@databricks.com>	2025-06-26 07:06:27 +00:00
Heikki Linnakangas	e90be06d46	silence a few compiler warnings about unnecessary 'mut's and 'use's	2025-06-23 18:16:54 +03:00
Heikki Linnakangas	356ba67607	Merge remote-tracking branch 'origin/main' into HEAD I also included build script changes from https://github.com/neondatabase/neon/pull/12266, which is not yet merged but will be soon.	2025-06-23 17:46:30 +03:00
Heikki Linnakangas	3d822dbbde	Refactor Makefile rules for building the extensions under pgxn/ (#12305 )	2025-06-22 19:43:14 +00:00
Heikki Linnakangas	1847f4de54	Add missing #include. Got a warning on macos without this	2025-06-18 17:26:20 +03:00
Mikhail	e95f2f9a67	compute_ctl: return LSN in /terminate (#12240 ) - Add optional `?mode=fast\|immediate` to `/terminate`, `fast` is default. Immediate avoids waiting 30 seconds before returning from `terminate`. - Add `TerminateMode` to `ComputeStatus::TerminationPending` - Use `/terminate?mode=immediate` in `neon_local` instead of `pg_ctl stop` for `test_replica_promotes`. - Change `test_replica_promotes` to check returned LSN - Annotate `finish_sync_safekeepers` as `noreturn`. https://github.com/neondatabase/cloud/issues/29807	2025-06-18 12:25:19 +00:00
Mikhail	7d4f662fbf	upgrade default neon version to 1.6 (#12185 ) Changes for 1.6 were merged and deployed two months ago https://github.com/neondatabase/neon/blob/main/pgxn/neon/neon--1.6--1.5.sql. In order to deploy https://github.com/neondatabase/neon/pull/12183, we need 1.6 to be default, otherwise we can't use prewarm API on read-only replica (`ALTER EXTENSION` won't work) and we need it for promotion	2025-06-17 17:46:35 +00:00

1 2 3 4 5 ...

453 Commits