rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-04 12:40:37 +00:00

Author	SHA1	Message	Date
Alexey Masterov	bfb7bf92f2	fix linters' warnings	2024-09-05 11:07:51 +02:00
Alexey Masterov	f8c9966aff	modify the patch	2024-09-05 10:10:54 +02:00
Alexey Masterov	2e1725c570	modify the patch	2024-09-05 09:56:48 +02:00
Alexey Masterov	9414976c4c	uncomment the extension creation	2024-09-04 17:36:48 +02:00
Alexey Masterov	777c01938d	fix	2024-09-04 15:42:19 +02:00
Alexey Masterov	302a2203a1	change path	2024-09-04 15:27:36 +02:00
Alexey Masterov	bc1697ab28	change path	2024-09-04 15:18:22 +02:00
Alexey Masterov	61f3ac3fbf	change path	2024-09-04 14:58:41 +02:00
Alexey Masterov	f7f0be8727	Temporary disable the extension.	2024-09-04 14:55:02 +02:00
Alexey Masterov	c34323eb80	Fix the test selection	2024-09-04 13:48:19 +02:00
Alexey Masterov	4104b1cbd4	Add CONNSTR	2024-09-04 13:29:08 +02:00
Alexey Masterov	d143822f64	update patches	2024-09-04 12:36:08 +02:00
Alexey Masterov	6ff6843dbb	add submodules	2024-09-04 11:23:35 +02:00
Alexey Masterov	c14d53b923	debug	2024-09-04 11:20:32 +02:00
Alexey Masterov	c7dde2e784	fix an obvious error	2024-09-04 11:05:05 +02:00
Alexey Masterov	173aef925c	directory change	2024-09-04 11:03:06 +02:00
Alexey Masterov	b2af44f027	debug	2024-09-04 11:00:03 +02:00
Alexey Masterov	a07fda3a86	debug	2024-09-04 10:57:22 +02:00
Alexey Masterov	6b5d33de7d	debug	2024-09-04 10:55:36 +02:00
Alexey Masterov	16450111c9	Fix a syntax error	2024-09-04 10:53:52 +02:00
Alexey Masterov	e8775dda76	Add patch	2024-09-04 10:42:45 +02:00
Alexey Masterov	8959cb1219	change on:	2024-09-03 19:16:23 +02:00
Alexey Masterov	ecf20bb6fa	Add the workflow file	2024-09-03 17:21:33 +02:00
Alexey Masterov	5a4a2ae4cd	Fix the trailing space	2024-09-02 10:52:22 +02:00
Alexey Masterov	d4f656daa2	Change the python file	2024-09-02 09:07:11 +02:00
Alexey Masterov	e2921e352c	Change the patch file	2024-09-02 09:06:19 +02:00
Alexey Masterov	8fb8ec57ea	Add python script, rename patch file	2024-08-30 16:39:07 +02:00
Alexey Masterov	0c6b34b5a0	New patch	2024-08-30 13:22:50 +02:00
Alexey Masterov	b3d90a7d7d	Merge branch 'main' into amasterov/regress-arm	2024-08-29 09:19:34 +02:00
Alexey Masterov	9b0e277514	New patch	2024-08-28 18:14:52 +02:00
Heikki Linnakangas	c5ef779801	tests: Remove unnecessary entries from list of allowed errors (#8199 ) The "manual_gc" context was removed in commit `be0c73f8e7`. The code that generated the other error was removed in commit `9a6c0be823`.	2024-08-27 17:47:05 +01:00
Heikki Linnakangas	2d10306f7a	Remove support for pageserver <-> compute protocol version 1 (#8774 ) Protocol version 2 has been the default for a while now, and we no longer have any computes running in production that used protocol version 1. This completes the migration by removing support for v1 in both the pageserver and the compute. See issue #6211.	2024-08-27 18:36:33 +03:00
Alexey Kondratov	9b9f90c562	fix(walproposer): Do not restart on safekeepers reordering (#8840 ) ## Problem Currently, we compare `neon.safekeepers` values as is, so we unnecessarily restart walproposer even if safekeepers set didn't change. This leads to errors like: ```log FATAL: [WP] restarting walproposer to change safekeeper list from safekeeper-8.us-east-2.aws.neon.tech:6401,safekeeper-11.us-east-2.aws.neon.tech:6401,safekeeper-10.us-east-2.aws.neon.tech:6401 to safekeeper-11.us-east-2.aws.neon.tech:6401,safekeeper-8.us-east-2.aws.neon.tech:6401,safekeeper-10.us-east-2.aws.neon.tech:6401 ``` ## Summary of changes Split the GUC into the list of individual safekeepers and properly compare. We could've done that somewhere on the upper level, e.g., control plane, but I think it's still better when the actual config consumer is smarter and doesn't rely on upper levels.	2024-08-27 15:49:47 +02:00
Folke Behrens	52cb33770b	proxy: Rename backend types and variants as prep for refactor (#8845 ) * AuthBackend enum to AuthBackendType * BackendType enum to Backend * Link variants to Web * Adjust messages, comments, etc.	2024-08-27 14:12:42 +02:00
Conrad Ludgate	12850dd5e9	proxy: remove dead code (#8847 ) By marking everything possible as pub(crate), we find a few dead code candidates.	2024-08-27 12:00:35 +01:00
a-masterov	5d527133a3	Fix the pg_hintplan flakyness (#8834 ) ## Problem pg_hintplan test seems to be flaky, sometimes it fails, while usually it passes ## Summary of changes The regression test is changed to filter out the Neon service queries. The expected file is changed as well. ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-08-27 12:39:42 +02:00
Arseny Sher	09362b6363	safekeeper: reorder routes and their handlers. Routes and their handlers were in a bit different order in 1) routes list 2) their implementation 3) python client 4) openapi spec, making addition of new ones intimidating. Make it the same everywhere, roughly lexicographically but preserving some of existing logic. No functional changes.	2024-08-27 07:37:55 +03:00
Alexey Kondratov	7820c572e7	fix(sql-exporter): Remove tenant_id from compute_logical_snapshot_files It appeared to be that it's already auto-added to all metrics [1] [1]: `3a907c317c/apps/base/ext-vmagent/vmagent.yaml (L43)`	2024-08-27 00:51:23 +02:00
Alexey Kondratov	bf03713fa1	fix(sql-exporter): Fix typo in gauge In `f4b3c317f` there was a typo and I missed that on review	2024-08-27 00:51:23 +02:00
Alex Chi Z.	0f65684263	feat(pageserver): use split layer writer in gc-compaction (#8608 ) Part of #8002, the final big PR in the batch. ## Summary of changes This pull request uses the new split layer writer in the gc-compaction. * It changes how layers are split. Previously, we split layers based on the original split point, but this creates too many layers (test_gc_feedback has one key per layer). * Therefore, we first verify if the layer map can be processed by the current algorithm (See https://github.com/neondatabase/neon/pull/8191, it's basically the same check) * On that, we proceed with the compaction. This way, it creates a large enough layer close to the target layer size. * Added a new set of functions `with_discard` in the split layer writer. This helps us skip layers if we are going to produce the same persistent key. * The delta writer will keep the updates of the same key in a single file. This might create a super large layer, but we can optimize it later. * The split layer writer is used in the gc-compaction algorithm, and it will split layers based on size. * Fix the image layer summary block encoded the wrong key range. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-08-26 14:19:47 -04:00
Christian Schwarz	97241776aa	pageserver: startup: ensure local disk state is durable (#8835 ) refs https://github.com/neondatabase/neon/issues/6989 Problem ------- After unclean shutdown, we get restarted, start reading the local filesystem, and make decisions based on those reads. However, some of the data might have not yet been fsynced when the unclean shutdown completed. Durability matters even though Pageservers are conceptually just a cache of state in S3. For example: - the cloud control plane is no control loop => pageserver responses to tenant attachmentm, etc, needs to be durable. - the storage controller does not rely on this (as much?) - we don't have layer file checksumming, so, downloaded+renamed but not fsynced layer files are technically not to be trusted - https://github.com/neondatabase/neon/issues/2683 Solution -------- `syncfs` the tenants directory during startup, before we start reading from it. This is a bit overkill because we do remove some temp files (InMemoryLayer!) later during startup. Further, these temp files are particularly likely to be dirty in the kernel page cache. However, we don't want to refactor that cleanup code right now, and the dirty data on pageservers is generally not that high. Last, with [direct IO](https://github.com/neondatabase/neon/issues/8130) we're going to have near-zero kernel page cache anyway quite soon.	2024-08-26 18:07:55 +02:00
Arpad Müller	2dd53e7ae0	Timeline archival test (#8824 ) This PR: * Implements the rule that archived timelines require all of their children to be archived as well, as specified in the RFC. There is no fancy locking mechanism though, so the precondition can still be broken. As a TODO for later, we still allow unarchiving timelines with archived parents. * Adds an `is_archived` flag to `TimelineInfo` * Adds timeline_archival_config to `PageserverHttpClient` * Adds a new `test_timeline_archive` test, loosely based on `test_timeline_delete` Part of #8088	2024-08-26 17:30:19 +02:00
Folke Behrens	d6eede515a	proxy: clippy lints: handle some low hanging fruit (#8829 ) Should be mostly uncontroversial ones.	2024-08-26 15:16:54 +02:00
Alexey Kondratov	d48229f50f	feat(compute): Introduce new compute_subscriptions_count metric (#8796 ) ## Problem We need some metric to sneak peek into how many people use inbound logical replication (Neon is a subscriber). ## Summary of changes This commit adds a new metric `compute_subscriptions_count`, which is number of subscriptions grouped by enabled/disabled state. Resolves: neondatabase/cloud#16146	2024-08-26 14:34:18 +02:00
Jakub Kołodziejczak	cdfdcd3e5d	chore: improve markdown formatting (#8825 ) fixes: ![Screenshot_2024-08-25_16-25-30](https://github.com/user-attachments/assets/c993309b-6c2d-4938-9fd0-ce0953fc63ff) fixes: ![Screenshot_2024-08-25_16-26-29](https://github.com/user-attachments/assets/cf497f4a-d9e3-45a6-a1a5-7e215d96d022)	2024-08-25 16:33:45 +01:00
Conrad Ludgate	06795c6b9a	proxy: new local-proxy application (#8736 ) Add binary for local-proxy that uses the local auth backend. Runs only the http serverless driver support and offers config reload based on a config file and SIGHUP	2024-08-23 22:32:10 +01:00
Conrad Ludgate	701cb61b57	proxy: local auth backend (#8806 ) Adds a Local authentication backend. Updates http to extract JWT bearer tokens and passes them to the local backend to validate.	2024-08-23 18:48:06 +00:00
John Spray	0aa1450936	storage controller: enable timeline CRUD operations to run concurrently with reconciliation & make them safer (#8783 ) ## Problem - If a reconciler was waiting to be able to notify computes about a change, but the control plane was waiting for the controller to finish a timeline creation/deletion, the overall system can deadlock. - If a tenant shard was migrated concurrently with a timeline creation/deletion, there was a risk that the timeline operation could be applied to a non-latest-generation location, and thereby not really be persistent. This has never happened in practice, but would eventually happen at scale. Closes: #8743 ## Summary of changes - Introduce `Service::tenant_remote_mutation` helper, which looks up shards & generations and passes them into an inner function that may do remote I/O to pageservers. Before returning success, this helper checks that generations haven't incremented, to guarantee that changes are persistent. - Convert tenant_timeline_create, tenant_timeline_delete, and tenant_timeline_detach_ancestor to use this helper. - These functions no longer block on ensure_attached unless the tenant was never attached at all, so they should make progress even if we can't complete compute notifications. This increases the database load from timeline/create operations, but only with cheap read transactions.	2024-08-23 18:56:05 +01:00
John Spray	b65a95f12e	controller: use PageserverUtilization for scheduling (#8711 ) ## Problem Previously, the controller only used the shard counts for scheduling. This works well when hosting only many-sharded tenants, but works much less well when hosting single-sharded tenants that have a greater deviation in size-per-shard. Closes: https://github.com/neondatabase/neon/issues/7798 ## Summary of changes - Instead of UtilizationScore, carry the full PageserverUtilization through into the Scheduler. - Use the PageserverUtilization::score() instead of shard count when ordering nodes in scheduling. Q: Why did test_sharding_split_smoke need updating in this PR? A: There's an interesting side effect during shard splits: because we do not decrement the shard count in the utilization when we de-schedule the shards from before the split, the controller will now prefer to pick _different_ nodes for shards compared with which ones held secondaries before the split. We could use our knowledge of splitting to fix up the utilizations more actively in this situation, but I'm leaning toward leaving the code simpler, as in practical systems the impact of one shard on the utilization of a node should be fairly low (single digit %).	2024-08-23 18:32:56 +01:00
Conrad Ludgate	c1cb7a0fa0	proxy: flesh out JWT verification code (#8805 ) This change adds in the necessary verification steps for the JWT payload, and adds per-role querying of JWKs as needed for #8736	2024-08-23 18:01:02 +01:00

1 2 3 4 5 ...

5997 Commits