rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-15 17:32:56 +00:00

Author	SHA1	Message	Date
Konstantin Knizhnik	8de5b63ac4	Remove assert for non-zero number of shards from AssignPageserverConnstring	2023-12-20 09:16:54 +02:00
Konstantin Knizhnik	338bf7a446	Bump postgres version	2023-12-19 22:17:07 +02:00
Konstantin Knizhnik	767ce2b187	Bump Postgres version	2023-12-19 16:21:15 +02:00
Konstantin Knizhnik	80bf8d4761	Allow empty connection string	2023-12-19 14:45:25 +02:00
Konstantin Knizhnik	c2b396905f	Merge with main	2023-12-19 10:20:56 +02:00
Konstantin Knizhnik	b43ae6e26c	Address review comments	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	151870b666	Update pgxn/neon/libpagestore.c Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	4cdfff75f2	Update pgxn/neon/libpagestore.c Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	02abe3c82e	Fix problem with stats collector at pg14	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	5c8b3d997f	Add [NEON_SMGR] to all messages produced by Neon exrtension	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	b465185738	Add [NEON_SMGR] to all messages produced by Neon exrtension	2023-12-19 09:22:42 +02:00
John Spray	4e47d3a984	pgxn: amend key hashing	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	3853646c75	[see #6052 ] make connection logging shard-aware	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	86c76d4b59	Fix shard map reload synchronization	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	864d5f84e9	Fix shard map reload mechanism	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	38f64ea8ef	Load shard map only at postmaster	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	2855074f55	Fix comments	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	16b0348c1f	Do not deop PS connections of config reload if connection strings are not changed	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	93c75c97e1	Fix shard hash caclulation	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	d3221243f8	Minor refectoring	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	3cbd2df0b3	Add neon.stripe_size	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	8ddfa0515e	Undo occsional changed in control_place_connector.c	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	3019f8fe8f	Merge with main	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	e720b5f9d4	Load shardmap from postgresql.conf	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	039fa60446	Take in account stripe size when calculating shard hash number	2023-12-19 09:22:42 +02:00
Konstantin Knizhnik	7ca20dd12b	Add support for PS shardoing in compute	2023-12-19 09:22:42 +02:00
Anna Khanova	6e6e40dd7f	Invalidate credentials on auth failure (#6171 ) ## Problem If the user reset password, cache could receive this information only after `ttl` minutes. ## Summary of changes Invalidate password on auth failure.	2023-12-18 23:24:22 +01:00
Heikki Linnakangas	6939fc3db6	Remove declarations of non-existent global variables and functions FileCacheMonitorMain was removed in commit `b497d0094e`.	2023-12-18 21:05:31 +02:00
Heikki Linnakangas	c4c48cfd63	Clean up #includes - No need to include c.h, port.h or pg_config.h, they are included in postgres.h - No need to include postgres.h in header files. Instead, the assumption in PostgreSQL is that all .c files include postgres.h. - Reorder includes to alphabetical order, and system headers before pgsql headers - Remove bunch of other unnecessary includes that got copy-pasted from one source file to another	2023-12-18 21:05:29 +02:00
Heikki Linnakangas	82215d20b0	Mark some variables 'static' Move initialization of neon_redo_read_buffer_filter. This allows marking it 'static', too.	2023-12-18 21:05:24 +02:00
Sasha Krassovsky	62737f3776	Grant BYPASSRLS and REPLICATION explicitly to neon_superuser roles	2023-12-18 10:54:14 -08:00
Christian Schwarz	1f9a7d1cd0	add a Rust client for Pageserver page_service (#6128 ) Part of getpage@lsn benchmark epic: https://github.com/neondatabase/neon/issues/5771 Stacked atop https://github.com/neondatabase/neon/pull/6145	2023-12-18 18:17:19 +00:00
John Spray	4ea4812ab2	tests: update python dependencies (#6164 ) ## Problem Existing dependencies didn't work on Fedora 39 (python 3.12) ## Summary of changes - Update pyyaml 6.0 -> 6.0.1 - Update yarl 1.8.2->1.9.4 - Update the `dnf install` line in README to include dependencies of python packages (unrelated to upgrades, just noticed absences while doing fresh pysync run)	2023-12-18 15:47:09 +00:00
Anna Khanova	00d90ce76a	Added cache for get role secret (#6165 ) ## Problem Currently if we are getting many consecutive connections to the same user/ep we will send a lot of traffic to the console. ## Summary of changes Cache with ttl=4min proxy_get_role_secret response. Note: this is the temporary hack, notifier listener is WIP.	2023-12-18 16:04:47 +01:00
John Khvatov	33cb9a68f7	pageserver: Reduce tracing overhead in timeline::get (#6115 ) ## Problem Compaction process (specifically the image layer reconstructions part) is lagging behind wal ingest (at speed ~10-15MB/s) for medium-sized tenants (30-50GB). CPU profile shows that significant amount of time (see flamegraph) is being spent in `tracing::span::Span::new`. mainline (commit: `0ba4cae491`): ![reconstruct-mainline-0ba4cae491c2](https://github.com/neondatabase/neon/assets/289788/ebfd262e-5c97-4858-80c7-664a1dbcc59d) ## Summary of changes By lowering the tracing level in get_value_reconstruct_data and get_or_maybe_download from info to debug, we can reduce the overhead of span creation in prod environments. On my system, this sped up the image reconstruction process by 60% (from 14500 to 23160 page reconstruction per sec) pr: ![reconstruct-opt-2](https://github.com/neondatabase/neon/assets/289788/563a159b-8f2f-4300-b0a1-6cd66e7df769) `create_image_layers()` (it's 1 CPU bound here) mainline vs pr: ![image](https://github.com/neondatabase/neon/assets/289788/a981e3cb-6df9-4882-8a94-95e99c35aa83)	2023-12-18 13:33:23 +00:00
Conrad Ludgate	17bde7eda5	proxy refactor large files (#6153 ) ## Problem The `src/proxy.rs` file is far too large ## Summary of changes Creates 3 new files: ``` src/metrics.rs src/proxy/retry.rs src/proxy/connect_compute.rs ```	2023-12-18 10:59:49 +00:00
John Spray	dbdb1d21f2	pageserver: on-demand activation cleanups (#6157 ) ## Problem #6112 added some logs and metrics: clean these up a bit: - Avoid counting startup completions for tenants launched after startup - exclude no-op cases from timing histograms - remove a rogue log messages	2023-12-18 10:29:19 +00:00
Arseny Sher	e1935f42a1	Don't generate core dump when walproposer intentionally panics. Walproposer sometimes intentionally PANICs when its term is defeated as the basebackup is likely spoiled by that time. We don't want core dumped in this case.	2023-12-18 11:03:34 +04:00
Alexander Bayandin	9bdc25f0af	Revert "CI: build build-tools image" (#6156 ) It turns out the issue with skipped jobs is not so trivial (because Github checks jobs transitively), a possible workaround with `if: always() && contains(fromJSON('["success", "skipped"]'), needs.build-buildtools-image.result)` will tangle the workflow really bad. We'll need to come up with a better solution. To unblock the main I'm going to revert https://github.com/neondatabase/neon/pull/6082.	2023-12-16 12:32:00 +00:00
Christian Schwarz	47873470db	pageserver: add method to dump keyspace in mgmt api client (#6145 ) Part of getpage@lsn benchmark epic: https://github.com/neondatabase/neon/issues/5771	2023-12-16 10:52:48 +00:00
Abhijeet Patil	8619e6295a	CI: build build-tools image (#6082 ) ## Currently our build docker file is located in the build repo it makes sense to have it as a part of our neon repo ## Summary of changes We had the docker file that we use to build our binary and other tools resided in the build repo It made sense to bring the docker file to its repo where it has been used So that the contributors can also view it and amend if required It will reduce the maintenance. Docker file changes and code changes can be accommodated in same PR Also, building the image and pushing it to ECR is abstracted in a reusable workflow. Ideal is to use that for any other jobs too ## Checklist before requesting a review - [x] Moved the docker file used to build the binary from the build repo to the neon repo - [x] adding gh workflow to build and push the image - [x] adding gh workflow to tag the pushed image - [x] update readMe file --------- Co-authored-by: Abhijeet Patil <abhijeet@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-12-16 10:33:52 +00:00
Conrad Ludgate	83811491da	update zerocopy (#6148 ) ## Problem https://github.com/neondatabase/neon/security/dependabot/48 ``` $ cargo tree -i zerocopy zerocopy v0.7.3 └── ahash v0.8.5 └── hashbrown v0.13.2 ``` ahash doesn't use the affected APIs we we are not vulnerable but best to update to silence the alert anyway ## Summary of changes ``` $ cargo update -p zerocopy --precise 0.7.31 Updating crates.io index Updating syn v2.0.28 -> v2.0.32 Updating zerocopy v0.7.3 -> v0.7.31 Updating zerocopy-derive v0.7.3 -> v0.7.31 ```	2023-12-16 09:06:00 +00:00
John Spray	d066dad84b	pageserver: prioritize activation of tenants with client requests (#6112 ) ## Problem During startup, a client request might have to wait a long time while the system is busy initializing all the attached tenants, even though most of the attached tenants probably don't have any client requests to service, and could wait a bit. ## Summary of changes - Add a semaphore to limit how many Tenant::spawn()s may concurrently do I/O to attach their tenant (i.e. read indices from remote storage, scan local layer files, etc). - Add Tenant::activate_now, a hook for kicking a tenant in its spawn() method to skip waiting for the warmup semaphore - For tenants that attached via warmup semaphore units, wait for logical size calculation to complete before dropping the warmup units - Set Tenant::activate_now in `get_active_tenant_with_timeout` (the page service's path for getting a reference to a tenant). - Wait for tenant activation in HTTP handlers for timeline creation and deletion: like page service requests, these require an active tenant and should prioritize activation if called.	2023-12-15 20:37:47 +00:00
John Spray	56f7d55ba7	pageserver: basic cancel/timeout for remote storage operations (#6097 ) ## Problem Various places in remote storage were not subject to a timeout (thereby stuck TCP connections could hold things up), and did not respect a cancellation token (so things like timeline deletion or tenant detach would have to wait arbitrarily long). ## Summary of changes - Add download_cancellable and upload_cancellable helpers, and use them in all the places we wait for remote storage operations (with the exception of initdb downloads, where it would not have been safe). - Add a cancellation token arg to `download_retry`. - Use cancellation token args in various places that were missing one per #5066 Closes: #5066 Why is this only "basic" handling? - Doesn't express difference between shutdown and errors in return types, to avoid refactoring all the places that use an anyhow::Error (these should all eventually return a more structured error type) - Implements timeouts on top of remote storage, rather than within it: this means that operations hitting their timeout will lose their semaphore permit and thereby go to the back of the queue for their retry. - Doing a nicer job is tracked in https://github.com/neondatabase/neon/issues/6096	2023-12-15 17:43:02 +00:00
Christian Schwarz	1a9854bfb7	add a Rust client for Pageserver management API (#6127 ) Part of getpage@lsn benchmark epic: https://github.com/neondatabase/neon/issues/5771 This PR moves the control plane's spread-all-over-the-place client for the pageserver management API into a separate module within the pageserver crate. I need that client to be async in my benchmarking work, so, this PR switches to the async version of `reqwest`. That is also the right direction generally IMO. The switch to async in turn mandated converting most of the `control_plane/` code to async. Note that some of the client methods should be taking `TenantShardId` instead of `TenantId`, but, none of the callers seem to be sharding-aware. Leaving that for another time: https://github.com/neondatabase/neon/issues/6154	2023-12-15 18:33:45 +01:00
John Spray	de1a9c6e3b	s3_scrubber: basic support for sharding (#6119 ) This doesn't make the scrubber smart enough to understand that many shards are part of the same tenants, but it makes it understand paths well enough to scrub the individual shards without thinking they're malformed. This is a prerequisite to being able to run tests with sharding enabled. Related: #5929	2023-12-15 15:48:55 +00:00
Arseny Sher	e62569a878	A few comments on rust walproposer build.	2023-12-15 19:31:51 +04:00
John Spray	bd1cb1b217	tests: update allow list for `negative_env` (#6144 ) Tests attaching the tenant immediately after the fixture detaches it could result in LSN updates failing validation e.g. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6142/7211196140/index.html#suites/7745dadbd815ab87f5798aa881796f47/32b12ccc0b01b122	2023-12-15 15:08:28 +00:00
Conrad Ludgate	98629841e0	improve proxy code cov (#6141 ) ## Summary of changes saw some low-hanging codecov improvements. even if code coverage is somewhat of a pointless game, might as well add tests where we can and delete code if it's unused	2023-12-15 12:11:50 +00:00
Arpad Müller	215cdd18c4	Make initdb upload retries cancellable and seek to beginning (#6147 ) * initdb uploads had no cancellation token, which means that when we were stuck in upload retries, we wouldn't be able to delete the timeline. in general, the combination of retrying forever and not having cancellation tokens is quite dangerous. * initdb uploads wouldn't rewind the file. this wasn't discovered in the purposefully unreliable test-s3 in pytest because those fail on the first byte always, not somewhere during the connection. we'd be getting errors from the AWS sdk that the file was at an unexpected end. slack thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1702632247784079	2023-12-15 12:11:25 +00:00

1 2 3 4 5 ...

4260 Commits