rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-16 09:52:54 +00:00

Author	SHA1	Message	Date
Christian Schwarz	1daeba6d87	another attempt to reduce allocations, don't know if helpers, certainly didn't eliminate all of them	2024-01-30 14:10:20 +00:00
Christian Schwarz	f3e1ae6740	try (and fail) to implement borrowed deserialize of Value (neon-_e02wX9z-py3.9) admin@ip-172-31-13-23:[~/neon-main]: cargo lcheck --features testing Checking pageserver v0.1.0 (/home/admin/neon-main/pageserver) Building [=======================> ] 716/721: pageserver error: implementation of `Deserialize` is not general enough --> pageserver/src/tenant/storage_layer/inmemory_layer.rs:179:29 \| 179 \| let value = ValueDe::des(&reconstruct_state.scratch)?; \| ^^^^^^^^^^^^ implementation of `Deserialize` is not general enough \| = note: `ValueDe<'_>` must implement `Deserialize<'0>`, for any lifetime `'0`... = note: ...but `ValueDe<'_>` actually implements `Deserialize<'1>`, for some specific lifetime `'1` error: implementation of `Deserialize` is not general enough --> pageserver/src/tenant/storage_layer/delta_layer.rs:792:23 \| 792 \| let val = ValueDe::des(&reconstruct_state.scratch).with_context(\|\| { \| ^^^^^^^^^^^^ implementation of `Deserialize` is not general enough \| = note: `ValueDe<'_>` must implement `Deserialize<'0>`, for any lifetime `'0`... = note: ...but `ValueDe<'_>` actually implements `Deserialize<'1>`, for some specific lifetime `'1`	2024-01-30 10:43:24 +00:00
Christian Schwarz	de8076d97d	use smallvec & pooling to avoid allocations on reconstruction path	2024-01-30 09:37:53 +00:00
Christian Schwarz	a28cdf1c28	wal-redo: consume reconstruct state as references (needed for next patch, useful indepdendently) We didn't take advantage of having the owned types inside walredo.rs, might as well just pass them in as reference so we can re-use their allocation in the next commit.	2024-01-30 09:11:16 +00:00
Christian Schwarz	3cd4f8aa59	possibly found the place where we do all those allocations, will check tomorrow	2024-01-29 20:25:02 +00:00
Christian Schwarz	c98215674c	avoid Vec::new() in walredo code path; still no dramatic improvement over before_scratch.svg	2024-01-29 20:10:03 +00:00
Christian Schwarz	0e3561f6d1	WIP: try to eliminate the raw_vec::finish_grow and bytes::promotable_even-drop This one doesn't make a big difference.	2024-01-29 19:52:05 +00:00
Christian Schwarz	28a4247c97	rip out slot pinning, has about 5% speedup	2024-01-29 19:37:35 +00:00
Christian Schwarz	70bc01494c	Revert "broken impl of a permit pool to shave off its allocations" This reverts commit `a1af2c7150`.	2024-01-29 19:23:09 +00:00
Christian Schwarz	a1af2c7150	broken impl of a permit pool to shave off its allocations	2024-01-29 19:22:55 +00:00
Christian Schwarz	043ed5edea	for posterity: RSS is about 18GB with previous bench at env.pageserver_config_override='page_cache_size=2097152;max_file_descriptors=500000;virtual_file_io_engine="tokio-epoll-uring"'	2024-01-29 18:35:07 +00:00
Christian Schwarz	6753ff089c	results: req_lru_size=2 gives tokio-epoll-uring 16k GetPage/s@110kIOPs std-fs: 9.5 GetPage/s @ 65k IOPS RUST_BACKTRACE=1 ./target/release/pagebench get-page-latest-lsn --mgmt-api-endpoint http://localhost:15011 --page-service-connstring=postgresql://localhost:15010 --keyspace-cache keyspace.cache --limit-to-first-n-targets 1000 --set-io-engine tokio-epoll-uring --set-req-lru-size 2 --runtime 2m Biggest gain with lru_size from 0 to 1, yay. Adding one more gives another 1-2k cgroup mem.high unlimited made sure global page cache is large enough to not have any misses MAKE SURE TO WARM UP, IT TAKES A WHILE, STILL DON'T KNOW WHY WARMUP IS THAT BADLY NEEDED std-fs: 50% cpu, lot of iowait 2024-01-29T18:25:52.923572Z INFO all clients stopped { "total": { "request_count": 1194213, "latency_mean": "68ms 343us", "latency_percentiles": { "p95": "152ms 63us", "p99": "201ms 215us", "p99.9": "260ms 991us", "p99.99": "314ms 623us" } } } tokio-epoll-uring: 100%cpu utilization Disk isn't saturated. We're CPU bound here. { "total": { "request_count": 1927700, "latency_mean": "43ms 11us", "latency_percentiles": { "p95": "83ms 263us", "p99": "101ms 887us", "p99.9": "124ms 991us", "p99.99": "147ms 583us" } } }	2024-01-29 18:33:04 +00:00
Christian Schwarz	49a5e411d6	implement request-scoped LRU cache	2024-01-29 18:22:00 +00:00
Christian Schwarz	21a11822e8	results: tokio-epoll-uring 3.3kGetPage/s@240k IOPS, std-fs: 1.2kGetPage/s@80k IOPS We have immense read amplification, I think we read the same blk multiple times during one getpage request. Before the switch to O_DIRECT, we'd go to the kernel page cache many times. std-fs has an edge there, it's more efficient than tokio-epoll-uring for workloads that have a high kernel page cache hit rate. With O_DIRECT, we now go to the disk for each read, making the inefficiency apparent. tokio-epoll-uring is mcuh better there, as we can see it can drive up to 240k IOPS, which is 2GiBs random 8k reads, which afaik is the max that the EC2 NVMe allows. CPU isn't near 100%. SO, we're IO bound. Idea to try out to reduce the read amplification: request-local page cache.	2024-01-29 16:21:22 +00:00
Christian Schwarz	aca2d7bdea	use O_DIRECT for VirtualFile reads	2024-01-29 16:21:14 +00:00
Christian Schwarz	db44395ee2	rip out materialized page cache	2024-01-29 14:45:16 +00:00
Christian Schwarz	03874009ec	add back page cache but not for DeltaLayerValue and ImageLayerValue	2024-01-29 14:44:55 +00:00
Christian Schwarz	0033b4c985	results: both tokio-epoll-uring and std-fs achieve about 4k GetPage/sec @ 60k IOPS	2024-01-29 13:53:13 +00:00
Christian Schwarz	a608667301	rip out page cache	2024-01-29 13:53:07 +00:00
Christian Schwarz	b9b7670a3a	hack: use a single runtime in pageserver doesn't seem to make a meaningful perf difference under get-page-latest-lsn load	2024-01-29 12:23:45 +00:00
Christian Schwarz	62a3d87098	results under higher memory pressure show that tokio-epoll-uring pays off setup: sudo mkdir /sys/fs/cgroup/benchmark admin@ip-172-31-13-23:[~/neon-main]: sudo mkdir /sys/fs/cgroup/benchmark admin@ip-172-31-13-23:[~/neon-main]: sudo chown admin:admin /sys/fs/cgroup/benchmark admin@ip-172-31-13-23:[~/neon-main]: sudo chown admin:admin /sys/fs/cgroup/benchmark/cgroup.procs admin@ip-172-31-13-23:[~/neon-main]: echo THE_PID_OF_THE_SHELL_WHERE_WE_LAUNCH_PAGESERVER > /sys/fs/cgroup/benchmark/cgroup.procs from another shell, that's not in the cgroup, run pagebench admin@ip-172-31-13-23:[~/neon-main]: RUST_BACKTRACE=1 ./target/release/pagebench get-page-latest-lsn --mgmt-api-endpoint http://localhost:15011 --page-service-connstring=postgresql://localhost:15010 --keyspace-cache keyspace.cache --per-target-rate-limit 2000 --limit-to-first-n-targets 500 --set-io-engine YOUR_IO_ENGINE --runtime 10s tokio-epoll-uring: { "total": { "request_count": 63780, "latency_mean": "77ms 993us", "latency_percentiles": { "p95": "120ms 703us", "p99": "143ms 743us", "p99.9": "171ms 775us", "p99.99": "195ms 583us" } } } Does ca 85-90k IOPS to the NVMe. std-fs { "total": { "request_count": 49303, "latency_mean": "100ms 669us", "latency_percentiles": { "p95": "214ms 399us", "p99": "268ms 799us", "p99.9": "335ms 359us", "p99.99": "399ms 615us" } } } Does ca 70k IOPS to the NVMe. with higher memroy pre	2024-01-29 10:50:52 +00:00
Christian Schwarz	8d6ce71b29	hacky: ability to set io_engine via mgmt_api => pagebench	2024-01-29 10:50:20 +00:00
Christian Schwarz	d23ea718ee	2min 3 tenants, 2000 req/s each; that is 0 IOPS workload (all in PS/Kernel page cache) Very comparable. tokio-epoll-uring { "total": { "request_count": 719999, "latency_mean": "375us", "latency_percentiles": { "p95": "576us", "p99": "649us", "p99.9": "823us", "p99.99": "1ms 636us" } } } std-fs { "total": { "request_count": 719997, "latency_mean": "341us", "latency_percentiles": { "p95": "543us", "p99": "618us", "p99.9": "748us", "p99.99": "1ms 358us" } } }	2024-01-27 12:59:15 +00:00
Christian Schwarz	73a7ca38b3	same config, but, rate limit of 2/sec per tenant => bursty due to ticker behavior RUST_BACKTRACE=1 ./target/release/pagebench get-page-latest-lsn --mgmt-api-endpoint http://localhost:15011 --page-service-connstring=postgresql://localhost:15010 --keyspace-cache keyspace.cache --per-target-rate-limit 2 --runtime 2m std-fs { "total": { "request_count": 240001, "latency_mean": "73ms 562us", "latency_percentiles": { "p95": "101ms 311us", "p99": "106ms 431us", "p99.9": "115ms 455us", "p99.99": "129ms 407us" } } } tokio-epoll-uring { "total": { "request_count": 240000, "latency_mean": "84ms 517us", "latency_percentiles": { "p95": "116ms 671us", "p99": "125ms 759us", "p99.9": "138ms 239us", "p99.99": "148ms 223us" } } }	2024-01-27 12:51:11 +00:00
Christian Schwarz	7eb1d4cfa6	manual 2min test run including warmup RUST_BACKTRACE=1 ./target/release/pagebench get-page-latest-lsn --mgmt-api-endpoint http://localhost:15011 --page-service-connstring=postgresql://localhost:15010 --keyspace-cache keyspace.cache --runtime 2m 2min std-fs { "total": { "request_count": 1213184, "latency_mean": "67ms 793us", "latency_percentiles": { "p95": "153ms 471us", "p99": "197ms 247us", "p99.9": "246ms 399us", "p99.99": "288ms 255us" } } } 2min tokio-eoll-uring { "total": { "request_count": 825637, "latency_mean": "108ms 702us", "latency_percentiles": { "p95": "136ms 959us", "p99": "191ms 615us", "p99.9": "9s 977ms 855us", "p99.99": "16s 334ms 847us" } } }	2024-01-27 12:48:32 +00:00
Christian Schwarz	6ebd683327	TODO/workaround: walredo quiescing broken with compaction_period=0	2024-01-27 12:48:27 +00:00
Christian Schwarz	b1ecdfe099	WIP: async walredo	2024-01-27 12:47:29 +00:00
Christian Schwarz	82a74d0e77	pagebench: fix percentiles reporting	2024-01-27 12:46:36 +00:00
Christian Schwarz	49b43c75e2	run test_pageserver_max_throughput_getpage_at_latest_lsn with 1k tenants, compare std-fs with tokio-epoll-uring	2024-01-26 16:49:12 +00:00
Vlad Lazar	5b34d5f561	pageserver: add vectored get latency histogram (#6461 ) This patch introduces a new set of grafana metrics for a histogram: pageserver_get_vectored_seconds_bucket{task_kind="Compaction\|PageRequestHandler"}. While it has a `task_kind` label, only compaction and SLRU fetches are tracked. This reduces the increase in cardinality to 24. The metric should allow us to isolate performance regressions while the vectorized get is being implemented. Once the implementation is complete, it'll also allow us to quantify the improvements.	2024-01-26 13:40:03 +00:00
Alexander Bayandin	26c55b0255	Compute: fix rdkit extension build (#6488 ) ## Problem `rdkit` extension build started to fail because of the changed checksum of the Comic Neue font: ``` Downloading https://fonts.google.com/download?family=Comic%20Neue... CMake Error at Code/cmake/Modules/RDKitUtils.cmake:257 (MESSAGE): The md5 checksum for /rdkit-src/Code/GraphMol/MolDraw2D/Comic_Neue.zip is incorrect; expected: 850b0df852f1cda4970887b540f8f333, found: b7fd0df73ad4637504432d72a0accb8f ``` https://github.com/neondatabase/neon/actions/runs/7666530536/job/20895534826 Ref https://neondb.slack.com/archives/C059ZC138NR/p1706265392422469 ## Summary of changes - Disable comic fonts for `rdkit` extension	2024-01-26 12:39:20 +00:00
Vadim Kharitonov	12e9b2a909	Update plv8 (#6465 )	2024-01-26 09:56:11 +00:00
Christian Schwarz	918b03b3b0	integrate tokio-epoll-uring as alternative VirtualFile IO engine (#5824 )	2024-01-26 09:25:07 +01:00
Alexander Bayandin	d36623ad74	CI: cancel old e2e-tests on new commits (#6463 ) ## Problem Triggered `e2e-tests` job is not cancelled along with other jobs in a PR if the PR get new commits. We can improve the situation by setting `concurrency_group` for the remote workflow (https://github.com/neondatabase/cloud/pull/9622 adds `concurrency_group` group input to the remote workflow). Ref https://neondb.slack.com/archives/C059ZC138NR/p1706087124297569 Cloud's part added in https://github.com/neondatabase/cloud/pull/9622 ## Summary of changes - Set `concurrency_group` parameter when triggering `e2e-tests` - At the beginning of a CI pipeline, trigger Cloud's `cancel-previous-in-concurrency-group.yml` workflow which cancels previously triggered e2e-tests	2024-01-25 19:25:29 +00:00
Christian Schwarz	689ad72e92	fix(neon_local): leaks child process if it fails to start & pass checks (#6474 ) refs https://github.com/neondatabase/neon/issues/6473 Before this PR, if process_started() didn't return Ok(true) until we ran out of retries, we'd return an error but leave the process running. Try it by adding a 20s sleep to the pageserver `main()`, e.g., right before we claim the pidfile. Without this PR, output looks like so: ``` (.venv) cs@devvm-mbp:[~/src/neon-work-2]: ./target/debug/neon_local start Starting neon broker at 127.0.0.1:50051. storage_broker started, pid: 2710939 . attachment_service started, pid: 2710949 Starting pageserver node 1 at '127.0.0.1:64000' in ".neon/pageserver_1"..... pageserver has not started yet, continuing to wait..... pageserver 1 start failed: pageserver did not start in 10 seconds No process is holding the pidfile. The process must have already exited. Leave in place to avoid race conditions: ".neon/pageserver_1/pageserver.pid" No process is holding the pidfile. The process must have already exited. Leave in place to avoid race conditions: ".neon/safekeepers/sk1/safekeeper.pid" Stopping storage_broker with pid 2710939 immediately....... storage_broker has not stopped yet, continuing to wait..... neon broker stop failed: storage_broker with pid 2710939 did not stop in 10 seconds Stopping attachment_service with pid 2710949 immediately....... attachment_service has not stopped yet, continuing to wait..... attachment service stop failed: attachment_service with pid 2710949 did not stop in 10 seconds ``` and we leak the pageserver process ``` (.venv) cs@devvm-mbp:[~/src/neon-work-2]: ps aux \| grep pageserver cs 2710959 0.0 0.2 2377960 47616 pts/4 Sl 14:36 0:00 /home/cs/src/neon-work-2/target/debug/pageserver -D .neon/pageserver_1 -c id=1 -c pg_distrib_dir='/home/cs/src/neon-work-2/pg_install' -c http_auth_type='Trust' -c pg_auth_type='Trust' -c listen_http_addr='127.0.0.1:9898' -c listen_pg_addr='127.0.0.1:64000' -c broker_endpoint='http://127.0.0.1:50051/' -c control_plane_api='http://127.0.0.1:1234/' -c remote_storage={local_path='../local_fs_remote_storage/pageserver'} ``` After this PR, there is no leaked process.	2024-01-25 19:20:02 +01:00
Christian Schwarz	fd4cce9417	test_pageserver_max_throughput_getpage_at_latest_lsn: remove n_tenants=100 combination (#6477 ) Need to fix the neon_local timeouts first (https://github.com/neondatabase/neon/issues/6473) and also not run them on every merge, but only nightly: https://github.com/neondatabase/neon/issues/6476	2024-01-25 18:17:53 +00:00
Arpad Müller	d52b81340f	S3 based recovery (#6155 ) Adds a new `time_travel_recover` function to the `RemoteStorage` trait that allows time travel like functionality for S3 buckets, regardless of their content (it is not even pageserver related). It takes a different approach from [this post](https://aws.amazon.com/blogs/storage/point-in-time-restore-for-amazon-s3-buckets/) that is more complicated. It takes as input a prefix a target timestamp, and a limit timestamp: * executes [`ListObjectVersions`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectVersions.html) * obtains the latest version that comes before the target timestamp * copies that latest version to the same prefix * if there is versions newer than the limit timestamp, it doesn't do anything for the file The limit timestamp is meant to be some timestamp before the start of the recovery operation and after any changes that one wants to revert. For example, it might be the time point after a tenant was detached from all involved pageservers. The limiting mechanism ensures that the operation is idempotent and can be retried without causing additional writes/copies. The approach fulfills all the requirements laid out in 8233, and is a recoverable operation. Nothing is deleted permanently, only new entries added to the version log. I also enable [nextest retries](https://nexte.st/book/retries.html) to help with some general S3 flakiness (on top of low level retries). Part of https://github.com/neondatabase/cloud/issues/8233	2024-01-25 18:23:18 +01:00
Joonas Koivunen	8dee9908f8	fix(compaction_task): wrong log levels (#6442 ) Filter what we log on compaction task. Per discussion in last triage call, fixing these by introducing and inspecting the root cause within anyhow::Error instead of rolling out proper conversions. Fixes: #6365 Fixes: #6367	2024-01-25 18:45:17 +02:00
Konstantin Knizhnik	19ed230708	Add support for PS sharding in compute (#6205 ) refer #5508 replaces #5837 ## Problem This PR implements sharding support at compute side. Relations are splinted in stripes and `get_page` requests are redirected to the particular shard where stripe is located. All other requests (i.e. get relation or database size) are always send to shard 0. ## Summary of changes Support of sharding at compute side include three things: 1. Make it possible to specify and change in runtime connection to more retain one page server 2. Send `get_page` request to the particular shard (determined by hash of page key) 3. Support multiple servers in prefetch ring requests ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: John Spray <john@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-01-25 15:53:31 +02:00
Joonas Koivunen	463b6a26b5	test: show relative order eviction with "fast growing tenant" (#6377 ) Refactor out test_disk_usage_eviction tenant creation and add a custom case with 4 tenants, 3 made with pgbench scale=1 and 1 made with pgbench scale=4. Because the tenants are created in order of scales [1, 1, 1, 4] this is simple enough to demonstrate the problem with using absolute access times, because on a disk usage based eviction run we will disproportionally target the first scale=1 tenant(s), and the later larger tenant does not lose anything. This test is not enough to show the difference between `relative_equal` and `relative_spare` (the fudge factor); much larger scale will be needed for "the large tenant", but that will make debug mode tests slower. Cc: #5304	2024-01-25 15:38:28 +02:00
John Spray	c9b1657e4c	pageserver: fixes for creation operations overlapping with shutdown/startup (#6436 ) ## Problem For #6423, creating a reproducer turned out to be very easy, as an extension to test_ondemand_activation. However, before I had diagnosed the issue, I was starting with a more brute force approach of running creation API calls in the background while restarting a pageserver, and that shows up a bunch of other interesting issues. In this PR: - Add the reproducer for #6423 by extending `test_ondemand_activation` (confirmed that this test fails if I revert the fix from https://github.com/neondatabase/neon/pull/6430) - In timeline creation, return 503 responses when we get an error and the tenant's cancellation token is set: this covers the cases where we get an anyhow::Error from something during timeline creation as a result of shutdown. - While waiting for tenants to become active during creation, don't .map_err() the result to a 500: instead let the `From` impl map the result to something appropriate (this includes mapping shutdown to 503) - During tenant creation, we were calling `Tenant::load_local` because no Preload object is provided. This is usually harmless because the tenant dir is empty, but if there are some half-created timelines in there, bad things can happen. Propagate the SpawnMode into Tenant::attach, so that it can properly skip _any_ attempt to load timelines if creating. - When we call upsert_location, there's a SpawnMode that tells us whether to load from remote storage or not. But if the operation is a retry and we already have the tenant, it is not correct to skip loading from remote storage: there might be a timeline there. This isn't strictly a correctness issue as long as the caller behaves correctly (does not assume that any timelines are persistent until the creation is acked), but it's a more defensive position. - If we shut down while the task in Tenant::attach is running, it can end up spawning rogue tasks. Fix this by holding a GateGuard through here, and in upsert_location shutting down a tenant after calling tenant_spawn if we can't insert it into tenants_map. This fixes the expected behavior that after shutdown_all_tenants returns, no tenant tasks are running. - Add `test_create_churn_during_restart`, which runs tenant & timeline creations across pageserver restarts. - Update a couple of tests that covered cancellation, to reflect the cleaner errors we now return.	2024-01-25 12:35:52 +00:00
Arpad Müller	b92be77e19	Make RemoteStorage not use async_trait (#6464 ) Makes the `RemoteStorage` trait not be based on `async_trait` any more. To avoid recursion in async (not supported by Rust), we made `GenericRemoteStorage` generic on the "Unreliable" variant. That allows us to have the unreliable wrapper never contain/call itself. related earlier work: #6305	2024-01-24 21:27:54 +01:00
Arthur Petukhovsky	8cb8c8d7b5	Allow remove_wal.rs to run on inactive timelines (#6462 ) Temporary enable it on staging to help with https://github.com/neondatabase/neon/issues/6403 Can be also deployed to prod if will work well on staging.	2024-01-24 16:48:56 +00:00
Conrad Ludgate	210700d0d9	proxy: add newtype wrappers for string based IDs (#6445 ) ## Problem too many string based IDs. easy to mix up ID types. ## Summary of changes Add a bunch of `SmolStr` wrappers that provide convenience methods but are type safe	2024-01-24 16:38:10 +00:00
Joonas Koivunen	a0a3ba85e7	fix(page_service): walredo logging problem (#6460 ) Fixes: #6459 by formatting full causes of an error to log, while keeping the top level string for end-user. Changes user visible error detail from: ``` -DETAIL: page server returned error: Read error: Failed to reconstruct a page image: +DETAIL: page server returned error: Read error ``` However on pageserver logs: ``` -ERROR page_service_conn_main{...}: error reading relation or page version: Read error: Failed to reconstruct a page image: +ERROR page_service_conn_main{...}: error reading relation or page version: Read error: reconstruct a page image: launch walredo process: spawn process: Permission denied (os error 13) ```	2024-01-24 15:47:17 +00:00
Arpad Müller	d820aa1d08	Disable initdb cancellation (#6451 ) ## Problem The initdb cancellation added in #5921 is not sufficient to reliably abort the entire initdb process. Initdb also spawns children. The tests added by #6310 (#6385) and #6436 now do initdb cancellations on a more regular basis. In #6385, I attempted to issue `killpg` (after giving it a new process group ID) to kill not just the initdb but all its spawned subprocesses, but this didn't work. Initdb doesn't take that long in the end either, so we just wait until it concludes. ## Summary of changes * revert initdb cancellation support added in #5921 * still return `Err(Cancelled)` upon cancellation, but this is just to not have to remove the cancellation infrastructure * fixes to the `test_tenant_delete_races_timeline_creation` test to make it reliably pass Fixes #6385	2024-01-24 13:06:05 +01:00
Christian Schwarz	996abc9563	pagebench-based GetPage@LSN performance test (#6214 )	2024-01-24 12:51:53 +01:00
John Spray	a72af29d12	control_plane/attachment_service: implement PlacementPolicy::Detached (#6458 ) ## Problem The API for detaching things wasn't implement yet, but one could hit this case indirectly from tests when using attach-hook, and find tenants unexpectedly attached again because their policy remained Single. ## Summary of changes Add PlacementPolicy::Detached, and: - add the behavior for it in schedule() - in tenant_migrate, refuse if the policy is detached - automatically set this policy in attach-hook if the caller has specified pageserver=null.	2024-01-24 12:49:30 +01:00
Sasha Krassovsky	4f51824820	Fix creating publications for all tables	2024-01-23 22:41:00 -08:00
Christian Schwarz	743f6dfb9b	fix(attachment_service): corrupted attachments.json when parallel requests (#6450 ) The pagebench integration PR (#6214) issues attachment requests in parallel. We observed corrupted attachments.json from time to time, especially in the test cases with high tenant counts. The atomic overwrite added in #6444 exposed the root cause cleanly: the `.commit()` calls of two request handlers could interleave or be reordered. See also: https://github.com/neondatabase/neon/pull/6444#issuecomment-1906392259 This PR makes changes to the `persistence` module to fix above race: - mpsc queue for PendingWrites - one writer task performs the writes in mpsc queue order - request handlers that need to do writes do it using the new `mutating_transaction` function. `mutating_transaction`, while holding the lock, does the modifications, serializes the post-modification state, and pushes that as a `PendingWrite` into the mpsc queue. It then release the lock and `await`s the completion of the write. The writer tasks executes the `PendingWrites` in queue order. Once the write has been executed, it wakes the writing tokio task.	2024-01-23 19:14:32 +00:00

1 2 3 4 5 ...

4477 Commits