Christian Schwarz
82d20b52b3
make noise when dropping an IoConcurrency with unfinished requests
2025-01-20 19:12:00 +01:00
Christian Schwarz
3b1328423e
basebackup: fetch all SLRUs of one basebackup using the same IoConcurrency
2025-01-20 16:58:14 +01:00
Christian Schwarz
2eb235e923
doc string explaining why we're deadlock free right now and why it's so brittle
2025-01-17 18:33:34 +01:00
Christian Schwarz
40ab9c2c5e
we can avoid adding the Arc<Mutex<>> around EphemeralLayer if we instead extend the lifetime of the InMemoryLayer for the spawned IO future; plus it's semantically more similar to what we now do for Delta and Image layers
2025-01-17 18:16:17 +01:00
Christian Schwarz
c43400389f
delta & image layer spawned IOs: keep layer resident until IO is done
2025-01-17 18:00:13 +01:00
Christian Schwarz
65932512c1
run tests with futures-unordered
2025-01-16 20:03:01 +01:00
Christian Schwarz
1866f261e0
make mypy pass
2025-01-16 20:01:42 +01:00
Christian Schwarz
7c662b771a
Merge branch 'problame/hung-shutdown/fix' into vlad/read-path-concurrent-io
2025-01-16 19:22:38 +01:00
Christian Schwarz
8f40bd4eb3
there is no Error Fe message -,-
2025-01-16 19:21:44 +01:00
Christian Schwarz
d2f8342080
Merge branch 'problame/hung-shutdown/fix' into vlad/read-path-concurrent-io
2025-01-16 18:16:36 +01:00
Christian Schwarz
92e4dd7ffa
script: template NEON_REPO_DIR
2025-01-16 18:14:34 +01:00
Christian Schwarz
0c3ab9c494
move test message tag to 99 and represent Fe message tag as enum, like we do for Be message
2025-01-16 18:07:56 +01:00
Christian Schwarz
c19a16792a
address nit ; https://github.com/neondatabase/neon/pull/10386#discussion_r1918782034
2025-01-16 17:54:14 +01:00
Christian Schwarz
cf75eb7d86
Revert "hacky experiment: what if we had more walredo procs => doesn't move the needle on throughput"
...
This reverts commit 9fffe6e60d .
2025-01-16 16:46:49 +01:00
Christian Schwarz
6ededa17e2
Revert "experiment: buffered socket with 128k buffer size; not super needle-moving"
...
This reverts commit 7e13e5fc4a .
2025-01-16 16:42:10 +01:00
Christian Schwarz
7e13e5fc4a
experiment: buffered socket with 128k buffer size; not super needle-moving
2025-01-16 16:42:01 +01:00
Christian Schwarz
45358bcb65
in the deepl_layers_with_delta script, make the stack height an argument
2025-01-16 16:41:15 +01:00
Christian Schwarz
9fffe6e60d
hacky experiment: what if we had more walredo procs => doesn't move the needle on throughput
2025-01-16 13:58:23 +01:00
Christian Schwarz
2ff0a4ae82
extract the l0stack generator into a reusable python module
2025-01-16 13:24:34 +01:00
Christian Schwarz
66c0df8109
doc comment on BatchedFeMessage explaining WeakHandle; https://github.com/neondatabase/neon/pull/10386#discussion_r1916968951
2025-01-15 21:50:00 +01:00
Christian Schwarz
9fe77c527f
inline get_impl; https://github.com/neondatabase/neon/pull/10386#discussion_r1916939623
2025-01-15 21:47:39 +01:00
Christian Schwarz
7fb4595c7e
fix: WeakHandle was holding on to the Timeline allocation
...
This made test_timeline_deletion_with_files_stuck_in_upload_queue fail
because the RemoteTimelineClient was being kept alive.
The fix is to stop keeping the timeline alive from WeakHandle.
2025-01-15 21:46:37 +01:00
Christian Schwarz
350dc251df
test case demonstrates the issue: we hod Timeline object alive
...
--- STDERR: pageserver tenant::timeline::handle::tests::test_weak_handles ---
thread 'tenant::timeline::handle::tests::test_weak_handles' panicked at pageserver/src/tenant/timeline/handle.rs:1131:9:
assertion `left == right` failed
left: 3
right: 2
2025-01-15 21:46:30 +01:00
Christian Schwarz
5b77a6d3ce
address clippy
2025-01-15 19:38:21 +01:00
Christian Schwarz
8c5005ff59
rename IoConcurrency::{todo=>serial} and remove deprecation warning
2025-01-15 19:38:05 +01:00
Christian Schwarz
f8218ac5fc
Revert "investigation: add log_if_slow => shows that the io_futures are slow"
...
This reverts commit e81fa7137e .
2025-01-15 19:34:37 +01:00
Christian Schwarz
40470c66cd
remove opportunistic poll, it seems slightly beneficial for perf
...
esp before I remembered to configure pipelining, the unpipelined
configuration achieved ~10% higher tput.
In any way, makes sense to not do the opportunisitc polling because
it registers the wrong waker.
2025-01-15 19:34:05 +01:00
Christian Schwarz
9b9479881a
extend script with instructions to configure batching
2025-01-15 19:30:15 +01:00
Christian Schwarz
af11b201bd
now the issue is no longer reproducible, maybe it was the barriers?
2025-01-15 19:10:45 +01:00
Christian Schwarz
8fafff37c5
remove the whole barriers business
2025-01-15 19:00:00 +01:00
Christian Schwarz
e81fa7137e
investigation: add log_if_slow => shows that the io_futures are slow
2025-01-15 18:56:07 +01:00
Christian Schwarz
e60738f029
it's reproducible before the merge, so, continuing to investigate and fix here
2025-01-15 18:43:01 +01:00
Christian Schwarz
f75b07a160
I find that if I ever go beyond queue-depth=4, something in the pageserver locks up.
2025-01-15 18:31:40 +01:00
Christian Schwarz
a5524fcf4d
add comment to use queue-depthed pagebench to the script
2025-01-15 18:31:29 +01:00
Christian Schwarz
351da2349e
Merge branch 'problame/hung-shutdown/fix' into vlad/read-path-concurrent-io
2025-01-15 17:09:02 +01:00
Christian Schwarz
c545d227b9
review doc comment
2025-01-15 16:24:39 +01:00
Christian Schwarz
a4fc6a92c9
fix cargo doc
2025-01-15 16:10:04 +01:00
Christian Schwarz
2205736262
doc comment & one fixup
2025-01-15 14:27:08 +01:00
Christian Schwarz
5f9ddbae2f
Merge branch 'problame/hung-shutdown/demo-hypothesis' into problame/hung-shutdown/fix
2025-01-15 00:25:11 +01:00
Christian Schwarz
173f18832c
fixup
2025-01-15 00:24:59 +01:00
Christian Schwarz
23bd5833e1
Merge branch 'problame/hung-shutdown/demo-hypothesis' into problame/hung-shutdown/fix
2025-01-15 00:21:54 +01:00
Christian Schwarz
dedd524d7e
refinements
2025-01-15 00:21:28 +01:00
Christian Schwarz
0340f00228
post-merge fix the handling of the new pagestream Test message, so that the regression test now passes
...
non-package-mode-py3.10christian@neon-hetzner-dev-christian:[~/src/neon]: BUILD_TYPE=debug DEFAULT_PG_VERSION=16 poetry run pytest ./test_runner/regress/test_page_service_batching_regressions.py --timeout=0 --pdb
2025-01-14 23:56:35 +01:00
Christian Schwarz
366ff9ffcc
Merge branch 'problame/hung-shutdown/demo-hypothesis' into problame/hung-shutdown/fix
2025-01-14 23:51:53 +01:00
Christian Schwarz
a8f9b564be
fix cd pageserver && cargo clippy --features testing build
2025-01-14 23:50:22 +01:00
Christian Schwarz
5450e54dab
bump ci
2025-01-14 22:47:16 +01:00
Christian Schwarz
53b05c4ba0
cleanups to make CI pass (well, fail because the bug isn't fixed yet)
2025-01-14 22:45:09 +01:00
Christian Schwarz
1f7d173235
Merge remote-tracking branch 'origin/main' into problame/hung-shutdown/demo-hypothesis
2025-01-14 22:33:20 +01:00
Christian Schwarz
8454e19a0f
address warnings and such
2025-01-14 22:28:08 +01:00
Christian Schwarz
45e08d0aa5
it repros
2025-01-14 22:16:27 +01:00