rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 22:12:56 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	48535798ba	Merge remote-tracking branch 'origin/main' into communicator-rewrite	2025-07-23 00:00:10 +03:00
Folke Behrens	9c0efba91e	Bump rand crate to 0.9 (#12674 )	2025-07-22 09:31:39 +00:00
Erik Grinaker	367d96e25b	Merge branch 'main' into communicator-rewrite	2025-07-14 18:47:23 +02:00
Erik Grinaker	30b877074c	pagebench: add CPU profiling support (#12478 ) ## Problem The new communicator gRPC client has significantly worse Pagebench performance than a basic gRPC client. We need to find out why. ## Summary of changes Add a `pagebench --profile` flag which takes a client CPU profile of the benchmark and writes a flamegraph to `profile.svg`.	2025-07-14 11:44:53 +00:00
Erik Grinaker	fecb707b19	pagebench: add `idle-streams` (#12583 ) ## Problem For the communicator scheduling policy, we need to understand the server-side cost of idle gRPC streams. Touches #11735. ## Summary of changes Add an `idle-streams` benchmark to `pagebench` which opens a large number of idle gRPC GetPage streams.	2025-07-14 09:41:58 +00:00
Erik Grinaker	8cd5370c00	Merge branch 'main' into communicator-rewrite	2025-07-11 10:39:26 +02:00
Erik Grinaker	8aa9540a05	pageserver/page_api: include block number and rel in gRPC `GetPageResponse` (#12542 ) ## Problem With gRPC `GetPageRequest` batches, we'll have non-trivial fragmentation/reassembly logic in several places of the stack (concurrent reads, shard splits, LFC hits, etc). If we included the block numbers with the pages in `GetPageResponse` we could have better verification and observability that the final responses are correct. Touches #11735. Requires #12480. ## Summary of changes Add a `Page` struct with`block_number` for `GetPageResponse`, along with the `RelTag` for completeness, and verify them in the rich gRPC client.	2025-07-10 22:35:14 +00:00
Erik Grinaker	44ea17b7b2	pageserver/page_api: add attempt to GetPage request ID (#12536 ) ## Problem `GetPageRequest::request_id` is supposed to be a unique ID for a request. It's not, because we may retry the request using the same ID. This causes assertion failures and confusion. Touches #11735. Requires #12480. ## Summary of changes Extend the request ID with a retry attempt, and handle it in the gRPC client and server.	2025-07-10 20:39:42 +00:00
Erik Grinaker	dcdfe80bf0	pagebench: add support for rich gRPC client (#12477 ) ## Problem We need to benchmark the rich gRPC client `client_grpc::PageserverClient` against the basic, no-frills `page_api::Client` to determine how much overhead it adds. Touches #11735. Requires #12476. ## Summary of changes Add a `pagebench --rich-client` parameter to use `client_grpc::PageserverClient`. Also adds a compression parameter to the client.	2025-07-10 17:30:09 +00:00
Erik Grinaker	81e7218c27	pageserver: tighten up gRPC `page_api::Client` (#12396 ) This patch tightens up `page_api::Client`. It's mostly superficial changes, but also adds a new constructor that takes an existing gRPC channel, for use with the communicator connection pool.	2025-07-08 18:15:13 +00:00
Erik Grinaker	bf01145ae4	Remove some old code	2025-07-02 11:46:54 +02:00
Erik Grinaker	958c2577f5	pageserver: tighten up `page_api::Client`	2025-07-01 17:54:41 +02:00
Erik Grinaker	c3cb1ab98d	Merge branch 'main' into communicator-rewrite	2025-06-30 21:07:01 +02:00
Erik Grinaker	67b04f8ab3	Fix a bunch of linter warnings	2025-06-30 11:10:02 +02:00
Heikki Linnakangas	f3ba201800	Run `cargo fmt`	2025-06-29 21:21:07 +03:00
Erik Grinaker	e50b914a8e	compute_tools: support gRPC base backups in `compute_ctl` (#12244 ) ## Problem `compute_ctl` should support gRPC base backups. Requires #12111. Requires #12243. Touches #11926. ## Summary of changes Support `grpc://` connstrings for `compute_ctl` base backups.	2025-06-27 16:39:00 +00:00
Erik Grinaker	f755979102	pageserver: payload compression for gRPC base backups (#12346 ) ## Problem gRPC base backups use gRPC compression. However, this has two problems: * Base backup caching will cache compressed base backups (making gRPC compression pointless). * Tonic does not support varying the compression level, and zstd default level is 10% slower than gzip fastest level. Touches https://github.com/neondatabase/neon/issues/11728. Touches https://github.com/neondatabase/cloud/issues/29353. ## Summary of changes This patch adds a gRPC parameter `BaseBackupRequest::compression` specifying the compression algorithm. It also moves compression into `send_basebackup_tarball` to reduce code duplication. A follow-up PR will integrate the base backup cache with gRPC.	2025-06-25 18:16:23 +00:00
Arpad Müller	552249607d	apply clippy fixes for 1.88.0 beta (#12331 ) The 1.88.0 stable release is near (this Thursday). We'd like to fix most warnings beforehand so that the compiler upgrade doesn't require approval from too many teams. This is therefore a preparation PR (like similar PRs before it). There is a lot of changes for this release, mostly because the `uninlined_format_args` lint has been added to the `style` lint group. One can read more about the lint [here](https://rust-lang.github.io/rust-clippy/master/#/uninlined_format_args). The PR is the result of `cargo +beta clippy --fix` and `cargo fmt`. One remaining warning is left for the proxy team. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2025-06-24 10:12:42 +00:00
Heikki Linnakangas	e90be06d46	silence a few compiler warnings about unnecessary 'mut's and 'use's	2025-06-23 18:16:54 +03:00
Heikki Linnakangas	356ba67607	Merge remote-tracking branch 'origin/main' into HEAD I also included build script changes from https://github.com/neondatabase/neon/pull/12266, which is not yet merged but will be soon.	2025-06-23 17:46:30 +03:00
Erik Grinaker	15d079cd41	pagebench: improve `getpage-latest-lsn` gRPC support (#12293 ) This improves `pagebench getpage-latest-lsn` gRPC support by: * Using `page_api::Client`. * Removing `--protocol`, and using the `page-server-connstring` scheme instead. * Adding `--compression` to enable zstd compression.	2025-06-20 08:31:40 +00:00
Erik Grinaker	dc1625cd8e	pagebench: add `basebackup` gRPC support (#12250 ) ## Problem Pagebench does not support gRPC for `basebackup` benchmarks. Requires #12243. Touches #11728. ## Summary of changes Add gRPC support via gRPC connstrings, e.g. `pagebench basebackup --page-service-connstring grpc://localhost:51051`. Also change `--gzip-probability` to `--no-compression`, since this must be specified per-client for gRPC.	2025-06-19 15:40:57 +00:00
Heikki Linnakangas	5a045e7d52	Move pagestream_api to separate module (#12272 ) For general readability.	2025-06-18 12:03:14 +00:00
Erik Grinaker	d0b3629412	Tweak base backups	2025-06-13 13:47:26 -07:00
Elizabeth Murray	68f18ccacf	Request Tracker Prototype Does not include splitting requests across shards.	2025-06-05 13:32:18 -07:00
Erik Grinaker	2fb6164bf8	Misc build fixes	2025-06-05 17:22:11 +02:00
Erik Grinaker	37c58522a2	Merge branch 'main' into communicator-rewrite	2025-06-05 15:08:05 +02:00
Erik Grinaker	6a43f23eca	pagebench: add batch support (#12133 ) ## Problem The new gRPC page service protocol supports client-side batches. The current libpq protocol only does best-effort server-side batching. To compare these approaches, Pagebench should support submitting contiguous page batches, similar to how Postgres will submit them (e.g. with prefetches or vectored reads). ## Summary of changes Add a `--batch-size` parameter specifying the size of contiguous page batches. One batch counts as 1 RPS and 1 queue depth. For the libpq protocol, a batch is submitted as individual requests and we rely on the server to batch them for us. This will give a realistic comparison of how these would be processed in the wild (e.g. when Postgres sends 100 prefetch requests). This patch also adds some basic validation of responses.	2025-06-05 11:52:52 +00:00
Erik Grinaker	745b750f33	Merge branch 'main' into communicator-rewrite	2025-06-03 13:29:45 +02:00
Erik Grinaker	a21c1174ed	pagebench: add gRPC support for `get-page-latest-lsn` (#12077 ) ## Problem We need gRPC support in Pagebench to benchmark the new gRPC Pageserver implementation. Touches #11728. ## Summary of changes Adds a `Client` trait to make the client transport swappable, and a gRPC client via a `--protocol grpc` parameter. This must also specify the connstring with the gRPC port: ``` pagebench get-page-latest-lsn --protocol grpc --page-service-connstring grpc://localhost:51051 ``` The client is implemented using the raw Tonic-generated gRPC client, to minimize client overhead.	2025-06-02 14:50:49 +00:00
Heikki Linnakangas	f06bb2bbd8	Implement growing the hash table. Fix unit tests.	2025-05-29 15:54:55 +03:00
Elizabeth Murray	7c9bd542a6	Fix compile warnings, minor cleanup.	2025-05-26 06:30:48 -07:00
Elizabeth Murray	af9379ccf6	Use a sempahore to gate access to connections. Add metrics for testing.	2025-05-26 05:28:50 -07:00
Heikki Linnakangas	bb28109ffa	Merge remote-tracking branch 'origin/main' into communicator-rewrite-with-integrated-cache There were conflicts because of the differences in the page_api protocol that was merged to main vs what was on the branch. I adapted the code for the protocol in main.	2025-05-26 11:52:32 +03:00
Elizabeth Murray	0dddb1e373	Add back whitespace that was removed.	2025-05-19 06:34:52 -07:00
Elizabeth Murray	3acb263e62	Add first iteration of simulating a flakey network with a custom TCP.	2025-05-19 06:33:30 -07:00
Christian Schwarz	a7ce323949	benchmarking: extend `test_page_service_batching.py` to cover concurrent IO + batching under random reads (#10466 ) This PR commits the benchmarks I ran to qualify concurrent IO before we released it. Changes: - Add `l0stack` fixture; a reusable abstraction for creating a stack of L0 deltas each of which has 1 Value::Delta per page. - Such a stack of L0 deltas is a good and understandable demo for concurrent IO because to reconstruct any page, $layer_stack_height` Values need to be read. Before concurrent IO, the reads were sequential. With concurrent IO, they are executed concurrently. - So, switch `test_latency` to use the l0stack. - Teach `pagebench`, which is used by `test_latency`, to limit itself to the blocks of the relation created by the l0stack abstraction. - Additional parametrization of `test_latency` over dimensions `ps_io_concurrency,l0_stack_height,queue_depth` - Use better names for the tests to reflect what they do, leave interpretation of the (now quite high-dimensional) results to the reader - `test_{throughput => postgres_seqscan}` - `test_{latency => random_reads}` - Cut down on permutations to those we use in production. Runtime is about 2min. Refs - concurrent IO epic https://github.com/neondatabase/neon/issues/9378 - batching task: fixes https://github.com/neondatabase/neon/issues/9837 --------- Co-authored-by: Peter Bendel <peterbendel@neon.tech>	2025-05-15 17:48:13 +00:00
Elizabeth Murray	be8ed81532	Connection pool: update error accounting, sweep idle connections, add config options.	2025-05-14 07:31:52 -07:00
Erik Grinaker	d785100c02	page_api: add `GetPageRequest::class`	2025-05-02 10:48:32 +02:00
Erik Grinaker	2c0d930e3d	page_api: add `GetPageResponse::status`	2025-04-30 16:48:45 +02:00
Erik Grinaker	66171a117b	page_api: add `GetPageRequestBatch`	2025-04-30 15:31:11 +02:00
Erik Grinaker	df2806e7a0	page_api: add `GetPageRequest::id`	2025-04-30 15:00:16 +02:00
Erik Grinaker	4c77397943	Add `neon-shard-id` header	2025-04-30 11:18:06 +02:00
Erik Grinaker	0f520d79ab	pageserver: rename `data_api` to `page_api`	2025-04-29 15:58:52 +02:00
Heikki Linnakangas	93eb7bb6b8	include lots of changes that went missing by accident	2025-04-29 15:32:27 +03:00
Heikki Linnakangas	e58d0fece1	New communicator, with "integrated" cache accessible from all processes	2025-04-29 11:52:44 +03:00
Heikki Linnakangas	4d0c1e8b78	refactor: Extract some code in pagebench getpage command to function (#11563 ) This makes it easier to add a different client implementation alongside the current one. I started working on a new gRPC-based protocol to replace the libpq protocol, which will introduce a new function like `client_libpq`, but for the new protocol. It's a little more readable with less indentation anyway.	2025-04-19 08:38:03 +00:00
Dmitrii Kovalkov	0f367cb665	storcon: reuse reqwest http client (#11327 ) ## Problem - Part of https://github.com/neondatabase/neon/issues/11113 - Building a new `reqwest::Client` for every request is expensive because it parses CA certs under the hood. It's noticeable in storcon's flamegraph. ## Summary of changes - Reuse one `reqwest::Client` for all API calls to avoid parsing CA certificates every time.	2025-03-21 11:48:22 +00:00
Dmitrii Kovalkov	63b22d3fb1	pageserver: https for management API (#11025 ) ## Problem Storage controller uses unencrypted HTTP requests for pageserver management API. Closes: https://github.com/neondatabase/cloud/issues/24283 ## Summary of changes - Implement `http_utils::server::Server` with TLS support. - Replace `hyper0::server::Server` with `http_utils::server::Server` in pageserver. - Add HTTPS handler for pageserver management API. - Generate local SSL certificates in neon local.	2025-03-10 15:07:59 +00:00
Arpad Müller	a22be5af72	Migrate the last crates to edition 2024 (#10998 ) Migrates the remaining crates to edition 2024. We like to stay on the latest edition if possible. There is no functional changes, however some code changes had to be done to accommodate the edition's breaking changes. Like the previous migration PRs, this is comprised of three commits: * the first does the edition update and makes `cargo check`/`cargo clippy` pass. we had to update bindgen to make its output [satisfy the requirements of edition 2024](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-extern.html) * the second commit does a `cargo fmt` for the new style edition. * the third commit reorders imports as a one-off change. As before, it is entirely optional. Part of #10918	2025-02-27 09:40:40 +00:00

1 2

80 Commits