rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-25 23:29:59 +00:00

Author	SHA1	Message	Date
Conrad Ludgate	ce6bbca8d7	make use of esc string opt	2025-07-18 09:59:38 +01:00
Folke Behrens	64d0008389	proxy: Shorten the initial TTL of cancel keys (#12647 ) ## Problem A high rate of short-lived connections means that there a lot of cancel keys in Redis with TTL=10min that could be avoided by having a much shorter initial TTL. ## Summary of changes * Introduce an initial TTL of 1min used with the SET command. * Fix: don't delay repushing cancel data when expired. * Prepare for exponentially increasing TTLs. ## Alternatives A best-effort UNLINK command on connection termination would clean up cancel keys right away. This needs a bigger refactor due to how batching is handled.	2025-07-17 21:52:20 +00:00
Krzysztof Szafrański	e2982ed3ec	[proxy] Cache node info only for TTL, even if Redis is available (#12626 ) This PR simplifies our node info cache. Now we'll store entries for at most the TTL duration, even if Redis notifications are available. This will allow us to cache intermittent errors later (e.g. due to rate limits) with more predictable behavior. Related to https://github.com/neondatabase/cloud/issues/19353	2025-07-16 16:23:05 +00:00
Conrad Ludgate	c71aea0223	proxy: for json logging, only use callsite IDs if span name is duplicated (#12625 ) ## Problem We run multiple proxies, we get logs like ``` ... spans={"http_conn#22":{"conn_id": ... ... spans={"http_conn#24":{"conn_id": ... ``` these are the same span, and the difference is confusing. ## Summary of changes Introduce a counter per span name, rather than a global counter. If the counter is 0, no change to the span name is made. To follow up: see which span names are duplicated within the codebase in different callsites	2025-07-16 13:29:18 +00:00
Conrad Ludgate	87915df2fa	proxy: replace serde_json with our new json ser crate in the logging impl (#12602 ) This doesn't solve any particular problem, but it does simplify some of the code that was forced to round-trip through verbose Serialize impls.	2025-07-16 13:27:00 +00:00
Heikki Linnakangas	5c9c3b3317	Misc cosmetic cleanups (#12598 ) - Remove a few obsolete "allowed error messages" from tests. The pageserver doesn't emit those messages anymore. - Remove misplaced and outdated docstring comment from `test_tenants.py`. A docstring is supposed to be the first thing in a function, but we had added some code before it. And it was outdated, as we haven't supported running without safekeepers for a long time. - Fix misc typos in comments - Remove obsolete comment about backwards compatibility with safekeepers without `TIMELINE_STATUS` API. All safekeepers have it by now.	2025-07-15 14:36:28 +00:00
Krzysztof Szafrański	ff526a1051	[proxy] Recognize more cplane errors, use retry_delay_ms as TTL (#12543 ) ## Problem Not all cplane errors are properly recognized and cached/retried. ## Summary of changes Add more cplane error reasons. Also, use retry_delay_ms as cache TTL if present. Related to https://github.com/neondatabase/cloud/issues/19353	2025-07-15 07:42:48 +00:00
Folke Behrens	296c9190b2	proxy: Use EXPIRE command to refresh cancel entries (#12580 ) ## Problem When refreshing cancellation data we resend the entire value again just to reset the TTL, which causes unnecessary load in proxy, on network and possibly on redis side. ## Summary of changes * Switch from using SET with full value to using EXPIRE to reset TTL. * Add a tiny delay between retries to prevent busy loop. * Shorten CancelKeyOp variants: drop redundant suffix. * Retry SET when EXPIRE failed.	2025-07-13 22:49:23 +00:00
Folke Behrens	a5fe67f361	proxy: cancel maintain_cancel_key task immediately (#12586 ) ## Problem When a connection terminates its maintain_cancel_key task keeps running until the CANCEL_KEY_REFRESH sleep finishes and then it triggers another cancel key TTL refresh before exiting. ## Summary of changes * Check for cancellation while sleeping and interrupt sleep. * If cancelled, break the loop, don't send a refresh cmd.	2025-07-13 17:27:39 +00:00
Conrad Ludgate	9bba31bf68	proxy: encode json as we parse rows (#11992 ) Serialize query row responses directly into JSON. Some of this code should be using the `json::value_as_object/list` macros, but I've avoided it for now to minimize the size of the diff.	2025-07-11 19:39:08 +00:00
Folke Behrens	380d167b7c	proxy: For cancellation data replace HSET+EXPIRE/HGET with SET..EX/GET (#12553 ) ## Problem To store cancellation data we send two commands to redis because the redis server version doesn't support HSET with EX. Also, HSET is not really needed. ## Summary of changes * Replace the HSET + EXPIRE command pair with one SET .. EX command. * Replace HGET with GET. * Leave a workaround for old keys set with HSET. * Replace some anyhow errors with specific errors to surface the WRONGTYPE error from redis.	2025-07-11 19:35:42 +00:00
Conrad Ludgate	f4245403b3	[proxy] allow testing query cancellation locally (#12568 ) ## Problem Canceelation requires redis, redis required control-plane. ## Summary of changes Make redis for cancellation not require control plane. Add instructions for setting up redis locally.	2025-07-11 15:13:36 +00:00
Vlad Lazar	fe0ddb7169	libs: make remote storage failure injection probabilistic (#12526 ) Change the unreliable storage wrapper to fail by probability when there are more failure attempts left. Co-authored-by: Yecheng Yang <carlton.yang@databricks.com>	2025-07-09 17:41:34 +00:00
Folke Behrens	5ea0bb2d4f	proxy: Drop unused metrics (#12521 ) * proxy_control_plane_token_acquire_seconds * proxy_allowed_ips_cache_misses * proxy_vpc_endpoint_id_cache_stats * proxy_access_blocker_flags_cache_stats * proxy_requests_auth_rate_limits_total * proxy_endpoints_auth_rate_limits * proxy_invalid_endpoints_total	2025-07-09 09:58:46 +00:00
Folke Behrens	e65d5f7369	proxy: Remove the endpoint filter cache (#12488 ) ## Problem The endpoint filter cache is still unused because it's not yet reliable enough to be used. It only consumes a lot of memory. ## Summary of changes Remove the code. Needs a new design. neondatabase/cloud#30634	2025-07-07 17:46:33 +00:00
Conrad Ludgate	03e604e432	Nightly lints and small tweaks (#12456 ) Let chains available in 1.88 :D new clippy lints coming up in future releases.	2025-07-03 14:47:12 +00:00
Ruslan Talpa	95e1011cd6	subzero pre-integration refactor (#12416 ) ## Problem integrating subzero requires a bit of refactoring. To make the integration PR a bit more manageable, the refactoring is done in this separate PR. ## Summary of changes * move common types/functions used in sql_over_http to errors.rs and http_util.rs * add the "Local" auth backend to proxy (similar to local_proxy), useful in local testing * change the Connect and Send type for the http client to allow for custom body when making post requests to local_proxy from the proxy --------- Co-authored-by: Ruslan Talpa <ruslan.talpa@databricks.com>	2025-07-03 11:04:08 +00:00
Conrad Ludgate	1bc1eae5e8	fix redis credentials check (#12455 ) ## Problem `keep_connection` does not exit, so it was never setting `credentials_refreshed`. ## Summary of changes Set `credentials_refreshed` to true when we first establish a connection, and after we re-authenticate the connection.	2025-07-03 09:51:35 +00:00
Folke Behrens	3415b90e88	proxy/logging: Add "ep" and "query_id" to list of extracted fields (#12437 ) Extract two more interesting fields from spans: ep (endpoint) and query_id. Useful for reliable filtering in logging.	2025-07-03 08:09:10 +00:00
Conrad Ludgate	e01c8f238c	[proxy] update noisy error logging (#12438 ) Health checks for pg-sni-router open a TCP connection and immediately close it again. This is noisy. We will filter out any EOF errors on the first message. "acquired permit" debug log is incorrect since it logs when we timedout as well. This fixes the debug log.	2025-07-03 07:46:48 +00:00
Conrad Ludgate	45607cbe0c	[local_proxy]: ignore TLS for endpoint (#12316 ) ## Problem When local proxy is configured with TLS, the certificate does not match the endpoint string. This currently returns an error. ## Summary of changes I don't think this code is necessary anymore, taking the prefix from the hostname is good enough (and is equivalent to what `endpoint_sni` was doing) and we ignore checking the domain suffix.	2025-07-03 07:35:57 +00:00
Conrad Ludgate	d6beb3ffbb	[proxy] rewrite pg-text to json routines (#12413 ) We would like to move towards an arena system for JSON encoding the responses. This change pushes an "out" parameter into the pg-test to json routines to make swapping in an arena system easier in the future. (see #11992) This additionally removes the redundant `column: &[Type]` argument, as well as rewriting the pg_array parser. --- I rewrote the pg_array parser since while making these changes I found it hard to reason about. I went back to the specification and rewrote it from scratch. There's 4 separate routines: 1. pg_array_parse - checks for any prelude (multidimensional array ranges) 2. pg_array_parse_inner - only deals with the arrays themselves 3. pg_array_parse_item - parses a single item from the array, this might be quoted, unquoted, or another nested array. 4. pg_array_parse_quoted - parses a quoted string, following the relevant string escaping rules.	2025-07-02 12:46:11 +00:00
Ivan Efremov	0f879a2e8f	[proxy]: Fix redis IRSA expiration failure errors (#12430 ) Relates to the [#30688](https://github.com/neondatabase/cloud/issues/30688)	2025-07-02 08:55:44 +00:00
Conrad Ludgate	4932963bac	[proxy]: dont log user errors from postgres (#12412 ) ## Problem #8843 User initiated sql queries are being classified as "postgres" errors, whereas they're really user errors. ## Summary of changes Classify user-initiated postgres errors as user errors if they are related to a sql query that we ran on their behalf. Do not log those errors.	2025-07-01 13:03:34 +00:00
Folke Behrens	0ee15002fc	proxy: Move client connection accept and handshake to pglb (#12380 ) * This must be a no-op. * Move proxy::task_main to pglb::task_main. * Move client accept, TLS and handshake to pglb. * Keep auth and wake in proxy.	2025-06-27 15:20:23 +00:00
Conrad Ludgate	abc1efd5a6	[proxy] fix connect_to_compute retry handling (#12351 ) # Problem In #12335 I moved the `authenticate` method outside of the `connect_to_compute` loop. This triggered [e2e tests to become flaky](https://github.com/neondatabase/cloud/pull/30533). This highlighted an edge case we forgot to consider with that change. When we connect to compute, the compute IP might be cached. This cache hit might however be stale. Because we can't validate the IP is associated with a specific compute-id☨, we will succeed the connect_to_compute operation and fail when it comes to password authentication☨☨. Before the change, we were invalidating the cache and triggering wake_compute if the authentication failed. Additionally, I noticed some faulty logic I introduced 1 year ago https://github.com/neondatabase/neon/pull/8141/files#diff-5491e3afe62d8c5c77178149c665603b29d88d3ec2e47fc1b3bb119a0a970afaL145-R147 ☨ We can when we roll out TLS, as the certificate common name includes the compute-id. ☨☨ Technically password authentication could pass for the wrong compute, but I think this would only happen in the very very rare event that the IP got reused and the compute's endpoint happened to be a branch/replica. # Solution 1. Fix the broken logic 2. Simplify cache invalidation (I don't know why it was so convoluted) 3. Add a loop around connect_to_compute + authenticate to re-introduce the wake_compute invalidation we accidentally removed. I went with this approach to try and avoid interfering with https://github.com/neondatabase/neon/compare/main...cloneable/proxy-pglb-connect-compute-split. The changes made in commit 3 will move into `handle_client_request` I suspect,	2025-06-27 10:36:27 +00:00
Conrad Ludgate	fd1e8ec257	[proxy] review and cleanup CLI args (#12167 ) I was looking at how we could expose our proxy config as toml again, and as I was writing out the schema format, I noticed some cruft in our CLI args that no longer seem to be in use. The redis change is the most complex, but I am pretty sure it's sound. Since https://github.com/neondatabase/cloud/pull/15613 cplane longer publishes to the global redis instance.	2025-06-26 11:25:41 +00:00
Conrad Ludgate	517a3d0d86	[proxy]: BatchQueue::call is not cancel safe - make it directly cancellation aware (#12345 ) ## Problem https://github.com/neondatabase/cloud/issues/30539 If the current leader cancels the `call` function, then it has removed the jobs from the queue, but will never finish sending the responses. Because of this, it is not cancellation safe. ## Summary of changes Document these functions as not cancellation safe. Move cancellation of the queued jobs into the queue itself. ## Alternatives considered 1. We could spawn the task that runs the batch, since that won't get cancelled. * This requires `fn call(self: Arc<Self>)` or `fn call(&'static self)`. 2. We could add another scopeguard and return the requests back to the queue. * This requires that requests are always retry safe, and also requires requests to be `Clone`.	2025-06-25 14:19:20 +00:00
Conrad Ludgate	27ca1e21be	[console_redirect_proxy]: fix channel binding (#12238 ) ## Problem While working more on TLS to compute, I realised that Console Redirect -> pg-sni-router -> compute would break if channel binding was set to prefer. This is because the channel binding data would differ between Console Redirect -> pg-sni-router vs pg-sni-router -> compute. I also noticed that I actually disabled channel binding in #12145, since `connect_raw` would think that the connection didn't support TLS. ## Summary of changes Make sure we specify the channel binding. Make sure that `connect_raw` can see if we have TLS support.	2025-06-25 13:41:30 +00:00
Dmitry Savelev	158d84ea30	Switch the billing metrics storage format to ndjson. (#12338 ) ## Problem The billing team wants to change the billing events pipeline and use a common events format in S3 buckets across different event producers. ## Summary of changes Change the events storage format for billing events from JSON to NDJSON. Resolves: https://github.com/neondatabase/cloud/issues/29994	2025-06-24 15:36:36 +00:00
Conrad Ludgate	4dd9ca7b04	[proxy]: authenticate to compute after connect_to_compute (#12335 ) ## Problem PGLB will do the connect_to_compute logic, neonkeeper will do the session establishment logic. We should split it. ## Summary of changes Moves postgres authentication to compute to a separate routine that happens after connect_to_compute.	2025-06-24 14:15:36 +00:00
Arpad Müller	552249607d	apply clippy fixes for 1.88.0 beta (#12331 ) The 1.88.0 stable release is near (this Thursday). We'd like to fix most warnings beforehand so that the compiler upgrade doesn't require approval from too many teams. This is therefore a preparation PR (like similar PRs before it). There is a lot of changes for this release, mostly because the `uninlined_format_args` lint has been added to the `style` lint group. One can read more about the lint [here](https://rust-lang.github.io/rust-clippy/master/#/uninlined_format_args). The PR is the result of `cargo +beta clippy --fix` and `cargo fmt`. One remaining warning is left for the proxy team. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2025-06-24 10:12:42 +00:00
Conrad Ludgate	a298d2c29b	[proxy] replace the batch cancellation queue, shorten the TTL for cancel keys (#11943 ) See #11942 Idea: * if connections are short lived, they can get enqueued and then also remove themselves later if they never made it to redis. This reduces the load on the queue. * short lived connections (<10m, most?) will only issue 1 command, we remove the delete command and rely on ttl. * we can enqueue as many commands as we want, as we can always cancel the enqueue, thanks to the ~~intrusive linked lists~~ `BTreeMap`.	2025-06-20 11:48:01 +00:00
Ivan Efremov	43acabd4c2	[proxy]: Improve backoff strategy for redis reconnection (#12218 ) Sometimes during a failed redis connection attempt at the init stage proxy pod can continuously restart. This, in turn, can aggravate the problem if redis is overloaded. Solves the #11114	2025-06-12 19:46:02 +00:00
Folke Behrens	1dce65308d	Update base64 to 0.22 (#12215 ) ## Problem Base64 0.13 is outdated. ## Summary of changes Update base64 to 0.22. Affects mostly proxy and proxy libs. Also upgrade serde_with to remove another dep on base64 0.13 from dep tree.	2025-06-12 16:12:47 +00:00
Conrad Ludgate	67b94c5992	[proxy] per endpoint configuration for rate limits (#12148 ) https://github.com/neondatabase/cloud/issues/28333 Adds a new `rate_limit` response type to EndpointAccessControl, uses it for rate limiting, and adds a generic invalidation for the cache.	2025-06-10 14:26:08 +00:00
Folke Behrens	e38193c530	proxy: Move connect_to_compute back to proxy (#12181 ) It's mostly responsible for waking, retrying, and caching. A new, thin wrapper around compute_once will be PGLB's entry point	2025-06-10 11:23:03 +00:00
Conrad Ludgate	58327ef74d	[proxy] fix sql-over-http password setting (#12177 ) ## Problem Looks like our sql-over-http tests get to rely on "trust" authentication, so the path that made sure the authkeys data was set was never being hit. ## Summary of changes Slight refactor to WakeComputeBackends, as well as making sure auth keys are propagated. Fix tests to ensure passwords are tested.	2025-06-10 08:46:29 +00:00
Conrad Ludgate	6dd84041a1	refactor and simplify the invalidation notification structure (#12154 ) The current cache invalidation messages are far too specific. They should be more generic since it only ends up triggering a `GetEndpointAccessControl` message anyway. Mappings: * `/allowed_ips_updated`, `/block_public_or_vpc_access_updated`, and `/allowed_vpc_endpoints_updated_for_projects` -> `/project_settings_update`. * `/allowed_vpc_endpoints_updated_for_org` -> `/account_settings_update`. * `/password_updated` -> `/role_setting_update`. I've also introduced `/endpoint_settings_update`. All message types support singular or multiple entries, which allows us to simplify things both on our side and on cplane side. I'm opening a PR to cplane to apply the above mappings, but for now using the old phrases to allow both to roll out independently. This change is inspired by my need to add yet another cached entry to `GetEndpointAccessControl` for https://github.com/neondatabase/cloud/issues/28333	2025-06-06 12:49:29 +00:00
Conrad Ludgate	4d99b6ff4d	[proxy] separate compute connect from compute authentication (#12145 ) ## Problem PGLB/Neonkeeper needs to separate the concerns of connecting to compute, and authenticating to compute. Additionally, the code within `connect_to_compute` is rather messy, spending effort on recovering the authentication info after wake_compute. ## Summary of changes Split `ConnCfg` into `ConnectInfo` and `AuthInfo`. `wake_compute` only returns `ConnectInfo` and `AuthInfo` is determined separately from the `handshake`/`authenticate` process. Additionally, `ConnectInfo::connect_raw` is in-charge or establishing the TLS connection, and the `postgres_client::Config::connect_raw` is configured to use `NoTls` which will force it to skip the TLS negotiation. This should just work.	2025-06-06 10:29:55 +00:00
Folke Behrens	1577665c20	proxy: Move PGLB-related modules into pglb root module. (#12144 ) Split the modules responsible for passing data and connecting to compute from auth and waking for PGLB. This PR just moves files. The waking is going to get removed from pglb after this.	2025-06-05 11:00:23 +00:00
Conrad Ludgate	c8a96cf722	update proxy protocol parsing to not a rw wrapper (#12035 ) ## Problem I believe in all environments we now specify either required/rejected for proxy-protocol V2 as required. We no longer rely on the supported flow. This means we no longer need to keep around read bytes incase they're not in a header. While I designed ChainRW to be fast (the hot path with an empty buffer is very easy to branch predict), it's still unnecessary. ## Summary of changes * Remove the ChainRW wrapper * Refactor how we read the proxy-protocol header using read_exact. Slightly worse perf but it's hardly significant. * Don't try and parse the header if it's rejected.	2025-06-05 07:12:00 +00:00
Conrad Ludgate	781bf4945d	proxy: optimise future layout allocations (#12104 ) A smaller version of #12066 that is somewhat easier to review. Now that I've been using https://crates.io/crates/top-type-sizes I've found a lot more of the low hanging fruit that can be tweaks to reduce the memory usage. Some context for the optimisations: Rust's stack allocation in futures is quite naive. Stack variables, even if moved, often still end up taking space in the future. Rearranging the order in which variables are defined, and properly scoping them can go a long way. `async fn` and `async move {}` have a consequence that they always duplicate the "upvars" (aka captures). All captures are permanently allocated in the future, even if moved. We can be mindful when writing futures to only capture as little as possible. TlsStream is massive. Needs boxing so it doesn't contribute to the above issue. ## Measurements from `top-type-sizes`: ### Before ``` 10328 {async block@proxy::proxy::task_main::{closure#0}::{closure#0}} align=8 6120 {async fn body of proxy::proxy::handle_client<proxy::protocol2::ChainRW<tokio::net::TcpStream>>()} align=8 ``` ### After ``` 4040 {async block@proxy::proxy::task_main::{closure#0}::{closure#0}} 4704 {async fn body of proxy::proxy::handle_client<proxy::protocol2::ChainRW<tokio::net::TcpStream>>()} align=8 ```	2025-06-02 16:13:30 +00:00
Conrad Ludgate	589bfdfd02	proxy: Changes to rate limits and GetEndpointAccessControl caches. (#12048 ) Precursor to https://github.com/neondatabase/cloud/issues/28333. We want per-endpoint configuration for rate limits, which will be distributed via the `GetEndpointAccessControl` API. This lays some of the ground work. 1. Allow the endpoint rate limiter to accept a custom leaky bucket config on check. 2. Remove the unused auth rate limiter, as I don't want to think about how it fits into this. 3. Refactor the caching of `GetEndpointAccessControl`, as it adds friction for adding new cached data to the API. That third one was rather large. I couldn't find any way to split it up. The core idea is that there's now only 2 cache APIs. `get_endpoint_access_controls` and `get_role_access_controls`. I'm pretty sure the behaviour is unchanged, except I did a drive by change to fix #8989 because it felt harmless. The change in question is that when a password validation fails, we eagerly expire the role cache if the role was cached for 5 minutes. This is to allow for edge cases where a user tries to connect with a reset password, but the cache never expires the entry due to some redis related quirk (lag, or misconfiguration, or cplane error)	2025-06-02 08:38:35 +00:00
Conrad Ludgate	87179e26b3	completely rewrite pq_proto (#12085 ) libs/pqproto is designed for safekeeper/pageserver with maximum throughput. proxy only needs it for handshakes/authentication where throughput is not a concern but memory efficiency is. For this reason, we switch to using read_exact and only allocating as much memory as we need to. All reads return a `&'a [u8]` instead of a `Bytes` because accidental sharing of bytes can cause fragmentation. Returning the reference enforces all callers only hold onto the bytes they absolutely need. For example, before this change, `pqproto` was allocating 8KiB for the initial read `BytesMut`, and proxy was holding the `Bytes` in the `StartupMessageParams` for the entire connection through to passthrough.	2025-06-01 18:41:45 +00:00
Conrad Ludgate	6768a71c86	proxy(tokio-postgres): refactor typeinfo query to occur earlier (#11993 ) ## Problem For #11992 I realised we need to get the type info before executing the query. This is important to know how to decode rows with custom types, eg the following query: ```sql CREATE TYPE foo AS ENUM ('foo','bar','baz'); SELECT ARRAY['foo'::foo, 'bar'::foo, 'baz'::foo] AS data; ``` Getting that to work was harder that it seems. The original tokio-postgres setup has a split between `Client` and `Connection`, where messages are passed between. Because multiple clients were supported, each client message included a dedicated response channel. Each request would be terminated by the `ReadyForQuery` message. The flow I opted to use for parsing types early would not trigger a `ReadyForQuery`. The flow is as follows: ``` PARSE "" // parse the user provided query DESCRIBE "" // describe the query, returning param/result type oids FLUSH // force postgres to flush the responses early // wait for descriptions // check if we know the types, if we don't then // setup the typeinfo query and execute it against each OID: PARSE typeinfo // prepare our typeinfo query DESCRIBE typeinfo FLUSH // force postgres to flush the responses early // wait for typeinfo statement // for each OID we don't know: BIND typeinfo EXECUTE FLUSH // wait for type info, might reveal more OIDs to inspect // close the typeinfo query, we cache the OID->type map and this is kinder to pgbouncer. CLOSE typeinfo // finally once we know all the OIDs: BIND "" // bind the user provided query - already parsed - to the user provided params EXECUTE // run the user provided query SYNC // commit the transaction ``` ## Summary of changes Please review commit by commit. The main challenge was allowing one query to issue multiple sub-queries. To do this I first made sure that the client could fully own the connection, which required removing any shared client state. I then had to replace the way responses are sent to the client, by using only a single permanent channel. This required some additional effort to track which query is being processed. Lastly I had to modify the query/typeinfo functions to not issue `sync` commands, so it would fit into the desired flow above. To note: the flow above does force an extra roundtrip into each query. I don't know yet if this has a measurable latency overhead.	2025-05-23 19:41:12 +00:00
Peter Bendel	f9fdbc9618	remove auth_endpoint password from log and command line for local proxy mode (#11991 ) ## Problem When testing local proxy the auth-endpoint password shows up in command line and log ```bash RUST_LOG=proxy LOGFMT=text cargo run --release --package proxy --bin proxy --features testing -- \ --auth-backend postgres \ --auth-endpoint 'postgresql://postgres:secret_password@127.0.0.1:5432/postgres' \ --tls-cert server.crt \ --tls-key server.key \ --wss 0.0.0.0:4444 ``` ## Summary of changes - Allow to set env variable PGPASSWORD - fall back to use PGPASSWORD env variable when auth-endpoint does not contain password - remove auth-endpoint password from logs in `--features testing` mode Example ```bash export PGPASSWORD=secret_password RUST_LOG=proxy LOGFMT=text cargo run --package proxy --bin proxy --features testing -- \ --auth-backend postgres \ --auth-endpoint 'postgresql://postgres@127.0.0.1:5432/postgres' \ --tls-cert server.crt \ --tls-key server.key \ --wss 0.0.0.0:4444 ```	2025-05-21 20:26:05 +00:00
Konstantin Merenkov	5db20af8a7	Keep the conn info cache on max_client_conn from pgbouncer (#11986 ) ## Problem Hitting max_client_conn from pgbouncer would lead to invalidation of the conn info cache. Customers would hit the limit on wake_compute. ## Summary of changes `should_retry_wake_compute` detects this specific error from pgbouncer as non-retriable, meaning we won't try to wake up the compute again.	2025-05-21 15:27:30 +00:00
Conrad Ludgate	f3c9d0adf4	proxy(logging): significant changes to json logging internals for performance. (#11974 ) #11962 Please review each commit separately. Each commit is rather small in goal. The overall goal of this PR is to keep the behaviour identical, but shave away small inefficiencies here and there.	2025-05-20 17:57:59 +00:00
Konstantin Merenkov	568779fa8a	proxy/scram: avoid memory copy to improve performance (#11980 ) Touches #11941 ## Problem Performance of our PBKDF2 was worse than reference. ## Summary of changes Avoided memory copy when HMACing in a tight loop.	2025-05-20 15:23:54 +00:00

1 2 3 4 5 ...

670 Commits