rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-16 04:30:38 +00:00

Author	SHA1	Message	Date
Folke Behrens	92d5e0e87a	proxy: clear lib.rs of code items (#9479 ) We keep lib.rs for crate configs, lint configs and re-exports for the binaries.	2024-10-23 08:21:28 +02:00
Conrad Ludgate	5cbdec9c79	[local_proxy]: install pg_session_jwt extension on demand (#9370 ) Follow up on #9344. We want to install the extension automatically. We didn't want to couple the extension into compute_ctl so instead local_proxy is the one to issue requests specific to the extension. depends on #9344 and #9395	2024-10-18 14:41:21 +01:00
Conrad Ludgate	d762ad0883	update rustls (#9396 ) The forever ongoing effort of juggling multiple versions of rustls :3 now with new crypto library aws-lc. Because of dependencies, it is currently impossible to not have both ring and aws-lc in the dep tree, therefore our only options are not updating rustls or having both crypto backends enabled... According to benchmarks run by the rustls maintainer, aws-lc is faster than ring in some cases too <https://jbp.io/graviola/>, so it's not without its upsides,	2024-10-17 20:45:37 +01:00
Folke Behrens	f14e45f0ce	proxy: format imports with nightly rustfmt (#9414 ) ```shell cargo +nightly fmt -p proxy -- -l --config imports_granularity=Module,group_imports=StdExternalCrate,reorder_imports=true ``` These rust-analyzer settings for VSCode should help retain this style: ```json "rust-analyzer.imports.group.enable": true, "rust-analyzer.imports.prefix": "crate", "rust-analyzer.imports.merge.glob": false, "rust-analyzer.imports.granularity.group": "module", "rust-analyzer.imports.granularity.enforce": true, ```	2024-10-16 15:01:56 +02:00
Conrad Ludgate	cb9ab7463c	proxy: split out the console-redirect backend flow (#9270 ) removes the ConsoleRedirect backend from the main auth::Backends enum, copy-paste the existing crate::proxy::task_main structure to use the ConsoleRedirectBackend exclusively. This makes the logic a bit simpler at the cost of some fairly trivial code duplication.	2024-10-14 12:25:55 +02:00
Conrad Ludgate	ab5bbb445b	proxy: refactor auth backends (#9271 ) preliminary for #9270 The auth::Backend didn't need to be in the mega ProxyConfig object, so I split it off and passed it manually in the few places it was necessary. I've also refined some of the uses of config I saw while doing this small refactor. I've also followed the trend and make the console redirect backend it's own struct, same as LocalBackend and ControlPlaneBackend.	2024-10-11 20:14:52 +01:00
Conrad Ludgate	8cd7b5bf54	proxy: rename console -> control_plane, rename web -> console_redirect (#9266 ) rename console -> control_plane rename web -> console_redirect I think these names are a little more representative.	2024-10-07 14:09:54 +01:00
Conrad Ludgate	6c05f89f7d	proxy: add local-proxy to compute image (#8823 ) 1. Adds local-proxy to compute image and vm spec 2. Updates local-proxy config processing, writing PID to a file eagerly 3. Updates compute-ctl to understand local proxy compute spec and to send SIGHUP to local-proxy over that pid. closes https://github.com/neondatabase/cloud/issues/16867	2024-10-04 14:52:01 +00:00
Folke Behrens	1e90e792d6	proxy: Add timeout to webauth confirmation wait (#9227 ) ```shell $ cargo run -p proxy --bin proxy -- --auth-backend=web --webauth-confirmation-timeout=5s ``` ``` $ psql -h localhost -p 4432 NOTICE: Welcome to Neon! Authenticate by visiting within 5s: http://localhost:3000/psql_session/e946900c8a9bc6e9 psql: error: connection to server at "localhost" (::1), port 4432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? connection to server at "localhost" (127.0.0.1), port 4432 failed: ERROR: Disconnected due to inactivity after 5s. ```	2024-10-02 12:10:56 +02:00
Conrad Ludgate	94a5ca2817	proxy: auth broker (#8855 ) Opens http2 connection to local-proxy and forwards requests over with all headers and body closes https://github.com/neondatabase/cloud/issues/16039	2024-09-30 20:43:45 +01:00
Conrad Ludgate	a2e2362ee9	add proxy-protocol header disable option (#9203 ) resolves https://github.com/neondatabase/cloud/issues/18026	2024-09-30 18:11:50 +00:00
Conrad Ludgate	ec07a1ecc9	proxy: make local-proxy config by signal with PID, refine JWKS apis with role caching (#9164 )	2024-09-26 19:01:48 +01:00
Folke Behrens	794bd4b866	proxy: mock cplane usable without allowed-ips table (#9046 )	2024-09-18 17:14:53 +02:00
Folke Behrens	c5cd8577ff	proxy: make sql-over-http max request/response sizes configurable (#9029 )	2024-09-18 13:58:51 +02:00
Stefan Radig	fcab61bdcd	Prototype implementation for private access poc (#8976 ) ## Problem For the Private Access POC we want users to be able to disable access from the public proxy. To limit the number of changes this can be done by configuring an IP allowlist [ "255.255.255.255" ]. For the Private Access proxy a new commandline flag allows to disable IP allowlist completely. See https://www.notion.so/neondatabase/Neon-Private-Access-POC-Proposal-8f707754e1ab4190ad5709da7832f020?d=887495c15e884aa4973f973a8a0a582a#7ac6ec249b524a74adbeddc4b84b8f5f for details about the POC., ## Summary of changes - Adding the commandline flag is_private_access_proxy=true will disable IP allowlist	2024-09-12 15:55:12 +01:00
Folke Behrens	52cb33770b	proxy: Rename backend types and variants as prep for refactor (#8845 ) * AuthBackend enum to AuthBackendType * BackendType enum to Backend * Link variants to Web * Adjust messages, comments, etc.	2024-08-27 14:12:42 +02:00
Conrad Ludgate	06795c6b9a	proxy: new local-proxy application (#8736 ) Add binary for local-proxy that uses the local auth backend. Runs only the http serverless driver support and offers config reload based on a config file and SIGHUP	2024-08-23 22:32:10 +01:00
Conrad Ludgate	0170611a97	proxy: small changes (#8752 ) ## Problem #8736 is getting too big. splitting off some simple changes here ## Summary of changes Local proxy wont always be using tls, so make it optional. Local proxy wont be using ws for now, so make it optional. Remove a dead config var.	2024-08-20 14:16:27 +01:00
Conrad Ludgate	ad0988f278	proxy: random changes (#8602 ) ## Problem 1. Hard to correlate startup parameters with the endpoint that provided them. 2. Some configurations are not needed in the `ProxyConfig` struct. ## Summary of changes Because of some borrow checker fun, I needed to switch to an interior-mutability implementation of our `RequestMonitoring` context system. Using https://docs.rs/try-lock/latest/try_lock/ as a cheap lock for such a use-case (needed to be thread safe). Removed the lock of each startup message, instead just logging only the startup params in a successful handshake. Also removed from values from `ProxyConfig` and kept as arguments. (needed for local-proxy config)	2024-08-07 14:37:03 +01:00
Conrad Ludgate	6ca41d3438	proxy: switch to leaky bucket (#8470 ) ## Problem The current bucket based rate limiter is not very intuitive and has some bad failure cases. ## Summary of changes Switches from fixed interval buckets to leaky bucket impl. A single bucket per endpoint, drains over time. Drains by checking the time since the last check, and draining tokens en-masse. Garbage collection works similar to before, it drains a shard (1/64th of the set) every 2048 checks, and it only removes buckets that are empty. To be compatible with the existing config, I've faffed to make it take the min and the max rps of each as the sustained rps and the max bucket size which should be roughly equivalent.	2024-07-24 12:28:37 +01:00
Anton Chaporgin	7996bce6d6	[proxy/redis] impr: use redis_auth_type to switch between auth types (#8428 ) ## Problem On Azure we need to use username-password authentication in proxy for regional redis client. ## Summary of changes This adds `redis_auth_type` to the config with default value of "irsa". Not specifying it will enforce the `regional_redis_client` to be configured with IRSA redis (as it's done now). If "plain" is specified, then the regional client is condifigured with `redis_notifications`, consuming username:password auth from URI. We plan to do that for the Azure cloud. Configuring `regional_redis_client` is required now, there is no opt-out from configuring it. https://github.com/neondatabase/cloud/issues/14462	2024-07-22 11:02:22 +03:00
Conrad Ludgate	fe13fccdc2	proxy: pg17 fixes (#8321 ) ## Problem #7809 - we do not support sslnegotiation=direct #7810 - we do not support negotiating down the protocol extensions. ## Summary of changes 1. Same as postgres, check the first startup packet byte for tls header `0x16`, and check the ALPN. 2. Tell clients using protocol >3.0 to downgrade	2024-07-10 09:10:29 +01:00
Christian Schwarz	7dcdbaa25e	remote_storage config: move handling of empty inline table `{}` to callers (#8193 ) Before this PR, `RemoteStorageConfig::from_toml` would support deserializing an empty `{}` TOML inline table to a `None`, otherwise try `Some()`. We can instead let * in proxy: let clap derive handle the Option * in PS & SK: assume that if the field is specified, it must be a valid RemtoeStorageConfig (This PR started with a much simpler goal of factoring out the `deserialize_item` function because I need that in another PR).	2024-07-02 12:53:08 +02:00
Conrad Ludgate	d7e349d33c	proxy: report blame for passthrough disconnect io errors (#8170 ) ## Problem Hard to debug the disconnection reason currently. ## Summary of changes Keep track of error-direction, and therefore error source (client vs compute) during passthrough.	2024-06-26 15:11:26 +00:00
Conrad Ludgate	c8cebecabf	proxy: reintroduce dynamic limiter for compute lock (#7737 ) ## Problem Computes that are healthy can manage many connection attempts at a time. Unhealthy computes cannot. We initially handled this with a fixed concurrency limit, but it seems this inhibits pgbench. ## Summary of changes Support AIMD for connect_to_compute lock to allow varying the concurrency limit based on compute health	2024-05-29 11:17:05 +01:00
Conrad Ludgate	43f9a16e46	proxy: fix websocket buffering (#7878 ) ## Problem Seems the websocket buffering was broken for large query responses only ## Summary of changes Move buffering until after the underlying stream is ready. Tested locally confirms this fixes the bug. Also fixes the pg-sni-router missing metrics bug	2024-05-24 17:56:12 +01:00
Conrad Ludgate	9cfe08e3d9	proxy password threadpool (#7806 ) ## Problem Despite making password hashing async, it can still take time away from the network code. ## Summary of changes Introduce a custom threadpool, inspired by rayon. Features: ### Fairness Each task is tagged with it's endpoint ID. The more times we have seen the endpoint, the more likely we are to skip the task if it comes up in the queue. This is using a min-count-sketch estimator for the number of times we have seen the endpoint, resetting it every 1000+ steps. Since tasks are immediately rescheduled if they do not complete, the worker could get stuck in a "always work available loop". To combat this, we check the global queue every 61 steps to ensure all tasks quickly get a worker assigned to them. ### Balanced Using crossbeam_deque, like rayon does, we have workstealing out of the box. I've tested it a fair amount and it seems to balance the workload accordingly	2024-05-22 17:05:43 +00:00
Anna Khanova	be1a88e574	Proxy added per ep rate limiter (#7636 ) ## Problem There is no global per-ep rate limiter in proxy. ## Summary of changes * Return global per-ep rate limiter back. * Rename weak compute rate limiter (the cli flags were not used anywhere, so it's safe to rename).	2024-05-10 12:17:00 +02:00
Conrad Ludgate	0c99e5ec6d	proxy: cull http connections (#7632 ) ## Problem Some HTTP client connections can stay open for quite a long time. ## Summary of changes When there are too many HTTP client connections, pick a random connection and gracefully cancel it.	2024-05-07 18:15:06 +01:00
Conrad Ludgate	9b65946566	proxy: add connect compute concurrency lock (#7607 ) ## Problem Too many connect_compute attempts can overwhelm postgres, getting the connections stuck. ## Summary of changes Limit number of connection attempts that can happen at a given time.	2024-05-03 15:45:24 +00:00
Anna Khanova	1684bbf162	proxy: Create disconnect events (#7535 ) ## Problem It's not possible to get the duration of the session from proxy events. ## Summary of changes * Added a separate events folder in s3, to record disconnect events. * Disconnect events are exactly the same as normal events, but also have `disconnect_timestamp` field not empty. * @oruen suggested to fill it with the same information as the original events to avoid potentially heavy joins.	2024-04-29 15:22:13 +02:00
Anna Khanova	5357f40183	proxy: Workaround switch to the regional redis (#7513 ) ## Problem Start switching from the global redis to the regional one ## Summary of changes * Publish cancellations to the regional redis * Listen notifications from both: global and regional	2024-04-25 15:26:18 +00:00
Anna Khanova	b1d47f3911	proxy: Fix cancellations (#7510 ) ## Problem Cancellations were published to the channel, that was never read. ## Summary of changes Fallback to global redis publishing.	2024-04-25 11:38:51 +00:00
Anna Khanova	5dda371c2b	Fix a bug with retries (#7494 ) ## Problem ## Summary of changes By default, it's 5s retry.	2024-04-24 14:13:18 +01:00
Anna Khanova	6a5650d40c	proxy: Make retries configurable and record it. (#7438 ) ## Problem Currently we cannot configure retries, also, we don't really have visibility of what's going on there. ## Summary of changes * Added cli params * Improved logging * Decrease the number of retries: it feels like most of retries doesn't help. Once there would be better errors handling, we can increase it back.	2024-04-22 11:37:22 +00:00
Conrad Ludgate	a54ea8fb1c	proxy: move endpoint rate limiter (#7413 ) ## Problem ## Summary of changes Rate limit for wake_compute calls	2024-04-18 06:00:33 +01:00
Anna Khanova	fd49005cb3	proxy: Improve logging (#7405 ) ## Problem It's unclear from logs what's going on with the regional redis. ## Summary of changes Make logs better.	2024-04-17 11:33:31 +00:00
Anna Khanova	13b9135d4e	proxy: Cleanup unused rate limiter (#7400 ) ## Problem There is an unused dead code. ## Summary of changes Let's remove it. In case we would need it in the future, we can always return it back. Also removed cli arguments. They shouldn't be used by anyone but us.	2024-04-17 11:11:49 +02:00
Conrad Ludgate	e5c50bb12b	proxy: rate limit authentication by masked IPv6. (#7316 ) ## Problem Many users have access to ipv6 subnets (eg a /64). That gives them 2^64 addresses to play with ## Summary of changes Truncate the address to /64 to reduce the attack surface. Todo: ~~Will NAT64 be an issue here? AFAIU they put the IPv4 address at the end of the IPv6 address. By truncating we will lose all that detail.~~ It's the same problem as a host sharing IPv6 addresses between clients. I don't think it's up to us to solve. If a customer is getting DDoSed, then they likely need to arrange a dedicated IP with us.	2024-04-16 14:16:34 +00:00
Anna Khanova	110282ee7e	proxy: Exclude private ip errors from recorded metrics (#7389 ) ## Problem Right now we record errors from internal VPC. ## Summary of changes * Exclude it from the metrics. * Simplify pg-sni-router	2024-04-15 20:21:50 +02:00
Anna Khanova	40f15c3123	Read cplane events from regional redis (#7352 ) ## Problem Actually read redis events. ## Summary of changes This is revert of https://github.com/neondatabase/neon/pull/7350 + fixes. * Fixed events parsing * Added timeout after connection failure * Separated regional and global redis clients.	2024-04-11 18:24:34 +00:00
Conrad Ludgate	5299f917d6	proxy: replace prometheus with measured (#6717 ) ## Problem My benchmarks show that prometheus is not very good. https://github.com/conradludgate/measured We're already using it in storage_controller and it seems to be working well. ## Summary of changes Replace prometheus with my new measured crate in proxy only. Apologies for the large diff. I tried to keep it as minimal as I could. The label types add a bit of boiler plate (but reduce the chance we mistype the labels), and some of our custom metrics like CounterPair and HLL needed to be rewritten.	2024-04-11 16:26:01 +00:00
Anna Khanova	0bb04ebe19	Revert "Proxy read ids from redis (#7205 )" (#7350 ) This reverts commit `dbac2d2c47`. ## Problem Proxy pods fails to install in k8s clusters, cplane release blocking. ## Summary of changes Revert	2024-04-10 10:12:55 +00:00
Anna Khanova	dbac2d2c47	Proxy read ids from redis (#7205 ) ## Problem Proxy doesn't know about existing endpoints. ## Summary of changes * Added caching of all available endpoints. * On the high load, use it before going to cplane. * Report metrics for the outcome. * For rate limiter and credentials caching don't distinguish between `-pooled` and not TODOs: * Make metrics more meaningful * Consider integrating it with the endpoint rate limiter * Test it together with cplane in preview	2024-04-10 02:40:14 +02:00
Conrad Ludgate	55da8eff4f	proxy: report metrics based on cold start info (#7324 ) ## Problem Would be nice to have a bit more info on cold start metrics. ## Summary of changes * Change connect compute latency to include `cold_start_info`. * Update `ColdStartInfo` to include HttpPoolHit and WarmCached. * Several changes to make more use of interned strings	2024-04-05 16:14:50 +01:00
Anna Khanova	582cec53c5	proxy: upload consumption events to S3 (#7213 ) ## Problem If vector is unavailable, we are missing consumption events. https://github.com/neondatabase/cloud/issues/9826 ## Summary of changes Added integration with the consumption bucket.	2024-04-02 21:46:23 +02:00
Conrad Ludgate	12512f3173	add authentication rate limiting (#6865 ) ## Problem https://github.com/neondatabase/cloud/issues/9642 ## Summary of changes 1. Make `EndpointRateLimiter` generic, renamed as `BucketRateLimiter` 2. Add support for claiming multiple tokens at once 3. Add `AuthRateLimiter` alias. 4. Check `(Endpoint, IP)` pair during authentication, weighted by how many hashes proxy would be doing. TODO: handle ipv6 subnets. will do this in a separate PR.	2024-03-26 19:31:19 +00:00
Anna Khanova	6770ddba2e	proxy: connect redis with AWS IAM (#7189 ) ## Problem Support of IAM Roles for Service Accounts for authentication. ## Summary of changes * Obtain aws 15m-long credentials * Retrieve redis password from credentials * Update every 1h to keep connection for more than 12h * For now allow to have different endpoints for pubsub/stream redis. TODOs: * PubSub doesn't support credentials refresh, consider using stream instead. * We need an AWS role for proxy to be able to connect to both: S3 and elasticache. Credentials obtaining and connection refresh was tested on xenon preview. https://github.com/neondatabase/cloud/issues/10365	2024-03-22 09:38:04 +01:00
Conrad Ludgate	02358b21a4	update rustls (#7048 ) ## Summary of changes Update rustls from 0.21 to 0.22. reqwest/tonic/aws-smithy still use rustls 0.21. no upgrade route available yet.	2024-03-07 18:23:19 +00:00
Arpad Müller	82853cc1d1	Fix warnings and compile errors on nightly (#6886 ) Nightly has added a bunch of compiler and linter warnings. There is also two dependencies that fail compilation on latest nightly due to using the old `stdsimd` feature name. This PR fixes them.	2024-03-01 17:14:19 +01:00

1 2

94 Commits