rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-25 23:29:59 +00:00

Author	SHA1	Message	Date
Ruslan Talpa	6549708b44	change subzero dep sha	2025-07-04 13:39:49 +03:00
Ruslan Talpa	9f46ca5eb1	Merge branch 'main' into ruslan/subzero-integration	2025-07-04 13:03:55 +03:00
Ruslan Talpa	54da030a2d	place the entire rest_broker code under a feature flag	2025-07-04 12:46:48 +03:00
Ruslan Talpa	afa4e48071	put subzero dependency under a feature flag	2025-07-04 11:27:36 +03:00
Conrad Ludgate	03e604e432	Nightly lints and small tweaks (#12456 ) Let chains available in 1.88 :D new clippy lints coming up in future releases.	2025-07-03 14:47:12 +00:00
Ruslan Talpa	b54872a4dc	fix error after merging latest master	2025-07-03 17:24:13 +03:00
Ruslan Talpa	486829f875	Merge branch 'main' into ruslan/subzero-integration	2025-07-03 17:10:43 +03:00
Ruslan Talpa	95e1011cd6	subzero pre-integration refactor (#12416 ) ## Problem integrating subzero requires a bit of refactoring. To make the integration PR a bit more manageable, the refactoring is done in this separate PR. ## Summary of changes * move common types/functions used in sql_over_http to errors.rs and http_util.rs * add the "Local" auth backend to proxy (similar to local_proxy), useful in local testing * change the Connect and Send type for the http client to allow for custom body when making post requests to local_proxy from the proxy --------- Co-authored-by: Ruslan Talpa <ruslan.talpa@databricks.com>	2025-07-03 11:04:08 +00:00
Conrad Ludgate	1bc1eae5e8	fix redis credentials check (#12455 ) ## Problem `keep_connection` does not exit, so it was never setting `credentials_refreshed`. ## Summary of changes Set `credentials_refreshed` to true when we first establish a connection, and after we re-authenticate the connection.	2025-07-03 09:51:35 +00:00
Folke Behrens	3415b90e88	proxy/logging: Add "ep" and "query_id" to list of extracted fields (#12437 ) Extract two more interesting fields from spans: ep (endpoint) and query_id. Useful for reliable filtering in logging.	2025-07-03 08:09:10 +00:00
Conrad Ludgate	e01c8f238c	[proxy] update noisy error logging (#12438 ) Health checks for pg-sni-router open a TCP connection and immediately close it again. This is noisy. We will filter out any EOF errors on the first message. "acquired permit" debug log is incorrect since it logs when we timedout as well. This fixes the debug log.	2025-07-03 07:46:48 +00:00
Conrad Ludgate	45607cbe0c	[local_proxy]: ignore TLS for endpoint (#12316 ) ## Problem When local proxy is configured with TLS, the certificate does not match the endpoint string. This currently returns an error. ## Summary of changes I don't think this code is necessary anymore, taking the prefix from the hostname is good enough (and is equivalent to what `endpoint_sni` was doing) and we ignore checking the domain suffix.	2025-07-03 07:35:57 +00:00
Conrad Ludgate	d6beb3ffbb	[proxy] rewrite pg-text to json routines (#12413 ) We would like to move towards an arena system for JSON encoding the responses. This change pushes an "out" parameter into the pg-test to json routines to make swapping in an arena system easier in the future. (see #11992) This additionally removes the redundant `column: &[Type]` argument, as well as rewriting the pg_array parser. --- I rewrote the pg_array parser since while making these changes I found it hard to reason about. I went back to the specification and rewrote it from scratch. There's 4 separate routines: 1. pg_array_parse - checks for any prelude (multidimensional array ranges) 2. pg_array_parse_inner - only deals with the arrays themselves 3. pg_array_parse_item - parses a single item from the array, this might be quoted, unquoted, or another nested array. 4. pg_array_parse_quoted - parses a quoted string, following the relevant string escaping rules.	2025-07-02 12:46:11 +00:00
Ivan Efremov	0f879a2e8f	[proxy]: Fix redis IRSA expiration failure errors (#12430 ) Relates to the [#30688](https://github.com/neondatabase/cloud/issues/30688)	2025-07-02 08:55:44 +00:00
Conrad Ludgate	4932963bac	[proxy]: dont log user errors from postgres (#12412 ) ## Problem #8843 User initiated sql queries are being classified as "postgres" errors, whereas they're really user errors. ## Summary of changes Classify user-initiated postgres errors as user errors if they are related to a sql query that we ran on their behalf. Do not log those errors.	2025-07-01 13:03:34 +00:00
Ruslan Talpa	4775aa3e01	Merge branch 'main' into ruslan/subzero-integration	2025-07-01 13:46:57 +03:00
Ruslan Talpa	1785f856b6	Move the local auth backend under the "testing" feature	2025-06-30 16:52:45 +03:00
Ruslan Talpa	69b22b05da	add in readme the way to run auth/rest broker locally	2025-06-30 16:17:35 +03:00
Ruslan Talpa	bf0007fa96	add note about local confir read code	2025-06-30 16:12:04 +03:00
Ruslan Talpa	a9bbe7b00b	remove unused imports	2025-06-30 16:02:30 +03:00
Ruslan Talpa	7e3f64b309	implement local auth backend for proxy and remove control plane hacks	2025-06-30 16:00:43 +03:00
Ruslan Talpa	9480d17de7	fix bug in pickcurrent_chema	2025-06-30 12:53:47 +03:00
Ruslan Talpa	424004ec95	apply cargo fmt	2025-06-30 12:32:47 +03:00
Ruslan Talpa	88d1a78260	cleanup the rest path code	2025-06-30 12:30:33 +03:00
Folke Behrens	0ee15002fc	proxy: Move client connection accept and handshake to pglb (#12380 ) * This must be a no-op. * Move proxy::task_main to pglb::task_main. * Move client accept, TLS and handshake to pglb. * Keep auth and wake in proxy.	2025-06-27 15:20:23 +00:00
Ruslan Talpa	8e544c7f99	import introspection queries instead of loading from files	2025-06-27 14:40:02 +03:00
Ruslan Talpa	4f49fc5b79	move common error types and http realted functions to error.rs and http_util.rs	2025-06-27 13:37:29 +03:00
Conrad Ludgate	abc1efd5a6	[proxy] fix connect_to_compute retry handling (#12351 ) # Problem In #12335 I moved the `authenticate` method outside of the `connect_to_compute` loop. This triggered [e2e tests to become flaky](https://github.com/neondatabase/cloud/pull/30533). This highlighted an edge case we forgot to consider with that change. When we connect to compute, the compute IP might be cached. This cache hit might however be stale. Because we can't validate the IP is associated with a specific compute-id☨, we will succeed the connect_to_compute operation and fail when it comes to password authentication☨☨. Before the change, we were invalidating the cache and triggering wake_compute if the authentication failed. Additionally, I noticed some faulty logic I introduced 1 year ago https://github.com/neondatabase/neon/pull/8141/files#diff-5491e3afe62d8c5c77178149c665603b29d88d3ec2e47fc1b3bb119a0a970afaL145-R147 ☨ We can when we roll out TLS, as the certificate common name includes the compute-id. ☨☨ Technically password authentication could pass for the wrong compute, but I think this would only happen in the very very rare event that the IP got reused and the compute's endpoint happened to be a branch/replica. # Solution 1. Fix the broken logic 2. Simplify cache invalidation (I don't know why it was so convoluted) 3. Add a loop around connect_to_compute + authenticate to re-introduce the wake_compute invalidation we accidentally removed. I went with this approach to try and avoid interfering with https://github.com/neondatabase/neon/compare/main...cloneable/proxy-pglb-connect-compute-split. The changes made in commit 3 will move into `handle_client_request` I suspect,	2025-06-27 10:36:27 +00:00
Ruslan Talpa	5461039c3f	implement remote config fetch from the db and cache introspected schema	2025-06-27 10:20:26 +03:00
Ruslan Talpa	d6c36d103e	subzero integration WIP6 beginning work on introspection of config and schema shape from the database	2025-06-26 14:42:31 +03:00
Conrad Ludgate	fd1e8ec257	[proxy] review and cleanup CLI args (#12167 ) I was looking at how we could expose our proxy config as toml again, and as I was writing out the schema format, I noticed some cruft in our CLI args that no longer seem to be in use. The redis change is the most complex, but I am pretty sure it's sound. Since https://github.com/neondatabase/cloud/pull/15613 cplane longer publishes to the global redis instance.	2025-06-26 11:25:41 +00:00
Ruslan Talpa	8072fae2fe	Merge branch 'main' into ruslan/subzero-integration	2025-06-26 10:19:16 +03:00
Ruslan Talpa	3869d680f9	use a global parsed/cached schema	2025-06-26 10:14:28 +03:00
Conrad Ludgate	517a3d0d86	[proxy]: BatchQueue::call is not cancel safe - make it directly cancellation aware (#12345 ) ## Problem https://github.com/neondatabase/cloud/issues/30539 If the current leader cancels the `call` function, then it has removed the jobs from the queue, but will never finish sending the responses. Because of this, it is not cancellation safe. ## Summary of changes Document these functions as not cancellation safe. Move cancellation of the queued jobs into the queue itself. ## Alternatives considered 1. We could spawn the task that runs the batch, since that won't get cancelled. * This requires `fn call(self: Arc<Self>)` or `fn call(&'static self)`. 2. We could add another scopeguard and return the requests back to the queue. * This requires that requests are always retry safe, and also requires requests to be `Clone`.	2025-06-25 14:19:20 +00:00
Ruslan Talpa	d3fa228d92	move subzero local test files to a "dot" folder	2025-06-25 16:43:50 +03:00
Conrad Ludgate	27ca1e21be	[console_redirect_proxy]: fix channel binding (#12238 ) ## Problem While working more on TLS to compute, I realised that Console Redirect -> pg-sni-router -> compute would break if channel binding was set to prefer. This is because the channel binding data would differ between Console Redirect -> pg-sni-router vs pg-sni-router -> compute. I also noticed that I actually disabled channel binding in #12145, since `connect_raw` would think that the connection didn't support TLS. ## Summary of changes Make sure we specify the channel binding. Make sure that `connect_raw` can see if we have TLS support.	2025-06-25 13:41:30 +00:00
Ruslan Talpa	be6a259b85	subzero integration WIP5 cleanup and postprocess the response and set the correct headers/status and handle errors	2025-06-25 14:33:45 +03:00
Ruslan Talpa	af3ca24a5e	remove unused enum values	2025-06-25 14:32:46 +03:00
Dmitry Savelev	158d84ea30	Switch the billing metrics storage format to ndjson. (#12338 ) ## Problem The billing team wants to change the billing events pipeline and use a common events format in S3 buckets across different event producers. ## Summary of changes Change the events storage format for billing events from JSON to NDJSON. Resolves: https://github.com/neondatabase/cloud/issues/29994	2025-06-24 15:36:36 +00:00
Conrad Ludgate	4dd9ca7b04	[proxy]: authenticate to compute after connect_to_compute (#12335 ) ## Problem PGLB will do the connect_to_compute logic, neonkeeper will do the session establishment logic. We should split it. ## Summary of changes Moves postgres authentication to compute to a separate routine that happens after connect_to_compute.	2025-06-24 14:15:36 +00:00
Ruslan Talpa	8b44f5b479	subzero integration WIP5 extract the response body from the local proxy response	2025-06-24 17:03:42 +03:00
Ruslan Talpa	d1445cf3eb	subzero integration WIP4 queries generated by subzero reach database and execute succesfully	2025-06-24 15:33:51 +03:00
Arpad Müller	552249607d	apply clippy fixes for 1.88.0 beta (#12331 ) The 1.88.0 stable release is near (this Thursday). We'd like to fix most warnings beforehand so that the compiler upgrade doesn't require approval from too many teams. This is therefore a preparation PR (like similar PRs before it). There is a lot of changes for this release, mostly because the `uninlined_format_args` lint has been added to the `style` lint group. One can read more about the lint [here](https://rust-lang.github.io/rust-clippy/master/#/uninlined_format_args). The PR is the result of `cargo +beta clippy --fix` and `cargo fmt`. One remaining warning is left for the proxy team. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2025-06-24 10:12:42 +00:00
Ruslan Talpa	67d3026fc4	subzero integration WIP3 * query makes it to the database	2025-06-23 11:48:55 +03:00
Ruslan Talpa	09e62e9b98	subzero integration WIP2	2025-06-23 10:11:06 +03:00
Ruslan Talpa	e121da4bfc	subzero integration WIP1	2025-06-20 15:10:45 +03:00
Conrad Ludgate	a298d2c29b	[proxy] replace the batch cancellation queue, shorten the TTL for cancel keys (#11943 ) See #11942 Idea: * if connections are short lived, they can get enqueued and then also remove themselves later if they never made it to redis. This reduces the load on the queue. * short lived connections (<10m, most?) will only issue 1 command, we remove the delete command and rely on ttl. * we can enqueue as many commands as we want, as we can always cancel the enqueue, thanks to the ~~intrusive linked lists~~ `BTreeMap`.	2025-06-20 11:48:01 +00:00
Ruslan Talpa	b39f04ab99	add missing parts to make disable_pg_session_jwt flag work	2025-06-20 10:23:42 +03:00
Ruslan Talpa	6bd15908fb	make pg_session_jwt instalation optional with a cli flag	2025-06-20 10:17:32 +03:00
Ruslan Talpa	3e36d516c2	vanilla pg dokcer image setup	2025-06-20 09:37:39 +03:00

1 2 3 4 5 ...

688 Commits