Commit Graph

35 Commits

Author SHA1 Message Date
Dmitry Ivanov
d90cd36bcc [proxy] Improve tracing spans here and there. 2023-02-17 15:32:14 +03:00
Dmitry Ivanov
a4d5c8085b Move hacks to a dedicated module. 2023-02-16 22:10:56 +03:00
Dmitry Ivanov
edffe0dd9d Extract password hack & cleartext hack 2023-02-16 22:10:56 +03:00
Heikki Linnakangas
d9c518b2cc Refactor use_cleartext_password_flow.
It's not a property of the credentials that we receive from the
client, so remove it from ClientCredentials. Instead, pass it as an
argument directly to 'authenticate' function, where it's actually
used. All the rest of the changes is just plumbing to pass it through
the call stack to 'authenticate'
2023-02-16 22:10:56 +03:00
Dmitry Ivanov
3569c1bacd [proxy] Fix: don't cache user & dbname in node info cache
Upstream proxy erroneously stores user & dbname in compute node info
cache entries, thus causing "funny" connection problems if such an entry
is reused while connecting to e.g. a different DB on the same compute node.

This PR fixes the problem but doesn't eliminate the root cause just yet.
I'll revisit this code and make it more type-safe in the upcoming PR.
2023-02-14 17:54:01 +03:00
Dmitry Ivanov
96e78394f5 [proxy] Fix project (aka endpoint) init in the password hack handler (#3529)
The project/endpoint should be set in the original (non-as_ref'd) creds,
because we call `wake_compute` not only in `try_password_hack` but also
later in the connection retry logic.

This PR also removes the obsolete `as_ref` method and makes the code
simpler because we no longer need this complication after a recent
refactoring.

Further action points: finally introduce typestate in creds (planned).
2023-02-02 22:56:15 +02:00
Dmitry Ivanov
ea0278cf27 [proxy] Implement compute node info cache (#3331)
This patch adds a timed LRU cache implementation and a compute node info cache on top of that.
Cache entries might expire on their own (default ttl=5mins) or become invalid due to real-world events,
e.g. compute node scale-to-zero event, so we add a connection retry loop with a wake-up call.

Solved problems:
- [x] Find a decent LRU implementation.
- [x] Implement timed LRU on top of that.
- [x] Cache results of `proxy_wake_compute` API call.
- [x] Don't invalidate newer cache entries for the same key.
- [x] Add cmdline configuration knobs (requires some refactoring).
- [x] Add failed connection estab metric.
- [x] Refactor auth backends to make things simpler (retries, cache
placement, etc).
- [x] Address review comments (add code comments + cleanup).
- [x] Retry `/proxy_wake_compute` if we couldn't connect to a compute
(e.g. stalled cache entry).
- [x] Add high-level description for `TimedLru`.

TODOs (will be addressed later):
- [ ] Add cache metrics (hit, spurious hit, miss).
- [ ] Synchronize http requests across concurrent per-client tasks
(https://github.com/neondatabase/neon/pull/3331#issuecomment-1399216069).
- [ ] Cache results of `proxy_get_role_secret` API call.
2023-02-01 17:11:41 +03:00
Arthur Petukhovsky
debd134b15 Implement wss support in proxy (#3247)
This is a hacky implementation of WebSocket server, embedded into our
postgres proxy. The server is used to allow https://github.com/neondatabase/serverless 
to connect to our postgres from browser and serverless javascript functions.

How it will work (general schema):
- browser opens a websocket connection to
`wss://ep-abc-xyz-123.xx-central-1.aws.neon.tech/`
- proxy accepts this connection and terminates TLS (https)
- inside encrypted tunnel (HTTPS), browser initiates plain
(non-encrypted) postgres connection
- proxy performs auth as in usual plain pg connection and forwards
connection to the compute

Related issue: #3225
2023-01-06 18:34:18 +03:00
Dmitry Ivanov
c700c7db2e [proxy] Add more labels to the pricing metrics 2022-12-29 22:25:52 +03:00
Dmitry Ivanov
83baf49487 [proxy] Forward compute connection params to client
This fixes all kinds of problems related to missing params,
like broken timestamps (due to `integer_datetimes`).

This solution is not ideal, but it will help. Meanwhile,
I'm going to dedicate some time to improving connection machinery.

Note that this **does not** fix problems with passing certain parameters
in a reverse direction, i.e. **from client to compute**. This is a
separate matter and will be dealt with in an upcoming PR.
2022-12-16 21:37:50 +03:00
Dmitry Ivanov
607c0facfc [proxy] Propagate more console API errors to the user
This patch aims to fix some of the inconsistencies in error reporting,
for example "Internal error" or "Console request failed" instead of
"password authentication failed for user '<NAME>'".
2022-12-13 16:16:31 +03:00
Dmitry Ivanov
05db6458df [proxy] Fix project (endpoint) -related error messages 2022-11-23 23:03:29 +03:00
Dmitry Ivanov
9470bc9fe0 [proxy] Implement per-tenant traffic metrics 2022-11-22 18:50:57 +03:00
Dmitry Ivanov
c38f38dab7 Move pq_proto to its own crate 2022-11-03 22:56:04 +03:00
Dmitry Ivanov
6ace79345d [proxy] Add more context to console requests logging (#2583) 2022-10-12 21:00:44 +03:00
Dmitry Ivanov
e516c376d6 [proxy] Improve logging (#2554)
* [proxy] Use `tracing::*` instead of `println!` for logging

* Fix a minor misnomer

* Log more stuff
2022-10-07 14:34:57 +03:00
Dmitry Ivanov
1dffba9de6 Write more tests for the proxy... (#1918)
And change a few more things in the process.
2022-09-23 18:30:44 +03:00
Dmitry Ivanov
e9a103c09f [proxy] Pass extra parameters to the console (#2467)
With this change we now pass additional params
to the console's auth methods.
2022-09-21 21:42:47 +03:00
Dmitry Ivanov
87bf7be537 [proxy] Drop support for legacy cloud API (#2448)
Apparently, it no longer exists in the cloud.
2022-09-14 21:27:47 +03:00
Dmitry Ivanov
96a50e99cf Forward various connection params to compute nodes. (#2336)
Previously, proxy didn't forward auxiliary `options` parameter
and other ones to the client's compute node, e.g.

```
$ psql "user=john host=localhost dbname=postgres options='-cgeqo=off'"
postgres=# show geqo;
┌──────┐
│ geqo │
├──────┤
│ on   │
└──────┘
(1 row)
```

With this patch we now forward `options`, `application_name` and `replication`.

Further reading: https://www.postgresql.org/docs/current/libpq-connect.html

Fixes #1287.
2022-08-30 17:36:21 +03:00
Kirill Bulatov
648e8bbefe Fix 1.63 clippy lints (#2282) 2022-08-16 18:49:22 +03:00
Dmitry Ivanov
ad08c273d3 [proxy] Rework wire format of the password hack and some errors (#2236)
The new format has a few benefits: it's shorter, simpler and
human-readable as well. We don't use base64 anymore, since
url encoding got us covered.

We also show a better error in case we couldn't parse the
payload; the users should know it's all about passing the
correct project name.
2022-08-12 17:38:43 +03:00
Ankur Srivastava
84d1bc06a9 refactor: replace lazy-static with once-cell (#2195)
- Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy`
- fixes #1147

Signed-off-by: Ankur Srivastava <best.ankur@gmail.com>
2022-08-05 19:34:04 +02:00
Dmitry Ivanov
5f4ccae5c5 [proxy] Add the password hack authentication flow (#2095)
[proxy] Add the `password hack` authentication flow

This lets us authenticate users which can use neither
SNI (due to old libpq) nor connection string `options`
(due to restrictions in other client libraries).

Note: `PasswordHack` will accept passwords which are not
encoded in base64 via the "password" field. The assumption
is that most user passwords will be valid utf-8 strings,
and the rest may still be passed via "password_".
2022-07-25 17:23:10 +03:00
KlimentSerafimov
392cd8b1fc Refactored extracting project_name in console.rs. (#1982) 2022-06-24 05:57:33 -04:00
Bojan Serafimov
93e050afe3 Don't require project name for link auth 2022-06-23 15:38:05 +03:00
KlimentSerafimov
d059e588a6 Added invariant check for project name. (#1921)
Summary: Added invariant checking for project name. Refactored ClientCredentials and TlsConfig.

* Added formatting invariant check for project name:
**\forall c \in project_name . c \in [alnum] U {'-'}. 
** sni_data == <project_name>.<common_name>
* Added exhaustive tests for get_project_name.
* Refactored TlsConfig to contain common_name : Option<String>.
* Refactored ClientCredentials construction to construct project_name directly.
* Merged ProjectNameError into ClientCredsParseError.
* Tweaked proxy tests to accommodate refactored ClientCredentials construction semantics. 
* [Pytests] Added project option argument to test_proxy_select_1.
* Removed project param from Api since now it's contained in creds.
* Refactored &Option<String> -> Option<&str>.

Co-authored-by: Dmitrii Ivanov <dima@neon.tech>.
2022-06-22 09:34:24 -04:00
KlimentSerafimov
fecad1ca34 Resolving issue #1745. Added cluster option for SNI data (#1813)
* Added project option in case SNI data is missing. Resolving issue #1745.

* Added invariant checking for project name: if both sni_data and project_name are available then they should match.
2022-06-06 08:14:41 -04:00
Dmitry Ivanov
5d813f9738 [proxy] Refactoring
This patch attempts to fix some of the technical debt
we had to introduce in previous patches.
2022-05-27 21:50:43 +03:00
Stas Kelvich
9a396e1feb Support SNI-based routing in proxy 2022-05-02 18:32:18 +03:00
Stas Kelvich
0323bb5870 [proxy] Refactor cplane API and add new console SCRAM auth API
Now proxy binary accepts `--auth-backend` CLI option, which determines
auth scheme and cluster routing method. Following backends are currently
implemented:

* legacy
    old method, when username ends with `@zenith` it uses md5 auth dbname as
    the cluster name; otherwise, it sends a login link and waits for the console
    to call back
* console
    new SCRAM-based console API; uses SNI info to select the destination
    cluster
* postgres
    uses postgres to select auth secrets of existing roles. Useful for local
    testing
* link
    sends login link for all usernames
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
af0195b604 [proxy] Introduce cloud::Api for communication with Neon Cloud
* `cloud::legacy` talks to Cloud API V1.
* `cloud::api` defines Cloud API v2.
* `cloud::local` mocks the Cloud API V2 using a local postgres instance.
* It's possible to choose between API versions using the `--api-version` flag.
2022-05-02 18:32:18 +03:00
Dmitry Rodionov
695b5f9d88 Remove obsolete failpoint in proxy
When failpoint feature is disabled it throws away passed code so code
inside is not guaranteed to compile when feature is disabled. In this
particular case code is obsolete so removing it.
2022-04-27 14:34:33 +03:00
Kirill Bulatov
81cad6277a Move and library crates into a dedicated directory and rename them 2022-04-21 13:30:33 +03:00
Dmitry Ivanov
4af87f3d60 [proxy] Add SCRAM auth mechanism implementation (#1050)
* [proxy] Add SCRAM auth

* [proxy] Implement some tests for SCRAM

* Refactoring + test fixes

* Hide SCRAM mechanism behind `#[cfg(test)]`

Currently we only use it in tests, so we hide all relevant
module behind `#[cfg(test)]` to prevent "unused item" warnings.
2022-04-13 03:00:32 +03:00