proxy(tokio-postgres): refactor typeinfo query to occur earlier (#11993)

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-23 06:09:59 +00:00

## Problem

For #11992 I realised we need to get the type info before executing the
query. This is important to know how to decode rows with custom types,
eg the following query:

```sql
CREATE TYPE foo AS ENUM ('foo','bar','baz');
SELECT ARRAY['foo'::foo, 'bar'::foo, 'baz'::foo] AS data;
```

Getting that to work was harder that it seems. The original
tokio-postgres setup has a split between `Client` and `Connection`,
where messages are passed between. Because multiple clients were
supported, each client message included a dedicated response channel.
Each request would be terminated by the `ReadyForQuery` message.

The flow I opted to use for parsing types early would not trigger a
`ReadyForQuery`. The flow is as follows:

```
PARSE "" // parse the user provided query
DESCRIBE "" // describe the query, returning param/result type oids
FLUSH // force postgres to flush the responses early

// wait for descriptions

// check if we know the types, if we don't then
// setup the typeinfo query and execute it against each OID:

PARSE typeinfo // prepare our typeinfo query
DESCRIBE typeinfo
FLUSH // force postgres to flush the responses early

// wait for typeinfo statement

// for each OID we don't know:
BIND typeinfo
EXECUTE
FLUSH

// wait for type info, might reveal more OIDs to inspect

// close the typeinfo query, we cache the OID->type map and this is kinder to pgbouncer.
CLOSE typeinfo

// finally once we know all the OIDs:
BIND "" // bind the user provided query - already parsed - to the user provided params
EXECUTE // run the user provided query
SYNC // commit the transaction
```

## Summary of changes

Please review commit by commit. The main challenge was allowing one
query to issue multiple sub-queries. To do this I first made sure that
the client could fully own the connection, which required removing any
shared client state. I then had to replace the way responses are sent to
the client, by using only a single permanent channel. This required some
additional effort to track which query is being processed. Lastly I had
to modify the query/typeinfo functions to not issue `sync` commands, so
it would fit into the desired flow above.

To note: the flow above does force an extra roundtrip into each query. I
don't know yet if this has a measurable latency overhead.

This commit is contained in:

Conrad Ludgate

2025-05-23 20:41:12 +01:00

committed by

GitHub

parent 87fc0a0374

commit 6768a71c86

15 changed files with 500 additions and 745 deletions

									
										7

libs/proxy/postgres-protocol2/src/message/frontend.rs
									
												View File
												
				@@ -25,6 +25,7 @@ where

				    Ok(())

				}

				#[derive(Debug)]

				pub enum BindError {

				    Conversion(Box<dyn Error + marker::Sync + Send>),

				    Serialization(io::Error),

				@@ -288,6 +289,12 @@ pub fn sync(buf: &mut BytesMut) {

				    write_body(buf, |_| Ok::<(), io::Error>(())).unwrap();

				}

				#[inline]

				pub fn flush(buf: &mut BytesMut) {

				    buf.put_u8(b'H');

				    write_body(buf, |_| Ok::<(), io::Error>(())).unwrap();

				}

				#[inline]

				pub fn terminate(buf: &mut BytesMut) {

				    buf.put_u8(b'X');

proxy(tokio-postgres): refactor typeinfo query to occur earlier (#11993)

7 libs/proxy/postgres-protocol2/src/message/frontend.rs Unescape Escape View File

7

libs/proxy/postgres-protocol2/src/message/frontend.rs

View File