fix(pageserver): reject non-i128 key on the write path (#8648)

It's better to reject invalid keys on the write path than storing it and
panic-ing the pageserver.
https://github.com/neondatabase/neon/issues/8636

## Summary of changes

If a key cannot be represented using i128, we don't allow writing that
key into the pageserver.

There are two versions of the check valid function: the normal one that
simply rejects i128 keys, and the stronger one that rejects all keys
that we don't support.

The current behavior when a key gets rejected is that safekeeper will
keep retrying streaming that key to the pageserver. And once such key
gets written, no new computes can be started. Therefore, there could be
a large amount of pageserver warnings if a key cannot be ingested. To
validate this behavior by yourself, the reviewer can (1) use the
stronger version of the valid check (2) run the following SQL.

```
set neon.regress_test_mode = true;
CREATE TABLESPACE regress_tblspace LOCATION '/Users/skyzh/Work/neon-test/tablespace';
CREATE SCHEMA testschema;
CREATE TABLE testschema.foo (i int) TABLESPACE regress_tblspace;
insert into testschema.foo values (1), (2), (3);
```

For now, I'd like to merge the patch with only rejecting non-i128 keys.
It's still unknown whether the stronger version covers all the cases
that basebackup doesn't support. Furthermore, the behavior of rejecting
a key will produce large amounts of warnings due to safekeeper retry.
Therefore, I'd like to reject the minimum set of keys that we don't
support (i128 ones) for now. (well, erroring out is better than panic on
`to_compact_key`)

The next step is to fix the safekeeper behavior (i.e., on such key
rejections, stop streaming WAL), so that we can properly stop writing.
An alternative solution is to simply drop these keys on the write path.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
This commit is contained in:
Alex Chi Z.
2024-08-29 22:07:05 +08:00
committed by GitHub
parent 18bfc43fa7
commit 653a6532a2
4 changed files with 49 additions and 7 deletions

View File

@@ -12,7 +12,7 @@ use crate::keyspace::{KeySpace, KeySpaceAccum};
use crate::span::debug_assert_current_span_has_tenant_and_timeline_id_no_shard_id;
use crate::walrecord::NeonWalRecord;
use crate::{aux_file, repository::*};
use anyhow::{ensure, Context};
use anyhow::{bail, ensure, Context};
use bytes::{Buf, Bytes, BytesMut};
use enum_map::Enum;
use pageserver_api::key::{
@@ -1791,6 +1791,11 @@ impl<'a> DatadirModification<'a> {
// Flush relation and SLRU data blocks, keep metadata.
let mut retained_pending_updates = HashMap::<_, Vec<_>>::new();
for (key, values) in self.pending_updates.drain() {
if !key.is_valid_key_on_write_path() {
bail!(
"the request contains data not supported by pageserver at TimelineWriter::put: {}", key
);
}
let mut write_batch = Vec::new();
for (lsn, value_ser_size, value) in values {
if key.is_rel_block_key() || key.is_slru_block_key() {
@@ -1843,10 +1848,13 @@ impl<'a> DatadirModification<'a> {
.drain()
.flat_map(|(key, values)| {
values.into_iter().map(move |(lsn, val_ser_size, value)| {
(key.to_compact(), lsn, val_ser_size, value)
if !key.is_valid_key_on_write_path() {
bail!("the request contains data not supported by pageserver at TimelineWriter::put: {}", key);
}
Ok((key.to_compact(), lsn, val_ser_size, value))
})
})
.collect::<Vec<_>>();
.collect::<anyhow::Result<Vec<_>>>()?;
writer.put_batch(batch, ctx).await?;
}