mirror of
https://github.com/neondatabase/neon.git
synced 2026-05-18 21:50:37 +00:00
## Problem Password hashing for sql-over-http takes up a lot of CPU. Perhaps we can get away with temporarily caching some steps so we only need fewer rounds, which will save some CPU time. ## Summary of changes The output of pbkdf2 is the XOR of the outputs of each iteration round, eg `U1 ^ U2 ^ ... U15 ^ U16 ^ U17 ^ ... ^ Un`. We cache the suffix of the expression `U16 ^ U17 ^ ... ^ Un`. To compute the result from the cached suffix, we only need to compute the prefix `U1 ^ U2 ^ ... U15`. The suffix by itself is useless, which prevent's its use in brute-force attacks should this cached memory leak. We are also caching the full 4096 round hash in memory, which can be used for brute-force attacks, where this suffix could be used to speed it up. My hope/expectation is that since these will be in different allocations, it makes any such memory exploitation much much harder. Since the full hash cache might be invalidated while the suffix is cached, I'm storing the timestamp of the computation as a way to identity the match. I also added `zeroize()` to clear the sensitive state from the stack/heap. For the most security conscious customers, we hope to roll out OIDC soon, so they can disable passwords entirely. --- The numbers for the threadpool were pretty random, but according to our busiest region for sql-over-http, we only see about 150 unique endpoints every minute. So storing ~100 of the most common endpoints for that minute should be the vast majority of requests. 1 minute was chosen so we don't keep data in memory for too long.
85 lines
2.8 KiB
Rust
85 lines
2.8 KiB
Rust
use tokio::time::Instant;
|
|
use zeroize::Zeroize as _;
|
|
|
|
use super::pbkdf2;
|
|
use crate::cache::Cached;
|
|
use crate::cache::common::{Cache, count_cache_insert, count_cache_outcome, eviction_listener};
|
|
use crate::intern::{EndpointIdInt, RoleNameInt};
|
|
use crate::metrics::{CacheKind, Metrics};
|
|
|
|
pub(crate) struct Pbkdf2Cache(moka::sync::Cache<(EndpointIdInt, RoleNameInt), Pbkdf2CacheEntry>);
|
|
pub(crate) type CachedPbkdf2<'a> = Cached<&'a Pbkdf2Cache>;
|
|
|
|
impl Cache for Pbkdf2Cache {
|
|
type Key = (EndpointIdInt, RoleNameInt);
|
|
type Value = Pbkdf2CacheEntry;
|
|
|
|
fn invalidate(&self, info: &(EndpointIdInt, RoleNameInt)) {
|
|
self.0.invalidate(info);
|
|
}
|
|
}
|
|
|
|
/// To speed up password hashing for more active customers, we store the tail results of the
|
|
/// PBKDF2 algorithm. If the output of PBKDF2 is U1 ^ U2 ^ ⋯ ^ Uc, then we store
|
|
/// suffix = U17 ^ U18 ^ ⋯ ^ Uc. We only need to calculate U1 ^ U2 ^ ⋯ ^ U15 ^ U16
|
|
/// to determine the final result.
|
|
///
|
|
/// The suffix alone isn't enough to crack the password. The stored_key is still required.
|
|
/// While both are cached in memory, given they're in different locations is makes it much
|
|
/// harder to exploit, even if any such memory exploit exists in proxy.
|
|
#[derive(Clone)]
|
|
pub struct Pbkdf2CacheEntry {
|
|
/// corresponds to [`super::ServerSecret::cached_at`]
|
|
pub(super) cached_from: Instant,
|
|
pub(super) suffix: pbkdf2::Block,
|
|
}
|
|
|
|
impl Drop for Pbkdf2CacheEntry {
|
|
fn drop(&mut self) {
|
|
self.suffix.zeroize();
|
|
}
|
|
}
|
|
|
|
impl Pbkdf2Cache {
|
|
pub fn new() -> Self {
|
|
const SIZE: u64 = 100;
|
|
const TTL: std::time::Duration = std::time::Duration::from_secs(60);
|
|
|
|
let builder = moka::sync::Cache::builder()
|
|
.name("pbkdf2")
|
|
.max_capacity(SIZE)
|
|
// We use time_to_live so we don't refresh the lifetime for an invalid password attempt.
|
|
.time_to_live(TTL);
|
|
|
|
Metrics::get()
|
|
.cache
|
|
.capacity
|
|
.set(CacheKind::Pbkdf2, SIZE as i64);
|
|
|
|
let builder =
|
|
builder.eviction_listener(|_k, _v, cause| eviction_listener(CacheKind::Pbkdf2, cause));
|
|
|
|
Self(builder.build())
|
|
}
|
|
|
|
pub fn insert(&self, endpoint: EndpointIdInt, role: RoleNameInt, value: Pbkdf2CacheEntry) {
|
|
count_cache_insert(CacheKind::Pbkdf2);
|
|
self.0.insert((endpoint, role), value);
|
|
}
|
|
|
|
fn get(&self, endpoint: EndpointIdInt, role: RoleNameInt) -> Option<Pbkdf2CacheEntry> {
|
|
count_cache_outcome(CacheKind::Pbkdf2, self.0.get(&(endpoint, role)))
|
|
}
|
|
|
|
pub fn get_entry(
|
|
&self,
|
|
endpoint: EndpointIdInt,
|
|
role: RoleNameInt,
|
|
) -> Option<CachedPbkdf2<'_>> {
|
|
self.get(endpoint, role).map(|value| Cached {
|
|
token: Some((self, (endpoint, role))),
|
|
value,
|
|
})
|
|
}
|
|
}
|