feat(refresh): batch_size is a per-refresh knob (refresh_column), not a function-only option

batch_size / num_workers / max_workers are invocation concerns (how to schedule THIS
refresh), so expose batch_size on refresh_column through every layer (Python sync+async
-> pyo3 -> Rust client -> the REST RefreshColumnRequest.batch_size, which the handler
already forwards into the backfill). num_workers/max_workers were already invocation-
placed; batch_size was the gap. The function may still carry a default; the refresh
override wins (extends the batch_size_override model). Both crates cargo-check clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Wyatt Alt
2026-06-14 06:30:08 -07:00
committed by Jack Ye
parent fbe6a5a3fd
commit d4f4fef3ba
4 changed files with 24 additions and 5 deletions

View File

@@ -1074,18 +1074,19 @@ impl Table {
})
}
#[pyo3(signature = (columns, where_clause=None, num_workers=None, max_workers=None))]
#[pyo3(signature = (columns, where_clause=None, num_workers=None, max_workers=None, batch_size=None))]
pub fn refresh_column(
self_: PyRef<'_, Self>,
columns: Vec<String>,
where_clause: Option<String>,
num_workers: Option<u32>,
max_workers: Option<u32>,
batch_size: Option<u32>,
) -> PyResult<Bound<'_, PyAny>> {
let inner = self_.inner_ref()?.clone();
future_into_py(self_.py(), async move {
inner
.refresh_column(&columns, where_clause, num_workers, max_workers)
.refresh_column(&columns, where_clause, num_workers, max_workers, batch_size)
.await
.infer_error()
})