feat: provide timeout parameter for merge_insert (#2378)

Provides the ability to set a timeout for merge insert. The default
underlying timeout is however long the first attempt takes, or if there
are multiple attempts, 30 seconds. This has two use cases:

1. Make the timeout shorter, when you want to fail if it takes too long.
2. Allow taking more time to do retries.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for specifying a timeout when performing merge insert
operations in Python, Node.js, and Rust APIs.
- Introduced a new option to control the maximum allowed execution time
for merge inserts, including retry timeout handling.

- **Documentation**
- Updated and added documentation to describe the new timeout option and
its usage in APIs.

- **Tests**
- Added and updated tests to verify correct timeout behavior during
merge insert operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Will Jones
2025-05-08 13:07:05 -07:00
committed by GitHub
parent 75c257ebb6
commit 272e4103b2
16 changed files with 179 additions and 33 deletions

View File

@@ -937,7 +937,7 @@ def test_merge_insert(mem_db: DBConnection):
table.merge_insert("a")
.when_matched_update_all()
.when_not_matched_insert_all()
.execute(new_data)
.execute(new_data, timeout=timedelta(seconds=10))
)
assert merge_insert_res.version == 2
assert merge_insert_res.num_inserted_rows == 1
@@ -1013,6 +1013,12 @@ def test_merge_insert(mem_db: DBConnection):
expected = pa.table({"a": [2, 4], "b": ["x", "z"]})
assert table.to_arrow().sort_by("a") == expected
# timeout
with pytest.raises(Exception, match="merge insert timed out"):
table.merge_insert("a").when_matched_update_all().execute(
new_data, timeout=timedelta(0)
)
# We vary the data format because there are slight differences in how
# subschemas are handled in different formats