Based on this comment: https://github.com/lancedb/lancedb/issues/2228#issuecomment-2730463075 and https://github.com/lancedb/lance/pull/2357 Here is my attempt at implementing bindings for returning merge stats from a `merge_insert.execute` call for lancedb. Note: I have almost no idea what I am doing in Rust but tried to follow existing code patterns and pay attention to compiler hints. - The change in nodejs binding appeared to be necessary to get compilation to work, presumably this could actual work properly by returning some kind of NAPI JS object of the stats data? - I am unsure of what to do with the remote/table.rs changes - necessarily for compilation to work; I assume this is related to LanceDB cloud, but unsure the best way to handle that at this point. Proof of function: ```python import pandas as pd import lancedb db = lancedb.connect("/tmp/test.db") test_data = pd.DataFrame( { "title": ["Hello", "Test Document", "Example", "Data Sample", "Last One"], "id": [1, 2, 3, 4, 5], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", ], } ) table = db.create_table("documents", data=test_data, exist_ok=True, mode="overwrite") update_data = pd.DataFrame( { "title": [ "Hello, World", "Test Document, it's good", "Example", "Data Sample", "Last One", "New One", ], "id": [1, 2, 3, 4, 5, 6], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", "New content", ], } ) stats = ( table.merge_insert(on="id") .when_matched_update_all() .when_not_matched_insert_all() .execute(update_data) ) print(stats) ``` returns ``` {'num_inserted_rows': 1, 'num_updated_rows': 5, 'num_deleted_rows': 0} ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Merge-insert operations now return detailed statistics, including counts of inserted, updated, and deleted rows. - **Bug Fixes** - Tests updated to validate returned merge-insert statistics for accuracy. - **Documentation** - Method documentation improved to reflect new return values and clarify merge operation results. - Added documentation for the new `MergeStats` interface detailing operation statistics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Will Jones <willjones127@gmail.com>
LanceDB Documentation
LanceDB docs are deployed to https://lancedb.github.io/lancedb/.
Docs is built and deployed automatically by Github Actions
whenever a commit is pushed to the main branch. So it is possible for the docs to show
unreleased features.
Building the docs
Setup
- Install LanceDB Python. See setup in Python contributing guide.
Run
make developto install the Python package. - Install documentation dependencies. From LanceDB repo root:
pip install -r docs/requirements.txt
Preview the docs
cd docs
mkdocs serve
If you want to just generate the HTML files:
PYTHONPATH=. mkdocs build -f docs/mkdocs.yml
If successful, you should see a docs/site directory that you can verify locally.
Adding examples
To make sure examples are correct, we put examples in test files so they can be run as part of our test suites.
You can see the tests are at:
- Python:
python/python/tests/docs - Typescript:
nodejs/examples/
Checking python examples
cd python
pytest -vv python/tests/docs
Checking typescript examples
The @lancedb/lancedb package must be built before running the tests:
pushd nodejs
npm ci
npm run build
popd
Then you can run the examples by going to the nodejs/examples directory and
running the tests like a normal npm package:
pushd nodejs/examples
npm ci
npm test
popd
API documentation
Python
The Python API documentation is organized based on the file docs/src/python/python.md.
We manually add entries there so we can control the organization of the reference page.
However, this means any new types must be manually added to the file. No additional
steps are needed to generate the API documentation.
Typescript
The typescript API documentation is generated from the typescript source code using typedoc.
When new APIs are added, you must manually re-run the typedoc command to update the API documentation. The new files should be checked into the repository.
pushd nodejs
npm run docs
popd