Files
lancedb/docs
Alex Pilon f315f9665a feat: implement bindings to return merge stats (#2367)
Based on this comment:
https://github.com/lancedb/lancedb/issues/2228#issuecomment-2730463075
and https://github.com/lancedb/lance/pull/2357

Here is my attempt at implementing bindings for returning merge stats
from a `merge_insert.execute` call for lancedb.

Note: I have almost no idea what I am doing in Rust but tried to follow
existing code patterns and pay attention to compiler hints.
- The change in nodejs binding appeared to be necessary to get
compilation to work, presumably this could actual work properly by
returning some kind of NAPI JS object of the stats data?
- I am unsure of what to do with the remote/table.rs changes -
necessarily for compilation to work; I assume this is related to LanceDB
cloud, but unsure the best way to handle that at this point.

Proof of function:

```python
import pandas as pd
import lancedb


db = lancedb.connect("/tmp/test.db")

test_data = pd.DataFrame(
    {
        "title": ["Hello", "Test Document", "Example", "Data Sample", "Last One"],
        "id": [1, 2, 3, 4, 5],
        "content": [
            "World",
            "This is a test",
            "Another example",
            "More test data",
            "Final entry",
        ],
    }
)

table = db.create_table("documents", data=test_data, exist_ok=True, mode="overwrite")

update_data = pd.DataFrame(
    {
        "title": [
            "Hello, World",
            "Test Document, it's good",
            "Example",
            "Data Sample",
            "Last One",
            "New One",
        ],
        "id": [1, 2, 3, 4, 5, 6],
        "content": [
            "World",
            "This is a test",
            "Another example",
            "More test data",
            "Final entry",
            "New content",
        ],
    }
)

stats = (
    table.merge_insert(on="id")
    .when_matched_update_all()
    .when_not_matched_insert_all()
    .execute(update_data)
)

print(stats)
```

returns

```
{'num_inserted_rows': 1, 'num_updated_rows': 5, 'num_deleted_rows': 0}
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Merge-insert operations now return detailed statistics, including
counts of inserted, updated, and deleted rows.
- **Bug Fixes**
- Tests updated to validate returned merge-insert statistics for
accuracy.
- **Documentation**
- Method documentation improved to reflect new return values and clarify
merge operation results.
- Added documentation for the new `MergeStats` interface detailing
operation statistics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-05-01 10:00:20 -07:00
..

LanceDB Documentation

LanceDB docs are deployed to https://lancedb.github.io/lancedb/.

Docs is built and deployed automatically by Github Actions whenever a commit is pushed to the main branch. So it is possible for the docs to show unreleased features.

Building the docs

Setup

  1. Install LanceDB Python. See setup in Python contributing guide. Run make develop to install the Python package.
  2. Install documentation dependencies. From LanceDB repo root: pip install -r docs/requirements.txt

Preview the docs

cd docs
mkdocs serve

If you want to just generate the HTML files:

PYTHONPATH=. mkdocs build -f docs/mkdocs.yml

If successful, you should see a docs/site directory that you can verify locally.

Adding examples

To make sure examples are correct, we put examples in test files so they can be run as part of our test suites.

You can see the tests are at:

  • Python: python/python/tests/docs
  • Typescript: nodejs/examples/

Checking python examples

cd python
pytest -vv python/tests/docs

Checking typescript examples

The @lancedb/lancedb package must be built before running the tests:

pushd nodejs
npm ci
npm run build
popd

Then you can run the examples by going to the nodejs/examples directory and running the tests like a normal npm package:

pushd nodejs/examples
npm ci
npm test
popd

API documentation

Python

The Python API documentation is organized based on the file docs/src/python/python.md. We manually add entries there so we can control the organization of the reference page. However, this means any new types must be manually added to the file. No additional steps are needed to generate the API documentation.

Typescript

The typescript API documentation is generated from the typescript source code using typedoc.

When new APIs are added, you must manually re-run the typedoc command to update the API documentation. The new files should be checked into the repository.

pushd nodejs
npm run docs
popd