Files
lancedb/python/ASYNC_MIGRATION.md
Weston Pace 8033a44d68 feat: add support for add to async python API (#1037)
In order to add support for `add` we needed to migrate the rust `Table`
trait to a `Table` struct and `TableInternal` trait (similar to the way
the connection is designed).

While doing this we also cleaned up some inconsistencies between the
SDKs:

* Python and Node are garbage collected languages and it can be
difficult to trigger something to be freed. The convention for these
languages is to have some kind of close method. I added a close method
to both the table and connection which will drop the underlying rust
object.
* We made significant improvements to table creation in
cc5f2136a6
for the `node` SDK. I copied these changes to the `nodejs` SDK.
* The nodejs tables were using fs to create tmp directories and these
were not getting cleaned up. This is mostly harmless but annoying and so
I changed it up a bit to ensure we cleanup tmp directories.
* ~~countRows in the node SDK was returning `bigint`. I changed it to
return `number`~~ (this actually happened in a previous PR)
* Tables and connections now implement `std::fmt::Display` which is
hooked into python's `__repr__`. Node has no concept of a regular "to
string" function and so I added a `display` method.
* Python method signatures are changing so that optional parameters are
always `Optional[foo] = None` instead of something like `foo = False`.
This is because we want those defaults to be in rust whenever possible
(though we still need to mention the default in documentation).
* I changed the python `AsyncConnection/AsyncTable` classes from
abstract classes with a single implementation to just classes because we
no longer have the remote implementation in python.

Note: this does NOT add the `add` function to the remote table. This PR
was already large enough, and the remote implementation is unique
enough, that I am going to do all the remote stuff at a later date (we
should have the structure in place and correct so there shouldn't be any
refactor concerns)

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-04-05 16:31:36 -07:00

39 lines
1.7 KiB
Markdown

# Migration from Sync to Async API
A new asynchronous API has been added to LanceDb. This API is built
on top of the rust lancedb crate (instead of being built on top of
pylance). This will help keep the various language bindings in sync.
There are some slight changes between the synchronous and the asynchronous
APIs. This document will help you migrate. These changes relate mostly
to the Connection and Table classes.
## Almost all functions are async
The most important change is that almost all functions are now async.
This means the functions now return `asyncio` coroutines. You will
need to use `await` to call these functions.
## Connection
* The connection now has a `close` method. You can call this when
you are done with the connection to eagerly free resources. Currently
this is limited to freeing/closing the HTTP connection for remote
connections. In the future we may add caching or other resources to
native connections so this is probably a good practice even if you aren't using remote connections.
In addition, the connection can be used as a context manager which may
be a more convenient way to ensure the connection is closed.
It is not mandatory to call the `close` method. If you don't call it
the connection will be closed when the object is garbage collected.
## Table
* The table now has a `close` method, similar to the connection. This
can be used to eagerly free the cache used by a Table object. Similar
to the connection, it can be used as a context manager and it is not
mandatory to call the `close` method.
* Previously `Table.schema` was a property. Now it is an async method.
* The method `Table.__len__` was removed and `len(table)` will no longer
work. Use `Table.count_rows` instead.