Commit Graph

212 Commits

Author SHA1 Message Date
Lance Release
69492586f0 [python] Bump version: 0.5.5 → 0.5.6 2024-04-05 16:30:40 -07:00
Will Jones
efa846b6e5 chore: upgrade lance to 0.9.16 (#975) 2024-04-05 16:30:36 -07:00
Lance Release
fded15c9fe [python] Bump version: 0.5.4 → 0.5.5 2024-04-05 16:30:36 -07:00
Lance Release
a7e60a4c3f [python] Bump version: 0.5.3 → 0.5.4 2024-04-05 16:29:58 -07:00
Weston Pace
e12bdc78bb chore: bump lance version to 0.9.15 (#949) 2024-04-05 16:29:58 -07:00
Weston Pace
41ccb48160 feat: add support for filter during merge insert when matched (#948)
Closes #940
2024-04-05 16:29:58 -07:00
Will Jones
39cc2fd62b feat(python): add read_consistency_interval argument (#828)
This PR refactors how we handle read consistency: does the `LanceTable`
class always pick up modifications to the table made by other instance
or processes. Users have three options they can set at the connection
level:

1. (Default) `read_consistency_interval=None` means it will not check at
all. Users can call `table.checkout_latest()` to manually check for
updates.
2. `read_consistency_interval=timedelta(0)` means **always** check for
updates, giving strong read consistency.
3. `read_consistency_interval=timedelta(seconds=20)` means check for
updates every 20 seconds. This is eventual consistency, a compromise
between the two options above.

There is now an explicit difference between a `LanceTable` that tracks
the current version and one that is fixed at a historical version. We
now enforce that users cannot write if they have checked out an old
version. They are instructed to call `checkout_latest()` before calling
the write methods.

Since `conn.open_table()` doesn't have a parameter for version, users
will only get fixed references if they call `table.checkout()`.

The difference between these two can be seen in the repr: Table that are
fixed at a particular version will have a `version` displayed in the
repr. Otherwise, the version will not be shown.

```python
>>> table
LanceTable(connection=..., name="my_table")
>>> table.checkout(1)
>>> table
LanceTable(connection=..., name="my_table", version=1)
```

I decided to not create different classes for these states, because I
think we already have enough complexity with the Cloud vs OSS table
references.

Based on #812
2024-04-05 16:29:57 -07:00
Lance Release
c101e9deed [python] Bump version: 0.5.2 → 0.5.3 2024-04-05 16:29:13 -07:00
Lance Release
c8f92c2987 [python] Bump version: 0.5.1 → 0.5.2 2024-04-05 16:29:05 -07:00
Weston Pace
9d115bd507 chore: bump pylance version to latest in pyproject.toml (#918) 2024-04-05 16:29:05 -07:00
Weston Pace
4eb819072a feat: upgrade to lance 0.9.11 and expose merge_insert (#906)
This adds the python bindings requested in #870 The javascript/rust
bindings will be added in a future PR.
2024-04-05 16:29:05 -07:00
Raghav Dixit
472344fcb3 feat(python): Embedding fn support for gte-mlx/gte-large (#873)
have added testing and an example in the docstring, will be pushing a
separate PR in recipe repo for rag example

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-04-05 16:28:56 -07:00
Ayush Chaurasia
545a03d7f9 feat(python): Aws Bedrock embeddings integration (#822)
Supports amazon titan, cohere english & cohere multi-lingual base
models.
2024-04-05 16:28:56 -07:00
Lei Xu
f2e29eb004 chore: upgrade lance, pylance and datafusion (#879) 2024-04-05 16:28:56 -07:00
Bert
d1f9722bfb Bump lance 0.9.9 (#851) 2024-04-05 16:27:51 -07:00
Lance Release
7b8188bcd5 [python] Bump version: 0.5.0 → 0.5.1 2024-04-05 16:27:51 -07:00
Bert
4243eaee93 bump lance to 0.9.7 (#826) 2024-04-05 16:27:14 -07:00
Lance Release
33ab68c790 [python] Bump version: 0.4.4 → 0.5.0 2024-04-05 16:26:36 -07:00
Chang She
17dcb70076 feat(python): basic polars integration (#811)
We should now be able to directly ingest polars dataframes and return
results as polars dataframes

![image](https://github.com/lancedb/lancedb/assets/759245/828b1260-c791-45f1-a047-aa649575e798)
2024-04-05 16:26:19 -07:00
Lance Release
1387dc6e48 [python] Bump version: 0.4.3 → 0.4.4 2024-04-05 16:25:02 -07:00
Will Jones
63e273606e upgrade lance (#809) 2024-04-05 16:25:02 -07:00
Lei Xu
45b006d68c chore: remove black as dependency (#808)
We use `ruff` in CI and dev workflow now.
2024-04-05 16:25:02 -07:00
Sebastian Law
4aa7f58a07 use requests instead of aiohttp for underlying http client (#803)
instead of starting and stopping the current thread's event loop on
every http call, just make an http call.
2024-04-05 16:25:01 -07:00
Lei Xu
4c8690549a chore: bump lance to 0.9.5 (#790) 2024-04-05 16:25:01 -07:00
Lance Release
918a2a4405 [python] Bump version: 0.4.2 → 0.4.3 2024-04-05 16:24:47 -07:00
Lei Xu
56db257ea9 chore: bump pylance to 0.9.2 (#754) 2024-04-05 16:24:47 -07:00
Lance Release
7778031b26 [python] Bump version: 0.4.1 → 0.4.2 2024-04-05 16:24:47 -07:00
Chang She
c97ae6b787 chore(python): update embedding API to use openai 1.6.1 (#751)
API has changed significantly, namely `openai.Embedding.create` no
longer exists.
https://github.com/openai/openai-python/discussions/742

Update the OpenAI embedding function and put a minimum on the openai sdk
version.
2024-04-05 16:24:47 -07:00
Chang She
7bac1131fb feat: add timezone handling for datetime in pydantic (#578)
If you add timezone information in the Field annotation for a datetime
then that will now be passed to the pyarrow data type.

I'm not sure how pyarrow enforces timezones, right now, it silently
coerces to the timezone given in the column regardless of whether the
input had the matching timezone or not. This is probably not the right
behavior. Though we could just make it so the user has to make the
pydantic model do the validation instead of doing that at the pyarrow
conversion layer.
2024-04-05 16:24:47 -07:00
Chang She
a0afa84786 feat(python): add post filtering for full text search (#739)
Closes #721 

fts will return results as a pyarrow table. Pyarrow tables has a
`filter` method but it does not take sql filter strings (only pyarrow
compute expressions). Instead, we do one of two things to support
`tbl.search("keywords").where("foo=5").limit(10).to_arrow()`:

Default path: If duckdb is available then use duckdb to execute the sql
filter string on the pyarrow table.
Backup path: Otherwise, write the pyarrow table to a lance dataset and
then do `to_table(filter=<filter>)`

Neither is ideal. 
Default path has two issues:
1. requires installing an extra library (duckdb)
2. duckdb mangles some fields (like fixed size list => list)

Backup path incurs a latency penalty (~20ms on ssd) to write the
resultset to disk.

In the short term, once #676 is addressed, we can write the dataset to
"memory://" instead of disk, this makes the post filter evaluate much
quicker (ETA next week).

In the longer term, we'd like to be able to evaluate the filter string
on the pyarrow Table directly, one possibility being that we use
Substrait to generate pyarrow compute expressions from sql string. Or if
there's enough progress on pyarrow, it could support Substrait
expressions directly (no ETA)

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-04-05 16:24:47 -07:00
Lance Release
d1f24ba1dd [python] Bump version: 0.4.0 → 0.4.1 2024-04-05 16:24:47 -07:00
Will Jones
48a12e780c upgrade lance to v0.9.1 (#727)
This brings in some important bugfixes related to take and aarch64
Linux. See changes at:
https://github.com/lancedb/lance/releases/tag/v0.9.1
2024-04-05 16:24:30 -07:00
Lance Release
acbcbe6496 [python] Bump version: 0.3.6 → 0.4.0 2024-04-05 16:24:30 -07:00
Lei Xu
1d79e9168e chore: bump lance version to 0.9 (#715) 2024-04-05 16:24:30 -07:00
Lance Release
811e604077 [python] Bump version: 0.3.5 → 0.3.6 2024-04-05 16:24:30 -07:00
Lance Release
fc32f98c34 [python] Bump version: 0.3.4 → 0.3.5 2024-04-05 16:24:30 -07:00
Will Jones
9356c3b86a feat(python): add update query support for Python (#654)
Closes #69

Will not pass until https://github.com/lancedb/lance/pull/1585 is
released
2024-04-05 16:24:29 -07:00
Lance Release
8f82e4897c [python] Bump version: 0.3.3 → 0.3.4 2024-04-05 16:23:49 -07:00
Will Jones
6d76fe80b8 chore: upgrade lance to v0.8.17 (#656)
Readying for the next Lance release.
2024-04-05 16:23:49 -07:00
Ayush Chaurasia
c0a49a9a5b Multi-task instructor model with quantization support & weak_lru cache for embedding function models (#612)
resolves #608
2024-04-05 16:23:49 -07:00
Lei Xu
86efd36689 chore: improve create_table API consistency between local and remote SDK (#627) 2024-04-05 16:23:47 -07:00
Lance Release
c1b037f0a5 [python] Bump version: 0.3.2 → 0.3.3 2024-04-05 16:23:14 -07:00
Lei Xu
3855bdf986 chore: bump lance to 8.10 (#622) 2024-04-05 16:23:14 -07:00
Will Jones
166b281d66 increment pylance (#618) 2024-04-05 16:22:59 -07:00
Lance Release
a3c955070e [python] Bump version: 0.3.1 → 0.3.2 2024-04-05 16:22:59 -07:00
Bert
edeecd3d9f update lance to 0.8.7 (#598) 2024-04-05 16:22:59 -07:00
Will Jones
e37a0566e0 Revert "[python] Bump version: 0.3.2 → 0.3.3"
This reverts commit c30faf6083.
2024-04-05 16:22:59 -07:00
Will Jones
48999ffc27 [python] Bump version: 0.3.2 → 0.3.3 2024-04-05 16:22:59 -07:00
Chang She
31334b05df chore: bump lance version in python/rust lancedb (#584)
To include latest v0.8.6

Co-authored-by: Chang She <chang@lancedb.com>
2024-04-05 16:22:59 -07:00
Ayush Chaurasia
507f6087c2 [Python]Embeddings API refactor (#580)
Sets things up for this -> https://github.com/lancedb/lancedb/issues/579
- Just separates out the registry/ingestion code from the function
implementation code
- adds a `get_registry` util
- package name "open-clip" -> "open-clip-torch"
2024-04-05 16:22:59 -07:00