Chang She
9eca8e7cd1
tests pass; still need catalog
2023-12-21 20:13:08 -08:00
Chang She
587fe6ffc1
almost
2023-12-21 19:45:10 -08:00
Chang She
89c8e5839b
initial changes to enable an in-memory dataset
2023-12-21 08:52:11 -08:00
Chang She
7bbb2872de
bug(python): fix path handling in windows ( #724 )
...
Use pathlib for local paths so that pathlib
can handle the correct separator on windows.
Closes #703
---------
Co-authored-by: Will Jones <willjones127@gmail.com >
2023-12-20 15:41:36 -08:00
Will Jones
f9dd7a5d8a
fix: prevent duplicate data in FTS index ( #728 )
...
This forces the user to replace the whole FTS directory when re-creating
the index, prevent duplicate data from being created. Previously, the
whole dataset was re-added to the existing index, duplicating existing
rows in the index.
This (in combination with lancedb/lance#1707) caused #726 , since the
duplicate data emitted duplicate indices for `take()` and an upstream
issue caused those queries to fail.
This solution isn't ideal, since it makes the FTS index temporarily
unavailable while the index is built. In the future, we should have
multiple FTS index directories, which would allow atomic commits of new
indexes (as well as multiple indexes for different columns).
Fixes #498 .
Fixes #726 .
---------
Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com >
2023-12-20 13:07:07 -08:00
Will Jones
1d4943688d
upgrade lance to v0.9.1 ( #727 )
...
This brings in some important bugfixes related to take and aarch64
Linux. See changes at:
https://github.com/lancedb/lance/releases/tag/v0.9.1
2023-12-20 13:06:54 -08:00
Chang She
7856a94d2c
feat(python): support nested reference for fts ( #723 )
...
https://github.com/lancedb/lance/issues/1739
Support nested field reference in full text search
---------
Co-authored-by: Will Jones <willjones127@gmail.com >
2023-12-20 12:28:53 -08:00
Chang She
371d2f979e
feat(python): add option to flatten output in to_pandas ( #722 )
...
Closes https://github.com/lancedb/lance/issues/1738
We add a `flatten` parameter to the signature of `to_pandas`. By default
this is None and does nothing.
If set to True or -1, then LanceDB will flatten structs before
converting to a pandas dataframe. All nested structs are also flattened.
If set to any positive integer, then LanceDB will flatten structs up to
the specified level of nesting.
---------
Co-authored-by: Weston Pace <weston.pace@gmail.com >
2023-12-20 12:23:07 -08:00
Lance Release
018314a5c1
[python] Bump version: 0.3.6 → 0.4.0
2023-12-18 17:27:26 +00:00
Lei Xu
409eb30ea5
chore: bump lance version to 0.9 ( #715 )
2023-12-17 22:11:42 -05:00
Lance Release
a0608044a1
[python] Bump version: 0.3.5 → 0.3.6
2023-12-15 18:20:55 +00:00
Bert
57207eff4a
implement update for remote clients ( #706 )
2023-12-15 09:06:40 -05:00
Rob Meng
2d78bff120
feat: pass vector column name to remote backend ( #710 )
...
pass vector column name to remote as well.
`vector_column` is already part of `Query` just declearing it as part to
`remote.VectorQuery` as well
2023-12-15 00:19:08 -05:00
Chang She
bd0034a157
feat: support nested pydantic schema ( #707 )
2023-12-14 18:20:45 -08:00
Lance Release
600bfd7237
[python] Bump version: 0.3.4 → 0.3.5
2023-12-14 19:31:22 +00:00
Will Jones
d087e7891d
feat(python): add update query support for Python ( #654 )
...
Closes #69
Will not pass until https://github.com/lancedb/lance/pull/1585 is
released
2023-12-14 11:28:32 -08:00
Ayush Chaurasia
693091db29
chore(python): Reduce posthog event count ( #661 )
...
- Register open_table as event
- Because we're dropping 'seach' event currently, changed the name to
'search_table' and introduced throttling
- Throttled events will be counted once per time batch so that the user
is registered but event count doesn't go up by a lot
2023-12-08 11:00:51 -08:00
QianZhu
f6bbe199dc
Qian/minor fix doc ( #695 )
2023-12-08 09:58:53 -08:00
QianZhu
aca785ff98
saas python sdk doc ( #692 )
...
<img width="256" alt="Screenshot 2023-12-07 at 11 55 41 AM"
src="https://github.com/lancedb/lancedb/assets/1305083/259bf234-9b3b-4c5d-af45-c7f3fada2cc7 ">
2023-12-07 14:47:56 -08:00
Bert
6eb662de9b
fix: python remote correct open_table error message ( #659 )
2023-11-24 19:28:33 -05:00
Lance Release
38321fa226
[python] Bump version: 0.3.3 → 0.3.4
2023-11-19 00:24:01 +00:00
Will Jones
a57aa4b142
chore: upgrade lance to v0.8.17 ( #656 )
...
Readying for the next Lance release.
2023-11-18 15:57:23 -08:00
Rok Mihevc
d8e3e54226
feat(python): expose index cache size ( #655 )
...
This is to enable https://github.com/lancedb/lancedb/issues/641 .
Should be merged after https://github.com/lancedb/lance/pull/1587 is
released.
2023-11-18 14:17:40 -08:00
Aidan
1cf8a3e4e0
SaaS create_index API ( #649 )
2023-11-15 19:12:52 -05:00
Rok Mihevc
1c872ce501
feat: add RemoteTable.version in Python ( #644 )
...
Please note: this is not tested as we don't have a server here and
testing against a mock object wouldn't be that interesting.
2023-11-13 21:43:48 +01:00
Ayush Chaurasia
ae0d2f2599
fix: Pydantic 1.x compat for weak_lru caching in embeddings API ( #643 )
...
Colab has pydantic 1.x by default and pydantic 1.x BaseModel objects
don't support weakref creation by default that we use to cache embedding
models
https://github.com/lancedb/lancedb/blob/main/python/lancedb/embeddings/utils.py#L206
. It needs to be added to slot.
2023-11-10 15:02:38 +05:30
Ayush Chaurasia
1e8678f11a
Multi-task instructor model with quantization support & weak_lru cache for embedding function models ( #612 )
...
resolves #608
2023-11-09 12:34:18 +05:30
QianZhu
662968559d
fix saas open_table and table_names issues ( #640 )
...
- added check whether a table exists in SaaS open_table
- remove prefilter not supported warning in SaaS search
- fixed issues for SaaS table_names
2023-11-07 17:34:38 -08:00
Lei Xu
554e068917
chore: improve create_table API consistency between local and remote SDK ( #627 )
2023-11-03 13:15:11 -07:00
Ayush Chaurasia
1589499f89
Exponential standoff retry support for handling rate limited embedding functions ( #614 )
...
Users ingesting data using rate limited apis don't need to manually make
the process sleep for counter rate limits
resolves #579
2023-11-02 19:20:10 +05:30
Lance Release
ef20b2a138
[python] Bump version: 0.3.2 → 0.3.3
2023-11-01 21:15:55 +00:00
Lei Xu
2e0f251bfd
chore: bump lance to 8.10 ( #622 )
2023-11-01 14:14:38 -07:00
Ayush Chaurasia
2cb91e818d
Disable posthog on docs & reduce sentry trace factor ( #607 )
...
- posthog charges per event and docs events are registered very
frequently. We can keep tracking them on GA
- Reduced sentry trace factor
2023-11-02 01:13:16 +05:30
Bert
24111d543a
fix!: sort table names ( #619 )
...
https://github.com/lancedb/lance/issues/1385
2023-11-01 10:50:09 -04:00
QianZhu
7eec2b8f9a
Qian/query option doc ( #615 )
...
- API documentation improvement for queries (table.search)
- a small bug fix for the remote API on create_table


2023-10-31 19:50:05 -07:00
Will Jones
b2b70ea399
increment pylance ( #618 )
2023-10-31 18:07:03 -07:00
Bert
e50a3c1783
added api docs for prefilter flag ( #617 )
...
Added the prefilter flag argument to the `LanceQueryBuilder.where`.
This should make it display here:
https://lancedb.github.io/lancedb/python/python/#lancedb.query.LanceQueryBuilder.select
And also in intellisense like this:
<img width="848" alt="image"
src="https://github.com/lancedb/lancedb/assets/5846846/e0c53f4f-96bc-411b-9159-680a6c4d0070 ">
Also adds some improved documentation about the `where` argument to this
method.
---------
Co-authored-by: Weston Pace <weston.pace@gmail.com >
2023-10-31 16:39:32 -04:00
Weston Pace
b517134309
feat: allow prefiltering with index ( #610 )
...
Support for prefiltering with an index was added in lance version 0.8.7.
We can remove the lancedb check that prevents this. Closes #261
2023-10-31 13:11:03 -07:00
Lance Release
f3cf986777
[python] Bump version: 0.3.1 → 0.3.2
2023-10-24 19:06:38 +00:00
Bert
c73fcc8898
update lance to 0.8.7 ( #598 )
2023-10-24 14:49:36 -04:00
Chang She
cd9debc3b7
fix(python): fix multiple embedding functions bug ( #597 )
...
Closes #594
The embedding functions are pydantic models so multiple instances with
the same parameters are considered ==, which means that if you have
multiple embedding columns it's possible for the embeddings to get
overwritten. Instead we use `is` instead of == to avoid this problem.
testing: modified unit test to include this case
2023-10-24 13:05:05 -04:00
Will Jones
14e8e48de2
Revert "[python] Bump version: 0.3.2 → 0.3.3"
...
This reverts commit c30faf6083 .
2023-10-20 17:52:49 -07:00
Will Jones
c30faf6083
[python] Bump version: 0.3.2 → 0.3.3
2023-10-20 17:30:00 -07:00
Chang She
0ed39b6146
chore: bump lance version in python/rust lancedb ( #584 )
...
To include latest v0.8.6
Co-authored-by: Chang She <chang@lancedb.com >
2023-10-19 13:05:12 -07:00
Ayush Chaurasia
0293bbe142
[Python]Embeddings API refactor ( #580 )
...
Sets things up for this -> https://github.com/lancedb/lancedb/issues/579
- Just separates out the registry/ingestion code from the function
implementation code
- adds a `get_registry` util
- package name "open-clip" -> "open-clip-torch"
2023-10-17 22:32:19 -07:00
QianZhu
d46bc5dd6e
list table pagination draft ( #574 )
2023-10-16 21:09:20 -07:00
Prashanth Rao
86efb11572
Add pyarrow date and timestamp type conversion from pydantic ( #576 )
2023-10-16 19:42:24 -07:00
Lei Xu
fe64fc4671
feat(python,js): deletion operation on remote tables ( #568 )
2023-10-14 15:47:19 -07:00
Rok Mihevc
6d66404506
docs: switch python examples to be row based ( #554 )
2023-10-14 14:07:43 -07:00
Lei Xu
eff94ecea8
chore: bump lance to 0.8.5 ( #561 )
...
Bump lance to 0.5.8
2023-10-14 12:38:43 -07:00