Commit Graph

1147 Commits

Author SHA1 Message Date
Ayush Chaurasia
2f13fa225f Chore (python): Better retry loop logging when embedding api fails (#1267)
https://github.com/lancedb/lancedb/issues/1266#event-12703166915

This happens because openai API errors out with None values. The current
log level didn't really print out the msg on screen. Changed the log
level to warning, which better suits this case.

Also, retry loop can be disabled by setting `max_retries=0` (I'm not
sure if we should also set this as the default behaviour as hitting api
rate is quite common when ingesting large corpus)

```
func = get_registry().get("openai").create(max_retries=0)
````
2024-05-06 11:49:11 +05:30
Nehil Jain
e933de003d fix: Docs for embed_func fixed in youtube transcript search notebook (#1269)
Fixes issue https://github.com/lancedb/lancedb/issues/1268
2024-05-06 11:48:25 +05:30
Ikko Eltociear Ashimine
05fd387425 docs: update README.md (#1270)
retrevial -> retrieval
2024-05-06 11:46:48 +05:30
Will Jones
82a1da554c fix(python): return ValueError if passed unknown args to connect() (#1265)
It's confusing to users that keyword arguments from the async API like
`storage_options` are accepted by `connect()`, but don't do anything. We
should error if unknown arguments are passed instead.
2024-05-03 17:00:08 -07:00
Rohit Rastogi
a7c0d80b9e Implement convertors to and from Polars DataFrames in Rust SDK using convertors based on C FFI #1099 (#1260)
https://github.com/lancedb/lancedb/issues/1099

Took the same general approach from:
https://github.com/lancedb/lancedb/pull/1235. Instead of using
high-level convertors implemented in polars-arrow (with the arrow-rs
feature flag, which adds a dependency on arrow-rs), I used convertors
based on the C FFI to avoid dependency conflicts.

---------

Co-authored-by: Rohit Rastogi <rohitrastogi@Rohits-MacBook-Pro.local>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-05-03 16:15:14 -07:00
Cory Grinstead
71323a064a chore(nodejs): update docs on "table.ts" (#1255)
closes https://github.com/lancedb/lancedb/issues/1007
2024-05-01 23:00:22 -05:00
asmith26
df48454b70 Update embedding_functions.md (#1250)
`clip.ndims` seems to be a function (I installed with `pip install
open_clip_torch`).
2024-05-01 09:33:42 -07:00
Lance Release
6603414885 Updating package-lock.json 2024-04-30 20:57:12 +00:00
Lance Release
c256f6c502 Updating package-lock.json 2024-04-30 19:58:49 +00:00
Lance Release
cc03f90379 Updating package-lock.json 2024-04-30 19:21:48 +00:00
Lance Release
975da09b02 Bump version: 0.4.17 → 0.4.18 v0.4.18 2024-04-30 19:21:37 +00:00
Cory Grinstead
c32e17b497 chore(nodejs): remove "optionalDependencies" (#1252)
closes #1248 

the binding specific `optionalDependencies` are added automatically as
part of the `prepublishOnly` hook, and they are not supposed to be
committed to `package.json`.



--- 

npm lifecycle scripts: 
https://docs.npmjs.com/cli/v7/using-npm/scripts#life-cycle-scripts
2024-04-30 10:51:10 -05:00
Ryan Green
0528abdf97 fix: fix path on remote create_table and check for error response (#1244) 2024-04-28 11:33:05 -02:30
Lance Release
1090c311e8 [python] Bump version: 0.6.10 → 0.6.11 python-v0.6.11 2024-04-27 03:54:58 +00:00
Weston Pace
e767cbb374 chore: update to Lance version 0.10.16 and Arrow version 51 (#1247) 2024-04-26 16:26:57 -07:00
Weston Pace
3d7c48feca feat: allow the index_cache_size to be configured when opening a table (#1245)
This was already configurable in the rust API but it wasn't actually
being passed down to the underlying dataset. I added this option to both
the async python API and the new nodejs API.

I also added this option to the synchronous python API.

I did not add the option to vectordb.
2024-04-26 13:42:02 -07:00
Bert
08d62550bb fix: passing data to createTable as option (#1242)
Fixes issue where we would throw `Either data or schema needs to
defined` when passing `data` to `createTable` as a property of the first
argument (an object).

```ts
await db.createTable({
  name: 'table1',
  data,
  schema
})
```
2024-04-26 15:26:08 -04:00
Lei Xu
b272408b05 chore: fix main branch test failure (#1240) 2024-04-24 13:49:37 -07:00
Weston Pace
46ffa87cd4 chore: disable the remote feature by default (#1239)
The rust implementation of the remote client is not yet ready. This is
understandably confusing for users since it is enabled by default. This
PR disables it by default. We can re-enable it when we are ready (even
then it is not clear this is something that should be a default
feature).

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-04-24 09:28:24 -07:00
QianZhu
cd9fc37b95 add rename_table fn and more data for index_stats to return (#1234)
1. added rename_table fn to enable dashboard to rename a table
2. added index_type and distance_type (for vector index) to index_stats
so that more detailed data can be shown on the table page.
2024-04-23 16:42:26 -07:00
Lance Release
431f94e564 [python] Bump version: 0.6.9 → 0.6.10 python-v0.6.10 2024-04-22 17:42:24 +00:00
Alex Kohler
c1a7d65473 chore: fix get_registry call in baai embeddings example (#1230) 2024-04-20 07:25:16 +05:30
Rob Meng
1e5ccb1614 chore: upgrade lance to 0.10.15 (#1229) 2024-04-19 10:31:39 -04:00
Bert
2e7ab373dc fix: update lance to 0.10.13 (#1226) 2024-04-17 09:29:10 -04:00
Weston Pace
c7fbc4aaee docs: fix minor typo (#1220) 2024-04-14 03:32:57 +05:30
Lance Release
7e023c1ef2 [python] Bump version: 0.6.8 → 0.6.9 python-v0.6.9 2024-04-12 22:09:12 +00:00
Weston Pace
1d0dd9a8b8 feat: bump lance version from 0.10.10 to 0.10.12 (#1219) 2024-04-12 15:08:39 -07:00
Weston Pace
deb947ddbd doc: fix typo, broken links (#1218) 2024-04-11 14:58:51 -07:00
Ayush Chaurasia
b039765d50 docs : Embedding functions quickstart and minor fixes (#1217) 2024-04-11 17:30:45 +05:30
Prashanth Rao
d155e82723 [docs] Fix broken links and clarify language in integrations docs (#1209)
This PR does the following:

- Fixes broken/outdated URLs
- Adds clarity to the way DuckDB/LanceDB integration works via Arrow
2024-04-11 15:32:08 +05:30
Ayush Chaurasia
5d8c91256c fix(python): Update to latest cohere reranking api (#1212)
Fixes https://github.com/lancedb/lancedb/issues/1196
Cohere introduced a breaking change in their reranker API starting
version 5.0.0. More context in discussion here
https://github.com/cohere-ai/cohere-python/issues/446
2024-04-11 15:20:29 +05:30
Ayush Chaurasia
44c03ebef3 docs : Update Reranking docs (#1213) 2024-04-11 15:20:00 +05:30
Will Jones
8ea06fe7f3 ci: fix failures in release scripts (#1215)
* Python release has been running when we create a Node release.
https://github.com/lancedb/lancedb/actions/runs/8635662585
* Rust is missing new enough compilers to check the kernels feature
https://github.com/lancedb/lancedb/actions/runs/8635662578
2024-04-10 13:09:39 -07:00
Lance Release
cf06b653d4 [python] Bump version: 0.6.7 → 0.6.8 python-v0.6.8 2024-04-10 17:51:45 +00:00
Lance Release
09cfab6d00 Updating package-lock.json 2024-04-10 17:40:03 +00:00
Lance Release
e4945abb1a Bump version: 0.4.16 → 0.4.17 v0.4.17 2024-04-10 17:39:52 +00:00
Raghav Dixit
a6aa67baed python: Bug fixes / tests (#1210)
closes #1194 #1172 #1124 #1208 


@wjones127 : `if query_type != "fts":` is needed because both fts and
vector search create `LanceQueryBuilder` which has `vector_column_name`
as a required attribute.
2024-04-10 10:17:14 -07:00
Will Jones
1d23af213b feat: expose storage options in LanceDB (#1204)
Exposes `storage_options` in LanceDB. This is provided for Python async,
Node `lancedb`, and Node `vectordb` (and Rust of course). Python
synchronous is omitted because it's not compatible with the PyArrow
filesystems we use there currently. In the future, we will move the sync
API to wrap the async one, and then it will get support for
`storage_options`.

1. Fixes #1168
2. Closes #1165
3. Closes #1082
4. Closes #439
5. Closes #897
6. Closes #642
7. Closes #281
8. Closes #114
9. Closes #990
10. Deprecating `awsCredentials` and `awsRegion`. Users are encouraged
to use `storageOptions` instead.
2024-04-10 10:12:04 -07:00
Bert
25dea4e859 BREAKING CHANGE: Check if remote table exists when opening (with caching) (#1214)
- make open table behaviour consistent:
- remote tables will check if the table exists by calling /describe and
throwing an error if the call doesn't succeed
- this is similar to the behaviour for local tables where we will raise
an exception when opening the table if the local dataset doesn't exist
- The table names are cached in the client with a TTL
- Also fixes a small bug where if the remote error response was
deserialized from JSON as an object, we'd print it resulting in the
unhelpful error message: `Error: Server Error, status: 404, message: Not
Found: [object Object]`
2024-04-10 11:54:47 -04:00
Weston Pace
8a1227030a chore: restore requests which was lost during rebase (#1205) 2024-04-08 11:56:43 +05:30
Weston Pace
9fee384d2c chore(node): restore package-lock.json lost during rebase 2024-04-05 16:36:29 -07:00
Ayush Chaurasia
b2952acca7 chore(python): remove redundant files (#1203) 2024-04-05 16:35:10 -07:00
Pranav Maddi
2b132a0bef Fix markdown formatting (#1188) 2024-04-05 16:35:10 -07:00
Will Jones
ba56208a34 ci: fix job (#1193) 2024-04-05 16:35:10 -07:00
Ayush Chaurasia
2d2042d59e chore(python): Remove settings manager and telemetry. (#1198)
This PR is intended to remove settings manager. But because telemetry
and CLI depends on settings manager those need to go too.
2024-04-05 16:35:09 -07:00
Raghav Dixit
1c41a00d87 Embeddings: HF model hub support added via transformers (#1154) 2024-04-05 16:34:56 -07:00
Lance Release
ac63d4066b Updating package-lock.json 2024-04-05 16:34:53 -07:00
Lance Release
be2074b90d [python] Bump version: 0.6.6 → 0.6.7 2024-04-05 16:34:53 -07:00
Lance Release
6c452f29e9 Bump version: 0.4.15 → 0.4.16 2024-04-05 16:34:50 -07:00
Will Jones
8a7ded23b2 chore: upgrade to lance-0.10.9 (#1192) 2024-04-05 16:34:50 -07:00