lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-07-03 19:10:41 +00:00

Author	SHA1	Message	Date
qzhu	b7c816c919	add index_stats to python api	2024-03-12 16:28:15 -07:00
qzhu	34dd548bc8	init commit for test	2024-03-11 13:28:24 -07:00
Ivan Leo	553dae1607	Update default_embedding_functions.md (#1073 ) Added a small bit of documentation for the `dim` feature which is provided by the new `text-embedding-3` model series that allows users to shorten an embedding. Happy to discuss a bit on the phrasing but I struggled quite a bit with getting it to work so wanted to help others who might want to use the newer model too	2024-03-11 21:30:07 +05:30
Weston Pace	9c7e00eec3	Remove remote integration workflow (#1076 )	2024-03-07 12:00:04 -08:00
Will Jones	a7d66032aa	fix: Allow converting from NativeTable to Table (#1069 )	2024-03-07 08:33:46 -08:00
Lance Release	7fb8a732a5	Updating package-lock.json	2024-03-07 01:05:09 +00:00
Lance Release	f393ac3b0d	Updating package-lock.json	2024-03-06 23:26:48 +00:00
Lance Release	ca83354780	Bump version: 0.4.11 → 0.4.12 v0.4.12	2024-03-06 23:26:38 +00:00
Lance Release	272cbcad7a	[python] Bump version: 0.6.1 → 0.6.2 python-v0.6.2	2024-03-06 16:28:50 +00:00
Will Jones	722fe1836c	fix: make checkout_latest force a reload (#1064 ) #1002 accidentally changed `checkout_latest` to do nothing if the table was already in latest mode. This PR makes sure it forces a reload of the table (if there is a newer version).	2024-03-05 11:51:47 -08:00
Lei Xu	d1983602c2	chore: bump lance to 0.10.2 (#1061 )	2024-03-05 10:16:07 -08:00
Weston Pace	9148cd6d47	feat: page_token / limit to native table_names function. Use async table_names function from sync table_names function (#1059 ) The synchronous table_names function in python lancedb relies on arrow's filesystem which behaves slightly differently than object_store. As a result, the function would not work properly in GCS. However, the async table_names function uses object_store directly and thus is accurate. In most cases we can fallback to using the async table_names function and so this PR does so. The one case we cannot is if the user is already in an async context (we can't start a new async event loop). Soon, we can just redirect those users to use the async API instead of the sync API and so that case will eventually go away. For now, we fallback to the old behavior.	2024-03-05 08:38:18 -08:00
Will Jones	47dbb988bf	feat: more accessible errors (#1025 ) The fact that we convert errors to strings makes them really hard to work with. For example, in SaaS we want to know whether the underlying `lance::Error` was the `InvalidInput` variant, so we can return a 400 instead of a 500.	2024-03-05 07:57:11 -08:00
Chang She	6821536d44	doc(python): document the method in fts (#982 ) Co-authored-by: prrao87 <prrao87@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>	2024-03-04 16:42:24 -08:00
Ayush Chaurasia	d6f0663671	fix(python): Few fts patches (#1039 ) 1. filtering with fts mutated the schema, which caused schema mistmatch problems with hybrid search as it combines fts and vector search tables. 2. fts with filter failed with `with_row_id`. This was because row_id was calculated before filtering which caused size mismatch on attaching it after. 3. The fix for 1 meant that now row_id is attached before filtering but passing a filter to `to_lance` on a dataset that already contains `_rowid` raises a panic from lance. So temporarily, in case where fts is used with a filter AND `with_row_id`, we just force user to using the duckdb pathway. --------- Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>	2024-03-04 16:41:59 -08:00
Weston Pace	ea33b68c6c	fix: sanitize foreign schemas (#1058 ) Arrow-js uses brittle `instanceof` checks throughout the code base. These fail unless the library instance that produced the object matches exactly the same instance the vectordb is using. At a minimum, this means that a user using arrow version 15 (or any version that doesn't match exactly the version that vectordb is using) will get strange errors when they try and use vectordb. However, there are even cases where the versions can be perfectly identical, and the instanceof check still fails. One such example is when using `vite` (e.g. https://github.com/vitejs/vite/issues/3910) This PR solves the problem in a rather brute force, but workable, fashion. If we encounter a schema that does not pass the `instanceof` check then we will attempt to sanitize that schema by traversing the object and, if it has all the correct properties, constructing an appropriate `Schema` instance via deep cloning.	2024-03-04 13:06:36 -08:00
Weston Pace	1453bf4e7a	feat: reconfigure typescript linter / formatter for nodejs (#1042 ) The eslint rules specify some formatting requirements that are rather strict and conflict with vscode's default formatter. I was unable to get auto-formatting to setup correctly. Also, eslint has quite recently [given up on formatting](https://eslint.org/blog/2023/10/deprecating-formatting-rules/) and recommends using a 3rd party formatter. This PR adds prettier as the formatter. It restores the eslint rules to their defaults. This does mean we now have the "no explicit any" check back on. I know that rule is pedantic but it did help me catch a few corner cases in type testing that weren't covered in the current code. Leaving in draft as this is dependent on other PRs.	2024-03-04 10:49:08 -08:00
Weston Pace	abaf315baf	feat: add support for add to async python API (#1037 ) In order to add support for `add` we needed to migrate the rust `Table` trait to a `Table` struct and `TableInternal` trait (similar to the way the connection is designed). While doing this we also cleaned up some inconsistencies between the SDKs: * Python and Node are garbage collected languages and it can be difficult to trigger something to be freed. The convention for these languages is to have some kind of close method. I added a close method to both the table and connection which will drop the underlying rust object. * We made significant improvements to table creation in `cc5f2136a6` for the `node` SDK. I copied these changes to the `nodejs` SDK. * The nodejs tables were using fs to create tmp directories and these were not getting cleaned up. This is mostly harmless but annoying and so I changed it up a bit to ensure we cleanup tmp directories. * ~~countRows in the node SDK was returning `bigint`. I changed it to return `number`~~ (this actually happened in a previous PR) * Tables and connections now implement `std::fmt::Display` which is hooked into python's `__repr__`. Node has no concept of a regular "to string" function and so I added a `display` method. * Python method signatures are changing so that optional parameters are always `Optional[foo] = None` instead of something like `foo = False`. This is because we want those defaults to be in rust whenever possible (though we still need to mention the default in documentation). * I changed the python `AsyncConnection/AsyncTable` classes from abstract classes with a single implementation to just classes because we no longer have the remote implementation in python. Note: this does NOT add the `add` function to the remote table. This PR was already large enough, and the remote implementation is unique enough, that I am going to do all the remote stuff at a later date (we should have the structure in place and correct so there shouldn't be any refactor concerns) --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2024-03-04 09:27:41 -08:00
Chang She	14b9277ac1	chore(rust): update rust version (#810 )	2024-03-03 18:51:58 -08:00
Chang She	d621826b79	feat(python): allow user to override api url (#1054 )	2024-03-03 18:29:47 -08:00
Chang She	08c0803ae1	chore(python): use pypi tantivy to speed up CI (#987 )	2024-03-03 16:57:55 -08:00
Chang She	62632cb90b	doc: fix docs deployment GHA (#1055 )	2024-03-03 16:04:45 -08:00
Prashanth Rao	14566df213	[docs]: Fix issues with Rust code snippets in "quick start" (#1047 ) The renaming of `vectordb` to `lancedb` broke the [quick start docs](https://lancedb.github.io/lancedb/basic/#__tabbed_5_3) (it's pointing to a non-existent directory). This PR fixes the code snippets and the paths in the docs page. Additionally, more fixes related to indexing docs below 👇🏽.	2024-03-03 15:59:57 -08:00
Louis Guitton	acfdf1b9cb	Fix default_embedding_functions.md (#1043 ) typo and broken table	2024-03-03 15:22:53 -08:00
Chang She	f95402af7c	doc: fix langchain link (#1053 )	2024-03-03 15:20:48 -08:00
Chang She	d14c9b6d9e	feat(python): add model_names() method to openai embedding function (#1049 ) small QoL improvement	2024-03-03 12:33:00 -08:00
QianZhu	c1af53b787	Add create scalar index to sdk (#1033 )	2024-02-29 13:32:01 -08:00
Weston Pace	2a02d1394b	feat: port create_table to the async python API and the remote rust API (#1031 ) I've also started `ASYNC_MIGRATION.MD` to keep track of the breaking changes from sync to async python.	2024-02-29 13:29:29 -08:00
Lance Release	085066d2a8	[python] Bump version: 0.6.0 → 0.6.1 python-v0.6.1	2024-02-29 19:48:16 +00:00
Rob Meng	adf1a38f4d	fix: fix columns type for pydantic 2.x (#1045 )	2024-02-29 14:47:56 -05:00
Weston Pace	294c33a42e	feat: Initial remote table implementation for rust (#1024 ) This will eventually replace the remote table implementations in python and node.	2024-02-29 10:55:49 -08:00
Lance Release	245786fed7	[python] Bump version: 0.5.7 → 0.6.0 python-v0.6.0	2024-02-29 16:03:01 +00:00
BubbleCal	edd9a043f8	chore: enable test for dropping table (#1038 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-02-29 15:00:24 +08:00
natcharacter	38c09fc294	A simple base usage that install the dependencies necessary to use FT… (#1036 ) A simple base usage that install the dependencies necessary to use FTS and Hybrid search --------- Co-authored-by: Nat Roth <natroth@Nats-MacBook-Pro.local> Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>	2024-02-28 09:38:05 -08:00
Rob Meng	ebaa2dede5	chore: upgrade to lance 0.10.1 (#1034 ) upgrade to lance 0.10.1 and update doc string to reflect dynamic projection options	2024-02-28 11:06:46 -05:00
BubbleCal	ba7618a026	chore(rust): report the TableNotFound error while dropping non-exist table (#1022 ) this will work after upgrading lance with https://github.com/lancedb/lance/pull/1995 merged see #884 for details Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-02-28 04:46:39 -08:00
Weston Pace	a6bcbd007b	feat: add a basic async python client starting point (#1014 ) This changes `lancedb` from a "pure python" setuptools project to a maturin project and adds a rust lancedb dependency. The async python client is extremely minimal (only `connect` and `Connection.table_names` are supported). The purpose of this PR is to get the infrastructure in place for building out the rest of the async client. Although this is not technically a breaking change (no APIs are changing) it is still a considerable change in the way the wheels are built because they now include the native shared library.	2024-02-27 04:52:02 -08:00
Will Jones	5af74b5aca	feat: `{add\|alter\|drop}_columns` APIs (#1015 ) Initial work for #959. This exposes the basic functionality for each in all of the APIs. Will add user guide documentation in a later PR.	2024-02-26 11:04:53 -08:00
Weston Pace	8a52619bc0	refactor: change arrow from a direct dependency to a peer dependency (#984 ) BREAKING CHANGE: users will now need to npm install `apache-arrow` and `@apache-arrow/ts` themselves.	2024-02-23 14:08:39 -08:00
Lance Release	314d4c93e5	Updating package-lock.json	2024-02-23 05:11:22 +00:00
Lance Release	c5471ee694	Updating package-lock.json	2024-02-23 03:57:39 +00:00
Lance Release	4605359d3b	Bump version: 0.4.10 → 0.4.11 v0.4.11	2024-02-23 03:57:28 +00:00
Weston Pace	f1596122e6	refactor: rename the rust crate from vectordb to lancedb (#1012 ) This also renames the new experimental node package to lancedb. The classic node package remains named vectordb. The goal here is to avoid introducing piecemeal breaking changes to the vectordb crate. Instead, once the new API is stabilized, we will officially release the lancedb crate and deprecate the vectordb crate. The same pattern will eventually happen with the npm package vectordb.	2024-02-22 19:56:39 -08:00
Will Jones	3aa0c40168	feat(node): add `read_consistency_interval` to Node and Rust (#1002 ) This PR adds the same consistency semantics as was added in #828. It does not add the same lazy-loading of tables, since that breaks some existing tests. This closes #998. --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-02-22 15:04:30 -08:00
Lance Release	677b7c1fcc	[python] Bump version: 0.5.6 → 0.5.7 python-v0.5.7	2024-02-22 20:07:12 +00:00
Lei Xu	8303a7197b	chore: bump pylance to 0.9.18 (#1011 )	2024-02-22 11:47:36 -08:00
Raghav Dixit	5fa9bfc4a8	python(feat): Imagebind embedding fn support (#1003 ) Added imagebind fn support , steps to install mentioned in docstring. pytest slow checks done locally --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-02-22 11:47:08 +05:30
Ayush Chaurasia	bf2e9d0088	Docs: add meta tags (#1006 )	2024-02-21 23:22:47 +05:30
Weston Pace	f04590ddad	refactor: rust vectordb API stabilization of the Connection trait (#993 ) This is the start of a more comprehensive refactor and stabilization of the Rust API. The `Connection` trait is cleaned up to not require `lance` and to match the `Connection` trait in other APIs. In addition, the concrete implementation `Database` is hidden. BREAKING CHANGE: The struct `crate::connection::Database` is now gone. Several examples opened a connection using `Database::connect` or `Database::connect_with_params`. Users should now use `vectordb::connect`. BREAKING CHANGE: The `connect`, `create_table`, and `open_table` methods now all return a builder object. This means that a call like `conn.open_table(..., opt1, opt2)` will now become `conn.open_table(...).opt1(opt1).opt2(opt2).execute()` In addition, the structure of options has changed slightly. However, no options capability has been removed. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2024-02-20 18:35:52 -08:00
Lance Release	62c5117def	[python] Bump version: 0.5.5 → 0.5.6 python-v0.5.6	2024-02-20 20:45:02 +00:00

1 2 3 4 5 ...

853 Commits