lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-07-06 20:40:41 +00:00

Author	SHA1	Message	Date
natcharacter	e29e4cc36d	A simple base usage that install the dependencies necessary to use FT… (#1036 ) A simple base usage that install the dependencies necessary to use FTS and Hybrid search --------- Co-authored-by: Nat Roth <natroth@Nats-MacBook-Pro.local> Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>	2024-04-05 16:31:36 -07:00
Rob Meng	f3de3d990d	chore: upgrade to lance 0.10.1 (#1034 ) upgrade to lance 0.10.1 and update doc string to reflect dynamic projection options	2024-04-05 16:31:36 -07:00
BubbleCal	0a8e258247	chore(rust): report the TableNotFound error while dropping non-exist table (#1022 ) this will work after upgrading lance with https://github.com/lancedb/lance/pull/1995 merged see #884 for details Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-04-05 16:31:36 -07:00
Weston Pace	2cec2a8937	feat: add a basic async python client starting point (#1014 ) This changes `lancedb` from a "pure python" setuptools project to a maturin project and adds a rust lancedb dependency. The async python client is extremely minimal (only `connect` and `Connection.table_names` are supported). The purpose of this PR is to get the infrastructure in place for building out the rest of the async client. Although this is not technically a breaking change (no APIs are changing) it is still a considerable change in the way the wheels are built because they now include the native shared library.	2024-04-05 16:31:34 -07:00
Will Jones	464a36ad38	feat: `{add\|alter\|drop}_columns` APIs (#1015 ) Initial work for #959. This exposes the basic functionality for each in all of the APIs. Will add user guide documentation in a later PR.	2024-04-05 16:30:47 -07:00
Weston Pace	ad1e81a1d1	refactor: change arrow from a direct dependency to a peer dependency (#984 ) BREAKING CHANGE: users will now need to npm install `apache-arrow` and `@apache-arrow/ts` themselves.	2024-04-05 16:30:47 -07:00
Lance Release	562d1af1ed	Bump version: 0.4.10 → 0.4.11	2024-04-05 16:30:40 -07:00
Weston Pace	2163502b31	refactor: rename the rust crate from vectordb to lancedb (#1012 ) This also renames the new experimental node package to lancedb. The classic node package remains named vectordb. The goal here is to avoid introducing piecemeal breaking changes to the vectordb crate. Instead, once the new API is stabilized, we will officially release the lancedb crate and deprecate the vectordb crate. The same pattern will eventually happen with the npm package vectordb.	2024-04-05 16:30:40 -07:00
Will Jones	c5b0934bfb	feat(node): add `read_consistency_interval` to Node and Rust (#1002 ) This PR adds the same consistency semantics as was added in #828. It does not add the same lazy-loading of tables, since that breaks some existing tests. This closes #998. --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-04-05 16:30:40 -07:00
Lance Release	ef54bd5ba2	[python] Bump version: 0.5.6 → 0.5.7	2024-04-05 16:30:40 -07:00
Lei Xu	80e4d14c02	chore: bump pylance to 0.9.18 (#1011 )	2024-04-05 16:30:40 -07:00
Raghav Dixit	fdabf31984	python(feat): Imagebind embedding fn support (#1003 ) Added imagebind fn support , steps to install mentioned in docstring. pytest slow checks done locally --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-04-05 16:30:40 -07:00
Ayush Chaurasia	538d0320f7	Docs: add meta tags (#1006 )	2024-04-05 16:30:40 -07:00
Weston Pace	cbc0c439ef	refactor: rust vectordb API stabilization of the Connection trait (#993 ) This is the start of a more comprehensive refactor and stabilization of the Rust API. The `Connection` trait is cleaned up to not require `lance` and to match the `Connection` trait in other APIs. In addition, the concrete implementation `Database` is hidden. BREAKING CHANGE: The struct `crate::connection::Database` is now gone. Several examples opened a connection using `Database::connect` or `Database::connect_with_params`. Users should now use `vectordb::connect`. BREAKING CHANGE: The `connect`, `create_table`, and `open_table` methods now all return a builder object. This means that a call like `conn.open_table(..., opt1, opt2)` will now become `conn.open_table(...).opt1(opt1).opt2(opt2).execute()` In addition, the structure of options has changed slightly. However, no options capability has been removed. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2024-04-05 16:30:40 -07:00
Lance Release	69492586f0	[python] Bump version: 0.5.5 → 0.5.6	2024-04-05 16:30:40 -07:00
Bert	f5627dac14	lance 0.9.18 (#1000 )	2024-04-05 16:30:40 -07:00
Johannes Kolbe	32bfb68ac3	apply fixes for notebook (#989 )	2024-04-05 16:30:40 -07:00
Ayush Chaurasia	bc871169f0	docs: Add meta tag for image preview (#988 ) I think this should work. Need to deploy it to be sure as it can be tested locally. Can be tested here. 2 things about this solution: * All pages have a same meta tag, i.e, lancedb banner * If needed, we can automatically use the first image of each page and generate meta tags using the ultralytics mkdocs plugin that we did for this purpose - https://github.com/ultralytics/mkdocs	2024-04-05 16:30:40 -07:00
Chang She	3fc835e124	doc: update navigation links for embedding functions (#986 )	2024-04-05 16:30:40 -07:00
Chang She	484a121866	doc: improve embedding functions documentation (#983 ) Got some user feedback that the `implicit` / `explicit` distinction is confusing. Instead I was thinking we would just deprecate the `with_embeddings` API and then organize working with embeddings into 3 buckets: 1. manually generate embeddings 2. use a provided embedding function 3. define your own custom embedding function	2024-04-05 16:30:40 -07:00
Chang She	bc850e6add	feat(python): add optional threadpool for batch requests (#981 ) Currently if a batch request is given to the remote API, each query is sent sequentially. We should allow the user to specify a threadpool.	2024-04-05 16:30:40 -07:00
Will Jones	26eec4bef4	fix: use static C runtime on Windows (#979 ) We depend on C static runtime, but not all Windows machines have that. So might be worth statically linking it. https://github.com/reorproject/reor/issues/36#issuecomment-1948876463	2024-04-05 16:30:40 -07:00
Will Jones	f84a4855ca	docs: show DuckDB with dataset, not table (#974 ) Using datasets is preferred way to allow filter and projection pushdown, as well as aggregated larger-than-memory tables.	2024-04-05 16:30:40 -07:00
Ayush Chaurasia	aecafa6479	docs: Minimal reranking evaluation benchmarks (#977 )	2024-04-05 16:30:40 -07:00
Will Jones	efa846b6e5	chore: upgrade lance to 0.9.16 (#975 )	2024-04-05 16:30:36 -07:00
Will Jones	cf3dbcf684	ci: fix Node ARM release build (#971 ) When we turned on fat LTO builds, we made the release build job much more compute and memory intensive. The ARM runners have particularly low memory per core, which makes them susceptible to OOM errors. To avoid issues, I have enabled memory swap on ARM and bumped the side of the runner.	2024-04-05 16:30:36 -07:00
Will Jones	c425d3759d	ci: reduce number of build jobs on aarch64 to avoid OOM (#970 )	2024-04-05 16:30:36 -07:00
Lance Release	fded15c9fe	[python] Bump version: 0.5.4 → 0.5.5	2024-04-05 16:30:36 -07:00
Lance Release	e888cb5b48	Bump version: 0.4.9 → 0.4.10	2024-04-05 16:30:30 -07:00
Weston Pace	9241f47f0e	feat: make it easier to create empty tables (#942 ) This PR also reworks the table creation utilities significantly so that they are more consistent, built on top of each other, and thoroughly documented.	2024-04-05 16:30:30 -07:00
Prashanth Rao	b014c24e66	[docs]: Fix typos and clarity in hybrid search docs (#966 ) - Fixed typos and added some clarity to the hybrid search docs - Changed "Airbnb" case to be as per the [official company name](https://en.wikipedia.org/wiki/Airbnb) (the "bnb" shouldn't be capitalized", and the text in the document aligns with this - Fixed headers in nav bar	2024-04-05 16:30:30 -07:00
Will Jones	68115f1369	fix: wrap in BigInt to avoid upstream bug (#962 ) Closes #960	2024-04-05 16:30:30 -07:00
Ayush Chaurasia	f78fe721db	docs: Add setup cell for colab example (#965 )	2024-04-05 16:30:30 -07:00
Ayush Chaurasia	510e8378bc	feat(python): hybrid search updates, examples, & latency benchmarks (#964 ) - Rename safe_import -> attempt_import_or_raise (closes https://github.com/lancedb/lancedb/pull/923) - Update docs - Add Notebook example (@changhiskhan you can use it for the talk. Comes with "open in colab" button) - Latency benchmark & results comparison, sanity check on real-world data - Updates the default openai model to gpt-4	2024-04-05 16:30:30 -07:00
Will Jones	1045af6c09	chore: fix clippy lints (#963 )	2024-04-05 16:30:30 -07:00
QianZhu	7afcfca10d	Qian/make vector col optional (#950 ) remote SDK tests were completed through lancedb_integtest	2024-04-05 16:30:29 -07:00
Will Jones	88205aba64	fix(node): statically link lzma (#961 ) Fixes #956 Same changes as https://github.com/lancedb/lance/pull/1934	2024-04-05 16:30:10 -07:00
Weston Pace	da47938a43	chore: use a bigger runner for NPM publish jobs on aarch64 to avoid OOM (#955 )	2024-04-05 16:30:06 -07:00
Lance Release	03e705c14c	Bump version: 0.4.8 → 0.4.9	2024-04-05 16:29:58 -07:00
Lance Release	a7e60a4c3f	[python] Bump version: 0.5.3 → 0.5.4	2024-04-05 16:29:58 -07:00
Weston Pace	e12bdc78bb	chore: bump lance version to 0.9.15 (#949 )	2024-04-05 16:29:58 -07:00
Weston Pace	41ccb48160	feat: add support for filter during merge insert when matched (#948 ) Closes #940	2024-04-05 16:29:58 -07:00
QianZhu	069ad267bd	added error msg to SaaS APIs (#852 ) 1. improved error msg for SaaS create_table and create_index --------- Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>	2024-04-05 16:29:58 -07:00
Weston Pace	138fc3f66b	feat: add a filterable count_rows to all the lancedb APIs (#913 ) A `count_rows` method that takes a filter was recently added to `LanceTable`. This PR adds it everywhere else except `RemoteTable` (that will come soon).	2024-04-05 16:29:58 -07:00
Nitish Sharma	2c3f982f4f	Minor updates to FAQ (#935 ) Based on discussion over discord, adding minor updates to the FAQ section about benchmarks, practical data size and concurrency in LanceDB	2024-04-05 16:29:58 -07:00
Ayush Chaurasia	d07817a562	feat(python): Reranker DX improvements (#904 ) - Most users might not know how to use `QueryBuilder` object. Instead we should just pass the string query. - Add new rerankers: Colbert, openai	2024-04-05 16:29:58 -07:00
Will Jones	39cc2fd62b	feat(python): add `read_consistency_interval` argument (#828 ) This PR refactors how we handle read consistency: does the `LanceTable` class always pick up modifications to the table made by other instance or processes. Users have three options they can set at the connection level: 1. (Default) `read_consistency_interval=None` means it will not check at all. Users can call `table.checkout_latest()` to manually check for updates. 2. `read_consistency_interval=timedelta(0)` means always check for updates, giving strong read consistency. 3. `read_consistency_interval=timedelta(seconds=20)` means check for updates every 20 seconds. This is eventual consistency, a compromise between the two options above. There is now an explicit difference between a `LanceTable` that tracks the current version and one that is fixed at a historical version. We now enforce that users cannot write if they have checked out an old version. They are instructed to call `checkout_latest()` before calling the write methods. Since `conn.open_table()` doesn't have a parameter for version, users will only get fixed references if they call `table.checkout()`. The difference between these two can be seen in the repr: Table that are fixed at a particular version will have a `version` displayed in the repr. Otherwise, the version will not be shown. ```python >>> table LanceTable(connection=..., name="my_table") >>> table.checkout(1) >>> table LanceTable(connection=..., name="my_table", version=1) ``` I decided to not create different classes for these states, because I think we already have enough complexity with the Cloud vs OSS table references. Based on #812	2024-04-05 16:29:57 -07:00
Ayush Chaurasia	0f00cd0097	feat(python): add support new openai embedding functions (#912 ) @PrashantDixit0 --------- Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>	2024-04-05 16:29:13 -07:00
Lei Xu	84edf56995	chore: add global cargo config to enable minimal cpu target (#925 ) * Closes #895 * Fix cargo clippy	2024-04-05 16:29:13 -07:00
QianZhu	b2efd0da53	fix hybrid search example (#922 )	2024-04-05 16:29:13 -07:00

1 2 3 4 5 ...

804 Commits