lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-25 14:29:56 +00:00

Author	SHA1	Message	Date
Lance Release	7a3ef68306	Bump version: 0.9.0-beta.3 → 0.9.0-beta.4	2024-12-20 16:02:53 +00:00
Ryan Green	43952e01d7	bump version	2024-12-20 09:44:46 -06:00
Ryan Green	495c335831	Fix fast_search	2024-12-20 09:43:39 -06:00
Ryan Green	77707db543	Backport fast_search and empty query builder for remote table	2024-12-20 09:21:05 -06:00
Ryan Green	d6d7ad3b06	bump version	2024-12-18 10:21:04 -06:00
Ryan Green	e58d64c286	Remove unsupported Retry params	2024-12-18 10:08:38 -06:00
Ryan Green	76cbd18c46	bump version	2024-12-18 09:38:36 -06:00
Ryan Green	4abb38ac70	bump version	2024-12-18 09:37:58 -06:00
Ryan Green	8193183304	override urllib3 version	2024-12-18 08:59:24 -06:00
Lance Release	e3b7ee47b9	Bump version: 0.9.0 → 0.9.0-final.1	2024-12-13 01:16:24 +00:00
Lu Qiu	97c9c906e4	Fix version test	2024-12-12 17:10:07 -08:00
Lu Qiu	358f86b9c6	fix	2024-12-12 16:44:24 -08:00
Lu Qiu	5489e215a3	Support storage options and folder prefix	2024-12-12 16:17:34 -08:00
Lance Release	bc0814767b	Bump version: 0.9.0-beta.0 → 0.9.0	2024-06-25 00:25:27 +00:00
Lance Release	8960a8e535	Bump version: 0.8.2 → 0.9.0-beta.0	2024-06-25 00:25:27 +00:00
Weston Pace	a8568ddc72	feat: upgrade to lance 0.13.0 (#1404 )	2024-06-24 17:22:57 -07:00
josca42	0fe844034d	feat: enable stemming (#1356 ) Added the ability to specify tokenizer_name, when creating a full text search index using tantivy. This enables the use of language specific stemming. Also updated the [guide on full text search](https://lancedb.github.io/lancedb/fts/) with a short section on choosing tokenizer. Fixes #1315	2024-06-20 14:23:55 -07:00
Weston Pace	ea86dad4b7	feat: upgrade lance to 0.12.2-beta.2 (#1381 )	2024-06-14 05:43:26 -07:00
harsha-mangena	a45656b8b6	docs: remove code-block:: python from docs (#1366 ) - refer #1264 - fixed minor documentation issue	2024-06-11 13:13:02 -07:00
Ayush Chaurasia	76fc16c7a1	docs: add retriever guide, address minor onboarding feedbacks & enhancement (#1326 ) - Tried to address some onboarding feedbacks listed in https://github.com/lancedb/lancedb/issues/1224 - Improve visibility of pydantic integration and embedding API. (Based on onboarding feedback - Many ways of ingesting data, defining schema but not sure what to use in a specific use-case) - Add a guide that takes users through testing and improving retriever performance using built-in utilities like hybrid-search and reranking - Add some benchmarks for the above - Add missing cohere docs --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-06-08 06:25:31 +05:30
Lance Release	11fcdb1194	Bump version: 0.8.2-beta.0 → 0.8.2	2024-06-05 13:47:16 +00:00
Lance Release	95a5a0d713	Bump version: 0.8.1 → 0.8.2-beta.0	2024-06-05 13:47:16 +00:00
Weston Pace	c3043a54c6	feat: bump lance dependency to 0.12.1 (#1357 )	2024-06-05 06:07:11 -07:00
Weston Pace	d5586c9c32	feat: make it possible to opt in to using the v2 format (#1352 ) This also exposed the max_batch_length configuration option in python/node (it was needed to verify if we are actually in v2 mode or not)	2024-06-04 21:52:14 -07:00
Ayush Chaurasia	16eff254ea	feat: add support for new cohere models in cohere and bedrock embedding functions (#1335 ) Fixes #1329 Will update docs on https://github.com/lancedb/lancedb/pull/1326	2024-05-30 10:20:03 +05:30
Lance Release	291ed41c3e	Bump version: 0.8.1-beta.0 → 0.8.1	2024-05-30 01:00:21 +00:00
Lance Release	fdda7b1a76	Bump version: 0.8.0 → 0.8.1-beta.0	2024-05-30 01:00:21 +00:00
Weston Pace	eb2cbedf19	feat: upgrade lance to 0.11.1 (#1338 )	2024-05-29 16:28:09 -07:00
zhongpu	3bb7c546d7	fix: the bug of async connection context manager (#1333 ) - add `return` for `__enter__` The buggy code didn't return the object, therefore it will always return None within a context manager: ```python with await lancedb.connect_async("./.lancedb") as db: # db is always None ``` (BTW, why not to design an async context manager?) - add a unit test for Async connection context manager - update return type of `AsyncConnection.open_table` to `AsyncTable` Although type annotation doesn't affect the functionality, it is helpful for IDEs.	2024-05-29 09:33:32 -07:00
Philip Meier	1ad1c0820d	chore: replace semver dependency with packaging (#1311 ) Fixes #1296 per title. See https://github.com/lancedb/lancedb/pull/1298#discussion_r1603931457 Cc @wjones127 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2024-05-28 10:05:16 -07:00
Lance Release	43f920182a	Bump version: 0.8.0-beta.0 → 0.8.0	2024-05-23 17:32:36 +00:00
Lance Release	718963d1fb	Bump version: 0.7.0 → 0.8.0-beta.0	2024-05-23 17:32:36 +00:00
Lance Release	41b77f5e25	Bump version: 0.7.0-beta.0 → 0.7.0	2024-05-23 16:30:16 +00:00
Lance Release	eb8b3b8c54	Bump version: 0.6.13 → 0.7.0-beta.0	2024-05-23 16:30:16 +00:00
Rob Meng	2e197ef387	feat: upgrade lance to 0.11.0 (#1317 ) upgrade lance and make fixes for the upgrade	2024-05-21 18:53:19 -04:00
Weston Pace	4f512af024	feat: add the optimize function to nodejs and async python (#1257 ) The optimize function is pretty crucial for getting good performance when building a large scale dataset but it was only exposed in rust (many sync python users are probably doing this via to_lance today) This PR adds the optimize function to nodejs and to python. I left the function marked experimental because I think there will likely be changes to optimization (e.g. if we add features like "optimize on write"). I also only exposed the `cleanup_older_than` configuration parameter since this one is very commonly used and the rest have sensible defaults and we don't really know why we would recommend different values for these defaults anyways.	2024-05-20 07:09:31 -07:00
Will Jones	5349e8b1db	ci: make preview releases (#1302 ) This PR changes the release process. Some parts are more complex, and other parts I've simplified. ## Simplifications * Combined `Create Release Commit` and `Create Python Release Commit` into a single workflow. By default, it does a release of all packages, but you can still choose to make just a Python or just Node/Rust release through the arguments. This will make it rarer that we create a Node release but forget about Python or vice-versa. * Releases are automatically generated once a tag is pushed. This eliminates the manual step of creating the release. * Release notes are automatically generated and changes are categorized based on the PR labels. * Removed the use of `LANCEDB_RELEASE_TOKEN` in favor of just using `GITHUB_TOKEN` where it wasn't necessary. In the one place it is necessary, I left a comment as to why it is. * Reused the version in `python/Cargo.toml` so we don't have two different versions in Python LanceDB. ## New changes * We now can create `preview` / `beta` releases. By default `Create Release Commit` will create a preview release, but you can select a "stable" release type and it will create a full stable release. * For Python, pre-releases go to fury.io instead of PyPI * `bump2version` was deprecated, so upgraded to `bump-my-version`. This also seems to better support semantic versioning with pre-releases. * `ci` changes will now be shown in the changelog, allowing changes like this to be visible to users. `chore` is still hidden. ## Versioning NOTE: unlike how it is in lance repo right now, the version in main is the last one released, including beta versions. --------- Co-authored-by: Lance Release <lance-dev@lancedb.com> Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-05-17 11:24:38 -07:00
asmith26	3850d5fb35	Add ollama embeddings function (#1263 ) Following the docs [here](https://lancedb.github.io/lancedb/python/python/#lancedb.embeddings.openai.OpenAIEmbeddings) I've been trying to use ollama embedding via the OpenAI API interface, but unfortunately I couldn't get it to work (possibly related to https://github.com/ollama/ollama/issues/2416) Given the popularity of ollama I thought it could be helpful to have a dedicated Ollama Embedding function in lancedb. Very much welcome any thought on this or my code etc. Thanks!	2024-05-13 13:09:19 +05:30
Lance Release	b37c58342e	[python] Bump version: 0.6.12 → 0.6.13	2024-05-10 16:15:13 +00:00
QianZhu	4746281b21	fix rename_table api and cache pop (#1283 )	2024-05-08 13:41:18 -07:00
Aman Kishore	7b3b6bdccd	Remove semvar strict dependancy (#1253 )	2024-05-08 11:16:15 -07:00
Lance Release	fe89a373a2	[python] Bump version: 0.6.11 → 0.6.12	2024-05-07 19:27:17 +00:00
Will Jones	12dbca5248	ci: better test for test_syntax (#1278 ) The syntax error was fixed in tantivy 0.22.0, so I changed the test case to something more wrong.	2024-05-07 11:52:39 -07:00
Will Jones	75ede86fab	fix: clearer error that FTS is not supported on object stores (#1273 ) Closes #1272	2024-05-07 10:15:53 -07:00
Ayush Chaurasia	2f13fa225f	Chore (python): Better retry loop logging when embedding api fails (#1267 ) https://github.com/lancedb/lancedb/issues/1266#event-12703166915 This happens because openai API errors out with None values. The current log level didn't really print out the msg on screen. Changed the log level to warning, which better suits this case. Also, retry loop can be disabled by setting `max_retries=0` (I'm not sure if we should also set this as the default behaviour as hitting api rate is quite common when ingesting large corpus) ``` func = get_registry().get("openai").create(max_retries=0) ````	2024-05-06 11:49:11 +05:30
Will Jones	82a1da554c	fix(python): return ValueError if passed unknown args to `connect()` (#1265 ) It's confusing to users that keyword arguments from the async API like `storage_options` are accepted by `connect()`, but don't do anything. We should error if unknown arguments are passed instead.	2024-05-03 17:00:08 -07:00
Rohit Rastogi	a7c0d80b9e	Implement convertors to and from Polars DataFrames in Rust SDK using convertors based on C FFI #1099 (#1260 ) https://github.com/lancedb/lancedb/issues/1099 Took the same general approach from: https://github.com/lancedb/lancedb/pull/1235. Instead of using high-level convertors implemented in polars-arrow (with the arrow-rs feature flag, which adds a dependency on arrow-rs), I used convertors based on the C FFI to avoid dependency conflicts. --------- Co-authored-by: Rohit Rastogi <rohitrastogi@Rohits-MacBook-Pro.local> Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-05-03 16:15:14 -07:00
Lance Release	1090c311e8	[python] Bump version: 0.6.10 → 0.6.11	2024-04-27 03:54:58 +00:00
Weston Pace	e767cbb374	chore: update to Lance version 0.10.16 and Arrow version 51 (#1247 )	2024-04-26 16:26:57 -07:00
Weston Pace	3d7c48feca	feat: allow the index_cache_size to be configured when opening a table (#1245 ) This was already configurable in the rust API but it wasn't actually being passed down to the underlying dataset. I added this option to both the async python API and the new nodejs API. I also added this option to the synchronous python API. I did not add the option to vectordb.	2024-04-26 13:42:02 -07:00

1 2 3 4 5 ...

390 Commits