Commit Graph

78 Commits

Author SHA1 Message Date
Rob Meng
cbb56e25ab port remote connection client into lancedb (#194)
* to_df() is now async, added `to_df_blocking` to convenience
* add remote lancedb client to public lancedb
* make lancedb connection class understand url scheme
`lancedb+<connection_type>://<host>:<port>`.
2023-06-15 18:57:52 -04:00
Utkarsh Gautam
6b5c046c3b [Python] Updated to_df implementation in Contextualizer class (#174)
Changes include:
- Contexts of sizes less than window param to be included as well
- Added optional threshold parameter to to_df in Contextualizer 
This should close #165 
- If maintainers are satisfied with the implementation will add more
examples and test cases and update the documentations as well.

---------

Co-authored-by: Nithin PS <47279496+Nithinps021@users.noreply.github.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2023-06-14 09:22:32 -07:00
Tevin Wang
9b83ce3d2a add black to python CI (#178)
Closes #48
2023-06-12 11:22:34 -07:00
Will Jones
fed33a51d5 wip: make the python API reference a bit nicer (#162)
Adds:

* Make `mkdocstrings` aware we are using numpy-style docstrings
* Fixes broken link on `index.md` to Python API docs (and added link to
node ones)
* Added examples to various classes.
* Added doctest to verify examples work.
2023-06-08 16:07:06 -07:00
Chang She
50cdb16b45 Better handle empty results from tantivy (#155)
Closes #154

---------

Co-authored-by: Chang She <chang@lancedb.com>
2023-06-05 18:18:14 -07:00
gsilvestrin
f765a453cf Use fsspec to implement table_names with cloud storage support (#117)
Co-authored-by: Will Jones <willjones127@gmail.com>
2023-06-01 16:56:26 -07:00
Lei Xu
9965b4564d [Python] Support drop table (#123)
Closes #86
2023-06-01 15:58:45 -07:00
Chang She
04d97347d7 move tantivy-py installation to be separate from wheel (#97)
pypi does not allow packages to be uploaded that has a direct reference

for now we'll just ask the user to install tantivy separately

---------

Co-authored-by: Chang She <chang@lancedb.com>
2023-05-25 17:57:26 -06:00
Chang She
f485378ea4 Basic full text search capabilities (#62)
This is v1 of integrating full text search index into LanceDB.

# API
The query API is roughly the same as before, except if the input is text
instead of a vector we assume that its fts search.

## Example
If `table` is a LanceDB LanceTable, then:

Build index: `table.create_fts_index("text")`

Query: `df = table.search("puppy").limit(10).select(["text"]).to_df()`

# Implementation
Here we use the tantivy-py package to build the index. We then use the
row id's as the full-text-search index's doc id then we just do a Take
operation to fetch the rows.

# Limitations

1. don't support incremental row appends yet. New data won't show up in
search
2. local filesystem only 
3. requires building tantivy explicitly

---------

Co-authored-by: Chang She <chang@lancedb.com>
2023-05-24 22:25:31 -06:00
Chang She
59014a01e0 bump version for v0.1.2 2023-05-05 11:27:09 -07:00
Chang She
b6739f3f66 windows paths 2023-05-04 11:41:05 -07:00
Chang She
3a2df0ce45 Add method to get the URI scheme to support cloud storage 2023-05-04 09:47:03 -07:00
Chang She
a8db7f56d2 tolerance 2023-04-25 20:08:18 -07:00
Chang She
89e6232aeb Make distance metric configurable during search 2023-04-24 22:40:40 -07:00
Chang She
159b175316 Merge pull request #34 from lancedb/changhiskhan/overwrite-table
Add mode to overwrite table if already exists
2023-04-19 21:11:56 -07:00
Chang She
99310e099e expose methods to work with versioning in tables 2023-04-19 16:48:06 -07:00
Chang She
d7c5793803 Add mode to overwrite table if already exists 2023-04-19 16:22:11 -07:00
Lei Xu
ec197b1855 Merge pull request #31 from lancedb/lei/doc
[Doc] Pandas, Parrow, DuckDB integration
2023-04-19 14:55:42 -07:00
Lei Xu
c38d80cab2 remove print 2023-04-19 14:26:07 -07:00
Lei Xu
b3fdabdf45 use python and arrow 2023-04-19 14:15:18 -07:00
Chang She
f0ea1d898b invalidate cached dataset after create_index and add 2023-04-18 16:51:26 -07:00
gsilvestrin
6865d66d37 renaming test case 2023-04-14 16:32:31 -07:00
gsilvestrin
aeecd809cc bugfix for LanceTable.add to convert python lists into arrow fixed size lists
- Fixed `add` unit test to create the correct expected result
- Added a unit test for LanceTable.add
- Need to discuss if len(LanceTable) is handled correctly
2023-04-14 14:13:01 -07:00
Chang She
eba533da4f fix 3.11 2023-03-24 19:45:46 -07:00
Chang She
5d7832c8a5 update for release 2023-03-24 18:16:29 -07:00
Chang She
5ef5141812 black 2023-03-22 18:29:07 -07:00
Chang She
690141d357 add unit tests 2023-03-21 22:29:19 -07:00
Chang She
b10301f5d6 initial python impl 2023-03-18 10:43:26 -07:00