* Easy to type
* Handle `String, &str, [String] and [&str]` well without manual
conversion
* Fix function name to be verb
* Improve docstring of Rust.
* Promote `query` and `search()` to public `Table` trait
Allow passing API key as env var:
```shell
export LANCEDB_API_KEY=sh_123...
```
with this set, apiKey argument can omitted from `connect`
```js
const db = await vectordb.connect({
uri: "db://test-proj-01-ae8343",
region: "us-east-1",
})
```
```py
db = lancedb.connect(
uri="db://test-proj-01-ae8343",
region="us-east-1",
)
```
@eddyxu added instructions for linting here:
7af213801a/python/README.md (L45-L50)
However, we had a lot of failures and weren't checking this in CI. This
PR fixes all lints and adds a check to CI to keep us in compliance with
the lints.
This PR makes the following aesthetic and content updates to the docs.
- [x] Fix max width issue on mobile: Content should now render more
cleanly and be more readable on smaller devices
- [x] Improve image quality of flowchart in data management page
- [x] Fix syntax highlighting in text at the bottom of the IVF-PQ
concepts page
- [x] Add example of Polars LazyFrames to docs (Integrations)
- [x] Add example of adding data to tables using Polars (guides)
This PR makes incremental changes to the documentation.
* Closes#697
* Closes#698
## Chores
- [x] Add dark mode
- [x] Fix headers in navbar
- [x] Add `extra.css` to customize navbar styles
- [x] Customize fonts for prose/code blocks, navbar and admonitions
- [x] Inspect all admonition boxes (remove redundant dropdowns) and
improve clarity and readability
- [x] Ensure that all images in the docs have white background (not
transparent) to be viewable in dark mode
- [x] Improve code formatting in code blocks to make them consistent
with autoformatters (eslint/ruff)
- [x] Add bolder weight to h1 headers
- [x] Add diagram showing the difference between embedded (OSS) and
serverless (Cloud)
- [x] Fix [Creating an empty
table](https://lancedb.github.io/lancedb/guides/tables/#creating-empty-table)
section: right now, the subheaders are not clickable.
- [x] In critical data ingestion methods like `table.add` (among
others), the type signature often does not match the actual code
- [x] Proof-read each documentation section and rewrite as necessary to
provide more context, use cases, and explanations so it reads less like
reference documentation. This is especially important for CRUD and
search sections since those are so central to the user experience.
## Restructure/new content
- [x] The section for [Adding
data](https://lancedb.github.io/lancedb/guides/tables/#adding-to-a-table)
only shows examples for pandas and iterables. We should include pydantic
models, arrow tables, etc.
- [x] Add conceptual tutorial for IVF-PQ index
- [x] Clearly separate vector search, FTS and filtering sections so that
these are easier to find
- [x] Add docs on refine factor to explain its importance for recall.
Closes#716
- [x] Add an FAQ page showing answers to commonly asked questions about
LanceDB. Closes#746
- [x] Add simple polars example to the integrations section. Closes#756
and closes#153
- [ ] Add basic docs for the Rust API (more detailed API docs can come
later). Closes#781
- [x] Add a section on the various storage options on local vs. cloud
(S3, EBS, EFS, local disk, etc.) and the tradeoffs involved. Closes#782
- [x] Revamp filtering docs: add pre-filtering examples and redo headers
and update content for SQL filters. Closes#783 and closes#784.
- [x] Add docs for data management: compaction, cleaning up old versions
and incremental indexing. Closes#785
- [ ] Add a benchmark section that also discusses some best practices.
Closes#787
---------
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
This mimics CREATE TABLE IF NOT EXISTS behavior.
We add `db.create_table(..., exist_ok=True)` parameter.
By default it is set to False, so trying to create
a table with the same name will raise an exception.
If set to True, then it only opens the table if it
already exists. If you pass in a schema, it will
be checked against the existing table to make sure
you get what you want. If you pass in data, it will
NOT be added to the existing table.
This pull request adds check for the presence of an environment variable
`OPENAI_API_KEY` and removes an unused parameter in
`retry_with_exponential_backoff` function.
Named it Gemini-text for now. Not sure how complicated it will be to
support both text and multimodal embeddings under the same class
"gemini"..But its not something to worry about for now I guess.
addresses #797
Problem: tantivy does not expose option to explicitly
Proposed solution here:
1. Add a `.phrase_query()` option
2. Under the hood, LanceDB takes care of wrapping the input in quotes
and replace nested double quotes with single quotes
I've also filed an upstream issue, if they support phrase queries
natively then we can get rid of our manual custom processing here.