- Creates testing files `md_testing.py` and `md_testing.js` for testing python and nodejs code in markdown files in the documentation This listens for HTML tags as well: `<!--[language] code code code...-->` will create a set-up file to create some mock tables or to fulfill some assumptions in the documentation. - Creates a github action workflow that triggers every push/pr to `docs/**` - Modifies documentation so tests run (mostly indentation, some small syntax errors and some missing imports) A list of excluded files that we need to take a closer look at later on: ```javascript const excludedFiles = [ "../src/fts.md", "../src/embedding.md", "../src/examples/serverless_lancedb_with_s3_and_lambda.md", "../src/examples/serverless_qa_bot_with_modal_and_langchain.md", "../src/examples/youtube_transcript_bot_with_nodejs.md", ]; ``` Many of them can't be done because we need the OpenAI API key :(. `fts.md` has some issues with the library, I believe this is still experimental? Closes #170 --------- Co-authored-by: Will Jones <willjones127@gmail.com>
4.2 KiB
Basic LanceDB Functionality
We'll cover the basics of using LanceDB on your local machine in this section.
??? info "LanceDB runs embedded on your backend application, so there is no need to run a separate server."
<img src="../assets/lancedb_embedded_explanation.png" width="650px" />
Installation
=== "Python"
shell pip install lancedb
=== "Javascript"
shell npm install vectordb
How to connect to a database
=== "Python"
python import lancedb uri = "data/sample-lancedb" db = lancedb.connect(uri)
LanceDB will create the directory if it doesn't exist (including parent directories).
If you need a reminder of the uri, use the `db.uri` property.
=== "Javascript" ```javascript const lancedb = require("vectordb");
const uri = "data/sample-lancedb";
const db = await lancedb.connect(uri);
```
LanceDB will create the directory if it doesn't exist (including parent directories).
If you need a reminder of the uri, you can call `db.uri()`.
How to create a table
=== "Python"
python tbl = db.create_table("my_table", data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0}, {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
If the table already exists, LanceDB will raise an error by default.
If you want to overwrite the table, you can pass in `mode="overwrite"`
to the `create_table` method.
You can also pass in a pandas DataFrame directly:
```python
import pandas as pd
df = pd.DataFrame([{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
tbl = db.create_table("table_from_df", data=df)
```
=== "Javascript"
javascript const tb = await db.createTable("my_table", data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0}, {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
!!! warning
If the table already exists, LanceDB will raise an error by default.
If you want to overwrite the table, you can pass in `mode="overwrite"`
to the `createTable` function.
??? info "Under the hood, LanceDB is converting the input data into an Apache Arrow table and persisting it to disk in Lance format."
How to open an existing table
Once created, you can open a table using the following code:
=== "Python"
python tbl = db.open_table("my_table")
If you forget the name of your table, you can always get a listing of all table names:
```python
print(db.table_names())
```
=== "Javascript"
javascript const tbl = await db.openTable("my_table");
If you forget the name of your table, you can always get a listing of all table names:
```javascript
console.log(await db.tableNames());
```
How to add data to a table
After a table has been created, you can always add more data to it using
=== "Python"
python df = pd.DataFrame([{"vector": [1.3, 1.4], "item": "fizz", "price": 100.0}, {"vector": [9.5, 56.2], "item": "buzz", "price": 200.0}]) tbl.add(df)
=== "Javascript"
javascript await tbl.add([{vector: [1.3, 1.4], item: "fizz", price: 100.0}, {vector: [9.5, 56.2], item: "buzz", price: 200.0}])
How to search for (approximate) nearest neighbors
Once you've embedded the query, you can find its nearest neighbors using the following code:
=== "Python"
python tbl.search([100, 100]).limit(2).to_df()
This returns a pandas DataFrame with the results.
=== "Javascript"
javascript const query = await tbl.search([100, 100]).limit(2).execute();
What's next
This section covered the very basics of the LanceDB API. LanceDB supports many additional features when creating indices to speed up search and options for search. These are contained in the next section of the documentation.