Files
lancedb/docs/src/basic.md
Tevin Wang b731a6aed9 Add docs code testing & documentation syntax changes (#196)
- Creates testing files `md_testing.py` and `md_testing.js` for testing
python and nodejs code in markdown files in the documentation
This listens for HTML tags as well: `<!--[language] code code
code...-->` will create a set-up file to create some mock tables or to
fulfill some assumptions in the documentation.
- Creates a github action workflow that triggers every push/pr to
`docs/**`
- Modifies documentation so tests run (mostly indentation, some small
syntax errors and some missing imports)

A list of excluded files that we need to take a closer look at later on:
```javascript
const excludedFiles = [
  "../src/fts.md",
  "../src/embedding.md",
  "../src/examples/serverless_lancedb_with_s3_and_lambda.md",
  "../src/examples/serverless_qa_bot_with_modal_and_langchain.md",
  "../src/examples/youtube_transcript_bot_with_nodejs.md",
];
```
Many of them can't be done because we need the OpenAI API key :(.
`fts.md` has some issues with the library, I believe this is still
experimental?

Closes #170

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2023-06-28 11:07:26 -07:00

4.2 KiB

Basic LanceDB Functionality

We'll cover the basics of using LanceDB on your local machine in this section.

??? info "LanceDB runs embedded on your backend application, so there is no need to run a separate server."

  <img src="../assets/lancedb_embedded_explanation.png" width="650px" />

Installation

=== "Python" shell pip install lancedb

=== "Javascript" shell npm install vectordb

How to connect to a database

=== "Python" python import lancedb uri = "data/sample-lancedb" db = lancedb.connect(uri)

  LanceDB will create the directory if it doesn't exist (including parent directories).

  If you need a reminder of the uri, use the `db.uri` property.

=== "Javascript" ```javascript const lancedb = require("vectordb");

  const uri = "data/sample-lancedb";
  const db = await lancedb.connect(uri);
  ```
  
  LanceDB will create the directory if it doesn't exist (including parent directories).

  If you need a reminder of the uri, you can call `db.uri()`.

How to create a table

=== "Python" python tbl = db.create_table("my_table", data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0}, {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])

  If the table already exists, LanceDB will raise an error by default.
  If you want to overwrite the table, you can pass in `mode="overwrite"`
  to the `create_table` method.

  You can also pass in a pandas DataFrame directly:
  ```python
  import pandas as pd
  df = pd.DataFrame([{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
                    {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
  tbl = db.create_table("table_from_df", data=df)
  ```

=== "Javascript" javascript const tb = await db.createTable("my_table", data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0}, {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])

!!! warning

  If the table already exists, LanceDB will raise an error by default.
  If you want to overwrite the table, you can pass in `mode="overwrite"`
  to the `createTable` function.

??? info "Under the hood, LanceDB is converting the input data into an Apache Arrow table and persisting it to disk in Lance format."

How to open an existing table

Once created, you can open a table using the following code:

=== "Python" python tbl = db.open_table("my_table")

  If you forget the name of your table, you can always get a listing of all table names:

  ```python
  print(db.table_names())
  ```

=== "Javascript" javascript const tbl = await db.openTable("my_table");

  If you forget the name of your table, you can always get a listing of all table names:

  ```javascript
  console.log(await db.tableNames());
  ```

How to add data to a table

After a table has been created, you can always add more data to it using

=== "Python" python df = pd.DataFrame([{"vector": [1.3, 1.4], "item": "fizz", "price": 100.0}, {"vector": [9.5, 56.2], "item": "buzz", "price": 200.0}]) tbl.add(df)

=== "Javascript" javascript await tbl.add([{vector: [1.3, 1.4], item: "fizz", price: 100.0}, {vector: [9.5, 56.2], item: "buzz", price: 200.0}])

How to search for (approximate) nearest neighbors

Once you've embedded the query, you can find its nearest neighbors using the following code:

=== "Python" python tbl.search([100, 100]).limit(2).to_df()

  This returns a pandas DataFrame with the results.

=== "Javascript" javascript const query = await tbl.search([100, 100]).limit(2).execute();

What's next

This section covered the very basics of the LanceDB API. LanceDB supports many additional features when creating indices to speed up search and options for search. These are contained in the next section of the documentation.