mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-15 09:22:55 +00:00
## Describe your changes Port HNSW implementation for ANN search top Postgres ## Issue ticket number and link https://www.pinecone.io/learn/hnsw ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist
1.3 KiB
1.3 KiB
Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
This ANN extension of Postgres is based on ivf-hnsw implementation of HNSW, the code for the current state-of-the-art billion-scale nearest neighbor search system presented in the paper:
Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors,
Dmitry Baranchuk, Artem Babenko, Yury Malkov
Postgres extension
HNSW index is hold in memory (built on demand) and it's maxial size is limited
by maxelements index parameter. Another required parameter is nubmer of dimensions (if it is not specified in column type).
Optional parameter ef specifies number of neighbors which are considered during index construction and search (corresponds efConstruction and efSearch parameters
described in the article).
Example of usage:
create extension hnsw;
create table embeddings(id integer primary key, payload real[]);
create index on embeddings using hnsw(payload) with (maxelements=1000000, dims=100, m=32);
select id from embeddings order by payload <-> array[1.0, 2.0,...] limit 100;