Neon, a serverless PostgreSQL provider, and pgvector, an extension for PostgreSQL that enables efficient storage and searching of vector embeddings, together offer a powerful, scalable solution for semantic search.

Here’s how it looks in action. Imagine you have a collection of product descriptions, and you want to find products semantically similar to a given query.

First, you’d set up a Neon PostgreSQL database and enable the pgvector extension.

-- Connect to your Neon database
-- CREATE EXTENSION IF NOT EXISTS vector; -- Run this once per database

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT,
    description TEXT,
    embedding vector(1536) -- Adjust dimension based on your embedding model
);

Next, you’d generate embeddings for your product descriptions using a machine learning model (like OpenAI’s text-embedding-ada-002, which produces 1536-dimensional vectors). You’d then insert these into your products table.

-- Example of inserting a product with its embedding
-- (Embedding generation would happen in your application code)
INSERT INTO products (name, description, embedding) VALUES
('Wireless Mouse', 'An ergonomic wireless mouse with long battery life.', '[0.123, 0.456, ..., 0.789]'), -- Placeholder for actual embedding
('Mechanical Keyboard', 'A clicky mechanical keyboard with RGB backlighting.', '[0.987, 0.654, ..., 0.321]'); -- Placeholder for actual embedding

Now, to perform a semantic search, you’d generate an embedding for your search query and then query the database for the nearest neighbors. pgvector supports various distance metrics like cosine, Euclidean (l2), and dot product. Cosine similarity is common for text embeddings.

-- Assume 'query_embedding' is the vector for "comfortable computer mouse"
SELECT
    id,
    name,
    description,
    embedding <-> '[0.150, 0.400, ..., 0.800]' AS distance -- Placeholder for actual query embedding
FROM products
ORDER BY distance
LIMIT 5;

The <-> operator calculates the cosine distance (for pgvector, it’s 1 - cosine_similarity). Ordering by this distance and taking the LIMIT gives you the most semantically similar products.

The core problem pgvector solves is efficient nearest neighbor search in high-dimensional spaces. Naively calculating the distance between your query vector and every single vector in your database is computationally infeasible for large datasets. pgvector implements indexing techniques, most notably Hierarchical Navigable Small Worlds (HNSW), which drastically reduces the number of comparisons needed. Instead of scanning all vectors, HNSW builds a graph where vectors are nodes and edges connect "close" vectors. Searching involves traversing this graph, quickly narrowing down the potential nearest neighbors. Neon’s serverless architecture means your database scales automatically with your workload, handling fluctuating query volumes and data sizes without manual intervention, making it ideal for applications with unpredictable traffic.

Most people understand that HNSW is an index. What they often miss is that pgvector’s HNSW index is built in memory by default, meaning the index itself resides in RAM. This is crucial for performance because disk I/O is orders of magnitude slower than memory access. When you create an HNSW index on a vector column, pgvector loads the index data into available RAM. If your index is too large to fit entirely in RAM, performance will degrade significantly as the system has to page parts of the index in and out of memory. The ef_search parameter in the index creation is a tuning knob for search speed versus accuracy, controlling the breadth of the search within the HNSW graph.

The next step in scaling semantic search often involves exploring different indexing strategies or distributed vector databases.

Want structured learning?

Take the full Neon course →