Adding full-text search to your app is surprisingly easy if you’re already using a relational database.
Let’s see it in action. Imagine we have a simple products table:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2)
);
INSERT INTO products (name, description, price) VALUES
('Ergonomic Office Chair', 'A comfortable chair designed for long working hours, with adjustable lumbar support.', 299.99),
('Mechanical Keyboard', 'Clicky, tactile keyboard with RGB backlighting. Perfect for gaming and typing.', 129.50),
('Wireless Mouse', 'High-precision optical mouse with a long-lasting battery. Connects via USB dongle.', 45.00),
('External Hard Drive', '1TB portable SSD for fast data transfer and storage. USB-C compatible.', 150.00),
('Webcam 1080p', 'Full HD webcam with built-in microphone. Ideal for video conferencing.', 75.99);
Now, we want to search for products based on keywords in their name or description.
First, we need to enable a full-text search extension. For PostgreSQL, this is pg_trgm.
CREATE EXTENSION pg_trgm;
This extension provides functions and operators for determining the similarity of text based on trigrams (sequences of three characters).
Next, we create a GIN (Generalized Inverted Index) index on the columns we want to search. GIN indexes are particularly effective for full-text search because they can quickly locate documents containing specific terms.
CREATE INDEX products_search_idx ON products USING GIN (to_tsvector('english', name || ' ' || description));
Here’s what’s happening:
to_tsvector('english', ...): This function converts the concatenatednameanddescriptioninto atsvectordata type.tsvectoris a special PostgreSQL type that stores a document’s words in a way that’s optimized for searching. The'english'argument specifies the language, which influences how words are stemmed (e.g., "working" becomes "work").name || ' ' || description: We concatenate thenameanddescriptionfields with a space in between. This ensures that words from both fields are included in the search.USING GIN: This tells PostgreSQL to use a GIN index for this column.
With the index in place, we can perform searches using the @@ operator, which checks for a match between a tsvector and a tsquery.
SELECT id, name, description, price
FROM products
WHERE to_tsvector('english', name || ' ' || description) @@ to_tsquery('english', 'work & comfort');
Let’s break down the WHERE clause:
to_tsquery('english', 'work & comfort'): This function converts our search string into atsquerydata type.tsqueryrepresents the search query, and it can include operators like&(AND),|(OR), and!(NOT). In this case, we’re looking for products that contain both "work" and "comfort" (or their stemmed variations) in their name or description.@@: This is the "matches" operator. It efficiently checks if thetsvectoron the left contains the terms specified in thetsqueryon the right.
This query will return the "Ergonomic Office Chair" because its description contains "working" and "comfortable."
The real power comes from how to_tsvector handles language. It performs stemming, which means it reduces words to their root form. So, searching for "work" will match "working," "worked," and "works." Similarly, "comfort" will match "comfortable." This significantly improves the recall of your search results.
The surprise is that this powerful, language-aware search functionality is built directly into your existing relational database, often with minimal overhead. You don’t need to spin up a separate Elasticsearch or Solr cluster for basic to moderately complex full-text search needs.
The next step is to explore advanced query syntax, like phrase searching and fuzzy matching, to refine your search capabilities even further.