PostgreSQL GIN Indexes: JSONB, Arrays & Full-Text Search
A practical guide for PostgreSQL and TimescaleDB

PostgreSQL GIN Indexes: JSONB, Arrays & Full-Text Search
Here's a pattern I see constantly: a team stores flexible data in a JSONB column, writes queries that filter on specific keys, and everything is fast in development. Six months later, with millions of rows in production, those JSONB queries take 3 seconds. They start thinking about moving to Elasticsearch or MongoDB. The actual fix is a single CREATE INDEX statement they've never seen before.
GIN indexes are PostgreSQL's answer to searching inside composite data types. If you work with JSONB, arrays, or full-text search, they're not optional.
What GIN Does Differently
A B-tree index stores one value per row and supports equality, range, and ordering. A GIN (Generalized Inverted Index) decomposes a column value into its constituent parts -- JSON keys, array elements, text lexemes -- and creates an index entry for each part. Think of it like a book's index: instead of one entry per page, you get entries for every term that appears on each page.
This is the only way to index operations like:
WHERE metadata @> '{"status": "active"}'(JSONB containment)WHERE tags @> ARRAY['postgresql'](array containment)WHERE search_vector @@ to_tsquery('performance')(full-text search)WHERE name LIKE '%smith%'(trigram matching)
Without a GIN index, every one of these is a sequential scan.
JSONB Indexes: Choose Your Operator Class
This is where people stumble. There are two GIN operator classes for JSONB:
Default (jsonb_ops) -- supports @>, ?, ?|, ?&:
CREATE INDEX CONCURRENTLY idx_products_attributes_gin
ON products USING gin (attributes);
-- All of these use the index:
SELECT * FROM products WHERE attributes @> '{"color": "blue"}';
SELECT * FROM products WHERE attributes ? 'warranty';
SELECT * FROM products WHERE attributes ?| array['color', 'size'];
jsonb_path_ops -- supports only @> but is 2-3x smaller and faster:
CREATE INDEX CONCURRENTLY idx_products_attributes_path
ON products USING gin (attributes jsonb_path_ops);
-- Only containment queries use this index
SELECT * FROM products WHERE attributes @> '{"color": "blue"}';
The trap: you create a jsonb_path_ops index, then write a query using ? (key existence). The index is silently ignored. The query does a seq scan. You spend an hour debugging.
Rule: if you only use @>, choose jsonb_path_ops. If you need ? or ?|, use the default.
Array Indexes
Straightforward -- create the GIN index and all array operators are covered:
CREATE INDEX CONCURRENTLY idx_articles_tags_gin
ON articles USING gin (tags);
-- Contains all specified elements (@>)
SELECT * FROM articles WHERE tags @> ARRAY['postgresql', 'performance'];
-- Contains any specified element (&&)
SELECT * FROM articles WHERE tags && ARRAY['postgresql', 'mysql'];
-- Is contained by (<@)
SELECT * FROM articles WHERE tags <@ ARRAY['postgresql', 'performance', 'indexing'];
Full-Text Search
Create a generated tsvector column with weighted sections, then index it:
-- PG12+ generated column
ALTER TABLE articles ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;
CREATE INDEX CONCURRENTLY idx_articles_search_gin
ON articles USING gin (search_vector);
-- Search with ranking
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgresql & performance') query
WHERE search_vector @@ query
ORDER BY rank DESC;
Title matches rank higher than body matches thanks to the weights.
Trigram Indexes: The LIKE '%pattern%' Solution
This is one of the most useful tricks in PostgreSQL. B-tree indexes need a fixed prefix, so LIKE '%smith%' always does a seq scan. The pg_trgm extension fixes this:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX CONCURRENTLY idx_customers_name_trgm
ON customers USING gin (customer_name gin_trgm_ops);
-- Now these use the index:
SELECT * FROM customers WHERE customer_name LIKE '%smith%';
SELECT * FROM customers WHERE customer_name ILIKE '%john%';
-- Similarity/fuzzy matching too:
SELECT * FROM customers
WHERE customer_name % 'Jon Smith'
ORDER BY similarity(customer_name, 'Jon Smith') DESC;
Finding Tables That Need GIN Indexes
-- Tables with JSONB/array/tsvector columns and high seq scan counts
SELECT
t.relname AS table_name,
a.attname AS column_name,
pg_catalog.format_type(a.atttypid, a.atttypmod) AS data_type,
t.seq_scan,
pg_size_pretty(pg_relation_size(t.relid)) AS table_size
FROM pg_stat_user_tables t
JOIN pg_attribute a ON a.attrelid = t.relid
WHERE a.atttypid IN ('jsonb'::regtype, 'json'::regtype)
OR pg_catalog.format_type(a.atttypid, a.atttypmod) LIKE '%[]'
OR a.atttypid = 'tsvector'::regtype
ORDER BY t.seq_tup_read DESC;
Confirm with EXPLAIN:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM products
WHERE attributes @> '{"color": "blue", "size": "large"}';
-- "Seq Scan" with "Filter" = needs GIN index
-- "Bitmap Index Scan" on GIN index = already optimized
Tuning fastupdate
GIN's pending list batches insertions for efficiency, but can cause unpredictable query latency during flushes:
-- Disable for consistent query latency
ALTER INDEX idx_products_attributes_gin SET (fastupdate = off);
-- Or tune the pending list size
ALTER INDEX idx_products_attributes_gin SET (gin_pending_list_limit = 256);
Tradeoff: disabling fastupdate means slower inserts but consistently fast queries.
GIN Index Size Considerations
GIN indexes are larger than B-tree indexes and grow with the number of distinct indexed values. A JSONB column with highly varied key structures produces a larger GIN index. Monitor sizes over time:
- Consider
jsonb_path_opsfor containment-only queries (2-3x smaller) - An unused GIN index slows every INSERT and UPDATE with zero query benefit
- Periodically check that your GIN indexes are actually being used
The Checklist
When adding a column to your schema:
| Column Type | Query Pattern | Index Needed |
| JSONB | @>, ? | GIN (default or jsonb_path_ops) |
| Array | @>, && | GIN |
| tsvector | @@ | GIN |
| Text (wildcard) | LIKE '%x%' | GIN with gin_trgm_ops |
Add the GIN index in the same migration that creates the column. Don't wait for production complaints.
