How to create a multi-column full text search index in Postgres
For example, to cover pg_search queries
Just a quick post here about something that doesn't seem to be covered much elsewhere.
If you have a query like this (this one generated by pg_search):
SELECT COUNT(*)
FROM "things"
INNER JOIN
(SELECT "things"."id" AS pg_search_id,
(ts_rank((
to_tsvector('simple', coalesce("things"."column1"::text, '')) ||
to_tsvector('simple', coalesce("things"."column2"::text, '')) ||
to_tsvector('simple', coalesce("things"."column3"::text, ''))
), (to_tsquery('simple', ''' ' || 'the search query' || ' ''' || ':*')),
0
)
) AS rank
FROM "things"
WHERE ((
to_tsvector('simple', coalesce("things"."column1"::text, '')) ||
to_tsvector('simple', coalesce("things"."column2"::text, '')) ||
to_tsvector('simple', coalesce("things"."column3"::text, ''))
) @@ (to_tsquery('simple', ''' ' || 'X' || ' ''' || ':*'))
)) AS pg_search_7dfb4cf67742cb0660305e ON "things"."id" = pg_search_7dfb4cf67742cb0660305e.pg_search_id
WHERE "things"."deleted_at" IS NULL
AND "things"."owner_id" = 1;
You can cover it with an index like this:
CREATE EXTENSION IF NOT EXISTS btree_gin;
CREATE INDEX CONCURRENTLY things_search ON things USING gin
(
owner_id,
(
to_tsvector('simple', coalesce("things"."column1"::text, '')) ||
to_tsvector('simple', coalesce("things"."column2"::text, '')) ||
to_tsvector('simple', coalesce("things"."column3"::text, ''))
)
)
WHERE deleted_at IS NULL;
One problem I ran into is that one of my columns is date. When casting it with ::text
, postgres doesn't like this and tells me "functions in index expression must be marked IMMUTABLE". More discussion on Function Volatility Categories is below. The fix for a date column is:
to_tsvector('simple', coalesce(EXTRACT(YEAR FROM column3) || '-' || EXTRACT(MONTH FROM column3)|| '-' || EXTRACT(DAY FROM column3), ''))
Function Volatility Categories
From what I understood from my quick research, this matters for columns that have localizations like timezones and charsets. I don't know how this would affect date. Can there be dates in other year systems like Julian, Persian, etc.? I don't know.
You can read more about Function Volatility Categories here.
You can view the function volatility category (provolatile) of an operator (in this case ||
) like so:
SELECT o.oprname, o.oprleft::regtype, o.oprright::regtype, o.oprcode, p.provolatile
FROM pg_operator o
JOIN pg_proc p ON p.oid = o.oprcode
WHERE oprname = '||';
You can view the function volatility category of a function (in this case to_char
) like so:
select proname,provolatile,proargtypes from pg_proc where proname='to_char';