How to create a new column in Postgres with existing rows backfilled with a different value from the default
The best approach for creating a new column that has a default has changed since Postgres 11. In the case where we want the existing rows to have a different value from future created rows (e.g. a boolean which signifies if a user should see the onboarding tour), the best approach has also improved significantly in a way that perhaps isn't immediately intuitive.
In the before times, in the long-long-ago (before Postgres 11)
Before Postgres 11, we had to create new columns with defaults in a multi-step process to avoid locking the database.
-- not specifying default, because before pg 11 this would lock and rewrite the entire table
ALTER TABLE users
ADD COLUMN has_seen_tour BOOLEAN;
ALTER TABLE users
ALTER COLUMN has_seen_tour SET DEFAULT FALSE,
ALTER COLUMN has_seen_tour SET NOT NULL;
-- to facilitate backfill below
CREATE INDEX CONCURRENTLY my_index ON users (has_seen_tour);
User.where(has_seen_tour: nil).in_batches do |batch|
sleep 0.01
batch.update_all(has_seen_tour: FALSE)
end
DROP INDEX CONCURRENTLY my_index;
>= Postgres 11
Now, this process is much more simple!
ALTER TABLE users
ADD COLUMN has_seen_tour BOOLEAN NOT NULL DEFAULT FALSE;
That's it!
However...
Populating existing rows with a different value from the default
Let's say you want the existing rows to be initialized with a different value from future
new rows. Your first instinct might be to set the default for what you want new rows to have,
and then run a loop backfilling the old rows.
ALTER TABLE users
ADD COLUMN has_seen_tour BOOLEAN NOT NULL DEFAULT FALSE;
Foo.in_batches do |batch|
sleep 0.1
batch.update_all(has_seen_tour: true)
end
However, thanks to Postgres 11's magic, there's a much easier way, without needing to do any table scanning or iterating through rows in a data migration:
-- initialize existing rows with FALSE
ALTER TABLE users
ADD COLUMN has_seen_tour BOOLEAN NOT NULL DEFAULT TRUE;
-- all new rows going forward should start with FALSE
ALTER TABLE users
ALTER COLUMN has_seen_tour SET DEFAULT FALSE;
That's it!