Rails N+1 query optimization to get records counts in PSQL
Performance of the Rails application becomes slower if the application has a lot of N+1 queries. In this article, we will go through a performance optimization that removes N+1 queries to fetch recourd counts based on certain conditions.
Let’s consider a schema for Address
model / table as gives below.
# id
# user_id
# address_1
# address_2
# address_3
# created_at
# updated_at
class Address < ApplicationRecord
end
Let’s say, we want to get
- The number of records that do not have
address_1
value - The number of records that do not have
address_2
value - The number of records that do not have
address_3
value
This can be written with Ruby on Rails ORM as given below.
column_names = %i(address_1 address_2 address_3)
column_names.inject({}) do |column_counts, column_name|
count = Address.where(column_name => [nil, '']).select(column_name).count
column_counts[column_name] = count
column_counts
end
If we run this piece of code,
we get following output
on a sample database with some records in addresses
table.
{
:address_1 => 23,
:address_2 => 51,
:address_3 => 1233
}
We’ve obtained the required information from the database using Rails ORM.
Let’s look at the queries being fired to get this data.
count = Address.where(column_name => [nil, '']).select(column_name).count
And the line is being called in a loop. When inspected, we can see following queries.
SELECT COUNT("addresses"."address_1") FROM "addresses" WHERE "addresses"."deleted_at" IS NULL AND ("addresses"."address_1" = $1 OR "addresses"."address_1" IS NULL) [["address_1", ""]]
SELECT COUNT("addresses"."address_2") FROM "addresses" WHERE "addresses"."deleted_at" IS NULL AND ("addresses"."address_2" = $1 OR "addresses"."address_2" IS NULL) [["address_2", ""]]
SELECT COUNT("addresses"."address_3") FROM "addresses" WHERE "addresses"."deleted_at" IS NULL AND ("addresses"."address_3" = $1 OR "addresses"."address_3" IS NULL) [["address_3", ""]]
As we can see it has fired 3 queries. Now, if this piece of / method gets called huge number of times in an application, this can become a bottleneck quickly.
General rule of thumb can be considered as:
If any line of code which fires a query is written in a loop, there may be a scope of performance optmization to remove those N+1 queries.
Remove N+1 querying using a SQL
Upon taking a close look at the SQL queries being fired, we can get the required data in a single SQL query using PostgreSQL’s Conditional Expresions
column_names = %i(address_1 address_2 address_3)
sql = <<-SQL
SELECT #{
column_names.map do |column_name|
"COALESCE(SUM(CASE WHEN #{column_name} IS NULL THEN 1 ELSE 0 END), 0) AS #{column_name}"
end.join(', ')
}
FROM addresses
SQL
ActiveRecord::Base.connection.execute(sql).first
Here we are getting the count of records
for each column in a select clause
based on condition in case .. when
expressions.
The record is counted if the value is NULL
.
We have used
COALESCE
in order to ensure default value 0
is returned if there are no records in addresses
table.
It gives the output as given below.
:address_1 => 23,
:address_2 => 51,
:address_3 => 1233
This is exactly the result that we wanted. But, using PostgreSQL’s features, we have obtained this result in a single query.
Subscribe to Ruby in Rails
Get the latest posts delivered right to your inbox
