SQL's NULL values: comparing, sorting, converting and joining with real values
SQL's NULL
value is a major point of confusion. The premise of NULL
is to represent an unknown or missing value, though it's only a placeholder and not really a value by itself. A single NULL
value can show up anywhere and torpedo your calculations. This perculiarity shows up when you try to compare it with anything else, including itself.
For example, look at these statements:
Statement | Result | Comment |
---|---|---|
SELECT NULL = NULL; |
NULL | NULL isn't comparable to NULL |
SELECT NULL > 0; |
NULL | NULL can't be compared |
SELECT NULL < 0; |
NULL | NULL can't be compared |
SELECT NULL = 0; |
NULL | NULL can't be equated |
SELECT NULL / 0; |
NULL | Are you expecting a Division by Zero error? |
SELECT NULL OR FALSE; |
NULL | NULL can't be used in boolean logic |
SELECT NULL OR TRUE; |
TRUE | Surprising and database dependent! |
Where are you finding NULL
values?
Sure, your database schema has clean data with the proper NOT NULL
constraints and any individual row does not have NULL
values. Though your original data does not have NULLs, there are two cases where your query can generate rows with NULL values:
Outer joins, when you want to fill gaps in your data. Unlike an inner join that only returns data when the join condition matches at least one row from either table, an outer join returns all rows from the tables referenced in the join as long as those rows match any of the join conditions.
Certain aggregate functions such as a
SUM
– for example, the sum of an empty list isNULL
.
NULL
values are a fact of life, and we should learn to deal with them.
1. Comparing NULL values
Say you have a table with a NULLable column type, and you want to find rows with a NULL
value in that column. Since you can't use a equality operator in the WHERE
clause (remember, NULL
values can't be equated or compared), the right way to compare NULL
values is to use the IS
and IS NOT
operators.
select nullable_column
from data_table
where nullable_column is null;
or
select nullable_column
from data_table
where nullable_column is not null;
1b. Additionally, Use IS DISTINCT FROM
to treat NULL
as a known value
PostgreSQL has two comparison statements IS DISTINCT FROM
and IS NOT DISTINCT FROM
that specially treats NULL
values as if it were a known value. The truth table with this expression for two nullable columns a
and b
will look something like this:
a | b | is distinct from |
---|---|---|
not NULL | not NULL | use the regular not equal '<>' operator |
not NULL | NULL | return TRUE because they are different |
NULL | not NULL | return TRUE because they are different |
NULL | NULL | return FALSE because they are the same |
2. Sorting NULL values
When trying to sort a column with the ORDER BY
clause on a nullable column, you'll find that NULL
values come last. If you tried a descending sort via the DESC
qualifier, NULL
values will come first.
Fortunately, you have more control over this. There are two ways to tell your database where you want the NULL
values to appear.
First, you can specify NULLS FIRST
or NULLS LAST
in your ORDER BY
clause to specify where you want the NULL
values. Consider a table users
where the gender
column is nullable.
select first_name, age, gender
from users
order by gender desc nulls last;
Second, you can convert the NULL values into a real value during query time or only when sorting. We'll make use of the COALESCE
function for this:
-- treat NULL in gender column as -1
select first_name, age, coalesce(gender, -1)
from users
order by 3;
-- treat NULL in gender column as -1 only during sorting
select first_name, age, gender
from users
order by coalesce(gender, -1);
3. Use COALESCE
to change NULL
values into real values
The COALESCE
function takes a number of arguments and returns the first non-NULL argument. If all of the arguments are NULL
, then the COALESCE
returns NULL
. For example,
select coalesce('love', 'sql');
-- returns love
select coalesce(null, 'love', 'sql');
-- returns love
3b. Additionally, COALESCE
is a shortcut for CASE
The COALESCE
function is a syntactic shortcut for the CASE
statement. For example, the code COALESCE(expr1, expr2, ..., exprN)
is rewritten by the query parser as the following CASE
statements:
case
when (expr1 is not null) then expr1
when (expr2 is not null) then expr2
...
else exprn
end
4. Joining tables on Nullable Columns
Say you have two tables: table1
and table2
with a column that can have NULL
values. If you were to join these two tables on that value, you have to be aware of the fact that NULL
is not equal to NULL
, and therefore, it would appear as though you are missing rows from the resulting joined table.
No spam, ever! Unsubscribe any time. See past emails here.