Querying JSON (JSONB) data types in PostgreSQL
One of the unusual features of the PostgreSQL database is the ability to store and process JSON documents. In the past, data analysts and engineers had to revert to a specialized document store like MongoDB for JSON processing. But with the powerful JSON features built into PostgreSQL, the need for an external document store is no longer necessary.
Document stores are enticing because it enables you to "store data now, figure out schema later." You were always able to store arbitrary data structures as plain text in databases like PostgreSQL and MySQL. But processing and speed were a problem because the database had no internal knowledge of the structure of the document. On every query, the database had to load and parse the entire text blob. Moreover, querying deep into the JSON document required the use of gnarly regular expressions.
PostgreSQL has two native data types to store JSON documents: JSON
and JSONB
. The key difference between them is that JSON
stores data in a raw format and JSONB
stores data in a custom binary format. Our focus here is going to be on the JSONB
data type because it allows the contents to be indexed and queried with ease.
1. Difference between JSON and JSONB
The JSON data type is basically a blob that stores JSON data in raw format, preserving even insignificant things such as whitespace, the order of keys in objects, or even duplicate keys in objects. It offers limited querying capabilities, and it's slow because it needs to load and parse the entire JSON blob each time.
JSONB on the other hand stores JSON data in a custom format that is optimized for querying and will not reparse the JSON blob each time.
If you know before hand that you will not be performing JSON querying operations, then use the JSON
data type. For all other cases, use JSONB
.
The following example demonstrates the difference:
select '{"user_id":1, "paying":true}'::json, '{"user_id":1, "paying":true}'::jsonb;
json | jsonb
--------------------------------+--------------------------------
{"user_id":1, "paying":true} | {"paying": true, "user_id": 1}
(1 row)
(the whitespace and the order of the keys are preserved in the JSOB column.)
2. Basics of PostgreSQL's JSONB data type
We'll first look at some basic operations for inserting and updating JSONB columns.
Creating a JSONB column
create table sales (
id serial not null primary key,
info jsonb not null
);
Inserting a JSON document
insert into sales values (1, '{name: "Alice", paying: true, tags: ["admin"]}');
Updating a JSON document
Updating by inserting a whole document:
update sales set info = '{name: "Bob", paying: false, tags: []}';
Updating by adding a key:
Use the ||
operator to concatenate existing data with new data. The operator will either update or insert the key to the existing document.
update sales set info = info || '{"country": "Canada"}';
Update by removing a key:
Use the -
operator to remove a key from the document.
update sales set info = info - 'country';
3. Querying the JSON document
PostgreSQL has two native operators ->
and ->>
to query JSON documents. The first operator ->
returns a JSON object, while the operator ->>
returns text. These operators work on both JSON
as well as JSONB
columns.
Because the ->
operator returns an object, you can chain it to inspect deep into a JSON document. For example,
select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' -> 'bot';
-- returns true
There are additional operators available for JSONB
columns. Of these, @>
, ?
, ||
and -
are the most interesting. We've already seen the ||
and -
operators in use in the prior sections.
3a. JSONB Containment with @>
The containment operator @>
tests whether one document contains another.
select '{"name": "Alice", "agent": {"bot": true} }'::jsonb @> '{"agent": {"bot": false}}';
-- returns false
select '{"name": "Alice", "agent": {"bot": true} }'::jsonb @> '{"agent": {"bot": true}}';
-- return true
3b. JSONB Existence with ?
You can use the ->>
operator to return text and PostgreSQL's standard filtering operations on strings like =
, <>
, is null
etc. For example,
select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' ->> 'bot' is not null;
Alternatively, you can use the JSONB
existence operator ?
to check if a string exists as a top-level key.
select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' ? 'bot';
4. JSONB functions
There are four (out of many) interesting functions that operate on JSONB
we want to highlight. These functions have the json_
variant that works on JSON
columns.
4a. jsonb_each
Expands the top-level JSON document into a set of key-value pairs.
select jsonb_each( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
jsonb_each | |
---|---|
1 | (name,"Alice") |
2 | (agent,"{"bot": true}") |
4b. jsonb_object_keys
Returns the keys of the top-level JSON document.
select jsonb_object_keys( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
jsonb_object_keys | |
---|---|
1 | name |
2 | agent |
4c. jsonb_extract_path
Returns a JSON object that is traversed by a "path".
select jsonb_extract_path( '{"name": "Alice", "agent": {"bot": true} }'::jsonb, 'agent', 'bot');
jsonb_extract_path | |
---|---|
1 | true |
4d. jsonb_pretty
By default, PostgreSQL returns a compact representation which works for machine consumption. If you want your JSON documents pretty printed for human consumption, use this function:
select jsonb_pretty( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
-- returns the following
{
"name": "Alice",
"agent": {
"bot": true
}
}
No spam, ever! Unsubscribe any time. See past emails here.