Handling JSON Data in Bash using jq

Somewhere down your DevOps journey, you may encounter a need for handling JSON data using bash as part of your automation needs. Be it for periodically checking a JSON API, or interpreting JSON data.

OSSPH

11 Sep 2022 • 4 min read

Photo by Gabriel Heinzer / Unsplash

Somewhere down your DevOps journey, you may encounter a need for handling JSON data using bash as part of your automation needs. Be it for periodically checking a JSON API, or interpreting JSON data.

Before you bust out your grepping chops to parse whatever you need on the JSON output, you may want to check out jq.

As mentioned in its website (https://stedolan.github.io/jq/), jq is like sed for JSON data, you can query it, filter it according to your needs, and event generate a new JSON document or dataset based on the output.

Installing jq

Installation is simply a matter of either downloading the binary from the website (https://stedolan.github.io/jq/), or via yum/apt-get/dnf installation commands if you are using Linux.

How does it work

Usage:

jq [options] <filter> [file...]
jq [options] --args <jq filter> [strings...]
jq [options] --jsonargs <filter> [JSON_TEXTS...]

A JSON document/dataset is taken as input and is passed to a specified filter, and the output is passed through the pipe(s), eventually outputted unto the console.

The filter specifies how the JSON input will be outputted in the console (or file if you use > or >> in any shell).

The filter follows the any of the following syntaxes:
<path>
<path> | field [, field2, ...]
<path> | query or function or field [| query or function or field, ...]

A filter containing only a path will filter the JSON data and output only the fields corresponding to it.

Adding a pipe in the filter indicates a multi-stage operation, where the output from the previous stage will be passed unto the next stage for further filtering/transformation.

Basic usage

Consider the following JSON document named doc.json

{
  "userId": 1,
  "id": 1,
  "title": "delectus aut autem",
  "completed": false,  
  "nested": {
    "pages": 333,
    "format": "paperback"
  }
}

For your basic usage, you can use the following syntax:
jq [options] '<filter>' [file,...]

In filtering your JSON input for specific fields you can use the following formats: .<fieldName> .<parentName>.<childName> .<field1>, .<field2>

In the provided JSON example, you can query for the format field by running the following:

user@user1:~ $ jq '.nested.format' doc.json
"paperback"
user@user1:~ $

Using functions

jq provides several builtin functions that you can use in the filter section of the command. For example, getting title from doc.json in upper case format can be something like this:

user@user1:~ $ jq '.title | ascii_upcase' doc.json
"BOOK1"
user@user1:~ $

Querying a JSON dataset

When filtering for arrays in a JSON document/dataset, you need to add the bracket [] to the path to indicate where the array(s) are.

Consider the following JSON dataset named data.json

[
  {
    "_id": "631b2b18636523411d5c3e58",
    "name": "Larson Woods",
    "email": "larsonwoods@entality.com",
    "phone": "+1 (804) 475-2504"
  },
  {
    "_id": "631b2b18f7e40a6b6fb0f162",
    "name": "Travis Palmer",
    "email": "travispalmer@entality.com",
    "phone": "+1 (879) 588-3222"
  },
  {
    "_id": "631b2b18b5e59d85d838ff0a",
    "name": "Mccoy Faulkner",
    "email": "mccoyfaulkner@entality.com",
    "phone": "+1 (869) 475-3006"
  }
]

And the JSON dataset named data2.json

[
  {
    "_id": "631b2f7e2bcd3b839ebe610d",
    "name": "Adriana Humphrey",
    "friends": [
      {
        "id": 0,
        "name": "Charles Rivers"
      }
    ]
  },
  {
    "_id": "631b2f7e9ba81d773d7a001e",
    "name": "Herring Mann",
    "friends": [
      {
        "id": 0,
        "name": "Kathleen Sanford"
      },
      {
        "id": 1,
        "name": "Willie Koch"
      }
    ]
  }
]

To fetch all the names in the dataset data.json:

user@user1:~ $ jq '.[] | .name' data.json
"Buck Cameron"
"Baxter Santos"
"Willa Padilla"
user@user1:~ $

Since the array is in the root of the document, the path used is .[].

To query all the friends in data2.json:

user@user1:~ $ jq '.[].friends[] | .name' data2.json
"Charles Rivers"
"Kathleen Sanford"
"Willie Koch"
user@user1:~ $

Each document in the main array contains a field friend that is also an array, so the path used is .[].friends[].

To query for any document from the dataset matching a specified condition, you can add your query condition in the filter string, as seen in the following example:

user@user1:~ $ jq '.[] | select(."_id" == "631b2aaa052765dfcb4352a1")' data.json
{
  "_id": "631b2aaa052765dfcb4352a1",
  "name": "Willa Padilla",
  "gender": "female",
  "company": "ENTALITY",
  "email": "willapadilla@entality.com",
  "phone": "+1 (870) 470-2599"
}
user@user1:~ $

As seen in the command, the arrays are passed unto the next stage of the filter, where it searches for documents (using select) with the field _id having the value of "631b2aaa052765dfcb4352a1." You can further wittle the output down to only contain the name of the matching document by running the following:

user@user1:~ $ jq '.[] | select(."_id" == "631b2aaa052765dfcb4352a1") | .name' data.json
"Willa Padilla"
user@user1:~ $

Resources

https://stedolan.github.io/jq/manual/

About the author

Geo Dela Paz is a technical writer at OSSPH and a site reliability engineer at IBSS Manila. Feel free to connect with Geo on GitHub, and LinkedIn.