Converting JSON Data with jq
Introduction
When working with large JSON files, it can be difficult to find and process the information you need. Copying data one by one and totaling them is both time consuming and has a high risk of error. While tools such as sed, awk, and grep available on Linux systems are useful, they are limited for machine-readable data such as JSON.
jq is a powerful command line tool for processing JSON data. It is used to filter, transform and analyze data in shell scripts, AI workflows and DevOps processes. For example:
- Pulling specific information from the JSON response from an API,
- Selecting certain values in Kubernetes outputs,
- It is possible to perform JSON cleaning and conversion operations in data engineering processes.
Modern Usage Areas
- AI integration: jq simplifies the data pre-processing step by filtering and transforming JSON data for machine learning models.
- Performance: It is very fast because it is written in C; can handle gigabytes of JSON data efficiently. Thanks to the
--streammode, it works on large data sets without overloading the memory.
In this guide, you'll learn how to convert a sample JSON file with jq, restructure data with filters, and integrate into modern AI and DevOps workflows.
Highlights
In this guide you will learn:
- Basic jq Operations: JSON data filtering, map usage and conversion techniques
- Performance Improvement: Processing very large (5GB+) JSON files with stream and memory friendly methods
- AI Integration: Use of jq in machine learning data preprocessing and API outputs
- Modern Workflows: jq applications in Kubernetes, CI/CD and microservices environments
- Advanced Techniques: Nested JSON data processing, conditional filters and error handling
Prerequisites
To complete this training you need:
- jq: Tool used to read and transform JSON data. It is available in the package repositories of most Linux distributions.
Installation for Ubuntu:
sudo apt install jq
Step 1 — Running the First jq Command
In this step, you will create the sample JSON file and test it with the jq command. jq can receive data both from file and via pipe; We will use file here.
First, create and open a file named seaCreatures.json (using nano as an example):
nano seaCreatures.json
Copy the following JSON data into the file:
[
{ "name": "Rocky", "type": "seal", "clams": 5 },
{ "name": "Coral", "type": "seal", "clams": 3 },
{ "name": "Finny", "type": "whale", "clams": 2 },
{ "name": "Pearl", "type": "seahorse", "clams": 2 }
]
We will work with this data throughout the training. At the end, you will be able to get answers to the following with one-line jq commands:
- You will see the names of sea creatures one by one in a list.
- You will easily calculate how many clams all living things have in total.
- And the most curious part: You will be able to find how many of these mussels belong only to dolphins.
So in summary, you will be able to get the answer you want in a few seconds with jq.
Save and close the file. ✅
Now we have an input file, but it is not enough on its own. When working with jq you also need to define a filter. This filter tells how to transform the data.
The simplest filter is the point (.). This is called the identity operator and what it does is very simple: It takes the JSON data as it is and outputs it without changing anything. So it's a kind of "pass without touching" filter.
You can first use identity operator (.) to test whether the installation is working correctly.
If you get an error, there is most likely an invalid JSON structure in the seaCreatures.json file. Check the file.
Now run this command:
jq '.' seaCreatures.json
When using jq with files you always specify a filter first and then the input file.
Since filters may contain spaces or characters that have special meaning to the shell, it is a good practice to enclose the filter in single quotes ('). So the shell sees it as a parameter.
Don't worry: running jq doesn't change your file, it just shows the output.
The output of the command will look like this:
Output
[
{
"name": "Rocky",
"type": "seal",
"clams": 5
},
{
"name": "Coral",
"type": "seal",
"clams": 3
},
{
"name": "Finny",
"type": "whale",
"clams": 2
},
{
"name": "Pearl",
"type": "seahorse",
"clams": 2
}
]
jq outputs pretty print by default.
In other words, it shows the data indented, line by line, and in color if possible. This makes it easier to read, especially when examining JSON output from other tools.
For example, when you pull JSON from an API with curl, you can make the output more readable by piping it to the jq '.' command.
Now jq works properly. ✅
Now that your input file is ready, in the next steps we will transform the data with different filters and obtain the following three information:
- creatures → list of creatures names
- totalClams → total number of clams
- totalSealClams → clams owned by dolphins
In the next step, we will first pull the creatures information.
Step 2 — Fetching the creatures value
In this step, we will extract the names of all sea creatures.
Using the creatures field, we will obtain the names of living things in a list.
At the end of the step, you will see the following list:
Output
[
"Rocky",
"Coral",
"Finny",
"Pearl"
]
To get this list, we just need to take the names of the creatures and convert them into an array.
To do this, we will narrow the filter a little: from all values we will select only the fields name.
Since we are working on an array, we need to tell jq “look at the elements one by one”. For this, .[], which is array value iterator, is used.
Now run this command:
jq '.[]' seaCreatures.json
Now each element in the array is output separately. So the names appear on the screen one by one:
Output
{
"name": "Rocky",
"type": "seal",
"clams": 5
}
{
"name": "Coral",
"type": "seal",
"clams": 3
}
{
"name": "Finny",
"type": "whale",
"clams": 2
}
{
"name": "Pearl",
"type": "seahorse",
"clams": 2
}
Instead of printing each element as it is, we only want to get the name field.
We use pipe operator (|) for this. This operator sends each output to the next filter.
If you used find | xargs on the command line, the same logic applies here.
You can access the name property of a JSON object by typing .name.
Run the following command with pipe:
jq '.[] | .name' seaCreatures.json
Now only the name fields remain, all other properties have disappeared from the output:
Output
"Rocky"
"Coral"
"Finny"
"Pearl"
jq produces valid JSON output by default. That's why string values always appear within double quotes (").
If you don't want the quotes, add the parameter -r for raw output:
jq -r '.[] | .name' seaCreatures.json
```json
The quotation marks have disappeared:
```bash
[secondary_label Output]
Rocky
Coral
Finny
Pearl
Step 3 — Calculating the totalClams Value (with map and add)
In this step, we will find out how many clams (clams) the creatures have in total.
No need for manual calculations; It's much faster and error-free with jq. The result will be 12.
1. pull clams values
First, let's list the clams values of each living creature:
jq '.[] | .clams' seaCreatures.json
Output
5
3
2
2
2. Getting values into array
In order to add, we need to put the values into an array:
jq '[.[] | .clams]' seaCreatures.json
Output
[
5,
3,
2,
2
]
Before applying the add filter, we can use the map function to make the command more readable.
map applies the given filter to each element of the array and converts the results to the array all at once.
For example:
jq 'map(.name)' '[{"name":"Rocky"},{"name":"Coral"}]'
Rewrite the filter to use map function directly and run the command:
jq 'map(.clams)' seaCreatures.json
You will get the same output you got before:
Output
[
5,
3,
2,
2
]
Now that we have an array, we can add it by piping it to the add filter:
jq 'map(.clams) | add' seaCreatures.json
The output of the command will be the sum of the array:
Output
12
With this filter, we calculated the total number of mussels (12). We will use this later to create the totalClams value.
So far we have solved 2 out of 3 questions. We just need to write one more filter, then we will be able to produce the final output.
Step 4 — Calculating the totalSealClams Value (with add)
We now know the total number of mussels. Now we'll just find out how many mussels dolphins (seals) have.
To do this, we will select only those elements in the array that meet a certain condition (type = seal) and add their clams values.
The result will be 4 This value will later be used in the totalSealClams field.
This time, instead of adding up all clams values, we'll just count the ones dolphins (seal) have.
For this we use the select function: select(koşul).
- If the condition is true, that value is filtered and passed.
- If false, the value is ignored.
You can use it with map to apply select to every element in an array.
Thus, values that do not meet the condition are eliminated, leaving only the items you want.
In our case we will only keep those with field type "seal".
The filter we will use is as follows:
jq 'map(select(.type == "seal").clams) | add' seaCreatures.json
This filter finds no matches for Rocky (seal) and Coral (turtle).
But it matches for two seals named Splish and Splash.
Output[
{
"name": "Rocky",
"type": "seal",
"clams": 5
},
{
"name": "Coral",
"type": "seal",
"clams": 3
}
]
This output contains each creature's clams value as well as other information.
To get just the clam counts, you can add the field name inside the map function:
jq 'map(select(.type == "seal").clams)' seaCreatures.json
The map function applies the given filter to each element in the array.
That's why select runs four times: once for each creature.
It produces output for the two dolphins that meet the condition and discards the others.
The end result is an array containing only the dolphins' clams values:
Output
[2, 2]
Now let's add these array values by piping them to the add filter:
jq 'map(select(.type == "seal").clams) | add' seaCreatures.json
The output of this command is the total number of clams owned by creatures of type seal only:
Output
4
Congratulations 🎉
Here using map and select:
- We went around the series,
- We only selected items that fit the condition (seal),
- We took their values
clamsand - We collected the result.
With this method, we calculated the totalSealClams value.
In the next step, we will add this value to the final output. 🚀
Step 5 — Transforming the Data into a New Structure
In the previous steps, we processed the data piece by piece by writing filters.
Now we will combine these filters and create a single JSON output.
In this output, we will see the answers to the following questions:
- What are the names of sea creatures?
- How many mussels do living creatures have in total?
- How many of these mussels belong to dolphins?
As a reminder:
- We used map(.name) to find the list of names.
- To find the total number of clams map(.clams) | We ran the add filter.
- To calculate the dolphins' clams, use map(select(.type == "seal").clams) | We used add.
Now we will combine these three filters into a single jq command.
We will create a new JSON object and display the information we want in a single structure.
As a reminder, our starting JSON file looked like this:
[
{ "name": "Rocky", "type": "seal", "clams": 5 },
{ "name": "Coral", "type": "seal", "clams": 3 },
{ "name": "Finny", "type": "whale", "clams": 2 },
{ "name": "Pearl", "type": "seahorse", "clams": 2 }
]
When we combine the filters, the converted JSON output we will get will be like this:
Final Output
{
"creatures": [
"Rocky"
"Coral"
"Finny"
"Pearl"
],
"totalClams": 12,
"totalSealClams": 4
}
Here's a demo showing the entire jq command syntax (with null values for example):
jq '{ creatures: [], totalClams: 0, totalSealClams: 0 }' seaCreatures.json
With this filter, we create a JSON object with three fields:
- creatures → list of creatures names
- totalClams → the sum of clams owned by all living things
- totalSealClams → total of clams owned by dolphins alone
Output
{
"creatures": [],
"totalClams": 0,
"totalSealClams": 0
}
This JSON structure looks like the final output, but the values in it are not correct at the moment.
Because they weren't taken from the file seaCreatures.json, they were written by hand.
Now let's replace these constant values with the filters we created in the previous steps:
jq '{ creatures: map(.name), totalClams: map(.clams) | add, totalSealClams: map(select(.type == "seal").clams) | add }' seaCreatures.json
The filter above makes jq do the following:
- creatures → list of
namevalues of all creatures - totalClams → sum of
clamsvalues of all creatures - totalSealClams → sum of
clamsof onlytype = "seal"
When you run the command you get this output:
Output
{
"creatures": [
"Rocky",
"Coral",
"Finny",
"Pearl"
],
"totalClams": 12,
"totalSealClams": 4
}
Result
When working with JSON data, it allows you to easily perform transformations that are difficult to do with text tools such as jq and sed.
In this training we learned:
- Filter data with select
- Converting array elements with map
- add numbers with add
- Create new JSON structures by combining these transformations
In short, jq is a powerful tool that seriously eases your hands when dealing with JSON data. 🚀

