Skip to content
pkoppstein edited this page May 7, 2023 · 402 revisions

Frequently Asked Questions

FAQ and Wiki Authors

𝑸: Who can edit the wiki, including this FAQ?

A: Anyone with a GitHub account.

𝑸: Who wrote this FAQ?

A: Various contributors, see the page's history. Any user is welcomed to add their FAQ, even without an answer.

𝑸: Who wrote the rest of this wiki?

A: Mostly the authors/maintainers.

𝑸: Can I add to the jq Cookbook?

A: Absolutely, please do!

Installation

𝑸: How can I install jq with assertion-checking turned off? How can I improve jq's performance?

A: One way is to make jq from source after adding the -DNDEBUG flag to the DEFS variable in the Makefile. Be sure to run make clean before running make with this flag in effect. Note also that enhancements made to jq since the release of version 1.6 significantly improve performance in many cases.

𝑸: On Windows, how can I install a more recent version of jq than is available using choco or scoop, without having to build from source?

A: See the Installation page regarding homebrew and Appveyor.

𝑸: What versions of jq are available using 0install ("Zero Install")? On what platforms?

A: See https://apps.0install.net/utils/jq.xml

𝑸: I have just upgraded jq while retaining a previous version, which was working properly before the upgrade. Now, however, when running the previous version, I get the error message:

dyld: Library not loaded: /usr/local/opt/oniguruma/lib/libonig.4.dylib

What can I do?

A: ln -s /usr/local/opt/oniguruma/lib/libonig.5.dylib libonig.4.dylib

𝑸: What are the pre-requisites for compiling and installing jq from GitHub?

A: To create a jq executable from source requires a C development environment, preferably with a recent version of bison (3.0 or newer), but if you can't get a recent-enough bison you can use the --disable-maintainer-mode option to ./configure. See http://stedolan.github.io/jq/download for details. To get regexp support you'll also need to have Oniguruma installed.

𝑸: How do I install oniguruma?

A: On a Mac, we recommend brew install oniguruma. For Linux, use your package manager to find oniguruma-dev or oniguruma-devel. (See also the next FAQ regarding the location of the libonig library.) All else failing, download oniguruma from https://github.com/kkos/oniguruma/archive/master.zip or https://web.archive.org/web/http://www.geocities.jp/kosako3/oniguruma/ and consult the INSTALL file. If you have a recipe to share, please add it to the Installation page on this wiki.

𝑸: When I run make, I get error messages such as "undefined reference to OnigSyntaxPerl_NG".

A: The Oniguruma library may have been installed in a directory that the standard jq installation process does not know about. For example, jq may be expecting the libonig library to be /usr/local/lib64/libonig.so.2 but it might instead be located in /usr/local/lib/. If this is the case, a simple workaround is to create symbolic links for all the libonig* files and then run ./configure again.

𝑸: Are there any complete recipes for installing jq from source?

A: See the main README for jq, and the Installation wiki page.

Troubleshooting

For installation-related questions, see the Installation section above.

𝑸: I have installed jq but am unable to run it. Why?

A: Try running jq --version giving the full path. In a Mac or Linux or similar environment, check that the binary file is executable.

𝑸: My JSON is valid but I am getting an error message beginning "jq: error:". What is wrong?

A: First check whether there is a file named .jq in your HOME directory (~/.jq). If there is, then see if the problem can be resolved by renaming it. After renaming it, you can check that it contains valid jq by running jq -n -f RENAMED_FILE.

𝑸: My file contains valid JSON, so why does jq give the error message: "parse error: Invalid numeric literal at EOF at line 1 ..."

A: The encoding of the file might not be valid for JSON text, which is defined as a sequence of Unicode code points. jq currently requires the text be encoded as UTF-8 (and therefore allows ASCII). Note that the default encoding used in PowerShell when redirecting terminal output to a file is UTF-16. If you need to convert from one encoding to another, consider using iconv, or if you are using Windows, try pasting your JSON into Notepad and saving the file as a UTF-8 file.

If your shell has the file command, then you can check the encoding of the file by running file MYFILE.json

Caveats

𝑸: Is . really a JSON pretty-printer?

A: Yes, but with the following caveats:

  • The regular jq parser in effect ignores all but the last occurrence of duplicate keys within an object.

  • All versions of jq through 1.6 convert JSON numbers to IEEE 754 64-bit values, so loss of precision and other changes can result. A commit dated Oct 21, 2019, ensures that the "external" format of numbers expressed without using exponential notation is generally preserved except for superfluous leading 0s (e.g. 000 is treated as 0). For example:

jqMaster -M -n '123456789123456789123456789123456789123456789123456789.0000|[.,tostring]'
123456789123456789123456789123456789123456789123456789.0000
"123456789123456789123456789123456789123456789123456789.0000"

𝑸: Can jq be used as a JSON validator?

A: Strictly speaking, no. Although jq is fairly strict about what it accepts as JSON, there is currently no "strict" mode, and jq will quietly accept some not-strictly-valid JSON texts, e.g. 00 is mapped to 0. See also the subsection on numbers below. However jq can be very helpful in pinpointing discrepancies from JSON.

𝑸: Are there restrictions on variable names? Why does the use of $end result in a “syntax error”?

A: Reserved words such as end cannot be used as $-variable names. Sorry😱

For a full listing of keywords and other details, see Keywords.

𝑸: What are some regex-related discrepancies between jq and gojq?

A: jq's gsub basically applies sub iteratively, and jq's regex engine will sometimes match \n with $. These characteristics lead to some discrepancies with other regex engines. Here are two examples:

echo gsub:
for jq in jq gojq
do
    echo $jq : $($jq -n '"aaa" | gsub("^a"; "b")' )
done

echo test:

for jq in jq gojq
do
    echo $jq : $($jq -n '"abc\n" | test("c$")')
done

Output:

gsub:
jq : "bbb"
gojq : "baa"
test:
jq : true
gojq : false

Bugs

𝑸: Why does implode sometimes raise an assertion violation error?

A: implode relies on assertions to detect and report errors. For example:

jq -n '[-2] | implode'
Assertion failed: (codepoint >= 0 && codepoint <= 0x10FFFF), function jvp_utf8_encode, file src/jv_unicode.c, line 101.

𝑸: What can I do about bugs related to time zones?

A: strflocaltime in jq version 1.6 and earlier has silent bugs (see e.g. https://github.com/stedolan/jq/pull/2202). We suggest you check the results, e.g. using gojq, the Go implementation of jq, or use gojq instead for transformations involving "local time".

𝑸: What bugs were introduced in jq 1.6?

A: |= can produce incorrect results when used in conjunction with try ... catch ... or the postfix ? operator; for example:

  • {a:[1,2]} | .a |= try sort incorrectly evaluates to {}.

This bug affects map_values because it is implemented using |=; for example:.

  • {a:0} | map_values(tostring?) incorrectly evaluates to {}.

This bug was fixed in November 2019, but if your builtin def of map_values is not working, you can simply redefine it in your program as: def map_values(f): with_entries(.value = (.value|f));

General Questions

𝑸: Where can I get additional help?

A:

𝑸: How can I access the value of a key with hyphens or $ or other special characters in it? Why does .a.["$"] produce a syntax error?

A: The basic form for accessing the value of a key is .["KEYNAME"] where "KEYNAME" is any valid JSON string, but recent versions of jq also allow ."KEYNAME".

Using the basic form might require explicit use of the pipe symbol, as in .["a-b"]|.["x-y"], but this can be abbreviated to .["a-b"]["x-y"] assuming that it is a terminating expression (i.e., the last component of a pipeline).

In fact, if the expression E | .[F] is valid and terminating, then it can be abbreviated to (E)[F], or even E[F] if E is sufficiently simple, e.g. if E is a jq identifier or a dot-separated string of such identifiers. A jq identifier is an alphanumeric string beginning with an alphabetic character, where "alphabetic character" here includes the underscore (_).

Applying these rules, it is apparent that .a | .["$"] can be abbreviated to .a["$"] provided it is a terminating expression. Since there is no rule allowing .a.["$"], it is syntactically invalid, at least in jq 1.6 and before.

𝑸: How can "in-place" editing of a JSON file be accomplished? What is jq's equivalent of sed -i?

A: Currently, jq does not have an option to edit a file "in-place" in the manner of the -i option of sed or ruby. There are several alternatives, but using tee or output redirection (>) to overwrite the input file is not recommended, even if it seems to work. (See e.g. http://askubuntu.com/questions/752174).

Here are two reasonable approaches:

(1) Use an explicit temporary file.

For example:

jq ... input.json > tmp.json && mv tmp.json input.json

A more elaborate variation might use mktemp and might check whether the temporary file is empty or identical to the source file.

(2) Use a command-line utility.

For example:

If concurrency is an issue, you will probably want to use flock or chflags uchg.

𝑸: Given an array, A, and a filter, f, what is an efficient way to find the least integer, $i, such that (A[$i]|f) == $x?

A: With the following stream-oriented filter, you would write: index_by(A[]; f == $x) on the understanding that null is returned if there is no such $i.

# For compatibility with index/1, index_by returns null if f is always falsey.
def index_by(stream; f):
  label $out
  | foreach stream as $s (-1; .+1; if ($s|f) then ., break $out else empty end) // null;

Note that A | map(f) | index($x) might differ from index_by(A[]; f == $x) because f might not always evaluate to a single JSON entity at every item in the array A.

𝑸: Given an array, A, containing an item, X, how can I find the least index of X in A? Why does [[1]] | index([1]) return null rather than 0? Why does [1,2] | index([1,2]) return 0 rather than null?

A: The simplest uniform method for finding the least index of X in an array is to query for [X] rather than X itself, that is: index([X]).

By contrast, the filter index([1,2]) attempts to find [1,2] as a subsequence of contiguous items in the input array. This is for uniformity with the behavior of t | index(s) where s and t are strings.

If X is not an array, then index([X]) may be abbreviated to index(X).

Since index/1 is implemented in a computationally inefficient manner, writing index_by(A[]; . == X) may be preferable to writing A | index([X]) under certain circumstances.

𝑸: Why does index return the wrong integer index when given a string with non-ASCII characters? Why does "”#a" | index("#a") yield 3 instead of 1?

A: index/1 is byte-oriented when processing strings. See How-to:-Avoid-Pitfalls#index1-is-byte-oriented-but-match1-is-codepoint-oriented

𝑸: Which date-time functions are sensitive to environment variables?

A: strflocaltime and localtime in jq 1.6rc1 depend on the TZ (time-zone) environment variable, e.g.:

$ TZ=FR jq1.6 -cn 'now|localtime[:5]'
[2018,1,27,6,29]

$ TZ=EST jq1.6 -cn 'now|localtime[:5]'
[2018,1,27,1,29]

TZ=EST jq1.6 -cn 'now|strflocaltime("%Y-%m-%dT%H:%M:%S EST")'
"2018-02-27T01:34:59 EST"

TZ=Asia/Shanghai jq1.6 -nr '1543200371|strflocaltime("%Y-%m-%dT%H:%M:%S %Z")'
2018-11-26T10:46:11 CST

WARNING: strflocaltime in jq version 1.6 and earlier has silent bugs. We suggest you check any results involving strflocaltime, e.g. using gojq, the Go implementation of jq, or use gojq instead for such transformations.

𝑸: How can I "zip" two arrays together? Why doesn't jq have a "zip" function for zipping together two arrays?

A: Use transpose/0, which has more functionality than the typical "zip" function.

𝑸: How can a variable number of arguments be passed to jq? How can a bash array of values be passed in to jq as a single argument?

A: Here is an example showing how embedded spaces in the values can be handled in the context of a bash shell:

$ x=(1 "a b" 2)
$ jq -n --argjson args "$(printf '%s\n' "${x[@]}" | jq -nR '[inputs]')" '$args'
[
  "1",
  "a b",
  "2"
]

This approach is applicable so long as none of the values contains a newline character. To use NUL as the separator, consider:

$ jq -n --argjson args "$(printf '%s\0' "${x[@]}" | jq -Rsc 'split("\u0000")')" '$args'
[
  "1",
  "a b",
  "2"
]

As of February 25, 2017, a variable number of JSON arguments can be passed to jq on the command line using the "--args" and/or "--jsonargs" command-line options. See the manual for details.

𝑸: Is jq's sort stable?

A: As of January 18, 2016 (7835a72), the builtin sort filter is stable; prior to that, stability was platform-dependent. This means that stability is NOT guaranteed in jq 1.5 or earlier.

𝑸: How can a stream of JSON entities be collected together in an array?

A: For streams generated within a jq program, one approach is simply to wrap the generator within square brackets, e.g. [range(0,10)]. Another option is to use reduce, e.g. reduce range(0;10) as $i ([]; . + [$i]).

For an external stream of JSON entities (e.g. in a file or from an invocation of curl), use the -s (--slurp) command-line option if you are using jq 1.4. For example, the following jq command will emit an array consisting of the input entities:

jq -s .

jq 1.5 includes the streaming filter inputs, which would normally be used in conjunction with the -n option, as in these examples, which produce the same result, namely [1,2]:

$ (echo 1; echo 2) | jq -nc '[inputs]'

$ (echo 1; echo 2) | jq -nc 'reduce inputs as $row ([]; . + [$row])'

𝑸: What is the equivalent of XPath's // expression? How can I find the value of a given key, no matter how deeply nested the object is? How can I find the path to a slot?

A: XPath's // expression allows one to select nodes in an XML document, no matter where they are. For example, //book selects all "book" elements.

The corresponding expression in jq is .., which yields a stream; it is typically used with the ? operator and the empty filter as in this example:

$ jq -nc '[{},{"book":10}] | .. | .book? // empty'
10

Similarly, the jq expression for finding all the paths to "book" nodes is path(..|book? // empty); for example:

$ jq -nc '[{},{"book":1}] | path(.. | .book? // empty)'
[1,"book"]

To find all the paths to objects which have a key named "book":

$ jq -nc '[{},{"book":1}] | path(.. | select(type == "object" and has("book")))'
[1]

𝑸: How to extract parts of JSON into shell variables?

A: jq has a way to format text in a shell-safe way. For example, this:

$ eval "$(jq -r '@sh "a=\(.a) b=\(.b)"' sample.json)"

sets shell variables $a and $b to the .a and .b values in the input, assuming these values are atomic (i.e., neither arrays nor objects). To avoid using eval, consider using a bash array, e.g.:


$ data=( $(jq -n '"a\tb","c"| @sh' )  )
$ echo "${data[0]}"
"'a\tb'"

See also the next Q.

𝑸: How can a stream of JSON texts produced by jq be converted into a bash array of corresponding values?

A: One option would be to use mapfile (aka readarray), for example:

mapfile -t array <<< $(jq -c '.[]' input.json)

An alternative that might be indicative of what to do in other shells is to use read -r within a while loop. The following bash script populates an array, x, with JSON texts. The key points are the use of the -c option, and the use of the bash idiom while read -r value; do ... done < <(jq .......):

#!/bin/bash
x=()
while read -r value
do
  x+=("$value")
done < <(jq -c '.[]' input.json)

𝑸: How can environment variables be passed to a jq program? How can a jq program be parameterized?

A: (1) In jq version 1.4, the primary mechanisms for passing in parameters and/or environment variables are the --arg and --argfile command-line options, e.g. at a Mac or Linux prompt:

$ jq -n -r --arg x abc '$x, ("def" as $x | $x), $x' 

will emit:

abc
def
abc

Note that values passed in in this manner are always strings. Recent versions of jq also have the --argjson option. See the jq manual for further options and details.

(2) Careful use of quotation marks can also be helpful, e.g.

$ hello="Goodbye"; jq -n '"He said '"$hello"'!"'

(See the Windows section below regarding quotation marks at a Windows command-line prompt.)

(3) In a shell script, cat << EOF and/or cat << 'EOF' can be helpful.

(4) In sufficiently recent versions of jq (jq>1.4), exported environment variables can be read as illustrated by this snippet:

$ export hello="Goodbye"; jq -n 'env.hello'
"Goodbye"

𝑸: How can I sort an array of strings by length? How can I sort an array using multiple criteria?

A: The key to both questions is sort_by/1. For example, to sort an array of strings by their lengths, one could simply use sort_by(length).

To sort by multiple criteria, we use the fact that jq's sort sorts arrays lexicographically. This means we can simply provide the set of sorting criteria as an array. For example, suppose a triangle is represented by a triple [a, b, c] where each component is the length of one side, and that we wish to sort the triangles first by perimeter, and then by the length of the maximum side. The filter to use is: sort_by( [add, max] ).

For example:

$ jq -c -n '[ [3,4,5], [3,4,6], [3.5, 3.5, 5]] | sort_by( [add, max] )'
[[3,4,5],[3.5,3.5,5],[3,4,6]]

Note: The jq 1.4 reference manual deprecates sort_by/1 in favor of sort/1, but the deprecation has been retracted.

𝑸: How can I convert JSON-P (JSONP) to JSON using jq?

A: Assuming that the padding takes the form of a function call:

$ jq -R  'capture("\\((?<x>.*)\\)[^)]*$").x | fromjson'

or if your jq does not support regular expressions:

$ jq -R 'explode | .[1+index("("|explode): rindex(")"|explode)] | implode | fromjson'

At a Windows command-line prompt, one could put either of the above jq filters into a file and invoke jq with the -f option, or escape the quotation marks, e.g.:

C:\ jq -R  "match(\"\\((?<x>.*)\\)[^)]*$\").captures[0].string | fromjson"

This command could be used in a pipeline, along the following lines:

curl ..... | jq -R ..... | jq .....

𝑸: Why is there no filter like to_values for accessing all the values of an object?

A: The expression .[] emits a stream of the input object's values; if necessary they can be wrapped into an array by writing [ .[] ]. See also map_values/1 in the manual.

𝑸: How can I recursively eliminate null-valued keys?

A: walk(if type == "object" then with_entries(select(.value != null)) else . end)

For example, using the above filter, {"a": {"b": 1, "c": null}} would be transformed to {"a": {"b": 1}}

Note that walk was only introduced as a builtin after jq 1.5 was released. If your jq does not include walk, simply include its definition before invoking it, or add it to your ~/.jq initialization file:

# Apply f to composite entities recursively, and to atoms
def walk(f):
  def w:
    if type == "array" then map(w)
    elif type == "object"
    then . as $in
    | reduce keys_unsorted[] as $key
        ( {}; . + { ($key):  ($in[$key] | walk(f)) } )
    else .
    end | f;
  w;

(This is an optimized version.)

𝑸: How can I use jq as a template engine?

A: See https://github.com/stedolan/jq/wiki/Cookbook#using-jq-as-a-template-engine

𝑸: How can I select a specific set of key-value pairs from a JSON object? How can I use one object as a template for querying another? How can I delete keys from an embedded object?

A: (1) If the goal is to create an object with a set of specific keys known ahead of time, consider this example:

$ jq -c -n '{"a": 1, "b": null, "c":3} | {a,b,d}'
{"a":1,"b":null,"d":null}

If the goal is to create an object as above but omitting fields which are undefined in the target object, then the following filter will do the job:

def query(queryobject):
  with_entries( select( .key as $key | queryobject | has( $key ) ));

Example:

$ jq -c -n '{"a": 1, "b": null, "c":3} | query( {a,b,d} )'
{"a":1,"b":null}

To delete keys based on their values, consider this example:

echo '{"outer": { "a": "delete me", "b": "delete me too", "keep": 1} }' |\
  jq '.outer |= with_entries(select(.value|tostring|test("delete")|not))'
{
  "outer": {
    "keep": 1
  }
}

(2) If the specific set of keys is not known ahead of time, then query as defined immediately above can still be used. If the keys are known as a list of strings, then reduce could be used, e.g. if the target object is $o and the list of keys is $l:

reduce $l[] as $key ({}; . + { ($key): $o[$key] })

See also the preceding question regarding the recursive removal of key-value pairs.

𝑸: How can I rename the keys of an object programmatically?

A: One way to rename the keys of an object is to use with_entries, e.g.

with_entries( if .key | contains("-") then .key |= sub("-";".") else . end)

To rename keys recursively, see the Q defining translate_keys(f) below.

𝑸: How can I delete an element from an array by index? How can I delete all elements by value?

A: To delete an element from an array at index (offset) 1, consider these examples:

$ jq -cn '[0,10,20] | del(.[1])'
[0,20]

$ jq -cn '[0,10,20] | .[0:1] + .[2:]'
[0,20]

$ jq -cn '[0,10,20] | delpaths([[1]])'
[0,20]

$ jq -cn '[0,10,20] | .[1] = null | map(select(.!=null))'
[0,20]

$ jq -cn '[0,10,20] | [.[0,2]]'
[0,20]

To delete all occurrences of a particular value, use array subtraction as it retains the ordering:

$ jq -cn '[0,10,20,10,30] - [10]'
[0,20,30]

𝑸: How can I merge two JSON objects?

A: The + operator can always be used to merge two objects, but + resolves conflicts simply by ignoring the conflicting values in the left-hand-side operand. The * operator is also available (see the jq Manual for details). To resolve conflicting values, say v1 and v2, by combining the two values into an array, see this gist. The "combine" filter defined there achieves commutativity and associativity by using "unique". See also the next Q&A.

𝑸: How can I convert an array of objects into an object of corresponding arrays? How can I meld an array of objects, $a, into a single object with keys, $k, such that .[$k][$i] is $a[$i][$k]?

A:

def meld: . as $in | reduce (add|keys[]) as $k ({}; .[$k] = [$in[] | .[$k]]);

Example:

[{a:1,b:10}, {a:2,c:3}] | meld

produces:

{"a":[1,2],"b":[10,null],"c":[null,3]}

𝑸: How can I create and initialize an array of a specific size? An m by n matrix?

A: To create an array of n+1 nulls, one can write:

[][n] = null

In practice, one is more likely to use range and/or reduce, e.g. to create an array of n 0s:

[range(0;n) | 0]

or:

reduce range(0;n) as $i ([]; . + [0])

Here is a function that produces a representation of an m by n matrix with initial value specified by its input:

def matrix(m;n): . as $init
  | [ range(0; n) | $init ] as $row
  | [ range(0; m) | $row ];

𝑸: If the condition in an "if-then-else-end" statement is not satisfied, is it possible to emit nothing? Can I omit the "else" clause?

A: if TEST then VALUE else empty end

Or you could just write select(TEST) | VALUE.

In jq 1.6 and earlier, the "else" clause cannot be omitted, but you can write your own "if-then-else" filter. A particularly useful complement to select is when/2 defined as follows:

def when(COND; ACTION): if COND? // null then ACTION else . end;

Thus, for example, 0 | when(empty; 1) emits 0.

As of February 20, 2019, the "master" version of jq allows the "else" clause to be omitted; doing so is equivalent to writing else ..

𝑸: How does one append an element to an array?

A: If a is an array, then a + [e] will result in a copy of a with e appended to it. This is often seen in expressions such as . + [e] and a += [e].

There are alternatives that may be (very marginally) more efficient. Assuming that a is an array, these expressions will also produce a + [e]:

 a | setpath( [length] ; e )

 a | .[length] = e

𝑸: How can I strip off those pesky double-quotation marks?

A: To output top-level strings without quotation marks, consider using the "-r" (--raw-output) option of the jq command. Often this option together with string interpolation, join/1, or @tsv can be used to achieve the desired effect. If you want to remove the quotations marks while retaining the two-character sequence \n for embedded newlines, you might wish to use the -r option in conjunction with gsub("\n"; "\\n").

Within a jq program, if s is a string that has outer quotation marks (e.g. ""abc"") then using a sufficiently recent version of jq, s[1:-1] will do the job; otherwise try s | s[1:length-1].

To check whether s[1:-1] is supported in your version of jq, try this at the command-line prompt ($ for Mac/Linux, > for Windows):

$ jq -n '"\"abc\""[1:-1]'
> jq -n "\"\\\"abc\\\"\"[1:-1]"

If neither of the above options is applicable, consider using sed.

𝑸: How can I convert a string to uppercase? To lowercase?

A: If your version of jq does not have ascii_downcase and ascii_upcase, then you might want to use their definitions:

# like ruby's downcase - only characters A to Z are affected
def ascii_downcase:
  explode | map( if 65 <= . and . <= 90 then . + 32  else . end) | implode;

# like ruby's upcase - only characters a to z are affected
def ascii_upcase:
  explode | map( if 97 <= . and . <= 122 then . - 32  else . end) | implode;

𝑸: Can I use jq in the shebang line? Can a jq script be turned into an executable command?

A: Yes, if your operating system supports them.

The trick to using jq in the shebang line is NOT to specify an argument for the -f option.

For jq>1.4, a suitable shebang line for a script that read from stdin would be:

#!/usr/bin/env jq -Mf

For earlier versions of jq, you may be able to use a shebang line such as:

#!/usr/local/bin/jq -n -M -f

As of December 2015, jq>1.5 also supports creating jq executables using exec as illustrated by the following:

#!/bin/sh
# this next line is ignored by jq, which otherwise does not continue comments \
exec jq -nef "$0" "$@"
# jq code follows
true

(Notice the trailing \ at the end of the second line.)

In both cases (that is, with jq on the shebang line, or using exec jq), arguments can be passed in to the script using the --arg NAME VALUE option. For example, if the script was in a file named "shebang" in the current directory, you could type:

./shebang --arg x 123

Then the value of $x in the jq program would be the string "123".

𝑸: How can I modify some or all of the keys in a JSON entity, no matter how deeply nested? How can I rename keys programmatically?

A: If your jq has walk/1 as a builtin, then it can be used as described below; otherwise, you can simply include its definition (available on this page or, for example, from https://github.com/stedolan/jq/blob/master/src/builtin.jq) in your jq script.

The following filter recursively walks the input JSON entity, changing each encountered key, k, to (k|filter), where "filter" is the specified filter, which may be any jq expression. The usual clobbering rules regarding duplicate object keys apply.

def translate_keys(f):
  walk( if type == "object" then with_entries( .key |= f ) else . end);

Example 1:

{"a": 1, "b": [{"c":2}] } | translate_keys( "@" + . )

yields:

{"@a": 1,"@b": [ {"@c":2} ]}

Example 2:

{"a": 1, "b": [{"c":2}] } | translate_keys( if . == "c" then "C" else . end )

yields:

{"a":1,"b":[{"C":2}]}

𝑸: How can I sort an inner array of an object? How can I sort all the arrays in a JSON entity? How can I modify a deeply nested array?

A: If the path to the inner entity is known, one can use |= as illustrated here:

{"array": [3,1,2] }
| .array |= sort

To sort all arrays in a JSON entity, no matter where they occur, one option is to use walk/1:

walk( if type == "array" then sort else . end )

(If your jq does not have walk/1 as a builtin, see elsewhere on this page for a link to its definition in jq.)

Alternatively, you could use a recursive procedure such as the following:

# Apply f to arrays no matter where they occur
def recursively(f):
  . as $in
  | if type == "object" then reduce keys_unsorted[] as $key
      ( {}; . + { ($key):  ($in[$key] | recursively(f)) } )
  elif type == "array" then map( recursively(f) ) | f
  else .
  end;

For example:

{"a": [3,[30,10,20],2], "b": [3,1,2] } | recursively(sort)

yields:

{"a":[2,3,[10,20,30]],"b":[1,2,3]}

𝑸: Can jq process CSV or TSV files? What about XML? YAML? TOML? HTML? BCOR? msgpack? ...

A: TSV files (that is, files in which each line has tab-separated values) can be read naively using the following incantation:

$ jq -R -s 'split("\n") | map( split("\t") )'

This simply produces an array of arrays, each representing one row. If the first row contains headers, and if it is desired to produce an array of objects with the headers as keys, then see the Cookbook

CSV files are trickier, firstly because there are several "standards", and secondly because of certain intrinsic potential complexities.

One option is to use the "csv2json" program at https://github.com/fadado/CSV; this is both very fast and quite tolerant. Each CSV record in the input is converted into a JSON array except that by default, empty records in the input produce a JSON null value.

Consider also using cq and/or jc for querying CSV; these packages also support YAML files, as does gojq. For more about YAML, see the Question about YAML elsewhere on this page, and similarly for HTML and TOML.

faq "Format-Agnostic-jQ" supports:

  • BSON
  • Bencode
  • JSON
  • Property Lists
  • TOML
  • XML
  • YAML

Note that the kislyuk version of yq (https://github.com/kislyuk/yq) also includes an executable, xq, for handling XML.

See also the previously mentioned recipe.

To process other formats, please use the appropriate transmogrification tools. Often a Google search using terms such as bcor2json or msgpack2json will yield some nuggets.

𝑸: How can I convert my JSON to CSV? To TSV?

A: There are many possibilities, but here's a link to an answer on stackoverflow.com that deals with the case of an array of arbitrary JSON objects, and which can readily be tailored to specific requirements: https://stackoverflow.com/a/57980447/997358

𝑸: How can I view all the paths leading to a particular key, together with its value, no matter how deeply nested the corresponding object is?

A: Suppose we are interested in every occurrence of the "country" key in a JSON entity. The following filter yields a stream of [PATH, VALUE] pairs, where PATH shows the path to an object having "country" as a key, and VALUE is the corresponding value:

[paths( .. | select(type=="object" and has("country")) )][-1] as $path
| [$path, getpath($path + ["country"])]

Example output:

[["addresses","address_name","address_spec"],"USA"]

𝑸: Is there a way to have jq keep going after it hits an error in the input file? Can jq handle broken JSON?

A: Yes, though in general, preprocessing (e.g. using hjson or any-json) would be preferable. Also, there are more options if you have jq 1.5.

If you do not have jq 1.5 or later, then consider using the -R option to read each line as text. If each JSON entity is on a separate line (as is often the case with log files, for example), then you can use a filter such as fromjson? // "sorry" to ensure that each line will always yield a JSON entity.

If you have jq 1.5 or later, there are two additional techniques.

The first uses the --seq option, documented in the manual and discussed in the blog entry at https://blog.jpalardy.com/posts/handling-broken-json-with-jq/. Here is an illustration:

echo $'1\ntwo\n3' | sed "s/^/$(printf '\36')/" | jq --seq . 2> /dev/null | tr -d '\36'
1
3

The second additional technique that is available with jq 1.5 or later uses the try/catch feature. For example, if each line of a file is either a self-contained JSON text or a non-JSON string, we could pretty-print the JSON and convert the non-JSON strings to JSON strings as follows:

jq -R '. as $line | try fromjson catch $line'

Here is an example using the inputs filter, which allows the JSON text to be spread over multiple lines. It could thus be used with a comma-delimited stream of JSON values with the delimiting commas are not followed by a JSON value on the same line:

# This is a variant of the built-in `recurse/1`:
def iterate(f): def r: f | (., r); r;

iterate(inputs?) | length

Here is another example using iterate/1, showing how you can also get some information about the location of the non-JSON:

iterate(try (inputs | [., "length is \(length)"]) catch "Failed at \(input_line_number)" )

Note that when inputs is used to read a file, jq should normally be invoked with the -n command-line option.

𝑸: How can one efficiently determine whether a stream is empty or not?

A: def isempty(s): 0 == ((label $go | s | 1, break $go) // 0);

This was added as a built-in after the release of jq 1.5. isempty is unlikely to be useful with inputs.

𝑸: How can one set the exit code of jq to signal an error in the event that an attempt is made to access an undefined key? Can the return code be set to 1 on an "out-of-bounds" error? Is there a flag to alter the semantics of array access?

A: jq does not have a global flag to alter the behavior of the fundamental operations for accessing the contents of JSON arrays or objects. If you want an "out-of-bounds" or "key-does-not-exist" error condition to be raised when an attempt to access the value at a non-existent index or key is made, then simply define a function with the desired semantics. The following function can be used for both arrays and objects:

# f should be a JSON string (a key name) or integer (an index)
def get(f): if has(f) then .[f] else error("\(type) is not defined at \(f)") end;

A similar function can be defined for "put" operations.

If you do not want to be bothered with the error message that would be sent to STDERR if the error condition is raised, you can redirect it to /dev/null (or NUL or perhaps null in a Windows environment), e.g. along the lines of: jq .... 2> /dev/null

Conversely, if you just want an error message to be written on "stderr" without an error condition being raised, you could use the stderr filter.

𝑸: How can I print the items in an array together with their indices? Given an array, a, of elements a[i], how can I generate the array with elements [i, a[i]]? How can I add a counter to a stream of items?

A: One approach is to use range, e.g.

["a","b"] | range(0; length) as $i | [$i, .[$i]]

Another is to use to_entries, e.g.

["a","b"] | to_entries[] | [.key, .value]

Yet another is to use transpose, e.g.

["a","b"] | [[range(0; length)], .] | transpose[]

If your jq has foreach, then for streams, you can adopt the technique illustrated by this generic filter:

# Given a stream, s, of values, emit a stream of [id, value] pairs,
# where id is a counter starting with the given number
def counts(s; start): foreach s as $value (start-1; .+1; [., $value]);

𝑸: Does jq support JSONL (JSON Lines)? How can I sort a stream of JSON objects by some key?

A: Yes, because jq is stream-oriented.

One way to convert JSONL input to a JSON array is to use the -s ("slurp") option. If your jq has inputs, then that may also be helpful in processing JSONL input.

The key to producing JSON Lines is the -c ("compact") option. To convert a JSON array into a stream of its elements, simply pipe it into .[]. In many cases, one can simplify " _ | .[]" to just "_[]" as in the following example, which shows how to sort a stream of JSON objects by some key:

jq -s -c 'sort_by(.id)[]'

𝑸: Why is jq not loading my module correctly?

A: If the rules for determining where jq will look for a module specified with the include or import directive are not working in your favor, consider specifying the pathname as part of the directive itself, e.g.

include "foo" {search: "~/jq/library"};

For further details about using modules, see the jq manual and Cookbook.

Numbers

𝑸: Why does jq convert floating-point integers (such as 2.0) to plain integers (2)? Why isn't the precision of numbers preserved?

A: For releases up to and including 1.6, the jq parser converts JSON numbers to IEEE 754 64-bit values, and the original JSON representation is lost. For numbers that are very large or very small in magnitude, the loss of precision will be significant. On Oct 21, 2019, a commit was made that generally ensures that the "external" representation is retained except for superfluous leading 0s.

Note that gojq, the Go implementation of jq, supports unbounded-precision integer arithmetic, and retains the full precision of integers but not floats.

𝑸: Why do 1E1000 and infinite both print as 1.7976931348623157e+308 ? Why does nan print as null?

A: For releases up to and including jq 1.6, the JSON number 1E1000 is represented internally as the IEEE 754 value for infinity, which prints as shown. However, the jq expression infinite == 1.7976931348623157e+308 returns false. To test whether a jq numeric value is equal to infinite, you can use the filter isinfinite. The jq filter nan evaluates to the IEEE value for NaN, which prints as null.

𝑸: What are the largest (smallest) numbers that jq can handle? Does "overflow" cause an error?

A: Currently, jq does not include "bigint" support, but gojq, the Go implementation of jq, does support unbounded-precision integer arithmetic.

Versions up to and including 1.6 of the C implementation of jq convert very large integers to a floating point approximation on being read. For these versions of jq, the largest number that can be reliably used as an integer is 2^53 (9,007,199,254,740,992). The largest floating point value is about 1.79e+308 and the smallest is about 1e-323.

In general, arithmetic operations do not raise errors, except that in jq 1.5, division by 0 does result in an error. In jq 1.4, 1/0 prints out as 1.7976931348623157e+308, and 0 * (1/0) evaluates to null.

A basic "bigint" library using string representations of integers is available at Bigint.jq.

𝑸: What mathematical functions are supported?

A: The answer varies from version to version, and by platform. As of Feb 13, 2017 (revision 1c806b), jq has a builtin function, builtins/0, that produces an array of strings of the form "FUNCTION/ARITY", one for each builtin function.

The following functions should be generally available:

acos, acosh, asin, asinh, atan, atanh, cbrt, cos, cosh, exp2, exp, floor, j0, j1, log10, log2, log, sin, sinh, sqrt, tan, tanh, tgamma, y0, y1 – these are the standard mathematical functions available in C. They are all 0-arity filters.

In addition, the 0-arity filter length is defined so that if its input is numeric, its output will be the absolute value of that input; for example: -1.1 | length yields 1.1

As of June 28, 2015, the following are also provided on most platforms and have their standard "libm" definitions: atan2/2, hypot/2, pow/2, remainder/2.

Examples: atan2(1;1), hypot(3;4), pow(2;3), remainder(5; -2) yields:

0.7853981633974483
5
8
1

nan/0 and infinite/0 are also defined so that nan | isnan and infinite | isinfinite evaluate to true.

Databases

𝑸: What database bindings are available?

A: See SQLite Bindings below.

𝑸: How can I pipe the output from non-JSON data in POSTGRESQL to jq?

A: Use row_to_json in conjunction with the -t flag:

psql ${pg_uri} -c 'SELECT row_to_json(t) FROM (SELECT * FROM two_rows) t;' -t

𝑸: How can I use jq to import JSON into an SQLite database?

A: One way is to use sqlite3, the SQLite CLI. The main idea is to use .mode line in conjunction with the .import command.

For example, suppose input.json contains a JSON array, each top-level item of which is to become a row in a table named input. One could proceed as follows:

sqlite> create table input (
    json JSON
);
sqlite> .mode line
sqlite> .separator "\t"
sqlite> .import "|jq -c .[] input.json" input
sqlite> select * from input limit 1;
 json = {"key_one":1,"key_two":2,"key_three":3}
sqlite> 

For very large JSON files, you might wish to use jq's "streaming parser", i.e. the --stream option. See also the sqlite_jq SQLite extension described below.

Definition of builtins

Note: As of Feb 13, 2017 (revision 1c806b), jq has a builtin function, builtins/0, that produces an array of strings of the form "NAME/ARITY", one for each builtin function.

𝑸: How can I view the definition of a jq builtin function?

A: https://github.com/stedolan/jq/blob/master/src/builtin.jq

𝑸: How can I circumvent the limitation that leaf_paths does not generate paths to null?

A: Assuming your jq's paths includes paths to null, the following definition may be used instead:

def all_leaf_paths:
  paths as $p
  | select( getpath($p) | type | IN("array", "object") | not )
  | $p;

Note that in jq 1.4 and jq 1.5, the paths filter skips paths to null. To include paths to null when using these versions, you could, for example, use allpaths defined as:

def allpaths:
  def conditional_recurse(f):  def r: ., (select(.!=null) | f | r); r;
  path(conditional_recurse(.[]?)) | select(length > 0);

This definition of allpaths is written so that anyone who wants to define conditional_recurse/1 as a top-level filter can easily do so. For reference:

  • the builtin recurse(f) is defined in terms of: def r: (f | select | r);
  • conditional_recurse(f) is defined in terms of: def r: (select | f | r);

See also #1163

𝑸: Is it possible to redefine jq builtins?

A: Yes, but the redefined builtin will only be effective with respect to invocations that occur after it has been redefined.

For example, if you wanted to redefine paths in accordance with the definition of allpaths in the previous Q, you could simply add the appropriate definition (def paths: ...) before any invocation of paths. With a little care, you can even have your cake (the new definition) and eat it (have access to the old definition) by following the model:

def old_filter: filter;
def filter: ....;

If you want to compare the performance of a builtin with an alternative definition, you can simply redefine it.

𝑸: Why does map_values(select(...)) produce the wrong result in my program? How can I use map_values/1 to delete keys?

A: In versions of jq available before Jan 30, 2017 (revision bd7b48c), map_values(select(g)) will yield the empty stream if select(g) is empty. This is generally not what is intended.

If you want to use map_values/1 to delete keys without being dependent on having a sufficiently recent version of jq, then the following alternative definition has much to recommend it:

def map_values(f):
  with_entries( [.value|f] as $v | select( $v|length == 1) | .value = $v[0] ) ;

You may wish to include this in your ~/.jq file or standard jq library.

"or" versus "//"

𝑸: What is the difference between the binary operators "or" and "//" ?

A: Both have a short-circuit behavioral semantics, but the two operators are otherwise very different. In a nutshell, given two expressions, A and B:

"A // B" either produces the truthy elements of A if there are any,
or else the entire stream B.

"A or B" produces a (possibly empty) stream of boolean values that 
is computed in an entirely different way, namely as the concatenation
of the streams a1 or B, a2 or B ...

(A JSON value is said to be "falsey" if it is null or false, and "truthy" otherwise.)

Here are the details regarding "A or B":

(i) If a and b are expressions each producing a single JSON value, then

"a or b" evaluates to true if a is truthy or if b is truthy, and false otherwise. 

(ii) If A is a (possibly empty) stream then:

A or empty evaluates to A
empty or A evaluates to empty

(iii) If A and B are expressions producing non-empty streams of values, (a1, ...) and (b1 ...) respectively, then:

'A or B' produces the concatenation of the streams: a1 or B, a2 or B, ...,

where ai or B evaluates to true if ai is truthy, and otherwise to the boolean stream:

false or b1, false or b2, ...

Example 1:

(null,1) or (2,3)

produces (true, true, true) - the first two values come from evaluating null or (2,3), and the third comes from evaluating 1 or (2,3).

Example 2:

(null, 1, null,2) // (10, 20)

produces (1, 2)

Related Resources

Tutorials

𝑸: What tutorials are available for jq?

See also the jq Cookbook, the jq Language Description, How-to:-Avoid-Pitfalls, and the examples at Rosettacode.org.

Editor Bindings

𝑸: What bindings are available for Atom?

A: https://atom.io/packages/language-jq

𝑸: What bindings are available for emacs?

A: https://melpa.org/#/jq-mode

𝑸: What bindings are available for vim?

A: https://github.com/vito-c/jq.vim

Language Bindings

𝑸: What language bindings are available for Rust?

A: https://crates.io/crates/jq-rs

𝑸: What language bindings are available for Java?

A:

𝑸: What language bindings are available for Python?

A:

pip install jq # For details, see https://pypi.python.org/pypi/jq
pip install pyjq # For details, see https://pypi.python.org/pypi/pyjq

𝑸: What language bindings are available for R?

A: See https://cran.r-project.org/web/packages/jqr/index.html

𝑸: What language bindings are available for Ruby?

A:

gem install ruby-jq # For details, see https://github.com/winebarrel/ruby-jq

𝑸: What language bindings are available for PHP?

A: A jq extension for PHP is available from https://github.com/kjdev/php-ext-jq

𝑸: What language bindings are available for node.js?

A: https://github.com/sanack/node-jq and https://github.com/port-labs/jq-node-bindings

𝑸: What language bindings are available for browsers?

A: https://github.com/fiatjaf/jq-web - actually, a wrapper around a emscripten-compiled jq.

𝑸: What language bindings are available for Perl?

A: https://github.com/dxma/perl5-json-jq

SQLite Bindings

𝑸: What bindings are available for SQLite?

A:

  • sqlite_jq is an SQLite extension based on gojq, the Go implementation of jq.

  • Datasette has a jq binding for SQLite.

The sqlite_jq extension adds an SQLite function jq(JSON, JQ) and a table-valued function jq_each(JSON, JQ).

The jq function can be used for exploration of a column of JSON values, as well as to extract data, and in CREATE INDEX statements, e.g.:

CREATE INDEX name_index ON json_data ( jq(json, '.name') COLLATE NOCASE );

The jq_each function can be used to import data from a JSON file. For example, assuming the file airline_routes.json contains a single object, it can be imported as singleton key-value objects as follows:

insert into raw_data select * from jq_each(readfile('airline_routes.json'),
  '.[]|to_entries[] | {(.key): .value}');

For further details, see https://mgdm.net/weblog/using-jq-in-sqlite/

The datasette-jq plugin adds a custom SQL function for filtering and transforming values from JSON columns using jq. To see it in action, visit https://datasette-jq-demo.datasette.io/ and enter a jq query such as:

select jq(info, "[.info.author, (.info|length)]") from packages limit 10;

Projects

𝑸: How can I use jq interactively?

A:

𝑸: Is there an alternative implementation of jq? What are the differences between jq and gojq?

A: A Go implementation of jq, gojq, lives at https://github.com/itchyny/gojq. Although for the most part it follows the jq specification very closely, there are several significant differences.

Most of the differences in functionality are described on the gojq README page, but for ease of reference, here is a very brief summary of some of the most important of these:

  • gojq does not currently support or respect the ordering of keys;
  • gojq supports unbounded-precision integer arithmetic;
  • gojq supports the reading and writing of YAML files, which is helpful for translation as well as for querying YAML documents using jq syntax and functionality.

In addition, whereas jq handles "$-variables" in function signatures (definition headers) as implicit "as" statements (thus potentially causing surprises), gojq Version 0.12.4 handles them as formal arguments. However, since the release of Version 0.12.4, gojq was changed to track jq's behavior - see https://github.com/itchyny/gojq/issues/107.

𝑸: How can I use jq for YAML instead of JSON?

A:

  • yq transcodes YAML on standard input to JSON and pipes it to jq; yq can also translate JSON back to YAML.

  • y2j provides a wrapper called yq that uses Python to convert YAML to JSON, runs a jq filter on the JSON, and converts the result JSON back to YAML.

  • gojq can read and write YAML texts, as well as JSON.

  • any-json simply converts YAML to JSON.

  • remarshal provides yaml2json and json2yaml scripts, amongst others.

𝑸: How can I use jq for TOML instead of JSON?

A:

  • remarshal provides toml2json and json2toml scripts, amongst others.

    • brew install remarshal
  • npm install --global toml2json installs a toml2json script

𝑸: How can I use jq to process HTML?

A: First convert the HTML to JSON, e.g. using pup (https://github.com/ericchiang/pup) or hq (https://github.com/rbwinslow/hq).

𝑸: How can I use jq to process JavaScript objects that are not JSON but are specified in accordance with the ECMAScript standard?

A: It may be possible to convert the objects to JSON using json5 (https://github.com/json5/json5), about which some further details are given below. Consider also using the JSON.stringify() function of your favorite JavaScript interpreter.

𝑸: What other jq-related projects are there?

A:

Windows

𝑸: Why doesn't jq '.' work? Why aren't my jq commands being parsed properly?

A:

  • Writing jq . should be sufficient to invoke jq's . filter on every platform.

  • Consider placing your jq commands in a file and then invoking jq with the -f FILENAME option.

  • To quote a jq command string in a Windows environment, use double-quotation marks, e.g. jq ".". To quote JSON strings within the command string, use \", for example:

      jq -n "\"Hello world!\""
    

Note also that on Windows, echo is different from on other platforms. In short:

  • correct on Windows: echo "Hello" | jq ". + \" world\""
  • correct on other planets: echo '"Hello"' | jq '. + " world"'

Notable Differences between Versions

In the following:

  • "1.4+" refers to a sufficiently recent version of jq since the release of Version 1.4.
  • "all versions" refers to versions 1.3 and later.

𝑸: Is there a NEWS or Changelog file that documents when a particular feature was released?

A: https://github.com/stedolan/jq/blob/master/NEWS

See also https://github.com/stedolan/jq/releases

𝑸: In which versions of jq is the ordering of the keys of an object preserved?

A: In jq 1.3, the keys are sorted, e.g.

jq -n '{b:1, a:2} | to_entries[].key'

produces the stream: "a" "b". In jq 1.4 and later, the ordering is preserved. Note that keys/0 sorts the keys; to avoid this, keys_unsorted was introduced in jq 1.5.

𝑸: In which version was the abbreviation {$x} for {"x": $x} introduced?

A: Version 1.5

𝑸: What alternatives are there to .["key"] for accessing the value of a key?

A: If "KEY" is an object key beginning with an alphabetic character and composed entirely of alphanumeric characters (it being understood that _ counts here as an alphabetic character), then all versions of jq allow .KEY as an alternative to .["KEY"]. In addition, in jq 1.4+, the form ."KEYNAME" is supported for any valid key name.

𝑸: In which versions of jq are regular expressions supported?

A: 1.4+. See the next section for further details.

𝑸: How can I match a string while ignoring case?

A: For simple matches, consider using ascii_downcase (see above). For more exotic cases, consider:

def equals_ignoring_case($t):
  def quote: gsub("(?<a>[|\\\\{}[\\]^$+*?.])"; "\\\(.a)");
  test( "^\($t|quote)$"; "i");

For regex matches, use the "i" flag with one of the regex filters available in jq 1.4+.

𝑸: How can I access the last element of an array? How can I set the last element of an array?

A: Assuming you are using jq version 1.5 or later, use -1 as the index, as in .[-1] or a[-1]; this applies to assignment expressions as well.

Assuming a is an expression that evaluates to an array, the following techniques can be used with any version of jq:

  • to access the last element: a | .[length-1]
  • to set the last element:
    • a | .[length - 1] = value or
    • a | setpath([length - 1]; value)

𝑸: How can I "slurp" from a secondary file? Is there way to "slurp" a file using the --argfile option? What is the --slurpfile option introduced in jq 1.5?

A: jq normally reads data from "stdin" or the file specified on the command line, e.g. jq . PRIMARY.json. In jq 1.4, the --argfile option allows one to read data from one or more secondary files: the contents of the file will be slurped if and only if it contains more than one JSON entity.

In jq 1.5, the --slurpfile option has been added to allow one to read the contents of an entire file of JSON entities as an array, e.g. jq -n --slurpfile a SECONDARY.json '$a | length' will report how many JSON items were read from the file named SECONDARY.json. The --slurpfile option always slurps the contents of the specified file, even if it is empty.

𝑸: What backwards-incompatible changes have been made since the release of jq 1.5?

A: The following listing may be incomplete.

  • The idiom .foo?//empty must now be written with a space immediately following the question mark, e.g. .foo? //empty

  • empty on RHS of |=

In jq 1.5 and earlier, expressions such as:

{a:1} | .a |= empty

produced null. This was surprising and not very useful.

As the result of a change introduced on January 30, 2017:

 $ jq -n '{a:1} | .a |= empty'
 {}

Note that the consequences of including empty in the body of a reduce statement might be surprising, e.g.:

 $ jq -n 'reduce 2 as $x (3; empty)'
 null

This behavior might also change, so caution should be exercised when including empty in this manner in the body of a reduce statement.

Support for Regular Expressions

Regex support was added soon after the release of jq 1.4.

𝑸: test("\d") does not work! Why can't I use character classes?

A: The regular expression must be given as a JSON string, which means that backslashes must be escaped, as in this example:

$ jq -n '"Is 1 a digit?" | test("\\d")'
true

𝑸: How can I eliminate all control characters in all strings, wherever they occur?

A:

walk(if type == "string" then gsub("\\p{Cc}"; "") else . end)

This will excise ASCII and Latin-1 control characters from all strings (other than key names), and illustrates that Unicode character categories can be specified using the abbreviated forms (here "Cc" rather than "Control").

The following filter will excise all control characters from all strings, including key names:

walk(if type == "string" then gsub("\\p{Cc}"; "")
     elif type == "object" then with_entries( .key |= gsub("\\p{Cc}"; "") )
     else . end)

If your jq does not have walk/1, simply include its definition (search for 'def walk' on this page) before its invocation.

𝑸: Where is the regex (regular expression) documentation?

A: jq uses the PCRE mode of the Oniguruma regex engine. The "master" version of the user documentation for the Oniguruma library is at RE. A still-useful "snapshot" of this documentation is at Docs-for-Oniguruma-Regular-Expressions-(RE.txt). See the following question about determining the version of the Oniguruma engine that has been included with a particular jq executable.

𝑸: How can I tell which version of Oniguruma has been included in a particular jq executable?

A: On a Mac, using lldb:

$ lldb jq
b main
run
p (char*) onig_version()

Using gdb, the debug commands are the same, but on a Mac, you may wish to run gdb under sudo to avoid "code sign" issues.

𝑸: How are named capture variables used?

A: Here are four examples. It is assumed in all cases that the shell allows single quotation marks to be used for quoting a string. The last example illustrates the use of back-references involving named capture variables.

In the first example, we want to extract the numeric prefix of a "semantic version" specification:

$ echo '{"VERSION": "0.2.1-alpha+abxc23"}' |\
    jq '.VERSION | sub("(?<vers>[0-9]+\\.[0-9]+\\.[0-9]+).*"; .vers)'

The result:

"0.2.1"

To capture the variables as a JSON object, use capture/1:

echo '{"VERSION": "0.2.1-alpha+abxc23"}' |
  jq '.VERSION | capture("(?<vers>[0-9]+\\.[0-9]+\\.[0-9]+).*")'
{
  "vers": "0.2.1" 
}

In the next example, notice the use of the form \(.NAME) in the "to-string":

$ jq -n '"abc" | sub( "(?<head>^.)(?<tail>.*)"; "\(.head)-\(.tail)")'
"a-bc"

Finally, here is an example of a back-reference to a named capture variable:

"aa" | test("(?<x>.)\\k<x>") #=> true

𝑸: Oniguruma is no longer available at http://www.geocities.jp. Where is the Oniguruma repository?

A: https://github.com/kkos/oniguruma

Streaming JSON parser

𝑸: How to handle huge JSON texts?

A: jq 1.5 includes a streaming parser that can be used to avoid having to read JSON texts completely before processing them. A search for a needle in a stack, for example, does not first have to create an in-memory representation of the entire JSON text, and can therefore go faster.

Here is an example of how to convert a top-level array of JSON objects into a stream of its elements:

$ jq -n '[{foo:"bar"},{foo:"baz"}]' | jq -cn --stream 'fromstream(1|truncate_stream(inputs))'
{"foo":"bar"}
{"foo":"baz"}
$ 

Notice the use of the "-n" option.

More generally:

$ echo '[{"foo":"bar"},99,null,{"foo":"baz"}]' |
  jq -cn --stream 'fromstream( inputs|(.[0] |= .[1:]) | select(. != [[]]) )'
{"foo":"bar"}
99
null
{"foo":"baz"}
$ 

Here is another example. Suppose we want to extract some information from certain JSON objects in a very large JSON document. For the sake of specificity, let's consider the case where the following would be appropriate except for the size of the JSON document:

.. | objects | select(.class=="FINDME"?) | .id

An alternative solution using jq's streaming parser would be as follows:

foreach inputs as $in (null;
  if has("id") and has("class") then null
  else . as $x
  | $in
  | if length != 2 then null
    elif .[0][-1] == "id" then ($x + {id: .[-1]})
    elif .[0][-1] == "class"
         and .[-1] == "FINDME" then  ($x + {class: .[-1]})
    else $x
    end
  end;
  select(has("id") and has("class")) | .id )

Invocation:

jq -n --stream -f program.jq input.json

See also the jq Cookbook, the stackoverflow.com page How to read a 100gb file with jq, the article Handling Large JSON Files with Streaming, and of course the jq manual.

Processing not-quite-valid JSON

𝑸: Can jq process objects with duplicate keys? Can jq help convert objects with duplicate keys to an alternative format so that no information is lost?

A: The JSON syntax formally allows objects with duplicate keys, and jq can accordingly read them, but the regular jq parser effectively ignores all but the last occurrence of each key within any given object.
jq's streaming parser, however, can be used to convert a JSON object with duplicate keys to an alternative format so that none of the values are lost. This is illustrated at https://stackoverflow.com/questions/69968773/how-to-group-by-within-object-in-jq and https://stackoverflow.com/questions/36956590/json-fields-have-the-same-name.

𝑸: Does jq support the processing of invalid JSON? Can jq be instructed to ignore comments?

A: If you want jq to ignore an error in the input file, see the 𝑸 above (search for the italicized text).

jq cannot be instructed to ignore Javascript-style comments, but see the next 𝑸 about using other tools to filter out such comments.

Apart from the possibility of skipping over invalid input, jq generally expects JSON input to be strictly valid, but JSON literals can be specified in a jq program more flexibly. For example:

$ jq -n '{a: 1}'
{
  "a": 1
}

Thus you may be able to use jq -n -f FILENAME to convert nearly-valid JSON to JSON.

𝑸: How can I convert JavaScript objects to JSON? How can I rectify a not-quite-valid-JSON text? How can I read a file which consists of JSON and comments?

A: As noted in the previous Q, jq itself can be used to transform nearly-valid JSON to JSON in many instances. For example, "#" comments can be removed using jq.

Here are brief descriptions of some other command-line tools that can be used to convert "not-quite JSON" to JSON. Some of these can also be used to remove comments.

relaxed-json

For "Plain Old JavaScript objects", consider https://github.com/phadej/relaxed-json

The relaxed-json command, rjson, can be installed by running:

yarn global add relaxed-json

or:

sudo npm install -g relaxed-json

strip-json-comments

 npm install --global strip-json-comments-cli

jsonlint

The jsonlint script provided by the python demjson package (pip install demjson) can be used as a JSON rectifier by invoking it with the -S and -f options. For example:

$ jsonlint -Sf 
/* This is a comment */
// Another comment
{'a': 1}

produces:

{ "a" : 1 }

For further information, see https://pypi.org/project/demjson/

json5

json5 is a command-line tool for converting JSON5 (a superset of JSON that is also a subset of Javascript) to JSON. In brief:

npm install json5
ln -s ~/node_modules/.bin/json5 ~/bin
json5 -c FILENAME.json5  # generates FILENAME.json

Note that json5 -c FILENAME.SUFFIX will generate FILENAME.SUFFIX.json if "SUFFIX" is not "json5".

Documentation on JSON5 and json5 is at http://json5.org/

any-json

any-json purports to support the transmogrification of the following formats to JSON: cson, csv, hjson, ini, json5, xls, xlsx, xml, yaml.

Example:

$ any-json -format=json
// This line is recognized as a comment even though the input format has been specified as JSON!
/* Line 1
   Line 2
*/
[1,2]

Output:

[
  1,
  2
]

hjson

The hjson website describes a tool, also called hjson, for converting from hjson to JSON. In brief:

npm install hjson -g
hjson -j file.hjson # to convert to JSON

and/or:

pip install hjson
python -m hjson.tool -j file.hjson # convert to JSON
Clone this wiki locally