Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSONPath extension revisited #306

Closed
danielaparker opened this issue Feb 3, 2021 · 0 comments
Closed

JSONPath extension revisited #306

danielaparker opened this issue Feb 3, 2021 · 0 comments

Comments

@danielaparker
Copy link
Owner

danielaparker commented Feb 3, 2021

The jsoncons jsonpath extension was introduced in early 2015. Since then, there have been some developments:

With version 0.161.0, the jsonpath extension has been rewritten to incorporate lessons learned from these developments. While a significant rewrite, the functions json_query and json_replace are compatible with earlier versions.

It is expected that most users should have no difficulty moving to the new version. However, the following changes to supported JSONPath syntax should be noted.

  • Previous versions allowed optionally omitting the '$' representing the root of the JSON instance in path selectors. This is no longer allowed. In 0.161.0, all path selectors must start with either '$', if relative to the root of the JSON instance, or '@', if relative to the current node. E.g. books.0 is not allowed, rather, $.books.0.
  • Previous versions supported unions of separate JSONPath expressions, e.g. $..[name.first,address.city]. 0.161.0 does too, but requires that the relative paths name.first and address.city start with a '@', so the example becomes $..[@.name.first,@.address.city] .
  • Previous versions supported unquoted names with the square bracket notation, this is no longer allowed. E.g. $[books] is not allowed, rather $['books'] or $["books"].
  • Previous versions allowed an empty string to be passed as a path argument to json_query. This is no longer allowed, a syntax error will be raised.
  • In 0.161.0, unquoted names in the dot notation are restricted to digits 0-9, letters A-Z and a-z, the underscore character _, and unicode coded characters that are non-ascii. All others names must be enclosed with single or double quotes. In particular, names with hypens (-) must be enclosed with single or double quotes.

Return by value vs access by reference

It is a feature of JSONPath that it selects values in the original JSON document, and unlike JMESPath, does not create JSON elements that are not in the original. Internally, the jsoncons implementation collects pointers to the selected items. Until 0.161.0, json_query was limited to returning an array of values, a copy, while json_replace allowed the user to provide a unary callback to replace an item in the original JSON with a returned value. With version 0.161.0, as an alternative to a copy, json_query supports a binary callback that is passed two arguments - the location of the item in the original, and a const reference to the item. json_replace similarly supports a binary callback, but with a mutable reference.

Parse and evaluate as one operation vs separate

Until 0.161.0, jsoncons did not separate parsing the JSONPath string into tokens from evaluating it against a JSON document. With 0.161.0, jsoncons introduces the function make_expression for creating a compiled form of the JSONPath string that can be used many times.

Duplicates and ordering

Consider the JSON instance

{
    "books":
    [
        {
            "title" : "A Wild Sheep Chase",
            "author" : "Haruki Murakami"
        },
        {
            "title" : "The Night Watch",
            "author" : "Sergei Lukyanenko"
        },
        {
            "title" : "The Comedians",
            "author" : "Graham Greene"
        },
        {
            "title" : "The Night Watch",
            "author" : "Phillips, David Atlee"
        }
    ]
}

with selector

$.books[1,1,3].title

Note that the second book, The Night Watch by Sergei Lukyanenko, is selected twice.

The majority of JSONPath implementations will produce (with duplicate paths allowed):

Path Value
$['books'][1]['title'] "The Night Watch"
$['books'][1]['title'] "The Night Watch"
$['books'][3]['title'] "The Night Watch"

A minority will produce (with duplicate paths excluded):

Path Value
$['books'][1]['title'] "The Night Watch"
$['books'][3]['title'] "The Night Watch"

In 0.161.0, jsoncons::jsonpath::json_query defaults to allowing duplicates, but has an option for no duplicates. jsoncons::jsonpath::json_replace defaults to no duplicates, as updating the same value multiple times would be inadvisable.

By default, the ordering of results is unspecified, although the user may expect array ordering at least to be preserved. In 0.161.0, jsoncons provides an option for sorting results by paths.

Unions

In jsoncons, a JSONPath union element can be

  • an index or slice expression
  • a single quoted name
  • a double quoted name
  • an expression, e.g. (@.length-1)
  • a filter
  • a wildcard, i.e. *
  • a path relative to the root of the JSON document (begins with $)
  • a path relative to the current node (begins with @)

To illustrate, the path expression below selects the second, third, and fourth titles from Stefan Goessner's store:

$.store.book[:-2:1,(@.length-2),?(@.author=='J. R. R. Tolkien')].title
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant