Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add json_extract function to PPL #3211

Open
acarbonetto opened this issue Dec 18, 2024 · 0 comments
Open

[FEATURE] Add json_extract function to PPL #3211

acarbonetto opened this issue Dec 18, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@acarbonetto
Copy link
Collaborator

acarbonetto commented Dec 18, 2024

Is your feature request related to a problem?

As part of the RFC to add JSON functions, the json_extract function would be useful to extract specific objects and scalars from an existing json_object.

What solution would you like?

### `JSON_EXTRACT`

**Description**

`json_extract(doc, path [, path]...)` Extracts a json value or scalar from a json document based on the path(s) specified.

**Argument type:** STRING, STRING

**Return type:** STRING | NULL | JSON ARRAY

- Returns a JSON array for multiple paths or if the path leads to an array.  
- Return null if path is not valid.

Example:

    os> source=people | eval `json_extract(json('{"a":"b"}'), '$.a')` = json_extract('{"a":"b"}', '$a')
    fetched rows / total rows = 1/1
    +----------------------------------+
    | json_extract('{"a":"b"}', 'a')   |
    +----------------------------------+
    | b                                |
    +----------------------------------+

    os> source=people | eval `json_extract(json('{"a":[{"b":1},{"b":2}]}'), '$.a[1].b')` = json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[1].b')
    fetched rows / total rows = 1/1
    +-----------------------------------------------------------+
    | json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[1].b')   |
    +-----------------------------------------------------------+
    | 2.0                                                       |
    +-----------------------------------------------------------+

    os> source=people | eval `json_extract('{"a":[{"b":1},{"b":2}]}', '$.a[*].b')` = json_extract(json('{"a":[{"b":1},{"b":2}]}'), '$.a[*].b')
    fetched rows / total rows = 1/1
    +-----------------------------------------------------------+
    | json_extract('{"a":[{"b":1.0},{"b":2.0}]}', '$.a[*].b')   |
    +-----------------------------------------------------------+
    | [1.0,2.0]                                                 |
    +-----------------------------------------------------------+

    os> source=people | eval `invalid_json` = json_extract(json('{"invalid": "json"'))
    fetched rows / total rows = 1/1
    +----------------+
    | invalid_json   |
    +----------------+
    | null           |
    +----------------+

What alternatives have you considered?

N/A

Do you have any additional context?

Related spark PPL PR: opensearch-project/opensearch-spark#780

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants