Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: implement array functions for JSON arrays #81707

Closed
wants to merge 1 commit into from

Conversation

HonoreDB
Copy link
Contributor

My answer on
https://stackoverflow.com/questions/72347170/delete-element-from-jsonb-array-in-cockaroachdb/72351955#72351955
made me sad, and it's not the first time I've had trouble manipulating a JSONB array.

There are a few different, non-mutually-exclusive ways we could make this easier. We could allow
aggregate(generator) functions in scalar contexts. We could implement more JSON functions as needed.

This PR instead just overloads every function that takes a json[] argument (which is a datum type
we don't allow in columns) to also be able to take a json array argument, and alters the return type
accordingly. It's not particularly performance-optimized but we can replace individual implementations
if needed.

That said, this is an unauthorized spike, what do you think?

Release note (sql change): Built-in array functions array_append, array_prepend, array_cat, array_remove, array_replace, array_position, and array_positions may now be used with jsonb arrays.

@HonoreDB HonoreDB requested a review from a team May 24, 2022 01:02
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@HonoreDB HonoreDB force-pushed the json_array_overloads branch from 25491be to 709a7dd Compare May 24, 2022 13:35
@jordanlewis
Copy link
Member

Cool stuff!

The danger here is about compatibility. If Postgres ends up with a different definition of what these functions do for JSON one day, we'll be between a rock and a hard place: incompatible with Postgres, but unable to change our behavior for fear of breaking existing applications.

That said, I'm not sure the risk here is super high. I'm curious to hear from @rafiss.

@HonoreDB HonoreDB force-pushed the json_array_overloads branch from 709a7dd to 8640273 Compare May 24, 2022 14:43
My answer on
https://stackoverflow.com/questions/72347170/delete-element-from-jsonb-array-in-cockaroachdb/72351955#72351955
made me sad, and it's not the first time I've had trouble
manipulating a JSONB array.

There are a few different, non-mutually-exclusive ways we
could make this easier. We could allow aggregate(generator) functions
in scalar contexts. We could implement more JSON functions as needed.

This PR instead just overloads every function that takes a json[]
 argument (which is a datum type we don't allow in columns) to also
be able to take a json array argument, and alters the return type
accordingly. It's not particularly performance-optimized but we can
replace individual implementations if needed.

That said, this is an unauthorized spike, what do you think?

Release note (sql change): Built-in array functions array_append, array_prepend, array_cat, array_remove, array_replace, array_position, and array_positions may now be used with jsonb arrays.
@HonoreDB HonoreDB force-pushed the json_array_overloads branch from 8640273 to 2762b63 Compare May 24, 2022 17:39
@ajwerner
Copy link
Contributor

I feel like instead of this, we should implement the postgres14 JSON operator support. See https://aaronbos.dev/posts/update-json-postgresql for an example. @mgartner and I discussed this as an area for an intern project. The last thing I want is for us to defer building out the good postgres syntax because we have a just barely good enough workaround to not having it.

@ajwerner
Copy link
Contributor

Whoops wrong link: https://aaronbos.dev/posts/postgres-14-json

@rafiss
Copy link
Collaborator

rafiss commented May 24, 2022

Here is the tracking issue for that: https://cockroachlabs.atlassian.net/browse/CRDB-12464 and #77434

I'd also lean a bit more in favor of doing that.

I do have a bit of a concern about adding a PG-incompatible JSON builtin, but perhaps it's unfounded

@ajwerner
Copy link
Contributor

Could this be a UDF when we have them?

@HonoreDB
Copy link
Contributor Author

Oddly enough I don't think existing postgres JSON operators would work for the problem I linked (removing a non-primitive value from a JSON array). I could be missing a trick though.

If you wanted, it wouldn't be too hard to change this from programmatically overloading existing array methods to programmatically creating new methods with their own namespace (e.g. experimental_json_array_remove).

@ajwerner
Copy link
Contributor

I disagree that the postgres syntax does not help:

create table t (data jsonb);
insert into t values (' {"a": {"b": [10, 20, 30]}}');
update t set data['a']['b'] =  data['a']['b'] - 1;
select * from t;
 {"a": {"b": [10, 30]}}

@HonoreDB
Copy link
Contributor Author

HonoreDB commented May 24, 2022

You can do that in CRDB today with

update t set data = json_set(data, Array['a','b'], json_extract_path(data, 'a','b') - 1);

The json operators are just (a lot) prettier. The missing functionality is being able to specify an element you want to remove, rather than an index. In Postgres you can do - "a" to remove all "a" strings from an array, which we don't have (edit: yes we do in some contexts)...but Postgres doesn't seem to support - '{"value": "tag1"}' which is what the person on StackOverflow needed.

@HonoreDB
Copy link
Contributor Author

And I don't particularly think we should support a json - json operator either, it's ambiguous whether you're trying to do set difference or remove a single element, which is presumably why Postgres doesn't have it.

@HonoreDB
Copy link
Contributor Author

Oh, cool, that looks like a more performant way to smuggle a json_array_elements into an UPDATE than my answer. Still makes me sad that you can't just do a regular function call for something so simple.

@ajwerner
Copy link
Contributor

ajwerner commented May 24, 2022

Hopefully in 22.2 we can define that logic in a UDF. A lot of the time the answer for why Postgres doesn't support something is that it's not hard to define the function yourself.

@otan otan removed the request for review from a team May 29, 2022 04:17
@HonoreDB HonoreDB closed this Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants