Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/pgwire: support for decoding JSON[] OID by aliasing to JSONB[] OID #94705

Merged
merged 3 commits into from
Jan 19, 2023

Conversation

ZhouXing19
Copy link
Collaborator

@ZhouXing19 ZhouXing19 commented Jan 4, 2023

Previously, with the parameter type oid (199) for JSON[], the pgwire parse message
will fail to be executed with a unknown oid type: 199. We now aliasing it
to the one for JSONB[].

This PR also includes a test revealing #95434.

This commit follows the changes in #88379.

fixes #90839

Release note (sql change): The pgwire protocol implementation can
now accept arguments of the JSON[] type (oid=199). Previously, it could
only accept JSONB[] (oid=3804). Internally, JSON[] and JSONB[] values are
still identical, so this change only affects how the values are received
over the wire protocol.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ZhouXing19 ZhouXing19 force-pushed the jsonarr_encoding branch 4 times, most recently from 0be3bb9 to 4f69978 Compare January 5, 2023 08:29
@ZhouXing19 ZhouXing19 changed the title [WIP]sql: support decoding JSON[] sql/pgwire: support for decoding JSON[] OID by aliasing to JSONB[] OID Jan 5, 2023
@ZhouXing19 ZhouXing19 force-pushed the jsonarr_encoding branch 2 times, most recently from f3764aa to a208ef6 Compare January 5, 2023 21:01
@ZhouXing19 ZhouXing19 marked this pull request as ready for review January 5, 2023 22:07
@ZhouXing19 ZhouXing19 requested a review from a team January 5, 2023 22:07
@ZhouXing19 ZhouXing19 requested review from a team as code owners January 5, 2023 22:07
@ZhouXing19 ZhouXing19 requested a review from rafiss January 5, 2023 22:07
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)


-- commits line 29 at r1:
could you make this a bit more clear for a user to understand? "Fixed the formatting of JSON values inside of a SQL array so they don't have improper quoting."


pkg/sql/pgwire/pgwirebase/encoding.go line 547 at r2 (raw file):

			return &tree.DTSVector{TSVector: ret}, nil
		}
		if typ.Family() == types.ArrayFamily {

hm, what was wrong with this case for json arrays?


pkg/sql/pgwire/pgwirebase/encoding.go line 859 at r2 (raw file):

		default:
			if typ.Family() == types.ArrayFamily {
				return decodeBinaryArray(ctx, evalCtx, typ.ArrayContents(), b, code)

we should make sure this works correctly for oid.T__json. i think we need to make sure typ.ArrayContents() returns an OID of 114 and not 3802, so that the case oid.T_json is used above.


pkg/sql/types/types.go line 655 at r2 (raw file):

	// JSONArray is the type of an array value having JSON-typed elements.
	JSONArray = &T{InternalType: InternalType{
		Family: ArrayFamily, ArrayContents: Jsonb, Oid: oid.T__jsonb, Locale: &emptyLocale}}

i'm wondering if we need to have two types here, like

	// JSONBArray is the type of an array value having JSONB-typed elements.
	JSONBArray = &T{InternalType: InternalType{
		Family: ArrayFamily, ArrayContents: Jsonb, Oid: oid.T__jsonb, Locale: &emptyLocale}}

	// JSONArray is the type of an array value having JSON-typed elements.
	JSONArray = &T{InternalType: InternalType{
		Family: ArrayFamily, ArrayContents: Json, Oid: oid.T__json, Locale: &emptyLocale}}

(all the current usages should use JSONBArray, but the new ones you are adding should use JSONArray)

see my comment on DecodeDatum


pkg/sql/logictest/testdata/logic_test/json line 212 at r1 (raw file):

SELECT array_agg(bar ORDER BY pk) FROM foo
----
{"'{\"a\": \"b\"}'","'[1, 2, 3]'","\"hello\"",1.000,true,false,NULL,"'{\"x\": [1, 2, 3]}'","'{\"x\": {\"y\": \"z\"}}'"}

nice, thanks for fixing this too!


pkg/sql/pgwire/testdata/pgtest/json_array line 1 at r1 (raw file):

send

this should be send crdb_only or else make sure to add a until noncrdb_only below this.


pkg/sql/pgwire/testdata/pgtest/json_array line 48 at r1 (raw file):

Parse {"Query": "SELECT $1::JSON[]", "ParameterOIDs": [199]}
Describe {"ObjectType": "S"}
Bind {"ParameterFormatCodes": [0], "Parameters": [{"text":"{\"{\\\"a\\\":{}}\"}"}]}

we should have some tests where ParameterFormatCodes is [1] so that the binary input format is tested

@ZhouXing19 ZhouXing19 force-pushed the jsonarr_encoding branch 3 times, most recently from c553017 to 5c603ca Compare January 6, 2023 19:32
Copy link
Collaborator Author

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


-- commits line 29 at r1:

Previously, rafiss (Rafi Shamim) wrote…

could you make this a bit more clear for a user to understand? "Fixed the formatting of JSON values inside of a SQL array so they don't have improper quoting."

Done.


pkg/sql/pgwire/pgwirebase/encoding.go line 547 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

hm, what was wrong with this case for json arrays?

It's because if the given string is of the wrong form of an array or json element (e.g. ""), this part won't return an error, and the server will still send a BindComplete message to the client, as the actual parsing for an array happens after this.
This is inconsistent with PG -- we should not give BindComplete in this case.

The newly added case oid.T__json, oid.T__jsonbpart ensures the string can be parsed to an array of json. If not, we returns an error right away.


pkg/sql/pgwire/pgwirebase/encoding.go line 859 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

we should make sure this works correctly for oid.T__json. i think we need to make sure typ.ArrayContents() returns an OID of 114 and not 3802, so that the case oid.T_json is used above.

Yeah I've checked with the debugger that typ.ArrayContents() returns with oid T_json.


pkg/sql/types/types.go line 655 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i'm wondering if we need to have two types here, like

	// JSONBArray is the type of an array value having JSONB-typed elements.
	JSONBArray = &T{InternalType: InternalType{
		Family: ArrayFamily, ArrayContents: Jsonb, Oid: oid.T__jsonb, Locale: &emptyLocale}}

	// JSONArray is the type of an array value having JSON-typed elements.
	JSONArray = &T{InternalType: InternalType{
		Family: ArrayFamily, ArrayContents: Json, Oid: oid.T__json, Locale: &emptyLocale}}

(all the current usages should use JSONBArray, but the new ones you are adding should use JSONArray)

see my comment on DecodeDatum

Nice catch, thanks! Done.


pkg/sql/pgwire/testdata/pgtest/json_array line 1 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

this should be send crdb_only or else make sure to add a until noncrdb_only below this.

Done.


pkg/sql/pgwire/testdata/pgtest/json_array line 48 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

we should have some tests where ParameterFormatCodes is [1] so that the binary input format is tested

Done, see below

@ZhouXing19 ZhouXing19 requested a review from rafiss January 6, 2023 20:31
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)


pkg/sql/pgwire/conn.go line 974 at r4 (raw file):

				continue
			}
			// This special case for json, json[] and jsonb[] is here so we can

nit: there is no special case for jsonb[] here


pkg/sql/types/oid.go line 82 at r4 (raw file):

	// oid.T_json:      Json,
	oid.T_jsonb:        Jsonb,
	oid.T__jsonb:       JSONBArray,

hm, this shouldn't be needed. note that there is no other Array type in this map


pkg/sql/types/types.go line 658 at r4 (raw file):

	// JSONArray is the type of an array value having JSON-typed elements.
	JSONArray = &T{InternalType: InternalType{

maybe we should name this new JSONArray type something like JSONArrayForDecodingOnly, so that people know not to use it in other places. (and explain in the comment)


pkg/sql/pgwire/testdata/pgtest/json_array line 57 at r4 (raw file):

----
{"Type":"BindComplete"}
{"Type":"DataRow","Values":[{"binary":"000000000000000000000eda"}]}

i ran this test against postgres with ./dev test pkg/sql/pgwire -f=TestPGTest/json_array --test-args='-addr localhost:5432 -user rafiss -rewrite' and there was a diff, so there's more adjusting needed. on postgres, this results in 000000000000000000000072, i think because PG outputs json[] while CRDB ouputs a jsonb[]


pkg/sql/pgwire/testdata/pgtest/json_array line 71 at r4 (raw file):

----
{"Type":"BindComplete"}
{"Type":"DataRow","Values":[{"binary":"000000010000000000000eda00000001000000010000000a017b2261223a207b7d7d"}]}

on postgres, this results in 0000000100000000000000720000000100000001000000087b2261223a7b7d7d. this is also because PG outputs json[] while CRDB ouputs a jsonb[]


pkg/sql/pgwire/testdata/pgtest/json_array line 86 at r4 (raw file):

----
{"Type":"BindComplete"}
{"Type":"DataRow","Values":[{"text":"{\"{\\\"a\\\": {}}\"}"}]}

on PG this results in {"Type":"DataRow","Values":[{"text":"{\"{\\\"a\\\":{}}\"}"}]} (no space after the :)


pkg/sql/pgwire/testdata/pgtest/json_array line 125 at r4 (raw file):

# Test with input in binary.
send
Parse {"Query": "SELECT $1::JSON[]", "ParameterOIDs":[199]}

we still should add another test where ParameterFormatCode=1, and ParameterOID=199, and where the data is successfully decoded. let's test these too:

  • 000000000000000000000072 - empty array in json[] format
  • 0000000100000000000000720000000100000001000000087b2261223a7b7d7d - {"{\"a\":{}}"}"}

pkg/sql/pgwire/testdata/pgtest/json_array line 126 at r4 (raw file):

send
Parse {"Query": "SELECT $1::JSON[]", "ParameterOIDs":[199]}
Bind {"ParameterFormatCodes": [1], "Parameters": [{"binary":"000000000000000000000eda"}]}

can you add a comment explaining what text value 000000000000000000000eda represents? my understanding is that it is an empty array in jsonb[] format


pkg/sql/pgwire/testdata/pgtest/json_array line 136 at r4 (raw file):

----
{"Type":"ParseComplete"}
{"Type":"ErrorResponse","Code":"08P01"}

on PG this returns error code 42804 (data type mismatch)

to fix this test, add a until mapError=(42804, 08P01) and a commend explaining that PG returns 42804 (data type mismatch) while CRDB returns 08P01 (protocol violation), but both are acceptable

@ZhouXing19 ZhouXing19 force-pushed the jsonarr_encoding branch 2 times, most recently from 64fa7d9 to 51e8015 Compare January 18, 2023 17:02
Copy link
Collaborator Author

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/pgwire/conn.go line 974 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: there is no special case for jsonb[] here

Done.


pkg/sql/types/oid.go line 82 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

hm, this shouldn't be needed. note that there is no other Array type in this map

Done.


pkg/sql/types/types.go line 658 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

maybe we should name this new JSONArray type something like JSONArrayForDecodingOnly, so that people know not to use it in other places. (and explain in the comment)

Done.


pkg/sql/pgwire/testdata/pgtest/json_array line 57 at r4 (raw file):

i think because PG outputs json[] while CRDB ouputs a jsonb[]

Yeah exactly. It's because cockroach parses JSON[] as JSONB[]. This happens to JSON as well. I added tests for both pg and crdb with comments.


pkg/sql/pgwire/testdata/pgtest/json_array line 71 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

on postgres, this results in 0000000100000000000000720000000100000001000000087b2261223a7b7d7d. this is also because PG outputs json[] while CRDB ouputs a jsonb[]

Split the test for pg and crdb. Also pg's binary record the number of spaces between the key and value in the json text input, but cockroach doesn't. See the tests added in the third commit.


pkg/sql/pgwire/testdata/pgtest/json_array line 86 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

on PG this results in {"Type":"DataRow","Values":[{"text":"{\"{\\\"a\\\":{}}\"}"}]} (no space after the :)

This turns out to be a bug for json as well. Filed here, also added tests in the 3rd commit.


pkg/sql/pgwire/testdata/pgtest/json_array line 125 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

we still should add another test where ParameterFormatCode=1, and ParameterOID=199, and where the data is successfully decoded. let's test these too:

  • 000000000000000000000072 - empty array in json[] format
  • 0000000100000000000000720000000100000001000000087b2261223a7b7d7d - {"{\"a\":{}}"}"}

Done. Note that I added the json[] binary for {"{\"a\": {}}"}"} (with a space between the kv) so it differs from the value you had above.


pkg/sql/pgwire/testdata/pgtest/json_array line 126 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

can you add a comment explaining what text value 000000000000000000000eda represents? my understanding is that it is an empty array in jsonb[] format

Done.


pkg/sql/pgwire/testdata/pgtest/json_array line 136 at r4 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

on PG this returns error code 42804 (data type mismatch)

to fix this test, add a until mapError=(42804, 08P01) and a commend explaining that PG returns 42804 (data type mismatch) while CRDB returns 08P01 (protocol violation), but both are acceptable

Ah TIL. Thanks, done.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests, this lgtm!

Reviewed 2 of 3 files at r1, 7 of 8 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)

Previously, json in an array is always format with single quotes surrounding it.
To be consistent with postgres, we now format them with bare string mode.

I.e. CRDB

```
> SELECT ARRAY['{"a":{}}']::JSON[];
         array
----------------------
  {"'{\"a\": {}}'"}

```

PG:

```
> SELECT ARRAY['{"a":{}}']::JSON[];
     array
----------------
 {"{\"a\":{}}"}

```

Also added tests for json array in pgwire.

Release note (bug fix): Fixed the formatting of JSON values inside of a SQL
array so they don't have improper quoting.
fixes cockroachdb#90839

Previously, with the parameter type oid (199) for JSON[], the pgwire parse message
will fail to be executed with a `unknown oid type: 199`. We now aliasing it
to the one for `JSONB[]`.

This commit follows the changes in cockroachdb#88379.

Release note (sql change): The pgwire protocol implementation can
now accept arguments of the JSON[] type (oid=199). Previously, it could
only accept JSONB[] (oid=3804). Internally, JSON[] and JSONB[] values are
still identical, so this change only affects how the values are received
over the wire protocol.
We noticed a bug that in crdb the text output for a json doesn't follow input
spacing between key and value. We are adding tests for this and this bug is
filed in cockroachdb#95434.

Release note: None
@ZhouXing19
Copy link
Collaborator Author

TFTR!
bors r+

@craig
Copy link
Contributor

craig bot commented Jan 19, 2023

Build succeeded:

@craig craig bot merged commit 0bc78ed into cockroachdb:master Jan 19, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jan 19, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from ce51054 to blathers/backport-release-22.2-94705: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/pgwire: need support for decoding JSON[] OID by aliasing to JSONB[] OID [unknown oid type: 199]
3 participants