Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding: fix UnsafeConvertStringToBytes to work with large input strings #111627

Merged
merged 1 commit into from
Oct 3, 2023
Merged

encoding: fix UnsafeConvertStringToBytes to work with large input strings #111627

merged 1 commit into from
Oct 3, 2023

Conversation

ecwall
Copy link
Contributor

@ecwall ecwall commented Oct 3, 2023

Fixes #111626

The previous impl assumed input string length <= math.MaxInt32. Go 1.20 added unsafe.StringData (https://pkg.go.dev/unsafe#StringData) which properly handles longer strings. This changes the impl to use unsafe.StringData and adds a unit test.

Release note (bug fix): Fixed a panic that could occur if a query uses a string
larger than 2^31-1 bytes.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ecwall ecwall requested review from j82w, cucaroach and a team October 3, 2023 01:14
@ecwall ecwall added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Oct 3, 2023
@@ -913,18 +913,7 @@ func prettyPrintInvertedIndexKey(b []byte) (string, []byte, error) {
// modified if the input string is expected to be used again - doing so could
// violate Go semantics.
func UnsafeConvertStringToBytes(s string) []byte {
if len(s) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the go docs are to be believed I think we need to leave in the zero length check. https://pkg.go.dev/unsafe#StringData

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added this back.

@cucaroach
Copy link
Contributor

Thanks for the quick fix!

@ecwall ecwall requested a review from cucaroach October 3, 2023 11:56
Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @j82w)


pkg/util/encoding/encoding.go line 999 at r2 (raw file):

	// Next we treat the string data as a maximally sized array which we
	// slice. This usage is safe because the pointer value remains in the string.
	arg := (*[0x7fffffff]byte)(unsafe.Pointer(hdr.Data))[:len(s):len(s)]

We should fix this one too

@ecwall
Copy link
Contributor Author

ecwall commented Oct 3, 2023

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @j82w)

pkg/util/encoding/encoding.go line 999 at r2 (raw file):

	// Next we treat the string data as a maximally sized array which we
	// slice. This usage is safe because the pointer value remains in the string.
	arg := (*[0x7fffffff]byte)(unsafe.Pointer(hdr.Data))[:len(s):len(s)]

We should fix this one too

Good catch.

@ecwall ecwall requested a review from cucaroach October 3, 2023 13:44
@j82w
Copy link
Contributor

j82w commented Oct 3, 2023

pkg/util/encoding/encoding_test.go line 2648 at r3 (raw file):

		t.Run(tc.desc, func(t *testing.T) {
			actual := UnsafeConvertStringToBytes(tc.input)
			require.Equal(t, tc.expected, actual)

I think the test might be failing because the t is overloaded, and it's using the outer t from the original test call. I would suggest renaming it to avoid the ambiguity to make sure it use the t from the function.

@ecwall
Copy link
Contributor Author

ecwall commented Oct 3, 2023

pkg/util/encoding/encoding_test.go line 2648 at r3 (raw file):

		t.Run(tc.desc, func(t *testing.T) {
			actual := UnsafeConvertStringToBytes(tc.input)
			require.Equal(t, tc.expected, actual)

I think the test might be failing because the t is overloaded, and it's using the outer t from the original test call. I would suggest renaming it to avoid the ambiguity to make sure it use the t from the function.

The inner t shadows the outer one which is what we want because calling require.Equal(t, ...) with the outer one when inside a subtest will fail the test with a useless error message like

testing.go:1471: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test

I think the test needs to be skipped under stress because of the large input.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @ecwall, and @j82w)


-- commits line 11 at r4:
nit: the public facing release notes should not reference an internal function name in our code base. can this be rephrased so something more high level? like "Fixed a panic that could occur if a query uses a string larger than 2^32 bytes."

Copy link
Contributor Author

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @j82w, and @rafiss)


-- commits line 11 at r4:

Previously, rafiss (Rafi Shamim) wrote…

nit: the public facing release notes should not reference an internal function name in our code base. can this be rephrased so something more high level? like "Fixed a panic that could occur if a query uses a string larger than 2^32 bytes."

The function name appears in the error output because of the panic so I thought it would make sense to reference it here so customers can identify if they had this problem. I don't have a strong opinion though so I will change it (also, it is 2^31-1 bytes).

@ecwall ecwall requested a review from rafiss October 3, 2023 16:59
…ings

Fixes #111626

The previous impl assumed input string length <= math.MaxInt32. Go 1.20 added
unsafe.StringData (https://pkg.go.dev/unsafe#StringData) which properly handles
longer strings. This changes the impl to use unsafe.StringData and adds a unit
test.

Release note (bug fix): Fixed a panic that could occur if a query uses a string
larger than 2^31-1 bytes.
@ecwall
Copy link
Contributor Author

ecwall commented Oct 3, 2023

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 3, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

encoding: UnsafeConvertStringToBytes panic on large input
5 participants