-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evalengine: Support built-in MySQL function for CONV function #11566
Conversation
Signed-off-by: Weijun-H <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a great implementation @Weijun-H. You need to put more thought into this. If you widen your set of integration test cases you'll see that it breaks at the seams very easily.
go/vt/vtgate/evalengine/string.go
Outdated
fromBase := args[1] | ||
toBase := args[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capture the args by reference
go/vt/vtgate/evalengine/string.go
Outdated
fromBase := args[1] | ||
toBase := args[2] | ||
|
||
if inarg.isNull() || fromBase.isNull() || toBase.isNull() || fromBase.int64() < 2 || fromBase.int64() > 36 || toBase.int64() < 2 || toBase.int64() > 36 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot call int64
over and over; toBase
and fromBase
need to be converted to their integer forms before they can be checked.
go/vt/vtgate/evalengine/string.go
Outdated
return | ||
} | ||
|
||
inarg.makeSignedIntegral() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't want a signed integral for inarg
. You want the raw textual bytes so you can process them.
go/vt/vtgate/evalengine/string.go
Outdated
|
||
inarg.makeSignedIntegral() | ||
|
||
num, _ := strconv.ParseInt(fmt.Sprint(inarg.int64()), int(fromBase.int64()), 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MySQL handles overflow with saturation, not 0-truncation.
Try:
+-----------------------------------------------------------------------------------------------------------------------+
| conv(99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999, 10, 2) |
+-----------------------------------------------------------------------------------------------------------------------+
| 1111111111111111111111111111111111111111111111111111111111111111 |
+-----------------------------------------------------------------------------------------------------------------------+
go/vt/vtgate/evalengine/string.go
Outdated
convNum := strconv.FormatUint(uint64(num), int(toBase.int64())) | ||
convNum = strings.ToUpper(convNum) | ||
result.setRaw(sqltypes.VarChar, []byte(convNum), inarg.collation()) | ||
result.makeTextual(env.DefaultCollation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely no point on calling makeTextual
after setRaw
. Particularly if you're going to change the collation. You can set the desired collation directly on the previous call.
_, f1 := args[0].typeof(env) | ||
// typecheck the right-hand argument but ignore its flags | ||
args[1].typeof(env) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing typecheck for args[2]
.
cases := []string{ | ||
"10", | ||
"10 + '10' + 10", | ||
"-10", | ||
"'10'", | ||
} | ||
bases := []string{ | ||
"-1", | ||
"1", | ||
"2", | ||
"4", | ||
"8", | ||
"10", | ||
"16", | ||
"32", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very limited set of test cases. You need to widen this. Particularly important: hex literals.
Signed-off-by: Weijun-H <[email protected]>
Signed-off-by: Weijun-H <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually checked '+10'
and it fails.
I also checked and '10-9+10'
also fails. You aren't supposed to evaluate the string in the conv function. Just use the raw bytes and interpret them in the given base. Your output for this string is 9
(which I don't understand how you got to anyways, 11
I could understand) but MySQL just says 10, because it ignores everything after the - sign.
cases := []string{ | ||
"-5.1", | ||
"-5.9", | ||
"0xa21 + '1'", | ||
"-0xa21 + '1'", | ||
"10", | ||
"10 + '10' + 10", | ||
"10 + '10' - 10", | ||
"-10", | ||
"'10'", | ||
"10+'10'+'10a'+X'0a'", | ||
"10 / 10", | ||
"X'0FFFFFFFFFFFFFF'", | ||
"99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999", | ||
"-99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These still aren't strong enough cases. What happens if we have a +
in the beginning? What if we have 2 of those in the beginning? What if it is a string with a + and not a number?
go/vt/vtgate/evalengine/string.go
Outdated
if inarg.isNull() || inarg2.isNull() || inarg3.isNull() || fromBase < 2 || fromBase > 36 || toBase < 2 || toBase > 36 { | ||
result.setNull() | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the checks inarg2.isNull()
or inarge3.isNull()
is required. The values of fromBase and toBase would be 0 if they were null respectively, so the condition would still be true.
go/vt/vtgate/evalengine/string.go
Outdated
if t == sqltypes.Float64 { | ||
for i, c := range rawString { | ||
if c == '-' { | ||
continue | ||
} | ||
if (fromBase <= 9 && c >= '0' && c <= rune('0'+fromBase)) || (fromBase > 9 && ((c >= '0' && c <= '9') || (c >= 'a' && c <= rune('a'+fromBase-9)))) { | ||
continue | ||
} else { | ||
rawString = rawString[:i] | ||
break | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific reason why you need the float type as a separate case?
go/vt/vtgate/evalengine/string.go
Outdated
} | ||
|
||
re, _ := regexp.Compile(`[+-]?[0-9.x]+[a-vA-Vx]*`) | ||
for _, num := range re.FindAllString(rawString, -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need all substring matches? Don't you only need the first one? In what case would have more than 1 matches?
Signed-off-by: Weijun-H <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from this, there are still cases that aren't checked and spoiler, some are still failing.
Specifically the case
string_fun_test.go:256: different results: NULL; mysql response: VARCHAR("-H") (local collation: binary; mysql collation: utf8mb4_0900_ai_ci)
query: SELECT CONV(-17, 10, -18) (SIMPLIFY=false)
string_fun_test.go:252: different results: NULL; mysql response: VARCHAR("-15") (local collation: binary; mysql collation: utf8mb4_0900_ai_ci)
query: SELECT CONV(-17, 16, -18) (SIMPLIFY=false)
string_fun_test.go:256: different results: NULL; mysql response: VARCHAR("-15") (local collation: binary; mysql collation: utf8mb4_0900_ai_ci)
query: SELECT CONV(-17, 16, -18) (SIMPLIFY=false)
string_fun_test.go:252: different results: NULL; mysql response: VARCHAR("-23") (local collation: binary; mysql collation: utf8mb4_0900_ai_ci)
query: SELECT CONV(-17, 32, -18) (SIMPLIFY=false)
MySQL allows to have negative from_base and to_base. If from_base is negative then the number is treated as a signed integer
toNum = strings.ToUpper(temp) | ||
} | ||
|
||
inarg.makeTextualAndConvert(env.DefaultCollation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see, there really is no need to convert inarg to the environment's collation. You can just extract the collation from inarg and change the collation and then type cast the result directly using that.
Assuming that the goal is to have the collation finally be the default collation with the coercability and repertoire being inherited from the first argument.
In this case, you can even remove the inarg.makeUnsignedIntegral
too, because you only need the raw bytes, so that is also only being done for the collation changes.
However, I don't know what coercability or repotoire we want. Is there some way to verify from MySQL what it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just extract the collation from inarg and change the collation and then type cast the result directly using that.
I have no idea how to do in this way.
In my opinion, the best way to do the conversion is to do it in two parts. You should convert using from_base and handle the behaviour of a negative from_base value. Once that conversion is complete, then do a second conversion into the to_base value and also handle the negative values there.
and then if your to_base is negative, then your output can be signed -
|
These are all worthwhile test cases to have - mysql [localhost:8026] {msandbox} (test) > SELECT CONV(10000000000000000000000000,-10,-10);
+------------------------------------------+
| CONV(10000000000000000000000000,-10,-10) |
+------------------------------------------+
| 9223372036854775807 |
+------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql [localhost:8026] {msandbox} (test) > SELECT CONV(10000000000000000000000000,-10,10);
+-----------------------------------------+
| CONV(10000000000000000000000000,-10,10) |
+-----------------------------------------+
| 9223372036854775807 |
+-----------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql [localhost:8026] {msandbox} (test) > SELECT CONV(10000000000000000000000000,10,10);
+----------------------------------------+
| CONV(10000000000000000000000000,10,10) |
+----------------------------------------+
| 18446744073709551615 |
+----------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql [localhost:8026] {msandbox} (test) > SELECT CONV(10000000000000000000000000,10,-10);
+-----------------------------------------+
| CONV(10000000000000000000000000,10,-10) |
+-----------------------------------------+
| -1 |
+-----------------------------------------+
1 row in set, 1 warning (0.00 sec) |
Signed-off-by: Weijun-H <[email protected]>
c6a8ede
to
654abcb
Compare
This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:
If no action is taken within 7 days, this PR will be closed. |
This PR was closed because it has been stale for 7 days with no activity. |
Signed-off-by: Weijun-H [email protected]
Description
The following functions will be supported
Related Issue(s)
Checklist
Deployment Notes