-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support unary '+' operators for strings #3013
Support unary '+' operators for strings #3013
Conversation
Pull Request Test Coverage Report for Build 11307572895Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
*/ | ||
std::string op = getFullText(ctx->op); | ||
if (op.front() == '+') { | ||
auto rhsctx = ctx->expression(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does rhs stands for in this context ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RHS=right-hand side ; LHS=left-hand side (of an expression/equation)
See https://en.wikipedia.org/wiki/Sides_of_an_equation
* In SQL this shows as redundant operators, for example for concatenation: SELECT 'a' ++ 'b'. Expressions | ||
* such as +++(+++@v)) are also valid syntax according to the T-SQL grammar even though they look unusual. | ||
* Here we remove such unary '+' operators, which are redundant anyway. | ||
* However we do not touch numeric constants (e.g. +123) since the '+', although still redundant, may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It mentioned we do not touch numeric constants, how about the numeric variables ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All expressions, including numeric ones (except numeric constants), have their unary +
removed. If it is a unary operator, then the +
is always redundant. This applies to variables, bracketed expressions as well as subqueries (see the test cases).
The reason we don't want to touch +123
is that this is a commonly known and meaningful notation by itself, which you learn in school, just like -123
; +@v
in contrast, is not. Arguably, there is a subjective element to this reasoning but the idea is to avoid modifying such familiar-looking numeric constants in case they show up somewhere, like in captured SQL code.
Does this change only work for string constant ? What about string variables ? |
(rhs.front() == '\'') || // single-quoted strings | ||
(rhs.front() == '"') || // both double-quoted strings and double-quoted identifiers | ||
(rhs.front() == '@') || // variables | ||
(rhs.front() == '(') || // bracketed expressions | ||
(rhs.front() == '[') || // bracket-delimited identifiers | ||
(rhs.front() == '_') || // identifiers starting with an underscore | ||
std::isalpha(rhs.front()) // identifiers as well as the N'...' string notation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we trying to do here? Why only these characters are considered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These characters match anything except a numeric constant. Technically +(123)
is also a numeric constant in T-SQL grammar terms, but at school we learned to interpret this as (+1)*(123)
so removing the +
is fine here, per the philosophy as outlined in the earlier comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can't we check only if the next char is numeric? Something like this:
if (rhs.front() == '+') {
// do something
}
else if (!std::isdigit(rhs.front())) {
stream.setText(ctx->op->getStartIndex(), " ");
break;
}
break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There would also have to be a test on the first char being '.' (e.g. +.5).
The advantage of the current coding explains what sort of expression we're removing the '+' from, which I find more helpful. But that's partly a matter of taste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, my point was to make the conditions a little simpler but anyways if it works this way then I am fine with it as well.
Also see comment above. |
(rhs.front() == '\'') || // single-quoted strings | ||
(rhs.front() == '"') || // both double-quoted strings and double-quoted identifiers | ||
(rhs.front() == '@') || // variables | ||
(rhs.front() == '(') || // bracketed expressions | ||
(rhs.front() == '[') || // bracket-delimited identifiers | ||
(rhs.front() == '_') || // identifiers starting with an underscore | ||
std::isalpha(rhs.front()) // identifiers as well as the N'...' string notation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, my point was to make the conditions a little simpler but anyways if it works this way then I am fine with it as well.
8ad86ff
into
babelfish-for-postgresql:BABEL_4_X_DEV
…esql#3013)" This reverts commit 8ad86ff.
This reverts commit 8ad86ff. [PR 3013](#3013) incorrectly ended up in BABEL_4_4_STABLE. Signed-off-by: Rob Verschoor <[email protected]>
Description
The T-SQL grammar allows an arbitrary number of unary
+
operators to precede an expression,but PG only supports that for numeric expressions. For string expressions, the
+
will be parsed fine by the T-SQL ANTLR parser, but these will raise an error in PG.In SQL those unary
+
string operators may look like they are additional concatenation operators, like:SELECT 'a' ++ 'b'
. Expressions such as+++(+++@v))
are also valid syntax according to the T-SQL grammar even though they look unusual.In this fix we remove such unary
+
operators, which are redundant anyway.However we do not touch numeric constants (e.g.
+123
) since the+
, though still redundant, mayhave been included for code clarity (e.g.
+123
as opposed to-123
).Signed-off-by: Rob Verschoor [email protected]
Issues Resolved
BABEL-1924 Support unary '+' operator for string expressions
Test Scenarios Covered
Use case based - Yes
Boundary conditions - N/A
Arbitrary inputs - N/A
Negative test cases - Yes
Minor version upgrade tests - N/A
Major version upgrade tests - N/A
Performance tests - N/A
Tooling impact - N/A
Client tests - N/A
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.