Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unary '+' operators for strings #3013

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 51 additions & 6 deletions contrib/babelfishpg_tsql/src/tsqlIface.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2499,11 +2499,11 @@ class tsqlBuilder : public tsqlCommonMutator
{
// Check for comparison operators directly followed by an '@@' variable, like =@@
handleAtAtVarInPredicate(ctx);
}
}
void exitUnary_op_expr(TSqlParser::Unary_op_exprContext *ctx) override
{
handleBitNotOperator(ctx);
}
}
void exitPlus_minus_bit_expr(TSqlParser::Plus_minus_bit_exprContext *ctx) override
{
handleBitOperators(ctx);
Expand Down Expand Up @@ -3217,10 +3217,7 @@ class tsqlMutator : public TSqlParserBaseListener
// Check for comparison operators directly followed by an '@@' variable, like =@@
handleAtAtVarInPredicate(ctx);
}
void exitUnary_op_expr(TSqlParser::Unary_op_exprContext *ctx) override
{
handleBitNotOperator(ctx);
}

void exitPlus_minus_bit_expr(TSqlParser::Plus_minus_bit_exprContext *ctx) override
{
handleBitOperators(ctx);
Expand All @@ -3229,6 +3226,54 @@ class tsqlMutator : public TSqlParserBaseListener
{
handleModuloOperator(ctx);
}

void enterUnary_op_expr(TSqlParser::Unary_op_exprContext *ctx) override
{
/*
* The T-SQL grammar allows an arbitrary number of unary '+' operators to precede an expression,
* but PG only supports that for numeric expressions. For string expressions, such a '+' will raise an error in PG.
* In SQL this shows as redundant operators, for example for concatenation: SELECT 'a' ++ 'b'. Expressions
* such as +++(+++@v)) are also valid syntax according to the T-SQL grammar even though they look unusual.
* Here we remove such unary '+' operators, which are redundant anyway.
* However we do not touch numeric constants (e.g. +123) since the '+', although still redundant, may
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It mentioned we do not touch numeric constants, how about the numeric variables ?

Copy link
Contributor Author

@robverschoor robverschoor Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All expressions, including numeric ones (except numeric constants), have their unary + removed. If it is a unary operator, then the + is always redundant. This applies to variables, bracketed expressions as well as subqueries (see the test cases).
The reason we don't want to touch +123 is that this is a commonly known and meaningful notation by itself, which you learn in school, just like -123; +@v in contrast, is not. Arguably, there is a subjective element to this reasoning but the idea is to avoid modifying such familiar-looking numeric constants in case they show up somewhere, like in captured SQL code.

* have been included for code clarity (e.g. +123 as opposed to -123).
*/
std::string op = getFullText(ctx->op);
if (op.front() == '+') {
auto rhsctx = ctx->expression();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does rhs stands for in this context ?

Copy link
Contributor Author

@robverschoor robverschoor Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RHS=right-hand side ; LHS=left-hand side (of an expression/equation)
See https://en.wikipedia.org/wiki/Sides_of_an_equation

while (true) {
std::string rhs = getFullText(rhsctx);
if (
(rhs.front() == '\'') || // single-quoted strings
(rhs.front() == '"') || // both double-quoted strings and double-quoted identifiers
(rhs.front() == '@') || // variables
(rhs.front() == '(') || // bracketed expressions
(rhs.front() == '[') || // bracket-delimited identifiers
(rhs.front() == '_') || // identifiers starting with an underscore
std::isalpha(rhs.front()) // identifiers as well as the N'...' string notation
Comment on lines +3247 to +3253
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we trying to do here? Why only these characters are considered?

Copy link
Contributor Author

@robverschoor robverschoor Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These characters match anything except a numeric constant. Technically +(123) is also a numeric constant in T-SQL grammar terms, but at school we learned to interpret this as (+1)*(123) so removing the + is fine here, per the philosophy as outlined in the earlier comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So can't we check only if the next char is numeric? Something like this:

if (rhs.front() == '+')  {
   // do something
}
else if (!std::isdigit(rhs.front())) {
    stream.setText(ctx->op->getStartIndex(), " ");	
    break;
}
break;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would also have to be a test on the first char being '.' (e.g. +.5).
The advantage of the current coding explains what sort of expression we're removing the '+' from, which I find more helpful. But that's partly a matter of taste.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, my point was to make the conditions a little simpler but anyways if it works this way then I am fine with it as well.

) {
stream.setText(ctx->op->getStartIndex(), " ");
break;
}
if (rhs.front() == '+') {
if (dynamic_cast<TSqlParser::Unary_op_exprContext *>(rhsctx)) {
TSqlParser::Unary_op_exprContext *uctx = static_cast<TSqlParser::Unary_op_exprContext *>(rhsctx);
op = getFullText(uctx->op);
if (op.front() == '+') {
rhsctx = uctx->expression();
continue;
}
}
}
break;
}
}
return;
}
void exitUnary_op_expr(TSqlParser::Unary_op_exprContext *ctx) override
{
handleBitNotOperator(ctx);
}

};

Expand Down
11 changes: 11 additions & 0 deletions test/JDBC/expected/unary_plus_op_string-vu-cleanup.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
drop procedure p1_unary_plus_op_string
go

drop function f1_unary_plus_op_string
go

drop view v1_unary_plus_op_string
go

drop table t1_unary_plus_op_string
go
76 changes: 76 additions & 0 deletions test/JDBC/expected/unary_plus_op_string-vu-prepare.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
create table t1_unary_plus_op_string(i int, vc varchar(30))
go

create view v1_unary_plus_op_string as
select 'view '+++(+N'value1') +
+ /*comment*/
(
+ /*comment*/
+ -- comment +
(
+
'value2'))++(+(select 'value3')) as col1,
1 + -2 as col2,
1 ++ -2 as col3,
1 + ~2 as col4,
1 +++++ ~2 as col5,
1 ++++ (++++ -2) as col6,
1 ++++ (++++ ~2) as col7
go

create procedure p1_unary_plus_op_string
as
declare @v varchar(20) = ' test'
declare @i int = 1
declare @d datetime='2024-Jan-01 01:02:03'
declare @dc decimal(10,4)=1
select 'proc '+'line1'
select 'proc '++'line2'
select 'proc '++N'line3'
select 'proc '+++++'line4'
select 'proc '+++++"line5"
select 'proc '+
+ /*comment*/
(
+ /*comment*/
+ -- comment +
(
+
"line6"))+++@v
select 'proc ' ++(++(++(select N'line7')))++(+@v)
select 'proc ' ++ case when len('a'+++ 'b')=2 then 'true' else 'false' end
select 'proc line8' where 'x' in (+'x')
select 'proc line9' where 'x' like +'x'
set @v = 'x'
select 'proc line10' where 'x' like (+@v)
select 'proc line11' where 'x' like +(+(+@v))
select 'proc line 12 ' ++ vc from t1_unary_plus_op_string order by 1
select 'proc line 13 ' ++(+vc) from t1_unary_plus_op_string order by 1
select 'proc line 14 ' ++ [vc] from t1_unary_plus_op_string order by 1
select 1 + @i, 2 + (+@i), 3 ++++(++(+++@i))
select +@d, + (+@d), ++++(++(+++@d))
select 1 +@dc, 2 + (+@dc), 3 ++++(++(+++@dc))
select 1 + -2 as expr1
select 1 ++ -2 as expr2
select 1 + ~2 as expr3
select 1 +++++ ~2 as expr4
select 1 ++++ (++++ -2) as expr5
select 1 ++++ (++++ ~2) as expr6
EXECUTE('select ''execimm ''++''line1''')
EXECUTE('select ''execimm ''++N''line2''')
EXECUTE('select ''execimm ''++(+(select N''line3''))')
go

create function f1_unary_plus_op_string (@v varchar(10)) returns varchar(30)
as
begin
declare @s varchar(30)
set @s = 'func '+++(+N'value1') +
+
(
+ /*comment*/
+ -- comment +
(++(select "value2")))++(+@v)
return @s
end
go
Loading
Loading