Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 'NO_BACKSLASH_ESCAPES' sql_mode #381

Closed
morningman opened this issue Dec 4, 2018 · 3 comments
Closed

Support 'NO_BACKSLASH_ESCAPES' sql_mode #381

morningman opened this issue Dec 4, 2018 · 3 comments

Comments

@morningman
Copy link
Contributor

Description

Doris supports creating external MySQL table, and can query mysql table using SQL. When querying MySQL table, Doris simply assemble the predicates(filter) into a SQL and pass it to the MySQL server.

On the other hand, Doris, as a SQL query engine, is using backslash to escape characters. So the lexical parser will consume the backslash in parsing phase. But when we are querying a MySQL table, This may take the SQL injection risk. For example:

SELECT * FROM mysql_table WHERE k1 = '1\') UNION ALL SELECT 1,2,3#'

The original statement expects to query the result where k1 equals to the string 1') UNION ALL SELECT 1,2,3#, where the ' is escaped by a backslash. But because this is a MySQL table, so we simply pass the value 1') UNION ALL SELECT 1,2,3# to the MySQL server, where the ' is NOT escaped by backslash, as in MySQL server view. So, the final query statement becomes:

SELECT * FROM mysql_table WHERE (k1 = '1') UNION ALL SELECT 1,2,3#')

which is not what we expected. (We add the () to try protecting the integrity of the predicates. And# is treated as comments, so characters after # are ignored. )

Solution

The main reason why this happens is that Doris can not know whether user is querying a Doris table or an external MySQL table when in lexical parsing phase. So Doris can not determine whether to consume the backslash or not.

A simple solution is to escape the backlash itself. For example:

SELECT * FROM mysql_table WHERE k1 = '1\\\\') UNION ALL SELECT 1,2,3#'

But this is inconvenient for user.

So we reference the MySQL variable sql_mode, which has a mode called 'NO_BACKSLASH_ESCAPES', which means Disable the use of the backslash character () as an escape character within strings.

Before a user querying a MySQL table, it can first execute:

SET [GLOBALE] VARIABLE sql_mode=NO_BACKSLASH_ESCAPES;

and then simple query MySQL table:

SELECT * FROM mysql_table WHERE k1 = '1\') UNION ALL SELECT 1,2,3#'

without any further pre-processing of the SQL. And the Doris will pass the final statement to MySQL server as:

SELECT * FROM mysql_table WHERE (k1 = '1\') UNION ALL SELECT 1,2,3#')

which is correct.

@morningman
Copy link
Contributor Author

The 'NO_BACKSLASH_ESCAPES' sql_mode we supported before does not have same meaning as MySQL's sql_mode with same name.

In MySQL, 'NO_BACKSLASH_ESCAPES' means to ignore the backslash at the very beginning of lexical parsing phase. But what we implemented in Doris, is to ignore backslash just in accepted quoted string.

In MySQL, if sql_mode is set to 'NO_BACKSLASH_ESCAPES', the following stmt will not be accepted:

select * from tbl where k1 'ab\'c';

But in Doris, this stmt is accepted with sql_mode set to 'NO_BACKSLASH_ESCAPES'.
Because in Doris, jflex is used as lexical parser, which does not support ignoring backslash when parsing the origin string. So first, jflex will accept the string 'ab\'c' as ab'c. And than it makes a further analysis to this string, determine whether to keep the backslash or not. (ab'c or ab\'c).

@morningman
Copy link
Contributor Author

Our original goal it to use this feature to avoid SQL injection risk. But first, user must know that it is querying a MySQL table before setting this variable. And it also add complexity to the implementation of many other feature, such as Creating View, which we will re-parsing the view stmt in replay process, so we have to save the sql_mode's value in Creating View operation's edit log. And also, we need to pass this sql_mode when we are forwarding the stmt to Master FE.

@morningman
Copy link
Contributor Author

So we decides not implementing this feature right now. User can avoid SQL injection risk by adding more backslash to escape, for now.

And we may redesign this feature later, after thinking it twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants