Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Use of regex on error messages can be fragile in DorisExceptionConverter.java #3027

Open
justinmclean opened this issue Apr 19, 2024 · 9 comments
Assignees
Labels
good first issue Good for newcomers improvement Improvements on everything

Comments

@justinmclean
Copy link
Member

What would you like to be improved?

DorisExceptionConverter.java uses a DATABASE_ALREADY_EXISTS_PATTERN_STRING regex to check error messages for databases that already exist. There are several issues with this and regex can be fragile expensive or even a possible DOS vector

How should we improve?

If possible improve code to not use regexp for this error condition.

@justinmclean justinmclean added good first issue Good for newcomers improvement Improvements on everything labels Apr 19, 2024
@jerryshao jerryshao added this to the Gravitino June Release milestone Apr 24, 2024
@vip955529
Copy link

Hi, I want to contribute to this project

@justinmclean
Copy link
Member Author

Hi @vip955529, do you want to work on this issue?

@ashwin1596
Copy link
Contributor

Hi, I want to work on this issue.

@justinmclean
Copy link
Member Author

@ashwin1596 Do you still want to work on this?

@ashwin1596
Copy link
Contributor

I am a bit confused about the issue. We are using the error code for "Database already exists" defined as:

@VisibleForTesting static final int CODE_DATABASE_EXISTS = 1007

We only use DATABASE_ALREADY_EXISTS_PATTERN_STRING when the SQL exception has the error code CODE_OTHER = 1105. In this case, we compare the message string to the regex to determine the error code.

Do you have any suggestions?

@justinmclean
Copy link
Member Author

The issue is that the regex used like this can cause issues. Can you find a simpler regex or another way of doing it?

@ashwin1596
Copy link
Contributor

Would it be possible to store the error messages in a configuration file and use those for comparisons, instead of relying on regular expressions?

This approach leverages a configuration file for error messages, enabling a more organized and maintainable way to handle and update error messages dynamically.

@justinmclean
Copy link
Member Author

That sounds a little too heavyweight to me.

@ashwin1596
Copy link
Contributor

In that case, may I suggest using exact string matching for errors within the same file as an alternative to using regex? This approach is less resource-intensive and can provide a more straightforward solution for error handling, ensuring both efficiency and reliability.

The only consideration is that someone will need to maintain and update the error messages in case there are any changes to the error messages thrown by the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers improvement Improvements on everything
Projects
None yet
Development

No branches or pull requests

4 participants