-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MM-34926] Short-circuit link processing if the regex is non-terminal #166
Conversation
Codecov Report
@@ Coverage Diff @@
## master #166 +/- ##
==========================================
+ Coverage 41.40% 41.57% +0.17%
==========================================
Files 6 6
Lines 669 671 +2
==========================================
+ Hits 277 279 +2
Misses 375 375
Partials 17 17
Continue to review full report at Codecov.
|
@srkgupta Do you have any other regexes that will reproduce this issue that I can add to the unit test? |
server/autolink/autolink.go
Outdated
@@ -108,6 +108,12 @@ func (l Autolink) Replace(message string) string { | |||
if submatch == nil { | |||
break | |||
} | |||
|
|||
// The beginning of the submatch is equal to the end of the submatch here. The regex pattern is non-terminal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this true? Can you explain what is happening here? I don't quite understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also use an explanation of the if
below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the loop begins, the post's message is converted to a []byte
and stored in the local variable in
. Another []byte
is created to store the output of the replace process.
In the loop, the regexp
package's FindSubmatchIndex function is called with the input array, and the first submatch found is processed. The FindSubmatchIndex
function returns a []int
that contains indexes of submatches. Each pair of two entries in the array represents the start and end index of each submatch (exclusive, meaning the end index will be equal to the length of the match if the start index is 0). The exit condition of the loop is when FindSubmatchIndex
returns nil, which only happens when no match is found.
For the autolink's algorithm, only the first submatch matters at any given time, because the algorithm processes the submatch, and then moves its cursor forward in the in
array and sees if there are more submatches to process in the next iteration of the loop. Because of this, only the first two elements in FindSubmatchIndex
's return value are ever important to the autolink replace algorithm at any given time.
When the autolink algorithm is provided a regex pattern such as .*
:
The first run through the loop causes the entire input string to match. The first element in FindSubmatchIndex
's return value is 0
and second is the length of the message. Line 119 results in the in
array being empty now, since the whole string was matched and submatch[1] == len(in)
in this case. Note that there is no condition checked at the end of the loop to see if we should leave the loop for any reason. The only condition to leave is "did FindSubmatchIndex
return nil?" at the beginning of the loop.
When we run through the loop again, the argument given to FindSubmatchIndex
is an empty array, and it returns the value [0, 0]
, meaning "the match starts at index zero, and ends at index zero", which means it matched on an empty string. Since submatch
is not nil here, the loop moves on, though results in a no-op for everything else below in the loop since in
is empty. The loop continues and does the same calculation over and over, finding the same empty submatch from the .*
pattern indefinitely.
The solution I've made here is "if a submatch is found but has no length, we have nothing to process, so we should exit the loop." A better solution may be to check if len(in) == 0
at the end of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! I think you are right that checking the length would be better. Maybe more intuitive at the beginning of the loop?
Hi @mickmister |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the changes and the reported issue is working fine now. Tested with multiple other regexes and no other issues were found. Thanks @mickmister. LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Summary
When the autolink's pattern contains
.*
, the process indefinitely hangs due to continuously processing an empty non-nil match. #166 (comment)Ticket Link
Fixes https://mattermost.atlassian.net/browse/MM-34926