-
-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: more forgiving base64 transformation #940
Conversation
if corrErr, ok := err.(base64.CorruptInputError); ok { | ||
illegalCharPos := int(corrErr) | ||
// Forgiving call to DecodeString, decoding is performed up to the illegal characther | ||
// If an error occurs, dec will still contain the decoded string up to the error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the error being discussed here about line 19? I think we should comment about line 26. Something like we can be sure we won't have any decode error here since we truncate to the first error index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to rewrite this comment. The second DecodeString call might still lead to an error. As far as I understood it may happen if the trailing character is converted into a not printable byte.
For example, starting from PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==
The first error tells us that the .
is illegal. DecodeString returns an empty conversion at this point.
The truncation is performed, and we run DecodeString against PHNjcmlwd
.
Also, this time an error pops up about the latest character d
, but now the output is not empty, and is equal to the conversion that would happen to convert PHNjcmlw
. This is why I called this second call a "forgiving" one, the idea was not to check anymore the error, but rather just return the best-effort conversion.
▶ echo "PHNjcmlwd" | base64 --decode
<scrip
▶ echo "PHNjcmlw" | base64 --decode
<scrip
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #940 +/- ##
==========================================
- Coverage 82.65% 82.64% -0.02%
==========================================
Files 162 162
Lines 9028 9040 +12
==========================================
+ Hits 7462 7471 +9
- Misses 1317 1319 +2
- Partials 249 250 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this comment is much clearer and indeed the truncation issue was something I wondered about.
I looked at the docs and this behavior doesn't seem to be documented
https://pkg.go.dev/encoding/base64#example-Encoding.DecodeString
So it's not so safe. To clarify one point, the main reason to move from bespoke decode logic to stdlib was I thought that is what is expected, based on CRS docs this seemed to be the strict version of the ext
operator. But it seems we've found that's not the case, so we don't need to care so much about stdlib - probably bringing back manual decoding is better if otherwise we have to rely on the current Go implementation details
Here is the custom implementation alternative: #944. I feel less confident about the implementation, but it would not be about relying on the current Go implementation details |
Closing in favor of #944 |
RawStdEncoding
is now used instead ofStdEncoding
, it permits to decode not padded stringsBeing the transformation meant to deobfuscate malicious payloads, returning even partial base64 converted strings should increase the chances of matching malicious data.
All the added test cases are aligned with the modsec v2 behavior.
These changes would still allow to have a
base64DecodeExt
version that could perform a cleanup of all the illegal characters and then try to decode the remaining stringTentatively closes #926, it should also fix
934131-5
and934131-7
CRS 4.0.0-rc2 failing test (#899)