Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: more forgiving base64 transformation #940

Closed
wants to merge 4 commits into from

Conversation

M4tteoP
Copy link
Member

@M4tteoP M4tteoP commented Dec 13, 2023

  • RawStdEncoding is now used instead of StdEncoding, it permits to decode not padded strings
  • A failover mechanism has been added: if the first decoding fails, a second best-effort decoding is performed up to the illegal character that made the previous decoding fail.

Being the transformation meant to deobfuscate malicious payloads, returning even partial base64 converted strings should increase the chances of matching malicious data.

All the added test cases are aligned with the modsec v2 behavior.

These changes would still allow to have a base64DecodeExt version that could perform a cleanup of all the illegal characters and then try to decode the remaining string

Tentatively closes #926, it should also fix 934131-5 and 934131-7 CRS 4.0.0-rc2 failing test (#899)

@M4tteoP M4tteoP requested a review from a team as a code owner December 13, 2023 15:18
if corrErr, ok := err.(base64.CorruptInputError); ok {
illegalCharPos := int(corrErr)
// Forgiving call to DecodeString, decoding is performed up to the illegal characther
// If an error occurs, dec will still contain the decoded string up to the error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the error being discussed here about line 19? I think we should comment about line 26. Something like we can be sure we won't have any decode error here since we truncate to the first error index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to rewrite this comment. The second DecodeString call might still lead to an error. As far as I understood it may happen if the trailing character is converted into a not printable byte.
For example, starting from PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==
The first error tells us that the . is illegal. DecodeString returns an empty conversion at this point.
The truncation is performed, and we run DecodeString against PHNjcmlwd.
Also, this time an error pops up about the latest character d, but now the output is not empty, and is equal to the conversion that would happen to convert PHNjcmlw. This is why I called this second call a "forgiving" one, the idea was not to check anymore the error, but rather just return the best-effort conversion.

▶ echo "PHNjcmlwd" | base64 --decode
<scrip
▶ echo "PHNjcmlw" | base64 --decode
<scrip

Copy link

codecov bot commented Dec 15, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (f1cfd13) 82.65% compared to head (fc69b59) 82.64%.

Files Patch % Lines
internal/transformations/base64decode.go 81.25% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #940      +/-   ##
==========================================
- Coverage   82.65%   82.64%   -0.02%     
==========================================
  Files         162      162              
  Lines        9028     9040      +12     
==========================================
+ Hits         7462     7471       +9     
- Misses       1317     1319       +2     
- Partials      249      250       +1     
Flag Coverage Δ
default 77.76% <81.25%> (-0.01%) ⬇️
examples 26.45% <0.00%> (-0.04%) ⬇️
ftw 47.16% <81.25%> (+0.03%) ⬆️
ftw-multiphase 49.35% <81.25%> (+0.03%) ⬆️
tinygo 75.31% <81.25%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this comment is much clearer and indeed the truncation issue was something I wondered about.

I looked at the docs and this behavior doesn't seem to be documented

https://pkg.go.dev/encoding/base64#example-Encoding.DecodeString

So it's not so safe. To clarify one point, the main reason to move from bespoke decode logic to stdlib was I thought that is what is expected, based on CRS docs this seemed to be the strict version of the ext operator. But it seems we've found that's not the case, so we don't need to care so much about stdlib - probably bringing back manual decoding is better if otherwise we have to rely on the current Go implementation details

@M4tteoP
Copy link
Member Author

M4tteoP commented Dec 18, 2023

Here is the custom implementation alternative: #944. I feel less confident about the implementation, but it would not be about relying on the current Go implementation details

@M4tteoP
Copy link
Member Author

M4tteoP commented Jan 3, 2024

Closing in favor of #944

@M4tteoP M4tteoP closed this Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

t:base64decode is too strict (padding required, no partial decoding)
2 participants