fix: more forgiving base64 transformation #940

M4tteoP · 2023-12-13T15:18:19Z

RawStdEncoding is now used instead of StdEncoding, it permits to decode not padded strings
A failover mechanism has been added: if the first decoding fails, a second best-effort decoding is performed up to the illegal character that made the previous decoding fail.

Being the transformation meant to deobfuscate malicious payloads, returning even partial base64 converted strings should increase the chances of matching malicious data.

All the added test cases are aligned with the modsec v2 behavior.

These changes would still allow to have a base64DecodeExt version that could perform a cleanup of all the illegal characters and then try to decode the remaining string

Tentatively closes #926, it should also fix 934131-5 and 934131-7 CRS 4.0.0-rc2 failing test (#899)

anuraaga · 2023-12-13T23:11:07Z

internal/transformations/base64decode.go

+		if corrErr, ok := err.(base64.CorruptInputError); ok {
+			illegalCharPos := int(corrErr)
+			// Forgiving call to DecodeString, decoding is performed up to the illegal characther
+			// If an error occurs, dec will still contain the decoded string up to the error


Is the error being discussed here about line 19? I think we should comment about line 26. Something like we can be sure we won't have any decode error here since we truncate to the first error index

I tried to rewrite this comment. The second DecodeString call might still lead to an error. As far as I understood it may happen if the trailing character is converted into a not printable byte.
For example, starting from PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==
The first error tells us that the . is illegal. DecodeString returns an empty conversion at this point.
The truncation is performed, and we run DecodeString against PHNjcmlwd.
Also, this time an error pops up about the latest character d, but now the output is not empty, and is equal to the conversion that would happen to convert PHNjcmlw. This is why I called this second call a "forgiving" one, the idea was not to check anymore the error, but rather just return the best-effort conversion.

▶ echo "PHNjcmlwd" | base64 --decode <scrip ▶ echo "PHNjcmlw" | base64 --decode <scrip

codecov · 2023-12-15T10:03:36Z

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (f1cfd13) 82.65% compared to head (fc69b59) 82.64%.

Files	Patch %	Lines
internal/transformations/base64decode.go	81.25%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #940      +/-   ##
==========================================
- Coverage   82.65%   82.64%   -0.02%     
==========================================
  Files         162      162              
  Lines        9028     9040      +12     
==========================================
+ Hits         7462     7471       +9     
- Misses       1317     1319       +2     
- Partials      249      250       +1

Flag	Coverage Δ
default	`77.76% <81.25%> (-0.01%)`	⬇️
examples	`26.45% <0.00%> (-0.04%)`	⬇️
ftw	`47.16% <81.25%> (+0.03%)`	⬆️
ftw-multiphase	`49.35% <81.25%> (+0.03%)`	⬆️
tinygo	`75.31% <81.25%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

anuraaga

Thanks this comment is much clearer and indeed the truncation issue was something I wondered about.

I looked at the docs and this behavior doesn't seem to be documented

https://pkg.go.dev/encoding/base64#example-Encoding.DecodeString

So it's not so safe. To clarify one point, the main reason to move from bespoke decode logic to stdlib was I thought that is what is expected, based on CRS docs this seemed to be the strict version of the ext operator. But it seems we've found that's not the case, so we don't need to care so much about stdlib - probably bringing back manual decoding is better if otherwise we have to rely on the current Go implementation details

M4tteoP · 2023-12-18T22:55:49Z

Here is the custom implementation alternative: #944. I feel less confident about the implementation, but it would not be about relying on the current Go implementation details

M4tteoP · 2024-01-03T15:29:39Z

Closing in favor of #944

more forgiving base64 transformation

cfa39f1

M4tteoP requested a review from a team as a code owner December 13, 2023 15:18

anuraaga reviewed Dec 13, 2023

View reviewed changes

M4tteoP added 2 commits December 15, 2023 10:59

better comment about second DecodeString call

fefad1e

Merge branch 'main' into base64_flexible

ee3de0c

anuraaga reviewed Dec 15, 2023

View reviewed changes

M4tteoP mentioned this pull request Dec 18, 2023

fix: more forgiving base64 transformation [custom implementation] #944

Merged

Merge branch 'main' into base64_flexible

fc69b59

M4tteoP closed this Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: more forgiving base64 transformation #940

fix: more forgiving base64 transformation #940

M4tteoP commented Dec 13, 2023 •

edited

Loading

anuraaga Dec 13, 2023

M4tteoP Dec 15, 2023

codecov bot commented Dec 15, 2023 •

edited

Loading

anuraaga left a comment

M4tteoP commented Dec 18, 2023

M4tteoP commented Jan 3, 2024

fix: more forgiving base64 transformation #940

fix: more forgiving base64 transformation #940

Conversation

M4tteoP commented Dec 13, 2023 • edited Loading

anuraaga Dec 13, 2023

Choose a reason for hiding this comment

M4tteoP Dec 15, 2023

Choose a reason for hiding this comment

codecov bot commented Dec 15, 2023 • edited Loading

Codecov Report

anuraaga left a comment

Choose a reason for hiding this comment

M4tteoP commented Dec 18, 2023

M4tteoP commented Jan 3, 2024

M4tteoP commented Dec 13, 2023 •

edited

Loading

codecov bot commented Dec 15, 2023 •

edited

Loading