fix: more forgiving base64 transformation [custom implementation] #944

M4tteoP · 2023-12-18T22:52:18Z

Sibling of #940, providing a custom implementation of base64 decoding. The same tests are executed.

The implementation is a refactored version of #758 to allow missing padding and decode up to an illegal character.

We have to decide between:

Rely (hacky) on a the std library transformation (definitely more confident about the implementation)
Rely on and maintain our custom implementation (faster, less confident about the implementation, no unexpected changes not relying on undocumented std library functions behaviors)

Benchmarks:

PR: custom implementation (this PR)

goos: darwin
goarch: arm64
pkg: github.com/corazawaf/coraza/v3/internal/transformations
BenchmarkB64Decode/VGVzdENhc2U=-10         	27859436	        36.83 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/VGVzdABDYXNl-10         	30617679	        38.87 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/VGVzdENhc2U-10          	32812024	        36.56 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/PA==-10                 	69833588	        16.98 ns/op	       4 B/op	       1 allocs/op
BenchmarkB64Decode/PFRFU1Q+-10             	42157344	        28.29 ns/op	       8 B/op	       1 allocs/op
BenchmarkB64Decode/PHNjcmlwd-10            	36910915	        33.60 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/PFR_FU1Q+-10            	57075021	        22.50 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==-10         	58432546	        19.89 ns/op	      48 B/op	       1 allocs/op
BenchmarkB64Decode/PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==-10         	33036349	        35.46 ns/op	      48 B/op	       1 allocs/op
BenchmarkB64Decode/PHNjcmlwdD.5hbGVydCgxKTwvc2NyaXB0Pg==-10         	31625136	        37.47 ns/op	      48 B/op	       1 allocs/op
BenchmarkB64Decode/PFRFU1Q--10                                      	43652700	        27.65 ns/op	       8 B/op	       1 allocs/op
PASS
ok  	github.com/corazawaf/coraza/v3/internal/transformations	13.494s

PR: Std library (#940)

goos: darwin
goarch: arm64
pkg: github.com/corazawaf/coraza/v3/internal/transformations
BenchmarkB64Decode/VGVzdENhc2U=-10         	48350179	        26.08 ns/op	       8 B/op	       1 allocs/op
BenchmarkB64Decode/VGVzdABDYXNl-10         	41106975	        27.79 ns/op	      16 B/op	       1 allocs/op
BenchmarkB64Decode/VGVzdENhc2U-10          	49100253	        24.27 ns/op	       8 B/op	       1 allocs/op
BenchmarkB64Decode/PA==-10                 	57690343	        20.77 ns/op	       1 B/op	       1 allocs/op
BenchmarkB64Decode/PFRFU1Q+-10             	46958845	        24.56 ns/op	       8 B/op	       1 allocs/op
BenchmarkB64Decode/PHNjcmlwd-10            	23393598	        51.72 ns/op	      16 B/op	       2 allocs/op
BenchmarkB64Decode/PFR_FU1Q+-10            	27806850	        43.29 ns/op	       8 B/op	       2 allocs/op
BenchmarkB64Decode/P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==-10         	20983162	        55.50 ns/op	      80 B/op	       2 allocs/op
BenchmarkB64Decode/PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==-10         	15266341	        78.58 ns/op	      88 B/op	       3 allocs/op
BenchmarkB64Decode/PHNjcmlwdD.5hbGVydCgxKTwvc2NyaXB0Pg==-10         	16138278	        74.43 ns/op	      88 B/op	       3 allocs/op
BenchmarkB64Decode/PFRFU1Q--10                                      	24491835	        48.89 ns/op	      16 B/op	       2 allocs/op
PASS
ok  	github.com/corazawaf/coraza/v3/internal/transformations	14.896s

Tentatively closes #926, it should also fix 934131-5 and 934131-7 CRS 4.0.0-rc2 failing test (#899)

codecov · 2023-12-18T22:55:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f1cfd13) 82.65% compared to head (8753b80) 82.71%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #944      +/-   ##
==========================================
+ Coverage   82.65%   82.71%   +0.06%     
==========================================
  Files         162      162              
  Lines        9028     9062      +34     
==========================================
+ Hits         7462     7496      +34     
  Misses       1317     1317              
  Partials      249      249

Flag	Coverage Δ
default	`77.82% <95.00%> (+0.06%)`	⬆️
examples	`26.39% <0.00%> (-0.11%)`	⬇️
ftw	`47.33% <100.00%> (+0.21%)`	⬆️
ftw-multiphase	`49.51% <100.00%> (+0.20%)`	⬆️
tinygo	`75.38% <95.00%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

anuraaga · 2023-12-19T00:20:03Z

internal/transformations/base64decode.go

-	return stringsutil.WrapUnsafe(dec), true, nil
+
+	// Handle any remaining characters
+	if n == 2 {


Does it still make sense to executer these when we break above on illegal character?

Early returning dst.String() when we break above on illegal character (so not when padding reached) leads to some failing tests:

--- FAIL: TestBase64Decode (0.00s) --- FAIL: TestBase64Decode/decoded_up_to_the_space_(invalid_character) (0.00s) /Users/matteopace/Repo/coraza/internal/transformations/base64decode_test.go:83: Expected "<T", but got "" --- FAIL: TestBase64Decode/decoded_up_to_the_dot_(invalid_character)#02 (0.00s) /Users/matteopace/Repo/coraza/internal/transformations/base64decode_test.go:83: Expected "<script", but got "<scrip" --- FAIL: TestBase64Decode/decoded_up_to_the_dash_(invalid_character_for_base64,_only_valid_for_Base64url) (0.00s) /Users/matteopace/Repo/coraza/internal/transformations/base64decode_test.go:83: Expected "<TEST", but got "<TE"

Trailing characters require to be rearranged even if that case, as if the end of the string was reached

anuraaga · 2023-12-19T00:21:18Z

internal/transformations/base64decode_test.go

@@ -31,7 +100,7 @@ func BenchmarkB64Decode(b *testing.B) {

 func FuzzB64Decode(f *testing.F) {
 	for _, tc := range b64DecodeTests {
-		f.Add(tc)
+		f.Add(tc.input)


Interestingly the fuzz test was still set up for this forgiving version, that's nice

anuraaga · 2023-12-19T00:23:43Z

internal/transformations/base64decode.go

+
+	for ; srcc < slen; srcc++ {
+		// If invalid character or padding reached, we stop decoding
+		if src[srcc] == '=' || src[srcc] == ' ' || src[srcc] > 127 || base64DecMap[src[srcc]] == 127 {


Extract variable for src[srcc]

Also while it means having two conditionals that can break, I think it's worth extracting a variable for base64DecMap[src[srcc]] rather than do it twice

internal/transformations/base64decode.go

anuraaga

I think this is better than the stdlib option

internal/transformations/base64decode.go

M4tteoP · 2023-12-20T10:18:50Z

Thanks @anuraaga for the review and all the guidance. I will wait a bit for any other feedback from others before merging this one and closing the other

fzipi

IMHO, this is ready to go.

jcchavezs · 2023-12-27T13:36:27Z

Please add a reference or a link to he golang tests we borrowed, a brief explanation on why we do this vs using std and I think we are ready to go. Maybe adding a doc.go vs the readme is more idiomatic.

…

On Wed, 27 Dec 2023, 13:28 Felipe Zipitría, ***@***.***> wrote: ***@***.**** approved this pull request. IMHO, this is ready to go. — Reply to this email directly, view it on GitHub <#944 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYARCYMMNO6XDB6UUZITYLQH57AVCNFSM6AAAAABA2H42RCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTOOJXGE2DCNRUGQ> . You are receiving this because your review was requested.Message ID: ***@***.***>

M4tteoP added 3 commits December 13, 2023 16:02

more forgiving base64 transformation

cfa39f1

better comment about second DecodeString call

fefad1e

custom implementation

718e225

M4tteoP requested a review from a team as a code owner December 18, 2023 22:52

Merge branch 'main' into base64_flexible_custom

f7907e7

M4tteoP mentioned this pull request Dec 18, 2023

fix: more forgiving base64 transformation #940

Closed

comments

d3dac7a

anuraaga reviewed Dec 19, 2023

View reviewed changes

address review, minor refactor

552c207

anuraaga approved these changes Dec 19, 2023

View reviewed changes

internal/transformations/base64decode.go Outdated Show resolved Hide resolved

moves new line characters check

8221a87

Extends tests with golang base64 tests

d600c6d

fzipi approved these changes Dec 27, 2023

View reviewed changes

M4tteoP added 2 commits January 3, 2024 13:21

updates link with the permalink to the exact borrowed payloads

77ce907

adds note about using custom implementation

8753b80

jcchavezs approved these changes Jan 3, 2024

View reviewed changes

M4tteoP merged commit b887a58 into corazawaf:main Jan 3, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: more forgiving base64 transformation [custom implementation] #944

fix: more forgiving base64 transformation [custom implementation] #944

M4tteoP commented Dec 18, 2023

codecov bot commented Dec 18, 2023 •

edited

Loading

anuraaga Dec 19, 2023

M4tteoP Dec 19, 2023

anuraaga Dec 19, 2023

anuraaga Dec 19, 2023

anuraaga Dec 19, 2023

anuraaga left a comment

M4tteoP commented Dec 20, 2023

fzipi left a comment

jcchavezs commented Dec 27, 2023 via email

fix: more forgiving base64 transformation [custom implementation] #944

fix: more forgiving base64 transformation [custom implementation] #944

Conversation

M4tteoP commented Dec 18, 2023

codecov bot commented Dec 18, 2023 • edited Loading

Codecov Report

anuraaga Dec 19, 2023

Choose a reason for hiding this comment

M4tteoP Dec 19, 2023

Choose a reason for hiding this comment

anuraaga Dec 19, 2023

Choose a reason for hiding this comment

anuraaga Dec 19, 2023

Choose a reason for hiding this comment

anuraaga Dec 19, 2023

Choose a reason for hiding this comment

anuraaga left a comment

Choose a reason for hiding this comment

M4tteoP commented Dec 20, 2023

fzipi left a comment

Choose a reason for hiding this comment

jcchavezs commented Dec 27, 2023 via email

codecov bot commented Dec 18, 2023 •

edited

Loading