Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: more forgiving base64 transformation #940

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions internal/transformations/base64decode.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,28 @@ package transformations

import (
"encoding/base64"
"strings"

stringsutil "github.com/corazawaf/coraza/v3/internal/strings"
)

// base64decode decodes a Base64-encoded string.
func base64decode(data string) (string, bool, error) {
dec, err := base64.StdEncoding.DecodeString(data)
// RawStdEncoding.DecodeString accepts and requires an unpadded string as input
// https://stackoverflow.com/questions/31971614/base64-encode-decode-without-padding-on-golang-appengine
dataNoPadding := strings.TrimRight(data, "=")
dec, err := base64.RawStdEncoding.DecodeString(dataNoPadding)
if err != nil {
// Forgiving implementation, which ignores invalid characters
return data, false, nil
// If the error is of type CorruptInputError, we can get the position of the illegal character
// and perform a partial decoding up to that point
if corrErr, ok := err.(base64.CorruptInputError); ok {
illegalCharPos := int(corrErr)
// Forgiving call to DecodeString, decoding is performed up to the illegal characther
// If an error occurs, dec will still contain the decoded string up to the error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the error being discussed here about line 19? I think we should comment about line 26. Something like we can be sure we won't have any decode error here since we truncate to the first error index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to rewrite this comment. The second DecodeString call might still lead to an error. As far as I understood it may happen if the trailing character is converted into a not printable byte.
For example, starting from PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==
The first error tells us that the . is illegal. DecodeString returns an empty conversion at this point.
The truncation is performed, and we run DecodeString against PHNjcmlwd.
Also, this time an error pops up about the latest character d, but now the output is not empty, and is equal to the conversion that would happen to convert PHNjcmlw. This is why I called this second call a "forgiving" one, the idea was not to check anymore the error, but rather just return the best-effort conversion.

▶ echo "PHNjcmlwd" | base64 --decode
<scrip
▶ echo "PHNjcmlw" | base64 --decode
<scrip

dec, _ = base64.RawStdEncoding.DecodeString(dataNoPadding[:illegalCharPos])
} else {
return data, false, nil
}
}
return stringsutil.WrapUnsafe(dec), true, nil
}
78 changes: 71 additions & 7 deletions internal/transformations/base64decode_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,81 @@ import (
"testing"
)

var b64DecodeTests = []string{
"VGVzdENhc2U=",
"P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==",
"VGVzdABDYXNl",
var b64DecodeTests = []struct {
name string
input string
expected string
}{
{
name: "Valid",
input: "VGVzdENhc2U=",
expected: "TestCase",
},
{
name: "Valid with \u0000",
input: "VGVzdABDYXNl",
expected: "Test\x00Case",
},
{
name: "Valid without padding",
input: "VGVzdENhc2U",
expected: "TestCase",
},
{
name: "Valid without longer padding",
input: "PA==",
expected: "<",
},
{
name: "valid <TEST>",
input: "PFRFU1Q+",
expected: "<TEST>",
},
{
name: "decoded up to the space (invalid caracter)",
input: "PFR FU1Q+",
expected: "<T",
},
{
name: "decoded up to the dot (invalid caracter)",
input: "P.HNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==",
expected: "",
},
{
name: "decoded up to the dot (invalid caracter)",
input: "PHNjcmlwd.D5hbGVydCgxKTwvc2NyaXB0Pg==",
expected: "<scrip",
},
{
name: "decoded up to the dot (invalid caracter)",
input: "PHNjcmlwdD.5hbGVydCgxKTwvc2NyaXB0Pg==",
expected: "<script",
},
{
name: "decoded up to the dash (invalid caracter for base64.RawStdEncoding)",
input: "PFRFU1Q-",
expected: "<TEST",
},
}

func TestBase64Decode(t *testing.T) {
for _, tt := range b64DecodeTests {
t.Run(tt.name, func(t *testing.T) {
actual, _, err := base64decode(tt.input)
if err != nil {
t.Errorf("Unexpected error: %v", err)
}
if actual != tt.expected {
t.Errorf("Expected %q, but got %q", tt.expected, actual)
}
})
}
}
func BenchmarkB64Decode(b *testing.B) {
for _, tt := range b64DecodeTests {
b.Run(tt, func(b *testing.B) {
b.Run(tt.input, func(b *testing.B) {
for i := 0; i < b.N; i++ {
_, _, err := base64decode(tt)
_, _, err := base64decode(tt.input)
if err != nil {
b.Error(err)
}
Expand All @@ -31,7 +95,7 @@ func BenchmarkB64Decode(b *testing.B) {

func FuzzB64Decode(f *testing.F) {
for _, tc := range b64DecodeTests {
f.Add(tc)
f.Add(tc.input)
}
f.Fuzz(func(t *testing.T, tc string) {
data, _, err := base64decode(tc)
Expand Down
Loading