Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix panic in Writer #1384

Closed
wants to merge 1 commit into from
Closed

Conversation

Luap99
Copy link
Contributor

@Luap99 Luap99 commented May 17, 2023

Commit 766cfec introduced this bug by defining an incorrect split function. First it breaks the old behavior because it never splits at newlines now. Second, it causes a panic because it never tells the scanner to stop. See the bufio.ScanLines function, something like:

if atEOF && len(data) == 0 {
	return 0, nil, nil
}

is needed to do that.

This commit fixes it by restoring the old behavior and calling bufio.ScanLines but also keep the 64KB check in place to avoid buffering for to long.

Two tests are added to ensure it is working as expected.

Fixes #1383

Commit 766cfec introduced this bug by defining an incorrect split
function. First it breaks the old behavior because it never splits at
newlines now. Second, it causes a panic because it never tells the
scanner to stop. See the bufio.ScanLines function, something like:
```
if atEOF && len(data) == 0 {
	return 0, nil, nil
}
```
is needed to do that.

This commit fixes it by restoring the old behavior and calling
bufio.ScanLines but also keep the 64KB check in place to avoid buffering
for to long.

Two tests are added to ensure it is working as expected.

Fixes sirupsen#1383

Signed-off-by: Paul Holzinger <[email protected]>

lines := strings.Split(strings.TrimRight(buf.String(), "\n"), "\n")
// we should have 4 lines because we wrote more than 64 KB each time
assert.Len(t, lines, 4, "logger printed incorrect number of lines")
Copy link
Owner

@sirupsen sirupsen May 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Thank you for fixing this (I had to revert the other commit in master, because of this panic). Let's get this fixed properly.

I think there's two ways of dealing with this, and I'm interested in your take (and those on the original issue @kke @brahads @ozfive @ashmckenzie @scarroll32 @voidspooks).

Intuitively, I would expect this to yield 6 log lines:

# approach 1
line1: 1024 * 64 * A
line2: 100 * A
line3: 1024 * 64 * A
line4: 100 * A
line5: 1024 * 64 * A
line6: 100 * A

But, this implementation from @ozfive yields:

# approach 2 (implemented)
line1: 1024 * 64 * A
line2: 1024 * 64 * A 
line3: 1024 * 64 * A
line4: 300 * A

In other words, log lines bleed into each other. The As from line1 bleed into line2, and so on.

I prefer approach 1, this seems way more intuitive to me when reading logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally do not care much, the current behavior is to only split at newlines. So if you write something without a newline at the end it also buffered it like we do now.

I am fine with changing the behavior to approach 1 but at that point we might as well just do not use the Scanner at all and exec the print function for each Write() call the caller makes, in this case we do not need to worry about the scanner limitations at all and do not need extra buffers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirupsen bumping this. I agree with @Luap99's explanation though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we can put a fix for the panic soon so that we can upgrade to a version without the DoS vulnerability that's described in #1370

Copy link
Contributor

@ashmckenzie ashmckenzie Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @sirupsen (was on vacation 😉), using the following test program:

package main

import (
	"bufio"
	"strings"
	"time"

	logrus "github.com/sirupsen/logrus"
)

func main() {
	output1 := strings.Repeat("A", bufio.MaxScanTokenSize) + "Z"
	output2 := strings.Repeat("B", bufio.MaxScanTokenSize) + "Z"

	logger := logrus.New()
	writer := logger.Writer()

	writer.Write([]byte("start"))
	writer.Write([]byte(output1))
	writer.Write([]byte(output2))
	writer.Write([]byte("end"))

	writer.Close()

	time.Sleep(500 * time.Millisecond)
}

as well as an updated writerScanner() function:

func (entry *Entry) writerScanner(reader *io.PipeReader, printFunc func(args ...interface{})) {
	for {
		data := make([]byte, bufio.MaxScanTokenSize)
		c, err := reader.Read(data)
		if err != nil && err != io.EOF {
			entry.Errorf("Error while reading from Writer: %s", err)
			break
		}

		if c > 0 {
			chunks := strings.Split(strings.TrimRight(string(bytes.Trim(data, "\x00")), "\n"), "\n")
			for _, section := range chunks {
				printFunc(section)
			}
		}
	}

	// Close the reader when we are done
	reader.Close()
}

I believe this achieves the preferred # approach 1:

$ go run play.go 2>&1
time="2023-06-02T22:06:34+10:00" level=info msg=start
time="2023-06-02T22:06:34+10:00" level=info msg=A<lots of As>
time="2023-06-02T22:06:34+10:00" level=info msg=Z
time="2023-06-02T22:06:34+10:00" level=info msg=B<lots of Bs>
time="2023-06-02T22:06:34+10:00" level=info msg=Z
time="2023-06-02T22:06:34+10:00" level=info msg=end

Am just updating the TestWriterSplitsMax64KB() test in a new branch to demonstrate. WDYT?

@sirupsen
Copy link
Owner

sirupsen commented Jun 3, 2023

@ashmckenzie @Luap99 I've decided to merge this behaviour and it has been released as v1.9.3. Thank you both for your efforts!

It is the least invasive change, and I'm committed to maintaining as strong backwards compatibility as possible due to the # of users of logrus. Let me know if you find any behaviours with this version

@sirupsen sirupsen closed this Jun 3, 2023
@ashmckenzie
Copy link
Contributor

Thanks @sirupsen and @Luap99, am very pleased to see this merged 🙂:+1::bow:

@kke
Copy link

kke commented Jun 7, 2023

Great! But as noted here k0sproject/k0s#3182 (comment) - the downside is it's not unicode safe

@sirupsen
Copy link
Owner

sirupsen commented Jun 7, 2023

@kke yeah I believe breaking codepoints is acceptable to make this change as minimal as possible. You shouldn't be logging > 64 KiB lines, because as also noted in the comment—it's difficult to ascertain where things came from due to how the log lines are broken up (see #1384 (comment)). If you absolutely need to log huge lines like this, you should choose something more performant than logrus :)

@kke
Copy link

kke commented Jun 8, 2023

Yeah, for us the problem comes from an external program which is launched through os/exec and is given logrus a writer as cmd.Stdout, one of the external tools dumped a certificate or whatever in debug mode and made the writer crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v1.9.1 causes panic in writer
6 participants