Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: fmt: don't recover panics except for dereferencing nil fmt.Stringer receivers #28150

Closed
bcmills opened this issue Oct 11, 2018 · 21 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Oct 11, 2018

When a type passed to a fmt function implements fmt.Stringer, the fmt package recovers any panics that the String method produces.

That's helpful for the common case of nil pointers (https://play.golang.org/p/YbmlVGcfPvO), but in other cases masks real, important bugs — and can lead to unexpected deadlocks (https://play.golang.org/p/GjOv8M75GqG; see also #25245).

I propose that, for Go 2, the fmt package should only recover a panic from a fmt.Stringer if:

  • the value underlying the fmt.Stringer is a nil value of a pointer type, and
  • the panic occurred as a result of dereferencing the receiver (not due to an explicit panic call or a dereference of some other value).

Enforcing the latter condition may require deeper runtime support.

@bcmills bcmills added v2 An incompatible library change Proposal labels Oct 11, 2018
@bcmills bcmills added this to the Go2 milestone Oct 11, 2018
@bcmills bcmills changed the title proposal: Go 2: fmt: don't recover panics except for nil fmt.Stringer receivers proposal: Go 2: fmt: don't recover panics except for dereferencing nil fmt.Stringer receivers Oct 11, 2018
@robpike
Copy link
Contributor

robpike commented Oct 11, 2018

The fmt package tries hard to do something always. The idea is that if you're printing something, and there's a problem, the problem is likely due to code that hasn't run before and/or is hard to get to run at all, so it's better to print something than crash or return error.

I'd like to keep it that way. It's worth not crashing a program because a log statement has a bad argument, for example. I admit this can cause problems sometimes, but all decisions do, and your proposal adds complexity and removes utility in the aid of relatively rare cases. The tradeoff doesn't feel right.

@bcmills
Copy link
Contributor Author

bcmills commented Oct 25, 2018

The idea is that if you're printing something, and there's a problem, the problem is likely due to code that hasn't run before and/or is hard to get to run at all, so it's better to print something than crash or return error.

If the problem is “due to code that hasn't run before and/or is hard to get to run at all,” that seems to make it even more important to preserve the stack trace that led to it — which is exactly the information that recover destroys.

@bcmills
Copy link
Contributor Author

bcmills commented Oct 25, 2018

The recover in fmt is especially insidious because — unlike the recover added for #28242 — it does not result in a user-visible error.

(Users often omit error checks for fmt calls, and even if they did check, the fmt functions don't return an error on panic anyway: https://play.golang.org/p/x92v-MWuzPF)

@robpike
Copy link
Contributor

robpike commented Oct 25, 2018

If the problem is “due to code that hasn't run before and/or is hard to get to run at all,” that seems to make it even more important to preserve the stack trace that led to it — which is exactly the information that recover destroys.

I know what you're trying to say, but I disagree. I'd rather see bogus output from a rare log statement than crash a production job.

I do not want to change this behavior in the fmt package.

@mvdan
Copy link
Member

mvdan commented Nov 16, 2018

Also worth noting that text/template now recovers panics encountered while calling template functions: #28242

One of the big points in favor was that fmt did something very similar. So if fmt changes in Go2, perhaps other pieces of the standard library like text/template should be reconsidered too.

There's one big difference, however. If fmt recovers a panic, it can be easy for it to go unnoticed, whereas Template.Execute will simply return the recovered panic as an error. That should be much more obvious.

@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Nov 27, 2018
@mvdan
Copy link
Member

mvdan commented Jan 29, 2019

I just realised that my last comment was completely redundant given @bcmills' earlier comment; apologies for not reading the thread properly.

@nhooyr
Copy link
Contributor

nhooyr commented Feb 9, 2019

Why don't we make fmt show the stack trace as well instead of just the panic value? That would be a reasonable middle ground.

@nhooyr
Copy link
Contributor

nhooyr commented Feb 9, 2019

E.g. this code (https://play.golang.org/p/g0_AlCKKOa3) right now produces

%!s(PANIC=runtime error: invalid memory address or nil pointer dereference)

Along with the error message, it'd include a full stack trace to make it easier to debug.

@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 16, 2019
@gopherbot gopherbot added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 3, 2019
@rsc rsc changed the title proposal: Go 2: fmt: don't recover panics except for dereferencing nil fmt.Stringer receivers proposal: fmt: don't recover panics except for dereferencing nil fmt.Stringer receivers Jun 21, 2023
@ianlancetaylor ianlancetaylor removed the v2 An incompatible library change label Jun 21, 2023
@ianlancetaylor ianlancetaylor modified the milestones: Go2, Proposal Jun 21, 2023
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@dottedmag
Copy link
Contributor

Along with the error message, it'd include a full stack trace to make it easier to debug.

Stack trace includes internal identifiers and paths, so putting them into fmt output might inadvertently expose them to the clients/users/other parties who should not have access to this information, unlike template where recovered panics do not end up in the rendered output. Even panic message is arguably too much information revealed.

@robpike
Copy link
Contributor

robpike commented Jun 22, 2023

I am not in favor of adding stack traces automatically, as I always groan when I get dumped a stack trace by some program I'm not responsible for. Perhaps under a flag or something, but not by default, please.

Also, how would you do that? The runtime/debug package already depends on fmt. The fmt package is a critical piece of the core and most of the library already uses it. You can't have fmt call out to much of the library without creating a circular dependency.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals Jun 28, 2023
@bcmills
Copy link
Contributor Author

bcmills commented Jun 29, 2023

I agree that adding stack traces to fmt output is a non-starter. It's too verbose, leaks too much information, and still doesn't address the fundamental problem of silently-corrupted output.

The goal of this proposal is to make a common class of bugs more likely to be found (in tests — especially fuzz tests — and in server logs), and thus more likely to be fixed before they cause real damage (like corrupting stored production data). Adding stack traces to fmt output would do almost nothing in the service of that goal.

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

The counterargument is that a lot of the time when you are debugging a mysterious problem, new prints get exercised. The last thing you want is to lose the info being printed - or for the whole server to crash - just because of a bad print. It is almost always better to format something and see a weird-looking print in a log than to lose the trace of the problem as a whole. I've seen that happen in other systems (not Go, since we guarded against this).

Have you seen the kind of corruption you are hypothesizing caused by fmt recovering panics in real systems?

@rsc rsc moved this from Likely Decline to Active in Proposals Jul 5, 2023
@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@aarzilli
Copy link
Contributor

aarzilli commented Jul 5, 2023

The counterargument is that a lot of the time when you are debugging a mysterious problem, new prints get exercised. The last thing you want is to lose the info being printed - or for the whole server to crash - just because of a bad print.

I've been wondering if this means that this program https://go.dev/play/p/Cg-49ZzTcjn should not crash (even though the crash is implied by the documentation of fmt).

@bcmills
Copy link
Contributor Author

bcmills commented Jul 5, 2023

The last thing you want is to lose the info being printed - or for the whole server to crash - just because of a bad print.

I'm more sympathetic to that argument for functions that are specific to logging, such as in the log and log/slog packages; however, fmt.Fprintf and similar are not specific to logging, and are used in packages like text/template that are mostly not used in a logging context.

Have you seen the kind of corruption you are hypothesizing caused by fmt recovering panics in real systems?

I have definitely seen deadlocks caused by fmt recovering panics in real systems. (Google bug # 8658139 is one such example, which motivated me to file issue #5350 over ten years ago. 😅)

I don't have any handy examples of data corruption, although I'm also not entirely sure how to go about looking for those. (Although I have been maintaining Go programs for a long time, I haven't maintained many production systems in Go that produced long-lived or user-facing data using fmt or packages such as text/template that build on it.)

@Merovius
Copy link
Contributor

I know I'm unlikely to sway anyone, but I'll leave the obligatory "I don't think we should recover any panics" comment.

I do occasionally get frustrated having to debug a panic in a Stringer or error implementation, because of the lack of context available, so I would welcome if fmt would leave them alone. And I don't like the idea of inspecting the dynamic value of interface implementations - especially not the idea of nil pointers being treated differently. I realize fmt is probably already doing that, but to me, it's not an exception to this rule. I also don't really see a difference between, say, a nil-pointer dereference and an out-of-bounds slice access, except maybe that slice-receiver are less common than pointer-receiver.

So if it where up to me, fmt (and text/template) would just leave panics alone.

@rsc
Copy link
Contributor

rsc commented Oct 24, 2023

I looked into Google bug 8658139, which turns out to have been about html/template, not fmt. Specifically, a method was being called in a template to create a web page, and it panicked while holding a lock, which made future attempts to render that specific page block forever. Now, it may be that in this specific instance html/template had called fmt, which ate the panic, but if fmt hadn't eaten the panic, then html/template would have instead (via text/template). And if html/template hadn't, then net/http would have. That is, there were three different packages stacked up, all of which would have recovered the panic. The code in question was only reading a data structure, and it should have used defer to unlock its lock. (This was in the bad old days when people avoided defer due to perceived performance problems. Those were never very large, and now they are completely gone.)

The fundamental question is whether it is permissible for Go packages to defend against problems in the code they call by recovering from panics. There are good arguments on both sides.

  • Against: It is possible that recovering a panic will mean that a deferred lock is unlocked with the data it protects in an inconsistent state. Or it is possible, as in the deadlock above, that a lock won't be unlocked or some other cleanup won't be done. (Of course, if we decide that recovering is OK, then not deferring the unlock is a bug, so the main concern is the inconsistent data.)

  • In favor: Recovering most panics leads to more stable, reliable programs. Not every program runs in a context where it will be automatically restarted when it crashes. Even those that do pay a restart cost that can be quite high. Keeping the program running is what we already do today, and despite the potential for inconsistency, empirically it works very well most of the time.

Personally I think the arguments come out in favor of "in favor", although I could be convinced they are merely balanced. Either way, that suggests we should preserve the status quo rather than make a very significant semantic change in the many packages that recover from panics: not just fmt, but also text/template, net/http, and probably others I am forgetting.

I admit the technical incorrectness of recovering, but pragmatically it seems to be much better than not.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals Nov 2, 2023
@rsc
Copy link
Contributor

rsc commented Nov 10, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals Nov 10, 2023
@rsc rsc closed this as completed Nov 10, 2023
@golang golang locked and limited conversation to collaborators Nov 9, 2024
@aclements aclements removed this from Proposals Nov 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal
Projects
None yet
Development

No branches or pull requests

10 participants