Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Error Reform and Failure Box #492

Closed
wants to merge 9 commits into from

Conversation

mitsuhiko
Copy link
Contributor

This RFC proposes four things:

  • a reform to the Error trait to support better messages.
  • a new Failure<T> wrapper for errors to support optional tracebacks.
  • a ConstructFailure trait to support interoperability between errors and failures.
  • New try! and fail! macros that use ConstructFailure instead of FromError.

Rendered View

solution to this problem is to box the error up and move it to the heap. While
heap allocation can sound controversial, the truth is that an `IoResult<u8>` is
currently 72 bytes large. Given that the vast majority of IO calls will actually
succeed this is a huge overhead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see some concrete examples showing that there is in fact "huge overhead" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, i'm going to provide a proper benchmark for this. The main problem is making it representative :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet in the RFC because I need to validate those numbers, but I get this currently:

test bench_large_result_3    ... bench:       134 ns/iter (+/- 37)
test bench_large_result_7    ... bench:       205 ns/iter (+/- 128)
test bench_large_result_ok_3 ... bench:        14 ns/iter (+/- 8)
test bench_large_result_ok_7 ... bench:        26 ns/iter (+/- 27)
test bench_small_result_3    ... bench:       128 ns/iter (+/- 38)
test bench_small_result_7    ... bench:       134 ns/iter (+/- 41)
test bench_small_result_ok_3 ... bench:         8 ns/iter (+/- 4)
test bench_small_result_ok_7 ... bench:        13 ns/iter (+/- 12)

_N means that it went through N stack frames. large_result means that it's thrown without boxing, small_result is the same error in a Box. _ok means Ok branch taken, otherwise it's the Err branch. The error thrown is a regular IoError.

Code is in the rust-incidents repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Came here to ask for exactly these numbers. What is still needed to validate your benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is for IoError sized structs. The odd thing is that I even see performance improvements on the OK branch for boxed up plain enums which seems suspicious.


# Questions

Q: Why is `name()` and `description()` necessary? What about translations?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is/are/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed.

@bstrie
Copy link
Contributor

bstrie commented Nov 30, 2014

cc @aturon

@bstrie
Copy link
Contributor

bstrie commented Nov 30, 2014

cc @glaebhoerl

@pythonesque
Copy link
Contributor

+1, with this RFC I would feel comfortable using Rust's normal error handling in many more scenarios than I currently do (because of size concerns and lack traceback).

@arthurprs
Copy link

Thanks for bringing up the fat results issue for discussion.
Although useful the hole traceback thing is very complex. Mixed feelings about it.


```rust
trait ConstructFailure<A> {
fn construct_failure(args: A, loc: Option<LocationInfo>) -> Self;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this lead to unneccesary allocation? Perhaps loc could be a closure that returned an Option<LocationInfo>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideal case is loc bring a uint that holds the instruction pointer and getting debug info from dwarf. There is nothing that prevents this other than that it's more work :)

@arielb1
Copy link
Contributor

arielb1 commented Nov 30, 2014

@mitsuhiko

Could you open an issue for the missing null pointer optimisation?

@mitsuhiko
Copy link
Contributor Author

@arielb1 filed as rust-lang/rust#19419.

@alexcrichton
Copy link
Member

cc rust-lang/rust#18629, a PR about conventions with the values returned by these error methods.

@aturon
Copy link
Member

aturon commented Dec 8, 2014

@mitsuhiko Sorry for the delay in getting back to you on this -- just back from the Mozilla work week in Portland.

Thanks for writing up such a detailed and clear RFC!

At a high level, I really like this proposal. It's highly practical, and increases the ergonomics of debugging without undue burden elsewhere. So mostly, I just have quibbles/questions about the details.

  • Despite the entry in the FAQ, I'm still not clear on the distinction between name and description. The vast majority of exception types that I'm familiar with just use the description and detail fields as I originally did with the Error trait. Can you give some concrete examples that motivate having all three fields, and explain why you couldn't handle them with two?
  • Is the use of Box in the ndebug version of Failure strictly to make it more implicit/easier to worked with boxed error types? I had some questions there:
    • You say that the use of Box "forces developers to implement the error trait for the box as well." But why not give a blanket implementation of Error for Box<T> where T: Error? I tried this and it seems to work fine.
    • You say "it also makes pattern matching over data contained on the error a lot more awkward as it requires deref-ing the error first." But I don't see how the Failure set up changes this -- maybe I'm missing something?
  • To be clear, do you intend that "leaf" APIs (the ones that originate an error) would just return a Result over that error type directly, whereas applications and abstract libraries would opt in to Failure for tracebacks? You talk a bit about this, but I wanted to clarify this point.
  • I believe that the ConstructFailure can basically be totally hidden? (If only we had full visibility hygiene for macros!).
  • The Clone bound is problematic for the new object safety rules. I think right now there is a hole in the rules that allows unsafe super traits, but once this is plugged I believe you won't be able to use Clone here any more. I suspect in the cases where you need it, you can always demand it separately.
  • The change from Send to 'static is probably ok, though the motivation isn't entirely clear to me (and the result is that you're not guaranteed to be able to communicate an error from one thread to another).
  • The downcasting infrastructure I'd prefer to leave out for now, as we decided when we originally landed Error. The problem is that it somewhat subverts the goals of the FromError infrastructure and the conventional enum-based representation for errors. We could certainly consider adding it later.
  • Overall, I found the ConstructFailure implementation a bit daunting. I'd like to iterate with you to see if we can simplify (though it's possible we cannot). Still, seems like it will be mostly hidden from view.
  • I think it's OK to enforce Error when using the try! macro; as you say, this is probably a good practice anyhow.
  • Finally, I'm just wondering about the 1.0 implications here. Clearly, the change to the Error trait is breaking, as is the Error requirement for try!. But I don't think anything else is. And if I understand the design correctly, it doesn't seem like the Failure box will actually be used anywhere in libstd itself (which is only an originator of errors). So this infrastructure could plausibly live out of tree for a while. I'm sympathetic to the concern that we want to standardize on this infrastructure, but I also don't want to standardize too quickly. In general, we're hoping that crates.io can serve as a means of experimentation and "market testing" before landing in std. To be clear, I'm not opposed to landing this now -- I just want to lay out the options. After all, on the flip side, having beautiful tracebacks could help make a stronger impression at 1.0.

Is anyone using rust-incidents in their own projects yet? What's your experience been like?

cc @wycats

@mitsuhiko
Copy link
Contributor Author

@aturon

Despite the entry in the FAQ, I'm still not clear on the distinction between name and description. The vast majority of exception types that I'm familiar with just use the description and detail fields as I originally did with the Error trait. Can you give some concrete examples that motivate having all three fields, and explain why you couldn't handle them with two?

I started out with trying to understand the errors as they are reported currently. At present I miss the name the most because it's tricky to find out which exact error was actually raised. As the description is more free form text content and can differ greatly between errors.

The idea was that the error could be rendered like this: "{name}: {description) ({detail})".

The difference between name and description is that the name is not a sentence or phrase. The difference between description and detail is that the detail is computed on the fly whereas description is just a basic textual description of what the problem might be. It's also important to point out that currently very often description is the same as detail which makes generic error rendering read really weird.

You say that the use of Box "forces developers to implement the error trait for the box as well." But why not give a blanket implementation of Error for Box where T: Error? I tried this and it seems to work fine.

There are two problems with this. The first is that because at that point you cannot have tracebacks any more or each error needs to be able to store traceback and location information. The second problem is that I could not make generic interoperability between boxed and unboxed errors work while using generic implementations for Box<T>.

You say "it also makes pattern matching over data contained on the error a lot more awkward as it requires deref-ing the error first." But I don't see how the Failure set up changes this -- maybe I'm missing something?

You are right that the same issue applies to Failure<T> however the big difference is that Failure<T> does not implement Error. It becomes very confusing to have an error implementation on a box and on the inner type. The failure wrapper allows APIs to continue emitting the internal errors directly if they do not want to force memory allocations. Interoperability between errors and failures is still guaranteed through explicit semantics.

With failure it feels like there are two concepts: errors and a failure type that wraps an error. With Box<Error> and Error directly there are two things that both implement error but the way to pattern match on the data is different, one involves the box keyword.

However core problem is still that location information cannot be achieved with Box.

To be clear, do you intend that "leaf" APIs (the ones that originate an error) would just return a Result over that error type directly, whereas applications and abstract libraries would opt in to Failure for tracebacks? You talk a bit about this, but I wanted to clarify this point.

I would say it depends on the API. I would argue that the IoError currently is just too large and it really should be a simple enum. IO operations would just generally return a simple enum that implements Error and be done with it. However for some other APIs it might make sense to use Failures directly. I would not want to draw a clear line where which one is preferable. Voices have been raised that requiring heap allocations for error handling is not a good idea.

I believe that the ConstructFailure can basically be totally hidden? (If only we had full visibility hygiene for macros!).

Yes, that in a way would have been the plan. However because of the few different conversions it can provide I figured exposing it for documentation purposes is not the worst idea.

The Clone bound is problematic for the new object safety rules. I think right now there is a hole in the rules that allows unsafe super traits, but once this is plugged I believe you won't be able to use Clone here any more. I suspect in the cases where you need it, you can always demand it separately.

I agree that Clone is probably not the best idea but it would support tracebacks tremendously because you do not lose the original error information upon conversion. The issue with "in the cases where you need it" is that we do not have a way to say: but if the error is also Clone, tracebacks preserve it, otherwise not.

The change from Send to 'static is probably ok, though the motivation isn't entirely clear to me (and the result is that you're not guaranteed to be able to communicate an error from one thread to another).

The problem with Send is that you cannot have reference stored on them which makes some otherwise neat things impossible. It also currently (due to bugs?) requires lots of methods to be declared as X + Send even though X is already a subtype of send.

The downcasting infrastructure I'd prefer to leave out for now, as we decided when we originally landed Error. The problem is that it somewhat subverts the goals of the FromError infrastructure and the conventional enum-based representation for errors. We could certainly consider adding it later.

The traceback pretty much depends on it. Note that the API can be entirely internally for a start.

Overall, I found the ConstructFailure implementation a bit daunting. I'd like to iterate with you to see if we can simplify (though it's possible we cannot). Still, seems like it will be mostly hidden from view.

I agree with this. I would love to get some feedback on if it would be possible to a similar user experience differently.

Finally, I'm just wondering about the 1.0 implications here. Clearly, the change to the Error trait is breaking, as is the Error requirement for try!. But I don't think anything else is. And if I understand the design correctly, it doesn't seem like the Failure box will actually be used anywhere in libstd itself (which is only an originator of errors). So this infrastructure could plausibly live out of tree for a while. I'm sympathetic to the concern that we want to standardize on this infrastructure, but I also don't want to standardize too quickly. In general, we're hoping that crates.io can serve as a means of experimentation and "market testing" before landing in std. To be clear, I'm not opposed to landing this now -- I just want to lay out the options. After all, on the flip side, having beautiful tracebacks could help make a stronger impression at 1.0.

That's why I would like to split the rfc into two parts; the necessary changes to Error and then everything else. Once the Error trait has been adjusted to support name() and a 'static trait (and an undocumented TypeId :)) everything else can live externally. I can do without Clone but I'm pretty sure TypeId will be necessary but it can be entirely internal and only available for error handling experiments.

Is anyone using rust-incidents in their own projects yet? What's your experience been like?

Because the traits are different using incidents currently is basically too much work :(

@alexcrichton
Copy link
Member

It also currently (due to bugs?) requires lots of methods to be declared as X + Send even though X is already a subtype of send.

I just tried this out in Cargo (we have a done of Foo + Send where trait Foo: Send), and it looks like this bug has been fixed! (see rust-lang/cargo#1024)

The change from Send to 'static is probably ok

I'll note that in Cargo we rely on being able to send errors across threads (just as a data point). Our use case is that the worker threads (aka parallel builds) are the ones generating errors which need to be communicated back to the main thread.

The problem with Send is that you cannot have reference stored on them which makes some otherwise neat things impossible

I may have missed something, but doesn't 'static also prevent storing references? The main type that Send blocks out today is Rc I think.

The downcasting infrastructure I'd prefer to leave out for now

As another data point, Cargo is using its own Error-like infrastructure (predated the std::error module) and one of the methods we have on our CargoError trait is an is_human method which is a flag indicating whether an error can be displayed to the user or not. I think that a form of downcasting would help a lot in detecting classes of errors like this (e.g. err.is::<HumanError>()). Right now boxed errors are pretty opaque and tough to get any sort of extra info out of (again, just a data point!)

@alexcrichton
Copy link
Member

@mitsuhiko actually I'm a little confused about one aspect here. In the RFC you've defined the Error trait as:

trait Error: 'static + Send + Clone { ... }

But one your comments seems to contradict this:

The change from Send to 'static is probably ok ...

The problem with Send is that you cannot have reference stored on them which makes some otherwise neat things impossible

I think @aturon misconstrued the addition of 'static to think it was replacing Send, and I think I also did the same thing!

I just wanted to clarify what the intended outcome of the bounds on Error are. Note that Send implies 'static (so the bounds as written are redundant), and 'static implies no references can be contained (which I think contradicts what you were mentioning about storing references). Just making sure we're all on the same page!

@mitsuhiko
Copy link
Contributor Author

I may have missed something, but doesn't 'static also prevent storing references? The main type that Send blocks out today is Rc I think.

You are absolutely correct. I'm not sure why I missed that :-/

actually I'm a little confused about one aspect here. In the RFC you've defined the Error trait as:

As mentioned I'm not sure about the intended trait bounds. The RFC is the conservative approach I think. I will do some experiments with just 'static and Send to see if it works out.

I just wanted to clarify what the intended outcome of the bounds on Error are. Note that Send implies 'static (so the bounds as written are redundant), and 'static implies no references can be contained (which I think contradicts what you were mentioning about storing references). Just making sure we're all on the same page!

Yeah. Just to clarify the intentions of the bounds:

  • Send was originally added to the error trait to allow sending them around (sensible)
  • 'static was added (in addition) so that TypeId becomes available. I believe this is required to make this work? Maybe it's implied now due to Send?
  • Clone is from my experiments helpful to allow both error conversion as keeping the original error around. This might be not necessary and I will see if I can due the same without clone.

@alexcrichton
Copy link
Member

Cool, thanks for the clarification @mitsuhiko!

@aturon
Copy link
Member

aturon commented Dec 17, 2014

@mitsuhiko Just an update: we do plan to take some action here but are currently swamped with other issues getting ready for the alpha. I will be in touch ASAP.

@aturon
Copy link
Member

aturon commented Dec 27, 2014

@mitsuhiko I just had a thought: I wonder whether the new fmt::String/fmt::Show changes should have an impact on the Error design?

One issue in the original design was whether detail should return Option<&str> or Option<String>. The latter makes it possible to lazily allocate/compute the string from other fields, but means that each such computation results in a new allocation. And in general we prefer being able to write to a writer, since producing a string is a special case.

So perhaps Error should inherit from fmt::String? But if so, would we want to retain something like detail as a convenience?

@mzabaluev
Copy link
Contributor

@aturon If detail is made non-allocating, what does it provide on top of description?

As I see it, the expectation is that detail will be called once or a very few times, likely in some detailed error display or a catch-all logging handler. It may not be called at all, especially if the error is acted upon as non-fatal, chained, or ignored. When dynamic data are used to format the detail, allocating a string internally just in case it may be required would often be wasteful.

Maybe the detail method should take a writer to write to?

@aturon
Copy link
Member

aturon commented Feb 4, 2015

@mzabaluev Apologies, I didn't see your comment until just now.

To be clear, what I meant was that we could use what is now fmt::Display to replace detail -- and that API in fact takes a writer. This change has, in fact, already been made as part of a separate RFC.

@aturon
Copy link
Member

aturon commented Feb 4, 2015

@mitsuhiko I'm wondering your thoughts on the change to use fmt::Display in place of detail?

I would like to tease out any remaining backwards-incompatible aspects of this RFC; we are getting ready to stabilize std::error. I don't think there's been a lot of buy-in for further splitting up detail/description; I believe that was the main incompatible change? Please let me know.

Once we've extracted the backcompat hazards, I think this should probably be closed in favor of a crates.io implementation for the time being.

You mentioned splitting up the RFC into those two parts previously; are you still open to doing that? I'd like to make some progress here.

@aturon aturon added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 22, 2015
@alexcrichton
Copy link
Member

Since this RFC was opened, the Error trait in the standard library has changed somewhat significantly (and almost has stable downcasting support) allowing for many sorts of more flavorful error abstractions to be built. There are likely still many extensions, and perhaps concrete errors, which can be added to the standard library, but we think it may be good to take a fresh look at the approach with a new spin on modern Rust (especially with relation to @aturon's last set of questions).

For now I'm going to close this RFC, but thanks regardless for it @mitsuhiko!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.