Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Procedural Macro Diagnostics (RFC 1566) #54140

Open
2 of 5 tasks
SergioBenitez opened this issue Sep 11, 2018 · 81 comments
Open
2 of 5 tasks

Tracking Issue: Procedural Macro Diagnostics (RFC 1566) #54140

SergioBenitez opened this issue Sep 11, 2018 · 81 comments
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Sep 11, 2018

This is a tracking issue for diagnostics for procedural macros spawned off from #38356.

Overview

Current Status

Next Steps

Summary

The initial API was implemented in #44125 and is being used by crates like Rocket and Diesel to emit user-friendly diagnostics. Apart from thorough documentation, I see two blockers for stabilization:

  1. Multi-Span Support

    At present, it is not possible to create/emit a diagnostic via proc_macro that points to more than one Span. The internal diagnostics API makes this possible, and we should expose this as well.

    The changes necessary to support this are fairly minor: a Diagnostic should encapsulate a Vec<Span> as opposed to a Span, and the span_ methods should be made generic such that either a Span or a Vec<Span> (ideally also a &[Vec]) can be passed in. This makes it possible for a user to pass in an empty Vec, but this case can be handled as if no Span was explicitly set.

  2. Lint-Associated Warnings

    At present, if a proc_macro emits a warning, it is unconditional as it is not associated with a lint: the user can never silence the warning. I propose that we require proc-macro authors to associate every warning with a lint-level so that the consumer can turn it off.

    No API has been formally proposed for this feature. I informally proposed that we allow proc-macros to create lint-levels in an ad-hoc manner; this differs from what happens internally, where all lint-levels have to be known apriori. In code, such an API might look lIke:

    val.span.warning(lint!(unknown_media_type), "unknown media type");

    The lint! macro might check for uniqueness and generate a (hidden) structure for internal use. Alternatively, the proc-macro author could simply pass in a string: "unknown_media_type".

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

I'd argue that support for associating warnings with lints should be a separate RFC, and shouldn't block moving forward with unsilenceable warnings (with the expectation that anything to associate warnings with lints would need to be an additional API)

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

Similarly, I'm not sure we actually need to address multi-span support before this API can be stabilized. The proposed change there involves changing some methods to be generic, which is considered a minor change under RFC #1105. It could also be done by changing Span itself to be an enum, rather than having a separate MultiSpan type

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Sep 11, 2018

@sgrif I suppose the question is: should unsilenceable warnings be allowed at all? I can't think of a reason to remove this control from the end-user. And, if we agree that they shouldn't be allowed, then we should fix the API before stabilizing anything to account for this. I'd really rather not have four warning methods: warning, span_warning, lint_warning, lint_span_warning.

Similarly, I'm not sure we actually need to address multi-span support before this API can be stabilized.

Sure, but the change is so minor, so why not do it now? What's more, as much as I want this to stabilize as soon as possible, I don't think enough experience has been had with the current API to merit its stabilization. I think we should implement these two features, announce them broadly so others can play with them, gather feedback, and then stabilize.

It could also be done by changing Span itself to be an enum, rather than having a separate MultiSpan type.

Right, that works too.

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

I suppose the question is: should unsilenceable warnings be allowed at all? I can't think of a reason to remove this control from the end-user.

I think "having a thing we can ship" is a decent reason, but I also think an API that only supports error/help/note, but not errors is sufficiently useful to ship even without warnings. I'd support doing that if it meant we didn't block this on yet another API -- Mostly I just want to avoid having perfect be the enemy of good here.

Sure, but the change is so minor, so why not do it now?

Because we have a perfectly workable API that's being used in the wild right now that we could focus on stabilizing instead. Typically we always trend towards the more conservative option on this sort of thing, shipping an MVP that's forward compatible with extensions we might want in the future.

I don't think enough experience has been had with the current API to merit its stabilization.

So what needs to happen for that? Should we do a public call for testing? Definitely adding more docs is huge. I suppose it'd be good to see what serde looks like using this API as well.

@SergioBenitez
Copy link
Contributor Author

So what needs to happen for that? Should we do a public call for testing? Definitely adding more docs is huge. I suppose it'd be good to see what serde looks like using this API as well.

Yeah, that's exactly what I was thinking.

Because we have a perfectly workable API that's being used in the wild right now that we could focus on stabilizing instead. Typically we always trend towards the more conservative option on this sort of thing, shipping an MVP that's forward compatible with extensions we might want in the future.

I don't think this is an eccentric proposal in any way. When folks play with this, they should have this feature. In any case, I'll be implementing this soon, unless someone beats me to it, as Rocket needs it.

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018 via email

@zackmdavis
Copy link
Member

Maybe it's too late (I'm lacking context here), but is there any hope of unifying proc-macro diagnostics with those emitted by the compiler itself? It seems sad and unmotivated to have two parallel implementations of diagnostics. (Rustc's diagnostics also have a suggestions API (albeit still somewhat in flux) that harbors a lot of promise given the new cargo fix subcommand that it would be nice for the Rocket/Diesel/&c. proc-macro world to also benefit from.)

@Havvy Havvy added A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Sep 12, 2018
@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Sep 12, 2018

@zackmdavis The API being exposed by the diagnostics API in proc_macro today is a refinement of the internal API; they're already quite unified, with minor differences to account for the context in which they are used. The implementation is a thin shell over the internal implementation.

In general, rustcs evolving needs and the proc_macro diagnostics API aim for stability prohibit the two from being identical. This is a good thing, however: rustc can experiment with unstable APIs as much as it wants without being concerned about stability while proc_macro authors can have a stable, concrete API to build with. Eventually, features from the former can makes their way into the latter.

@lambda-fairy
Copy link
Contributor

Maud also uses the diagnostic API. It would benefit from both features described in the summary:

  1. Multi-span support – Currently, the duplicate attribute check emits two separate diagnostics for each error. It would be cleaner to emit a single diagnostic instead.

  2. Lint-associated warnings – We want to warn on non-standard HTML elements and attributes. But we also want to let the user silence this warning, for forward compatibility with future additions to HTML.

@macpp
Copy link

macpp commented Sep 21, 2018

Is there any way to emit warning for arbitrary file? It could be usefull for macros that read additional data from external files (like derive(Template) in https://github.com/djc/askama ) .

If it's not possible, how problematic it is to add to Diagnostics something equivalent to :
fn new_raw<T: Into<String>>(start: LineColumn, end: LineColumn, file: &path::Path, level: Level, message: T) -> Diagnostic ?

@alexcrichton alexcrichton added B-unstable Blocker: Implemented in the nightly compiler and unstable. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 1, 2018
@dhardy
Copy link
Contributor

dhardy commented Oct 31, 2018

Something I find confusing about the current nightly API: does Diagnostic::emit() return? (It appears to do so sometimes but not others; even for errors.)

Currently I must use unreachable!() after cases where I think emit should not return... and sometimes this results in internal error: entered unreachable code while at other times it does not (I can't spot a functional difference between the two cases, except for different spans being used).

In my opinion:

  • either procedural macro functions should be revised to return a Result (probably difficult to do now)
  • or there should be a documented, reliable way of aborting (i.e. fn() -> !) with an error code (via panic!, emit() or whatever; currently panic! is reliable but does not give a nice error message)

@SergioBenitez
Copy link
Contributor Author

@dhardy Diagnostic::emit() should always return.

@dhardy
Copy link
Contributor

dhardy commented Nov 1, 2018

Okay, fair enough; not sure why I some panic messages sometimes and not others.

Then we could do with something that doesn't return; maybe Diagnostic::abort()?

I guess syn::parse::Error::to_compile_error and std::compile_error are what I was looking for.

@lambda-fairy
Copy link
Contributor

Personally I'd prefer not exposing a Diagnostic::abort() method, because it encourages macro authors to bail out at the first error instead of collecting as many errors as possible.

The compiler will not proceed with type checking if your macro emits at least one error, so you can get away with returning nonsense (like TokenStream::empty()) in that case.

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Nov 4, 2018

@macpp It's not possible today. I agree that something like this should exist, but much thought needs to be given to the API that's exposed. The API you propose, for instance, puts the onus of tracking line and column information on the user, and makes it possible to produce a Diagnostic where Rust would make it impossible to do so due to Span restrictions.

The API that I've considered is having a mechanism by which a TokenStream can be retrieved given a file name:

impl TokenStream {
    fn from_source_file<P: AsRef<Path>>(path: P) -> Result<TokenStream>;
}

This would make it possible to get arbitrary (but "well-formed") spans anywhere in the file. As a bonus, it means you can reuse libraries that work with TokenStream already, and implementing this function is particularly easy given that Rust uses something like it internally already. The downside is that you can only use this when the source contains valid Rust tokens, which in practice means that delimiters must be balanced, which seems like a sane restriction.

@macpp
Copy link

macpp commented Nov 4, 2018

Well, of course if you deal with .rs files, then TokenStream::from_source_file is much better solution. However, i want to notice that this works only with valid rust files. To clarify, what i wanted to propose is api that allows me to emit warning for any kind of file - for example if my procedural macro reads some configuration from config.toml, then i want to be able to do something better than

panic("bad configuration in config.toml file: line X column Y")

Unfortunately, i forgot about Span restrictions, so api that i proposed earlier is simply wrong for this use case :/

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Nov 4, 2018

@macpp No, that would work for almost any kind of text file. A TokenStream is only beholden to matching delimiters; it need not necessarily be Rust. If it can be used as input to a macro, it would be parseable in this way. It would work just fine for Askama, for instance.

@Arnavion
Copy link

Arnavion commented Nov 4, 2018

The downside is that you can only use this when the source contains valid Rust tokens, which in practice means that delimiters must be balanced, which seems like a sane restriction.

And that strings inside single quotes must contain a single codepoint, which eliminates a lot more things (including TOML files) that have arbitrary strings in single quotes.

Even the "delimiters must be balanced" requires that the file use the same idea of where a delimiter is and isn't valid. For example the source file must agree that a ") sequence does not end a previous ( because the Rust parser ) will treat it as part of a string.

In short, I don't have a problem with having a TokenStream::from_source_file, but I would not push this as a solution to the "diagnostics for arbitrary non-Rust source files" problem.

@jhpratt
Copy link
Member

jhpratt commented Mar 29, 2023

Not impossible, but it'd require you to keep the details for the diagnostic around yourself, rather than a type of the proc_macro crate providing that functionality for you.

I think it's important that the proc_macro crate remains simple and minimal. That doesn't mean we should make it inconvenient on purpose, but it does mean we don't need to provide layers of extra convenience on top of the basic functionality. (Just like how we provide TokenStream, but not a full parser with convenient structs for all grammar elements, which is left to ecosystem crates such as syn.)

Forcing the end user to either add a dependency or reinvent the wheel for something that can be trivially done (and is already implemented for the internal diagnostics) seems needlessly restrictive.

The default could be to eagerly emit, as that is likely what a majority of uses will be. However, having the ability to emit a diagnostic lazily on demand is something that more advanced uses cases will unquestionably want. I'll concede that I think proc_macro should have far more native tooling than it does (acknowledging that there would be tons of bikeshedding), and my belief of .emit() or similar is in line with that.

Even if there were a .lazy_error() method and similar, that would be more than sufficient for what I'd like to see.

More generally, if the diagnostic isn't being emitted on drop, how would the API even work? It would necessarily emit the diagnostic on .error(), but then you'd be adding labels, notes, etc. on a diagnostic that is already emitted, no?

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Mar 29, 2023

There are some other differences.

Ah, I see now that the differences are rather significant, in fact. The methods for DiagnosticHandle in your comment don't have any receiver, so it wasn't clear to me what was going on (it's still not quite clear), but as far as I can understand, the proposal is that you have an opaque DiagnosticHandle to some part of a diagnostic which you can then add more parts to, and every time you do, you get another handle to that part.

impl DiagnosticHandle {
    pub fn label(span: impl Spans, message: impl Message) -> DiagnosticHandle;
    pub fn note(message: impl Message) -> DiagnosticHandle;
    pub fn help(message: impl Message) -> DiagnosticHandle;
    // pub fn suggestion(…) ?
    // pub fn lint(…) ?
    // pub fn primary_span(…) ?
    // … ?
}

Is the handle Clone? Or Copy? Or do all of those methods take &self?

For example, with the API you proposed, you can't add labels to the main diagnostic after adding a subdiagnostic. (It'd be added to the last subdiagnostic. With the 'handle' style api, you can keep a handle to the main diagnostic around to add a label to it later.)

That's somewhat true. If all we have are the by-move builder methods, this would be true, but I did suggest that we have &mut methods, which would enable this. I don't think I've ever needed to do this, though. But I don't think my API makes it impossible, but less convenient.

I think it's important that the proc_macro crate remains simple and minimal. That doesn't mean we should make it inconvenient on purpose, but it does mean we don't need to provide layers of extra convenience on top of the basic functionality. (Just like how we provide TokenStream, but not a full parser with convenient structs for all grammar elements, which is left to ecosystem crates such as syn.)

To be fair, this is because there is no mechanism for Rust to expose an AST in a compatibility-safe manner. I don't think the existence of syn in Rust is a good thing. I think we'd all love to avoid the detriments that come with needing a crate like syn, including compile times.

Not impossible, but it'd require you to keep the details for the diagnostic around yourself, rather than a type of the proc_macro crate providing that functionality for you.

But this is really inconvenient. You're effectively asking users to duplicate the entire logical Diagnostic type to get this functionality in exchange for lazily emitting diagnostics, which I still find quite problematic.

In the same way, I think that note(span, "abc"); for basic notes/warnings/errors is much prerrable over using Diagnostic::note("abc").mark(thing).emit();, even if there are some use cases where it might be useful to delay the .emit() call. I think it's important to keep simple things simple, and that we should be careful not to optimize only for the complex use cases.

The fair comparison would be note(span, "abc") vs. thing.note("abc").emit(), but even that isn't fair as the latter has clear semantics about when the note is emitted whereas the former does not. It's still not clear to me when I should expect the diagnostic to be emitted, or if there are cases in which it won't be emitted.

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Mar 29, 2023

rustc's Diagnostic already accepts a MultiSpan in that place: the primary span is effectively Vec<Span> (as can be seen when you have too many formatting arguments in a format! formatting string.

Ah that's true. In fact, I implemented it in proc_macro. 😅

I am not in favor of making the constructors for diagnostics reside either on Span or AST nodes: it should either be a function or macro living in the proc_macro namespace.

I think that's fine, although I'd love to hear why you feel this way.

We discussed two alternatives: "registration order" (order of construction) or "drop order" (and just rely on an impl Drop for Diagnostic, but that approach could not trigger if a panic occurs).

Got it. Would diagnostics be emitted if a panic occurs? If so, this would mean that an in-progress diagnostic will be emitted even if it's not complete as this API provides no means by which rustc would know if a diagnostic is complete or not.

In short, I don't see what advantage lazily emitting diagnostics provides. It's really unclear to me when you should expect a diagnostic to be emitted, and if there are cases in which it won't be emitted.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2023

Is the handle Clone? Or Copy?

They are Copy. (This is mentioned in my comment: "impl Copy for DiagnosticHandle".)

The methods for DiagnosticHandle in your comment don't have any receiver

Oh oops. Fixed.

It would necessarily emit the diagnostic on .error(), but then you'd be adding labels, notes, etc. on a diagnostic that is already emitted, no?

That's already explained in my comment: the diagnostics are emitted when the proc macro finishes executing. So, we can still add more information to diagnostics as long as the proc macro runs.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2023

How often does one need lazy diagnostic functionality? Aren't most use cases easily restructured to work with eager diagnostics?

For lazy diagnostics, we need to decide between &mut self or self receivers. Taking self by (mut) reference makes it easy to miss .emit(), and taking self by value makes it annoying to change a mutable diagnostic variable/argument. (Which is why the rustc api chose for &mut.) And having methods for both gets messy. The copyable handle style api fixes/side steps these issues.

The main reason for keeping the stable surface of the proc_macro crate as minimal as possible is that what's stable will be stable forever. We don't really get to do a major version bump (unlike an ecosystem crate). (That's why I don't feel comfortable committing to having tokentree.error() or span.error() etc. What's convenient now might just get in the way in the future.)

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2023

Would diagnostics be emitted if a panic occurs? If so, this would mean that an in-progress diagnostic will be emitted even if it's not complete as this API provides no means by which rustc would know if a diagnostic is complete or not.

We can either not show any diagnostics if the proc macro panics, or just show everything until the panic. I prefer the latter. I don't think having an "incomplete diagnostic" is surprising or a problem, as it would work just the same as println!():

println!("a");
panic!();
println!("b");
// "a" shown, followed by a panic message.
// "b" not shown.
let diag = error(span, "a");
panic!();
diag.note("b");
// error "a" shown, followed by a panic message.
// note "b" not shown.

An error without a note or label is still useful. And it's directly followed by the "proc macro panicked" message that explains it went 💥.

@Qix-
Copy link

Qix- commented Mar 29, 2023

Subjectively and personally, to me it doesn't really matter what gets emitted when it panics. I won't trust any of the output up until that point anyway given my experience with compilers. I'd rather see the crash fixed and getting complete output again before I trust whatever I see up until that point.

@bjorn3
Copy link
Member

bjorn3 commented Mar 29, 2023

This is about when the proc macro panics, not when the entire compiler panics. Rustc catches proc macro panics and presents them as regular errors. Almost every proc macro currently uses panics to report errors as the diagnostics api isn't stable. It will take a while for the entire ecosystem to move to the diagnostics api, until which it is expected that errors will be reported as a mix of the diagnostics api and panics. As such I think it is important for panics to still cause diagnostics to be emitted.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2023

In my view every diagnostic is either unsilenceable (error) or silenceable (warning), which implies that notes/helps should not be considered diagnostics in their own right by the API.

I forgot to mention that @estebank and I also discussed this. While I mostly agreed with the quote above, @estebank convinced me there are enough use cases for non-lint warnings/notes that we probably shouldn't block the warning and note diagnostic API on having a lint API. (For example, proc macros specifically used for their warnings, like a #[warn_when_using_casts] proc macro attribute. Or a proc macro that loads or writes external data/files, producing a note when something has been updated. Or simply using note diagnostics as a substitute for "println debugging". Or a proc macro with its own language/DSL with its own syntax for silencing warnings.)

We can always add a proc_macro::lint(…) function in the future to support opt-in/out lints that work with the #[allow()]/#[deny()]/etc. system, but that will have to come together with something (a macro?) to allow a proc macro crate to define lint names and their default reporting levels.

@programmerjake
Copy link
Member

Almost every proc macro currently uses panics to report errors as the diagnostics api isn't stable.

that's becoming less true since a decent amount of proc macros now use syn::Result which translates to generating compile_error! rather than panicking.

@pksunkara
Copy link
Contributor

But, they all still have to build a nice message with notes and help. Having diagnostics being stable lets all the proc macros have the same UI as rust compiler errors.

@akarahdev
Copy link

akarahdev commented Jun 2, 2023

I know the discussion was earlier, but I do agree we shouldn't block this from going anywhere due to the lint API. I think we should start off with something basic like what Sergio suggested with a simple DiagnosticHandle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests