Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify attributes and macros to use @ sigil, redux #386

Closed
wants to merge 2 commits into from

Conversation

nikomatsakis
Copy link
Contributor

A revised version of @pcwalton's RFC, incorporating a more precise description of conflicts and a description of feedback.

Rendered view.

@nikomatsakis
Copy link
Contributor Author

cc @pcwalton @aturon


**Token trees as the interface to rule them all.** Currently,
decorators like `deriving` are implemented as a transform from one AST
node to some number of AST nodes. Basically they take the AST node for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decorators are actually a bit more complex than that now. They're split into two different types of extensions: ItemDecorators and ItemModifiers. ItemDecorators are given a reference to an Item as input and produce some number of new items from it. deriving is an ItemDecorator, it takes in the tagged Item and produces impl items based on it. ItemModifiers take in an item and return a new item what will be used in the input's place.

This split was made to enforce some sanity on the operation of decorators in a world where third party decorators can be written and used together. If one isn't careful, you can end up in sad situations like this. Imagine we have a decorator #[munge] that alters names of fields (who knows why). Now say we want to use it along with another decorator like deriving:

#[deriving(PartialEq)]
#[munge]
struct Foo {
    f: int
}

Assuming that decorators evaluate top-down, we'll run into some problems. deriving will be evaluated first, and produce an impl like

impl PartialEq for Foo {
    fn eq(&self, other: &Foo) -> bool {
        self.f == other.f
    }
}

But then #[munge] is evaluated and renames f! If we order the attribute the other way, we might be safe, if munge is careful to preserve the attributes when it makes the new item. If it doesn't, we'll silently lose the PartialEq implementation. It gets even crazier if munge wants to add new attributes that may themselves be decorators!

In the new system, the compiler evaluates all ItemModifiers first, chaining the result through all of them. It can then run ItemDecorators afterword, guaranteeing that they'll see the item that's actually going to be output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me though that even with that system, two different attributes can still interact strangely depending on order.

For example, one item modifier might add an filed, and another might change a property of all fields (eg make them public).

So in any case the order of attributes is important for both the user and the attribute impl to consider, which makes the internal distinction between them helpful, but not the full solution.

I think providing facilities to properly propagate all possible existing attributes down the expansion chain, and making applying them easy/the default to do would be the better option here, in which case merging the two cases would not be an issue.

@Valloric
Copy link

This is a good idea. People will bikeshed over the @ sigil, but the following facts will remain:

  1. Other popular languages (Python, Java etc) have set a precedent for @ being used for "meta things". A good argument would have to be made for Rust to be inconsistent here.
  2. As the RFC mentions, ! is valuable ASCII "real estate", it reads like "pay attention here" and thus should be used for error handling syntax sugar (or similar) and not for macros.

position.

`@foo{...}` and `@foo[...]` are *always* interpreted as
macro invocations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But are they interpreted as statement or expression macros (#364)? I would suggest that () and [] macros be always interpreted as expressions, and {} be interpreted as statements when in statement position (which are the rules used in #378).

@reem
Copy link

reem commented Oct 11, 2014

Unresolved Questions (might as well go here as a discussion of updating macros/syntax extensions):

How does this interact with syntax extensions like the IdentTT variant &c.?

Could syntax extensions be further generalized to allow the equivalent of StrTT (called like macro! "some string" { .. }) and others? Is this desirable?

@jfager
Copy link

jfager commented Oct 11, 2014

The extended example, w/o the cheat codes:

@frobnicate(...)
@defenestrate(...)
@transmogrify(....); 

@frobnicate(....)
@transmogrify { ... } 

fn example_fn() {
    @frobnicate(...)
    let x = @transmogrify(...);

    @frobnicate(...)
    if some_cond == @transmogrify(...) {
        @!ossify(...)

        @frobnicate(...)
        @defenestrate(...)
        fn foo() { ... }

        let x = (@!frobnicate(...) foo * bar) + zed;

        @transmogrify(...);
        @obdurate(...);
        @transmogrify(...)
    } else {
        @!frobnicate(...)
    }
}

Adding a whole set of rules for resolving completely voluntarily created ambiguities seems less than ideal. The only benefit to the shared sigil that I'm reading is that it reflects current and planned similarities between the two features. But if the two will continue to need to be understood independently, and will continue to have different use cases and require disambiguation, is that really a benefit?

@Ericson2314
Copy link
Contributor

This sounds great, glad you guys are sticking to the long term plans to unify them.

@vadimcn
Copy link
Contributor

vadimcn commented Oct 11, 2014

  1. As mentioned in the RFC, @ stands out too much in most fonts and thus emphasizes macros much more than ! used to. Can we use # instead? IMO, #println(...) looks way less noisy than @println(...). println#(...) would be even better.
  2. I'd prefer to keep the #[...] syntax for attributes. Should be easier to disambiguate during parsing, and looks better. As was said by many before me, attributes and macros serve different purposes, so it's ok for them to look differently.
  3. In fact, since Rust has aspirations to become C++ replacement for high-performance code, let's reserve @ for a future matrix multiplication operator. For motivations see here :)

@CloudiDust
Copy link
Contributor

I still think @foo() for macros and @foo(): & @!foo() for attributes are better than "completely unify and add many disambiguation rules". Also the attribute syntax will be lighter-weight than @[foo].

Just because two things are similar doesn't mean we should use the same notation for them, as long as they are only similar, not the same, but we can use similar notations.

@CloudiDust
Copy link
Contributor

Eventually, @foo(some_token_tree): decorated will just be the alternative syntax for @foo(decorated, some_token_tree) if macros and attributes/decorators are unified under the hood.

@CloudiDust
Copy link
Contributor

@vadimcn, # used to be Rust's macro sigil and got replaced because many didn't like it. (This happened before I got to know Rust.)

Personally I think # fits attributes/decorators but not macros, as # often means a tag that is attached to something, and attributes are somewhat "tags" while macros are not. If I am to choose between @ or # for macros I'd go with @. But yes this is subjective.

@CloudiDust
Copy link
Contributor

Also, currently, the difference between inner attributes and outer attributes are smaller than that between outer attributes and macros. So why should outer attributes and macros have the same notation when the "gap" is larger?

In the future, the only difference between the three would be where they apply. And this difference is simply not going away.

So, we can unify the three's syntax with the leading @ to signify the similarities between them, but should also have the "where" part clearly indicated in the syntax.

I believe that, for this job, : is a better choice than [] or "nothing".

@CloudiDust
Copy link
Contributor

OTOH, I believe making macro calls more consistent with function calls regarding ; is a good idea. I just don't think ; should be used for disambiguation purposes.

@Kimundi
Copy link
Member

Kimundi commented Oct 12, 2014

Where does this leave the IDENT! IDENT ( TT ) / @IDENT IDENT ( TT ) form, as used by macro_rules?

In general I think I can live with @ as common macro sigil, but I'm also worrying that the ambiguity rules might be to confusing in practice, and whether the "Add a : to differentiate" proposal might be worthwhile. (But then again, in practice you wouldn't use so many macros and attributes)

Also, an interesting idea: If we had type macros, you'd be able to somewhat recover the old sigil prefix types with the the macro syntax: type Foo = @Rc X; could be an attribute that expands to type Foo = Rc<X>. Assuming we make types and macros share the same namespace, we could even add that as a general syntax sugar for types, instead of custom one-off attributes.

@nrc
Copy link
Member

nrc commented Oct 12, 2014

Strongly +1 to all of this.

Slightly tangential, but relevant to the future plans section: I would like to split the AST that libsyntax generates from the AST which librustc uses. I believe a clean separation here would be beneficial in terms of cleanly engineering the compiler. With this change, the libsyntax AST should match the source language very closely, should only change as the language changes, and shouldn't contain any implementation details which rustc relies on. We could then offer this as part of the API for syntax extensions and the compiler itself. Other parser libraries would also produce this, and thus be able to cleanly interact with the compiler. I don't think this changes anything laid out in the RFC, however, I believe it is still better to pass token trees to syntax extensions and provide the facility for parsing them.

However, this makes me wonder about how this works in practice - for syntax extensions to be composable, they must produce token trees as output, which means we need to offer a way to convert from an AST back to a token tree. Do we have that at the moment? Will there be difficulties? (I'm thinking of spans, in particular, but perhaps other stuff too).

@nrc
Copy link
Member

nrc commented Oct 12, 2014

Another idea I was toying with (which is closely related to the proposal in this RFC) is to expand the information provided by token tree to include items. Currently token trees identify tokens and clauses identified by matching brackets. We could extend this to also match the keywords and the following scope - something like keyword tokens [{ tokens }|;]. Whether a macro is attribute-like or call-like could then be derived from the token tree and the macro definition would be able to make use of the extra info. I am not sure if this makes things simpler or more complex...

@liigo
Copy link
Contributor

liigo commented Oct 12, 2014

+1
2014年10月12日 下午1:04于 "Richard Zhang" [email protected]写道:

I still think @foo() for macros and @foo(): & @!foo() for attributes are
better than "completely unify and add many disambiguation rules". Also the
attribute syntax will be lighter-weight than @[foo].

Just because two things are similar doesn't mean we should use the
same notation for them, as long as they are only similar, not the
same, but we can use similar notations.


Reply to this email directly or view it on GitHub
#386 (comment).

@gifnksm
Copy link

gifnksm commented Oct 13, 2014

I like @CloudiDust's idea that @foo(some_token_tree): decorated is a sugar of @foo(decorated, some_token_tree).

In the similar way, decorated @!foo(some_token_tree) and mod decorated { @!foo(some_token_tree) ... } can also be desugared to @foo(decorated, some_token_tree) and @foo(mod decorated {...}, some_token_tree).

@reem
Copy link

reem commented Oct 13, 2014

@nick29581 I have been messing around with syntax extensions a bit in the past week and I think that your suggestion of splitting librustc's AST and libsyntax's AST would be enormously helpful when it comes to cleaning up the syntax extension API.

@sfackler
Copy link
Member

One benefit of having syntax extensions process TTs is that it'll give a simple way forward to allow decorators to work on non-Items. This is now the only blocker preventing #[cfg] from being implemented as a decorator.

Here are some alternatives that were considered:

1. Use `@foo` for attributes, keep `foo!` for macros (status quo-ish).
2. Use `@[foo]` for attributes and `@foo` for macros (a compromise).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will "@[foo] for attributes and @foo for macros" remove all the ambiguity issues? If so I much prefer this. It unifies the two with a common, prettier sigil, indicating that they both do code generation, but they still show that they still have slightly different semantics. { @!foo } seems like a bit of an ugly workaround that would be hard to format nicely, and does not solve the problem of annotating conditionals. I suspect we would have to explain the reasoning over and over again to new folks years into the future... "You can't do this because it is ambiguous..."...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @foo: rather than @[foo] would also resolve the ambiguity

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These solutions would allow for helpful syntax hightligting. I like that.

@SiegeLord
Copy link

Strong -1 to this proposal just like the old one. First, I'll have to re-iterate the points I made on the last RFC. ! is used for 4 things in Rust right now: boolean NOT, bitwise NOT, inner attribute/doc-comment syntax and macros. Only 1 out of those 4 things can be considered 'dangerous', and I somewhat disagree that macros are dangerous in any sense of that word. Pretending that ! has a uniform connotation is contradicted by its current and future usage (as this RFC does not alter the other 3 usages). The usage should be consistent within Rust first and foremost, and not between other historical languages.

! for error handling is an unproven idea, and it makes no sense to change the current syntax in a negative way (as ! looks subjectively better than @) for a hypothetical usage. Either way, it is a false dichotomy that keeping ! in macros will preclude it from being used as an error handling operator. The only ambiguity is when you're applying ! to an identifier, which can be avoided by wrapping the identifier in parentheses/a block. If the meaning of ! is '.unwrap()' then the vast majority of its use cases will be right after a function call, i.e. let a = b.foo()! which requires no disambiguation. The alternative of using ! (and ?) in identifiers is a terrible idea for this purpose, as it relies on a naming convention to accurately represent a pretty significant change in semantics.

In terms of subjectivity of ! looking good, the syntax-highlighting argument is completely inappropriate. Rust should not need to rely on tools to look good: Rust is not APL.

Lastly, I find the notion of 'valuable ASCII real estate' a bit schizophrenic. Rust has been moving away from using valuable ASCII real estate for months, and now you want reserve some line noise symbols for future usage? Are ASCII symbols for novel semantics a bad thing or a good thing? I don't agree that meaning of ! is immediately and unambiguously clear from historical usage alone.

Now moving on to the idea of the unification of syntax. If the enormous amounts of disambiguation rules doesn't tell you it's a terrible idea, I don't know what will. I'll defer to @jfager's comment above. In particular, I want to highlight the issue of the false syntax similarity given the differences (e.g. conflicting names of attributes and macros... why have them conflict if the syntax is the same?).

Lastly, I think the stabilization on token trees as the output format is a bad idea. While today, it is usually simpler to quasiquote token trees to produce the AST, this is only because the AST builder is hard to use directly (being poorly documented). I really find having to insert commas, braces and parentheses programmatically to be very inconvenient, and against the spirit of the AST macros. Frankly, at this point it becomes more attractive to standardize on a string output format, as it is easier to deal with. The typed token-constructing API provides virtually no benefit over it, in my opinion. I think chained macros is a niche use case that can co-exist with an AST-outputting macro types, but is not common enough to treat it as if it is the sole use case.

@netvl
Copy link

netvl commented Oct 13, 2014

I wholly agree with @SiegeLord. I would also like to add that unification of attributes and macros does not make sense with some of attributes being not even remotely macro-like, for example, #[inline] or #![feature] or #[test].

@pcwalton
Copy link
Contributor

I don't think string output formats for macros will work for hygiene.

@nielsle
Copy link

nielsle commented Oct 14, 2014

By the way the sigil syntax could potentially create a lot of spam references on github. The following four people will be spammed

https://github.com/try
https://github.com/cfg
https://github.com/println
https://github.com/fmt

@nikomatsakis
Copy link
Contributor Author

Hmm, it occurs to me (which should have been obvious) that using (@!foo...) for expression (resp. block) annotations does not in fact remove the ambiguity. After all, (@!foo (bar)) (resp. {@!foo (bar)}) could be parsed in one of two ways.

@CloudiDust
Copy link
Contributor

@netvl, in a sense, @ can mean "meta and magical staff".

I am fine with the current syntax, but if we are to free up ! and #, then @foo/@foo:/@!foo is also fine in my eyes. (I would not agree with this change if it does not free up sigils. However, I am also not too fond of the "! as .unwrap()" idea.)

@nielsle, eh, I think we should use @try/@cfg/@println/@fmt anyway. :)

@CloudiDust
Copy link
Contributor

@SiegeLord, while not freeing up ! may not stop it from being used as .unwrap, having foo! and (foo)! do different things is not pretty. So if ! is to be used as .unwrap, it should no longer be the macro sigil.

Also I believe the "suffix ! means caution" connotation comes from scheme (at least for me), where ! conventionally means "side-effecting operations". When I first saw Rust's println!, I didn't realize ! meant "macro" here.

I believe "which sigil looks better" is subjective. And whether we should do this change depends on whether it is worthwhile to free up two sigils.

@glaebhoerl
Copy link
Contributor

My thoughts briefly:

I think either @attr or #attr is a good syntax for attributes. (We should lose the brackets in either case!) @attr has precedent from other languages, while #attr has a useful intuition in terms of hashtags, which represent metadata, as attributes also do. However, because @ is a more visually appealing symbol, it has potential applications for many possible features, not just attributes, while # is not a reasonable choice for much else besides attributes. If we use # for attributes, we could later go on to use @ as syntax for some other valuable feature; while if we use @ for attributes, it is highly unlikely that we will ever find another use for #. To co-opt the language of the RFC, @ is much more valuable real estate, and by using it for attributes, we would be "losing it forever"; but in another sense, we would be "losing # forever" because we would no longer have any profitable use for it.

The goal of unifying the syntax of attribute-style (decorator) and "normal" macros is a very reasonable one, but the RFC seems to completely elide the fact that quite a few important attributes are not macros, such as inline and repr. Currently the syntax for decorator and normal macros is strangely different, but following this proposal, the syntax for macros and "normal" attributes would be strangely the same. In the process of removing an unnecessary distinction between two things, we would be unnecessarily conflating two others. It would be far more sensible to somehow unify the syntax for decorator and normal macros, and to use a different syntax for "real" attributes such as inline and repr. Then similar things would actually look similar and different things would look different.

I still think that println!("foo") is quite likely the nicest possible syntax for macros, and in any case is much nicer than @println(foo). With regards to intuitions, the connotation of ! for me is that it is "actively doing something" in a way that normal function calls aren't: it is being invoked at compile time and expanded into arbitrary syntax. At this stage, normal function calls are inert. If we care a lot about precedent, then D also uses ! for template instantiation in much the same way, and their templates have considerable overlap with our macros.

I'm not too fond of the idea that by taking ! away from macros, we would be "losing it forever" (my co-opting of the same language above notwithstanding). By that way of thinking, we have already "lost" &, *, +, -, /, :, and so on "forever". But that seems like a strange way of putting it. We might instead say that we are using them. The question is: are we using them effectively? I think the answer for all of these, including !, is clearly "yes". The reason it is being proposed for removal is to free it up for potential use as an unwrap operator in the future. If we wanted to have such an operator, then I agree that ! would be the best syntax for it. However, I strongly doubt that we should want to have it. In that sense, ! is good syntax for a bad idea.

(If this turns out to be mistaken, which would surprise me very much, then we could still finesse it along the lines of @SiegeLord's idea with foo!() being a macro and foo()! using the operator. So it's not even the case that we'd be "losing it forever". That sort of syntactic special-casing would be kind of gross, but not obviously more so than the syntactic special cases proposed for @attributes and @macros in this RFC.)

So for whatever it's worth, I think we should:

  • Keep ! for macros.
  • Change the syntax of decorator macros to also use ! in some way.
  • Keep # for non-macro attributes, but lose the brackets.

@CloudiDust
Copy link
Contributor

@glaebhoerl, I think sometimes whether an attribute is "macro-like" is not so clear-cut, it somewhat feels like "implementation detail". If we use different sigils here, we would be leaking such detail. It is possible that a macro-like attribute would become "builtin" to the compiler in later versions, and vice versa, and I think most people don't care as long as it works as expected. But "where it applies" would not change.

I see both macros and attributes as doing "meta and/or magical staff", in this sense, their notations can (but don't have to) be unified.

@CloudiDust
Copy link
Contributor

@nikomatsakis, to solve this problem, I think we can parse inner attributes greedly, and permit @[foo(...)]: and @![foo(...)] as alternative syntax, or rather, attribute arrays: { @![foo(...), bar(...)] baz } is the same as { @!foo(...) @!bar(...) baz }.

@Ericson2314
Copy link
Contributor

Even repr and inline can be thought of as macros if the macro output AST is more expressive than the surface syntax.

@glaebhoerl
Copy link
Contributor

Going down that road, just about every language construct can be thought of as a macro that expands to some internal compiler representation...

@Ericson2314
Copy link
Contributor

Very true, but I kind of like that viewpoint. :D

@ftxqxd
Copy link
Contributor

ftxqxd commented Oct 17, 2014

@glaebhoerl I agree about not wanting a ! operator. Even if for some reason we did want to have some kind of unwrap() operator (which I hope we will not, because I believe that failure should be extremely rare, preferring Result instead), we could easily steal !! from Kotlin. It certainly shouts ‘dangerous’ and is also slightly harder to type than ! (or even ?, promoting the use of exception propagation instead of outright failure).

To me the idea of unifying macros and attributes’ syntax still doesn’t make much sense. Not only does it not make much sense to consider things like #[inline] as macros, but it also introduces so much ambiguity which, while resolvable, seems like it could get in the way of expressing what you actually want to express. I don’t mind the idea of using @ for attributes, but the idea of using it for macros (quite a different construct) at the same time confuses me.

(Start of bikeshed.) I wouldn’t really mind syntax like #foo for macros and @foo for attributes. # already has ‘meta’ connotations: in C, # is used for preprocessor directives, which basically modify the way the compiler sees the file. In Rust, macros are similar (although not quite the same): they modify the way the compiler sees the AST. @ also has attribute-like connotations: in Python (and I think Java) it is used to decorate functions in quite a similar manner (although it works quite differently, but it’s the same idea and looks quite similar even in context). (End of bikeshed.)

But overall, I think this discussion has always been destined to contain far too much bikeshedding to be useful. I think that the problem here is a very small and possibly opinionated one, and coming up with a solution will always create a lot of debate.

@CloudiDust
Copy link
Contributor

@P1start, I didn't think of "!!", but for .unwrap(), it is better than ! in my eyes. Yes, it shouts, which is a good thing here.

And my "solution" to the ambiguity with @!foo let me think that maybe we should not drop []s after all. (#!foo would have the same problem)

So I now think we should just leave things as-is.

@CloudiDust
Copy link
Contributor

Also, it is a backwards compatible change to allow optionally omitting [] when there are no parsing ambiguities.

@arcto
Copy link

arcto commented Oct 18, 2014

I think that both of these alternatives are good in that they are easy to disambiguate:

@attrib(...):
@macro(...)

or

@attrib(...)
#macro(...)

@Ericson2314
Copy link
Contributor

I think it would be possible to have @ "gobble up" the same tokens and then separately figure out how to present them to the attribute / macro based solely upon which the ID bound is? Since the separate step is a completely independent parsing, it keeps the original grammar simple and context free. I'll try to make a formal proposal for this in a few.

@Ericson2314
Copy link
Contributor

Formal proposal: macro / attribute component of main grammer:

EXPR_ATTR_MACRO = '@' ID {MIDDLE_TT} EXPR_FINAL_TT
EXPR_FINAL_TT   = MIDDLE_TT
                | '{' {TT} '}'

STMT_ATTR_MACRO = '@' ['!'] ID {MIDDLE_TT} STMT_FINAL_TT
STMT_FINAL_TT   = MIDDLE_TT ';'
                | '{' {TT} '}'

MIDDLE_TT       = '(' {TT} ')'
                | '[' {TT} ']'

post-parsing:

  1. Lookup id
  2. dispatch:
  • if unbound, error
  • if macro, concat token trees with new root node and pass as single token tree to macro. (Include the brackets/parentheses in the combined token tree). Note that the semicolon in the statement case is NOT part of this token tree. Error on @!.
  • if attribute: parse first n-1 token tries as attribute args, parse final as normal Rust. Parsing errors HERE can be combined with errors the attribute itself may raise, if the parsing is successful. The Rust is parsed different depending on whether the attribute was in statement or expression position.

post-parsing attribute grammer (parsing round 2):

ATTR_BODY  = '(' META_SEQ ')' RUST
META_SEQ   = META_ITEM {',' META_ITEM }
META_ITEM  = ID
           | ID '=' STRING_LITERAL

This makes it really easy how to combine macros and attributes down the road, just chuck the post-parsing steps.

Edits: tweaking to remove potential ambiguities, basic idea remains the same.

@nrc
Copy link
Member

nrc commented Oct 22, 2014

I remembered why I wanted to make token-trees item-aware - it was so cfg could be implemented (as a macro) without requiring that the cfg'ed code parses, only that it tokenises.

@nikomatsakis
Copy link
Contributor Author

@nick29581 oh, that'd be really nice. That said, I think I've come to the conclusion that I would prefer a clearer distinction between attributes and macros after all. This ambiguity seems to put me over the top.

@nikomatsakis
Copy link
Contributor Author

Going to close this RFC for now.

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019
@kennytm kennytm mentioned this pull request Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.