-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify attributes and macros to use @
sigil, redux
#386
Conversation
|
||
**Token trees as the interface to rule them all.** Currently, | ||
decorators like `deriving` are implemented as a transform from one AST | ||
node to some number of AST nodes. Basically they take the AST node for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decorators are actually a bit more complex than that now. They're split into two different types of extensions: ItemDecorator
s and ItemModifier
s. ItemDecorator
s are given a reference to an Item
as input and produce some number of new items from it. deriving
is an ItemDecorator
, it takes in the tagged Item
and produces impl
items based on it. ItemModifiers
take in an item and return a new item what will be used in the input's place.
This split was made to enforce some sanity on the operation of decorators in a world where third party decorators can be written and used together. If one isn't careful, you can end up in sad situations like this. Imagine we have a decorator #[munge]
that alters names of fields (who knows why). Now say we want to use it along with another decorator like deriving
:
#[deriving(PartialEq)]
#[munge]
struct Foo {
f: int
}
Assuming that decorators evaluate top-down, we'll run into some problems. deriving
will be evaluated first, and produce an impl like
impl PartialEq for Foo {
fn eq(&self, other: &Foo) -> bool {
self.f == other.f
}
}
But then #[munge]
is evaluated and renames f
! If we order the attribute the other way, we might be safe, if munge
is careful to preserve the attributes when it makes the new item. If it doesn't, we'll silently lose the PartialEq
implementation. It gets even crazier if munge
wants to add new attributes that may themselves be decorators!
In the new system, the compiler evaluates all ItemModifiers
first, chaining the result through all of them. It can then run ItemDecorators
afterword, guaranteeing that they'll see the item that's actually going to be output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me though that even with that system, two different attributes can still interact strangely depending on order.
For example, one item modifier might add an filed, and another might change a property of all fields (eg make them public).
So in any case the order of attributes is important for both the user and the attribute impl to consider, which makes the internal distinction between them helpful, but not the full solution.
I think providing facilities to properly propagate all possible existing attributes down the expansion chain, and making applying them easy/the default to do would be the better option here, in which case merging the two cases would not be an issue.
This is a good idea. People will bikeshed over the
|
position. | ||
|
||
`@foo{...}` and `@foo[...]` are *always* interpreted as | ||
macro invocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unresolved Questions (might as well go here as a discussion of updating macros/syntax extensions): How does this interact with syntax extensions like the IdentTT variant &c.? Could syntax extensions be further generalized to allow the equivalent of StrTT (called like |
The extended example, w/o the cheat codes:
Adding a whole set of rules for resolving completely voluntarily created ambiguities seems less than ideal. The only benefit to the shared sigil that I'm reading is that it reflects current and planned similarities between the two features. But if the two will continue to need to be understood independently, and will continue to have different use cases and require disambiguation, is that really a benefit? |
This sounds great, glad you guys are sticking to the long term plans to unify them. |
|
I still think Just because two things are similar doesn't mean we should use the same notation for them, as long as they are only similar, not the same, but we can use similar notations. |
Eventually, |
@vadimcn, Personally I think |
Also, currently, the difference between inner attributes and outer attributes are smaller than that between outer attributes and macros. So why should outer attributes and macros have the same notation when the "gap" is larger? In the future, the only difference between the three would be where they apply. And this difference is simply not going away. So, we can unify the three's syntax with the leading I believe that, for this job, |
OTOH, I believe making macro calls more consistent with function calls regarding |
Where does this leave the In general I think I can live with Also, an interesting idea: If we had type macros, you'd be able to somewhat recover the old sigil prefix types with the the macro syntax: |
Strongly +1 to all of this. Slightly tangential, but relevant to the future plans section: I would like to split the AST that libsyntax generates from the AST which librustc uses. I believe a clean separation here would be beneficial in terms of cleanly engineering the compiler. With this change, the libsyntax AST should match the source language very closely, should only change as the language changes, and shouldn't contain any implementation details which rustc relies on. We could then offer this as part of the API for syntax extensions and the compiler itself. Other parser libraries would also produce this, and thus be able to cleanly interact with the compiler. I don't think this changes anything laid out in the RFC, however, I believe it is still better to pass token trees to syntax extensions and provide the facility for parsing them. However, this makes me wonder about how this works in practice - for syntax extensions to be composable, they must produce token trees as output, which means we need to offer a way to convert from an AST back to a token tree. Do we have that at the moment? Will there be difficulties? (I'm thinking of spans, in particular, but perhaps other stuff too). |
Another idea I was toying with (which is closely related to the proposal in this RFC) is to expand the information provided by token tree to include items. Currently token trees identify tokens and clauses identified by matching brackets. We could extend this to also match the keywords and the following scope - something like |
+1
|
I like @CloudiDust's idea that In the similar way, |
@nick29581 I have been messing around with syntax extensions a bit in the past week and I think that your suggestion of splitting librustc's AST and libsyntax's AST would be enormously helpful when it comes to cleaning up the syntax extension API. |
One benefit of having syntax extensions process TTs is that it'll give a simple way forward to allow decorators to work on non- |
Here are some alternatives that were considered: | ||
|
||
1. Use `@foo` for attributes, keep `foo!` for macros (status quo-ish). | ||
2. Use `@[foo]` for attributes and `@foo` for macros (a compromise). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will "@[foo]
for attributes and @foo
for macros" remove all the ambiguity issues? If so I much prefer this. It unifies the two with a common, prettier sigil, indicating that they both do code generation, but they still show that they still have slightly different semantics. { @!foo }
seems like a bit of an ugly workaround that would be hard to format nicely, and does not solve the problem of annotating conditionals. I suspect we would have to explain the reasoning over and over again to new folks years into the future... "You can't do this because it is ambiguous..."...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @foo:
rather than @[foo]
would also resolve the ambiguity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These solutions would allow for helpful syntax hightligting. I like that.
Strong -1 to this proposal just like the old one. First, I'll have to re-iterate the points I made on the last RFC.
In terms of subjectivity of Lastly, I find the notion of 'valuable ASCII real estate' a bit schizophrenic. Rust has been moving away from using valuable ASCII real estate for months, and now you want reserve some line noise symbols for future usage? Are ASCII symbols for novel semantics a bad thing or a good thing? I don't agree that meaning of Now moving on to the idea of the unification of syntax. If the enormous amounts of disambiguation rules doesn't tell you it's a terrible idea, I don't know what will. I'll defer to @jfager's comment above. In particular, I want to highlight the issue of the false syntax similarity given the differences (e.g. conflicting names of attributes and macros... why have them conflict if the syntax is the same?). Lastly, I think the stabilization on token trees as the output format is a bad idea. While today, it is usually simpler to quasiquote token trees to produce the AST, this is only because the AST builder is hard to use directly (being poorly documented). I really find having to insert commas, braces and parentheses programmatically to be very inconvenient, and against the spirit of the AST macros. Frankly, at this point it becomes more attractive to standardize on a string output format, as it is easier to deal with. The typed token-constructing API provides virtually no benefit over it, in my opinion. I think chained macros is a niche use case that can co-exist with an AST-outputting macro types, but is not common enough to treat it as if it is the sole use case. |
I wholly agree with @SiegeLord. I would also like to add that unification of attributes and macros does not make sense with some of attributes being not even remotely macro-like, for example, |
I don't think string output formats for macros will work for hygiene. |
By the way the sigil syntax could potentially create a lot of spam references on github. The following four people will be spammed https://github.com/try |
Hmm, it occurs to me (which should have been obvious) that using |
@netvl, in a sense, I am fine with the current syntax, but if we are to free up @nielsle, eh, I think we should use |
@SiegeLord, while not freeing up Also I believe the "suffix I believe "which sigil looks better" is subjective. And whether we should do this change depends on whether it is worthwhile to free up two sigils. |
My thoughts briefly: I think either The goal of unifying the syntax of attribute-style (decorator) and "normal" macros is a very reasonable one, but the RFC seems to completely elide the fact that quite a few important attributes are not macros, such as I still think that I'm not too fond of the idea that by taking (If this turns out to be mistaken, which would surprise me very much, then we could still finesse it along the lines of @SiegeLord's idea with So for whatever it's worth, I think we should:
|
@glaebhoerl, I think sometimes whether an attribute is "macro-like" is not so clear-cut, it somewhat feels like "implementation detail". If we use different sigils here, we would be leaking such detail. It is possible that a macro-like attribute would become "builtin" to the compiler in later versions, and vice versa, and I think most people don't care as long as it works as expected. But "where it applies" would not change. I see both macros and attributes as doing "meta and/or magical staff", in this sense, their notations can (but don't have to) be unified. |
@nikomatsakis, to solve this problem, I think we can parse inner attributes greedly, and permit |
Even |
Going down that road, just about every language construct can be thought of as a macro that expands to some internal compiler representation... |
Very true, but I kind of like that viewpoint. :D |
@glaebhoerl I agree about not wanting a To me the idea of unifying macros and attributes’ syntax still doesn’t make much sense. Not only does it not make much sense to consider things like (Start of bikeshed.) I wouldn’t really mind syntax like But overall, I think this discussion has always been destined to contain far too much bikeshedding to be useful. I think that the problem here is a very small and possibly opinionated one, and coming up with a solution will always create a lot of debate. |
@P1start, I didn't think of "!!", but for And my "solution" to the ambiguity with So I now think we should just leave things as-is. |
Also, it is a backwards compatible change to allow optionally omitting |
I think that both of these alternatives are good in that they are easy to disambiguate:
or
|
I think it would be possible to have |
Formal proposal: macro / attribute component of main grammer:
post-parsing:
post-parsing attribute grammer (parsing round 2):
This makes it really easy how to combine macros and attributes down the road, just chuck the post-parsing steps. Edits: tweaking to remove potential ambiguities, basic idea remains the same. |
I remembered why I wanted to make token-trees item-aware - it was so cfg could be implemented (as a macro) without requiring that the cfg'ed code parses, only that it tokenises. |
@nick29581 oh, that'd be really nice. That said, I think I've come to the conclusion that I would prefer a clearer distinction between attributes and macros after all. This ambiguity seems to put me over the top. |
Going to close this RFC for now. |
Remove jQuery by default
A revised version of @pcwalton's RFC, incorporating a more precise description of conflicts and a description of feedback.
Rendered view.