Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor html tag peeking #1738

Merged
merged 10 commits into from
Feb 27, 2021

Conversation

lukechu10
Copy link
Contributor

@lukechu10 lukechu10 commented Feb 11, 2021

Description

Fixes #1737

Fix for parsing generic component tags with a lowercase type paramater, e.g.

html! {
    <COMP<usize>></COMP<usize>>
}

By fixing this, I found another bug related to generic components. The following doesn't work either.

html! { <COMP<A, B>></COMP<A,B> }

(Note that this is not fixed in this PR)

Checklist

  • I have run cargo make pr-flow
  • I have reviewed my own code
  • I have added tests

@lukechu10 lukechu10 changed the title Lukechu10/fix html component generics Fix html macro lowercase params in generic component tag Feb 11, 2021
@lukechu10 lukechu10 force-pushed the lukechu10/fix-html-component-generics branch from 09f27f2 to 437fac7 Compare February 11, 2021 23:30
@lukechu10 lukechu10 marked this pull request as ready for review February 11, 2021 23:34
@lukechu10 lukechu10 marked this pull request as draft February 12, 2021 04:42
@lukechu10
Copy link
Contributor Author

There are a few more problems related to generics that I would like to fix, e.g. <COMP::<A>></COMP::<A>> should be accepted: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=768021be463e044232e68a9b2c6a7964

@siku2
Copy link
Member

siku2 commented Feb 12, 2021

Replying to your comment here.

I am not familiar (at all) with syn but would it be possible to just reuse syn's default .parse()

Yes! That's exactly why I dislike the current approach. Normal parsing operates on a ParseStream which is totally capable of parsing literally any Rust syntax but our peeking logic operates on a Cursor (to avoid advancing the ParseStream) and because they aren't meant to do this sort of thing we're forced to rewrite the logic by hand.

If you'd like to get your hands dirty a bit I'd love to resolve this mess entirely.
The way I see it we either use a quick and dirty fix or we rewrite this logic entirely, I really don't want to use a middle of the road approach.

@lukechu10
Copy link
Contributor Author

Yes! That's exactly why I dislike the current approach. Normal parsing operates on a TokenStream which is totally capable of parsing literally any Rust syntax but our peeking logic operates on a Cursor (to avoid advancing the TokenStream) and because they aren't meant to do this sort of thing we're abusing them for so we're forced to rewrite the logic by hand.

How about using ParseStream::fork? I think I might as well do this the right way. You can merge this if you want and I'll create a new PR. Otherwise, I'll just continue working on this on the same PR.

@siku2
Copy link
Member

siku2 commented Feb 12, 2021

Yes, that's one way to do it, but I think the problem here is a bit bigger because Yew's Peek* traits receive a Cursor. So one way to go about this would be to change the definition of the Peek* traits to receive a forked ParseStream but this requires quite a few changes in many different places so I don't think that's the way to go.

I think the main problem here is that the tags implement PeekValue<Type> (i.e. that they need to "parse" and return the component type). Really all the peek needs to check is that it starts with < and is followed by either an uppercase letter (that's to distinguish it from normal elements) or a lowercase ident (optional) which itself is followed by a path separator :: (we don't really care whether the last segment of this path starts with an uppercase letter because if we're looking at a path a::b::c we can be quite sure that it's a component type, not an element name).
Basically we check for the following patterns <[A-Z], <[a-z]+::, and <::. Whether or not these are actually valid paths is up to the parser, which, as I mentioned in the issue, should handle everything correctly.
I think we should switch to just PeekValue<()> trait where the implementation just checks for the two things mentioned above.
The only significant code we need to adjust for this to work are the following lines:

if let Some(ty) = HtmlComponentClose::peek(input.cursor()) {
if open.ty == ty {
break;
}
}

but we can just replace that with this:

if HtmlComponentClose::peek(input.cursor()).is_some() {
  let close = input.parse::<HtmlComponentClose>()?;
  if close.ty == open.ty {
    break;
  } else {
    // TODO: feel free to improve this error message ex. by including the name of the opening tag.
    return Err(syn::Error::new_spanned(close.to_spanned(), "this closing tag doesn't match the corresponding opening tag");
  }
}

At some point in the future I would still like to get rid of this "speculative" parsing entirely though, but that's not something I want to burden you with.

btw, feel free to keep working on it in this pull request unless you want want to get in a fix as soon as possible :P

@lukechu10
Copy link
Contributor Author

lukechu10 commented Feb 12, 2021

Yes, that's one way to do it, but I think the problem here is a bit bigger because Yew's Peek* traits receive a Cursor. So one way to go about this would be to change the definition of the Peek* traits to receive a forked ParseStream but this requires quite a few changes in many different places so I don't think that's the way to go.

I'm not sure I understand.

impl Parse for HtmlTree {
fn parse(input: ParseStream) -> Result<Self> {
let html_type = HtmlTree::peek(input.cursor())
.ok_or_else(|| input.error("expected a valid html element"))?;
let html_tree = match html_type {
HtmlType::Empty => HtmlTree::Empty,
HtmlType::Component => HtmlTree::Component(Box::new(input.parse()?)),
HtmlType::Element => HtmlTree::Element(Box::new(input.parse()?)),
HtmlType::Block => HtmlTree::Block(Box::new(input.parse()?)),
HtmlType::List => HtmlTree::List(Box::new(input.parse()?)),
};
Ok(html_tree)
}
}
impl PeekValue<HtmlType> for HtmlTree {
fn peek(cursor: Cursor) -> Option<HtmlType> {
if cursor.eof() {
Some(HtmlType::Empty)
} else if HtmlList::peek(cursor).is_some() {
Some(HtmlType::List)
} else if HtmlComponent::peek(cursor).is_some() {
Some(HtmlType::Component)
} else if HtmlElement::peek(cursor).is_some() {
Some(HtmlType::Element)
} else if HtmlBlock::peek(cursor).is_some() {
Some(HtmlType::Block)
} else {
None
}
}
}

Aren't we basically parsing the html tag twice? (one time peeking and then discarding the result, the second time actually parsing and storing the result) Even though the peek implementation only needs to determine the HtmlType, couldn't we just try to parse the tag as a certain type and if it fails, try the next one? That way, we would completely remove the need for a Peek trait as well as manipulating Cursors.

(This is my first time writing proc macros and I never knew it could be this fun)

@siku2
Copy link
Member

siku2 commented Feb 12, 2021

Aren't we basically parsing the html tag twice? (one time peeking and then discarding the result, the second time actually parsing and storing the result)

Mostly, yes. To be fair, I don't think we're actually parsing the same thing twice anywhere. The code here is likely the worst offender in this regard. Other implementations of the peek trait are a fair bit better.
The problem is that we're generally pre-parsing way more than necessary for reasons I'll get into further down.

couldn't we just try to parse the tag as a certain type and if it fails, try the next one?

We definitely could, the reason we don't is mainly because that makes it much harder to emit good error messages. If you just try one after the other, what do you do when none of them succeed?

The easiest way to approach this is to use a little of both worlds. When you have a structure like HtmlTree you look at the absolute minimum amount of tokens required to infer the appropriate next step. Really, you shouldn't need more than three tokens for this.

Yew's implementation of this idea went wrong because of the Peek* traits. It's not even the fact that the peek method returns a value instead of just a boolean. That's definitely problematic, sure, but the bigger issue is that the amount of tokens we need to look at to determine the exact type entirely depends on the other available types that can also be parsed at this point.

To give an example: if we only want to distinguish between a normal HTML element <div> and a code block { "string" } we only need to look at the first token (which is either < or {) and we already know which one to parse.
Now, if we throw components into the mix, we have to properly distinguish them from normal elements.
To do this, we just have to add another rule: if the first token is <, we also look at the next token (which should be an identifier in this simplified example) and use it to distinguish between component and element based on the casing of the first letter.
syn even offers the peek, peek2, peek3, and the powerful lookahead1 functions for this purpose.

The point I'm getting at is that it doesn't make sense to have a Peek trait which is implemented on a specific type.
Given the previous example, should the peek implementation for HTML elements only check the first token or include the second token to distinguish them from components?
Well, as we know, the answer is that it depends.

For such a trait implementation to be correct in all cases it has no choice but to basically parse the entire thing.
In our case here, HtmlTree should be the one that actively determines which type to parse because only it "knows" how many tokens are really needed to produce a unique match.

@lukechu10
Copy link
Contributor Author

The point I'm getting at is that it doesn't make sense to have a Peek trait which is implemented on a specific type because in order for such a trait implementation to be correct in all cases it has no choice but to basically parse the entire thing.
In our case here, HtmlTree should be the one that actively determines which type to parse because only it "knows" how many tokens are really needed to produce a unique match.

If I understood correctly, I'll just get rid of all the Peek traits and instead perform the minimal amount of peeking directly in the parse() method on HtmlTree?

@siku2
Copy link
Member

siku2 commented Feb 12, 2021

If you want to put in the effort, go for it!
There's always the hacky approach I described before if you don't :)

@lukechu10 lukechu10 changed the title Fix html macro lowercase params in generic component tag Refactor html tag peeking Feb 12, 2021
@lukechu10
Copy link
Contributor Author

I created a new method called peek_html_type which tries to find the HtmlType and eagerly returns.

@lukechu10
Copy link
Contributor Author

Hmm, how come there is an error with yew-services in CI? I didn't change anything

@teymour-aldridge
Copy link
Contributor

I think that failure is a fluke.

@lukechu10 lukechu10 marked this pull request as ready for review February 16, 2021 02:55
@lukechu10
Copy link
Contributor Author

For some reason, <Generic::<String>></Generic::<String>> does not work but I think it's probably something to do with Peek still being used in parsing.

@siku2
Copy link
Member

siku2 commented Feb 18, 2021

The diff is a bit messed up now, could you try to fix it @lukechu10?

@lukechu10
Copy link
Contributor Author

The diff is a bit messed up now, could you try to fix it @lukechu10?

Sorry, I am not sure what you mean?

@siku2
Copy link
Member

siku2 commented Feb 20, 2021

Sorry, I am not sure what you mean?

@lukechu10, there are a few commits in here that really shouldn't be (check the commits).

@teymour-aldridge
Copy link
Contributor

You probably want to rebase on top of master (we really should change the name at some point :P).

@lukechu10 lukechu10 force-pushed the lukechu10/fix-html-component-generics branch from f2f438e to af6af1b Compare February 22, 2021 17:37
fn path_arguments(cursor: Cursor) -> Option<(PathArguments, Cursor)> {
let (punct, cursor) = cursor.punct()?;
/// Refer to the [`syn::parse::Parse`] implementation for [`AngleBracketedGenericArguments`].
fn path_arguments(mut cursor: Cursor) -> Option<(PathArguments, Cursor)> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I still really want to get rid of this and the peek_type method. Did you run into an issue here?
Other than that this PR is pretty much done, should we just merge this and revisit this topic another time? I don't want to push you to do things you don't really want to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Peek implementation depends on peek_type which dependends on path_arguments. I think it would be best if this were merged first and I'll try to work on removing this in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, sounds like a plan

@siku2 siku2 merged commit bc46062 into yewstack:master Feb 27, 2021
@jbg
Copy link
Contributor

jbg commented Feb 28, 2021

As of this branch being merged, I can't use <use> inside <svg> in html! {} any more. (Maybe other SVG elements are affected too; I haven't tried.) For example, the following:

html!{
  <a class="btn btn-primary" href="https://example.org/">
    <svg class="bi" fill="currentColor">
      <use href="/bootstrap-icons.svg#wrench"/>
    </svg>
    <span>{"Go to example.org"}</span>
  </a>
}

fails with this error:

error: expected a valid html element
   --> src/pages/example.rs:140:19
    |
140 |                   <use href="/bootstrap-icons.svg#wrench"/>
    |                   ^

@siku2
Copy link
Member

siku2 commented Feb 28, 2021

@jbg oh no, could you open a new issue for this? We'll get this fixed ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-yew-macro Area: The yew-macro crate A-yew-router Area: The yew-router crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generic components with lower case type: "expected a valid html element"
5 participants