Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce StimulusReflex::HTML::Document, featuring Nokogiri::HTML5 #601

Conversation

marcoroth
Copy link
Member

@marcoroth marcoroth commented Aug 9, 2022

Type of PR

Bug Fix and Enhancement

Description

Extract fragment logic into StimulusReflex::HTML::Part and introduce StimulusReflex::HTML::Document (for page morphs) and StimulusReflex::HTML::Fragment (for selector morphs)

Edit: After a rework I was able to make page morphs and selector morphs work with just a single StimulusReflex::HTML::Document class.

Why should this be added

#594 refactored and simplified some of the logic into a new StimulusReflex::Fragment class. This new class was then also being used for page morphs. Nokogiri fragments produce weird HTML if you pass them a whole HTML document. Since both page and selector morphs were now being passed through Nokogiri fragments this broke the behaviour for page morphs.

Example:

Input:

<!DOCTYPE html>
<html>
  <head>
    <title>StimulusReflex Test</title>
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <meta name="csrf-param" content="authenticity_token" />
    <meta name="csrf-token" content="token" />
    <link rel="stylesheet" href="/assets/application.css" data-turbo-track="reload" />
    <script src="/assets/application.js" data-turbo-track="reload" defer="defer"></script>
  </head>

  <body id="body">
    <h1>Home#index</h1>
    <p>Find me in app/views/home/index.html.erb</p>
  </body>
</html>

Output:

<body>
  <title>StimulusReflex Test</title> 
  <meta name="viewport" content="width=device-width,initial-scale=1"> 
  <meta name="csrf-param" content="authenticity_token"> 
  <meta name="csrf-token" content="token"> 
  <link rel="stylesheet" href="/assets/application.css" data-turbo-track="reload"> 
  <script src="/assets/application.js" data-turbo-track="reload" defer></script> 
  <h1>Home#index</h1> 
  <p>Find me in app/views/home/index.html.erb</p>"
</body>

Interestingly enough, morphdom was still able to process this generated HTML without any noticeable issues. Though, this inconsistency should be fixed.

For example, the Alpine.morph() function wasn't able to process the generated HTML. I discovered this inconsistency while working on the CableReady alpine_morph package.

Checklist

  • My code follows the style guidelines of this project
  • Checks (StandardRB & Prettier-Standard) are passing
  • This is not a documentation update

@marcoroth marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch from 1b49caf to 3c2f031 Compare August 9, 2022 07:25
@marcoroth marcoroth changed the base branch from fix-fragment-class-in-combination-with-page-morphs to master August 9, 2022 07:25
@marcoroth marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch 3 times, most recently from 11ab381 to 4dea4e4 Compare August 9, 2022 07:28
@marcoroth marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch from 4dea4e4 to baec564 Compare August 9, 2022 07:34
@leastbad
Copy link
Contributor

leastbad commented Aug 15, 2022

I've been going over 594 and 601 really carefully the past few days. I'm feeling conflicted because we went from a working solution with predictable behaviour that was entirely self-contained in the two classes that required it, to something that is 80 lines longer, spread out over several classes and requires a lot more mental overhead to follow than what we had. That said, it does seem to be a logical refactoring and most of my initial concerns appear to be addressed.

To recap, there was a Nokogiri parsing bug introduced in 594. We originally chose the Nokogiri::HTML.parse operation because it can process a full document. Nokogiri::HTML.fragment will wrap a document in a body, as Marco pointed out.

It seems as though we could have fixed 594 by reverting to the use of parse for page morphs. I was initially going to propose an interstitial PR in between Julian and Marco's patches, with the explicit goal of restoring the main branch to the behaviour we relied on before 594. However, I'm comfortable moving forward with 601.

Digging into 601, I'm curious/nervous about any potential ramifications from switching page morph processing from Nokogiri::HTML.parse to Nokogiri::HTML5::Document.parse. What I can piece together from the Nokogiri RDOC is that Nokogiri::HTML is just an alias for Nokogiri::HTML4. This implies that there is some meaningful difference between the HTML4 and HTML5 processing modes; I just can't find what these differences might be explained anywhere. Does anyone have a reliable source?

Not knowing the minute ways these modules could differ from each other does create some anxiety. For example, according to https://nokogiri.org/rdoc/Nokogiri/HTML5.html the HTML5 module isn't available to JRuby(!!). It also seems as though the HTML5 parse just passes the string to the Gumbo parser.

Should we just stick to the original Nokogiri::HTML.parse? I'm fuzzy on the implementation details as so many things in Nokogiri are just aliases or wrappers over C bindings. At least the HTML4 module works with JRuby.

It seems like with any use of Nokogiri::HTML.fragment, selector morphs are vulnerable to the same failure we're currently seeing on page morphs. In other words, this breaks:

morph "html", render(template: "home/index", layout: true)

but this "works", only because the Nokogiri HTML5 fragment parser (see point number one) strips the html + head + body elements from the string:

morph "body", render(template: "home/index", layout: true)

While it's uncommon for someone to do this, it's not that weird. I can definitely imagine scenarios where people are rendering pages on demand; if they wanted to implement a navigation system, for example.

Should we allow people to render a full HTML document and pass it to a selector morph? I'm picturing dynamically switching between parse and fragment depending on what they pass in. We could also explain in the documentation that if people want to do this, they should just call CableReady directly. Open to discussion, here.

The last thing I'm a bit fuzzy on is that while Marco's StimulusReflex::HTML::Part class implements both to_html and outer_html methods, they don't appear to actually be called in the current codebase. Is this because of my reaction in the conversation a few weeks ago? I am still quite cautious about switching to morphing outer_html, but I made a promise that I'm staying open minded - especially if there's a test suite which covers https://docs.stimulusreflex.com/appendices/troubleshooting#morphing-sanity-checklist

Copy link
Contributor

@hopsoft hopsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. My only reservation is that we ensure that this doesn't change/break any existing behavior. Having said that, I think we can cross the JRuby bridge if/when we need to in the future.

@leastbad leastbad added this to the 3.5 milestone Aug 26, 2022
@leastbad leastbad added bug Something isn't working enhancement New feature or request ruby Pull requests that update Ruby code labels Aug 26, 2022
@leastbad leastbad changed the title Fix StimulusReflex::Fragment class in combination with page morphs Introduce StimulusReflex::HTML::Document, featuring Nokogiri::HTML5 Aug 26, 2022
@leastbad leastbad merged commit d5ba70c into stimulusreflex:master Sep 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request ruby Pull requests that update Ruby code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants