Introduce `StimulusReflex::HTML::Document`, featuring `Nokogiri::HTML5` #601

marcoroth · 2022-08-09T07:25:05Z

Type of PR

Bug Fix and Enhancement

Description

~~Extract fragment logic into StimulusReflex::HTML::Part and introduce StimulusReflex::HTML::Document (for page morphs) and StimulusReflex::HTML::Fragment (for selector morphs)~~

Edit: After a rework I was able to make page morphs and selector morphs work with just a single StimulusReflex::HTML::Document class.

Why should this be added

#594 refactored and simplified some of the logic into a new StimulusReflex::Fragment class. This new class was then also being used for page morphs. Nokogiri fragments produce weird HTML if you pass them a whole HTML document. Since both page and selector morphs were now being passed through Nokogiri fragments this broke the behaviour for page morphs.

Example:

Input:

<!DOCTYPE html>
<html>
  <head>
    <title>StimulusReflex Test</title>
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <meta name="csrf-param" content="authenticity_token" />
    <meta name="csrf-token" content="token" />
    <link rel="stylesheet" href="/assets/application.css" data-turbo-track="reload" />
    <script src="/assets/application.js" data-turbo-track="reload" defer="defer"></script>
  </head>

  <body id="body">
    <h1>Home#index</h1>
    <p>Find me in app/views/home/index.html.erb</p>
  </body>
</html>

Output:

<body>
  <title>StimulusReflex Test</title> 
  <meta name="viewport" content="width=device-width,initial-scale=1"> 
  <meta name="csrf-param" content="authenticity_token"> 
  <meta name="csrf-token" content="token"> 
  <link rel="stylesheet" href="/assets/application.css" data-turbo-track="reload"> 
  <script src="/assets/application.js" data-turbo-track="reload" defer></script> 
  <h1>Home#index</h1> 
  <p>Find me in app/views/home/index.html.erb</p>"
</body>

Interestingly enough, morphdom was still able to process this generated HTML without any noticeable issues. Though, this inconsistency should be fixed.

For example, the Alpine.morph() function wasn't able to process the generated HTML. I discovered this inconsistency while working on the CableReady alpine_morph package.

Checklist

My code follows the style guidelines of this project
Checks (StandardRB & Prettier-Standard) are passing
This is not a documentation update

…a whole HTML document

leastbad · 2022-08-15T22:17:38Z

I've been going over 594 and 601 really carefully the past few days. I'm feeling conflicted because we went from a working solution with predictable behaviour that was entirely self-contained in the two classes that required it, to something that is 80 lines longer, spread out over several classes and requires a lot more mental overhead to follow than what we had. That said, it does seem to be a logical refactoring and most of my initial concerns appear to be addressed.

To recap, there was a Nokogiri parsing bug introduced in 594. We originally chose the Nokogiri::HTML.parse operation because it can process a full document. Nokogiri::HTML.fragment will wrap a document in a body, as Marco pointed out.

It seems as though we could have fixed 594 by reverting to the use of parse for page morphs. I was initially going to propose an interstitial PR in between Julian and Marco's patches, with the explicit goal of restoring the main branch to the behaviour we relied on before 594. However, I'm comfortable moving forward with 601.

Digging into 601, I'm curious/nervous about any potential ramifications from switching page morph processing from Nokogiri::HTML.parse to Nokogiri::HTML5::Document.parse. What I can piece together from the Nokogiri RDOC is that Nokogiri::HTML is just an alias for Nokogiri::HTML4. This implies that there is some meaningful difference between the HTML4 and HTML5 processing modes; I just can't find what these differences might be explained anywhere. Does anyone have a reliable source?

Not knowing the minute ways these modules could differ from each other does create some anxiety. For example, according to https://nokogiri.org/rdoc/Nokogiri/HTML5.html the HTML5 module isn't available to JRuby(!!). It also seems as though the HTML5 parse just passes the string to the Gumbo parser.

Should we just stick to the original Nokogiri::HTML.parse? I'm fuzzy on the implementation details as so many things in Nokogiri are just aliases or wrappers over C bindings. At least the HTML4 module works with JRuby.

It seems like with any use of Nokogiri::HTML.fragment, selector morphs are vulnerable to the same failure we're currently seeing on page morphs. In other words, this breaks:

morph "html", render(template: "home/index", layout: true)

but this "works", only because the Nokogiri HTML5 fragment parser (see point number one) strips the html + head + body elements from the string:

morph "body", render(template: "home/index", layout: true)

While it's uncommon for someone to do this, it's not that weird. I can definitely imagine scenarios where people are rendering pages on demand; if they wanted to implement a navigation system, for example.

Should we allow people to render a full HTML document and pass it to a selector morph? I'm picturing dynamically switching between parse and fragment depending on what they pass in. We could also explain in the documentation that if people want to do this, they should just call CableReady directly. Open to discussion, here.

The last thing I'm a bit fuzzy on is that while Marco's StimulusReflex::HTML::Part class implements both to_html and outer_html methods, they don't appear to actually be called in the current codebase. Is this because of my reaction in the conversation a few weeks ago? I am still quite cautious about switching to morphing outer_html, but I made a promise that I'm staying open minded - especially if there's a test suite which covers https://docs.stimulusreflex.com/appendices/troubleshooting#morphing-sanity-checklist

…e-morphs

lib/stimulus_reflex/html/fragment.rb

…age-morphs' into fix-fragment-class-in-combination-with-page-morphs

hopsoft

This PR looks good to me. My only reservation is that we ensure that this doesn't change/break any existing behavior. Having said that, I think we can cross the JRuby bridge if/when we need to in the future.

marcoroth added 2 commits July 30, 2022 00:52

Add some failing tests for StimulusReflex::Fragment when used with …

342c6ee

…a whole HTML document

Add extract assertion for extracting the body out of a whole document

8c10868

marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch from 1b49caf to 3c2f031 Compare August 9, 2022 07:25

marcoroth changed the base branch from fix-fragment-class-in-combination-with-page-morphs to master August 9, 2022 07:25

marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch 3 times, most recently from 11ab381 to 4dea4e4 Compare August 9, 2022 07:28

Fix StimulusReflex::Fragment class in combination with page morphs

baec564

marcoroth force-pushed the fix-fragment-class-in-combination-with-page-morphs branch from 4dea4e4 to baec564 Compare August 9, 2022 07:34

change to_html to default to outer_html instead of inner_html

cf28894

Merge branch 'master' into fix-fragment-class-in-combination-with-pag…

9ed785b

…e-morphs

leastbad reviewed Aug 22, 2022

View reviewed changes

lib/stimulus_reflex/html/fragment.rb Outdated Show resolved Hide resolved

marcoroth added 6 commits August 23, 2022 02:33

Consolidate Part, Document and Fragment into just HTML::Document

cacb898

Merge branch 'stimulusreflex:fix-fragment-class-in-combination-with-p…

a5c48cd

…age-morphs' into fix-fragment-class-in-combination-with-page-morphs

fix test name

3c2846c

standardize

faf1f78

remove duplicate test file

aadc8db

explictly use Nokogiri::HTML5::Document

a0725ce

hopsoft reviewed Aug 23, 2022

View reviewed changes

leastbad mentioned this pull request Aug 24, 2022

documentElement doesn't have a parent stimulusreflex/cable_ready#218

Merged

leastbad added this to the 3.5 milestone Aug 26, 2022

leastbad added bug Something isn't working enhancement New feature or request ruby Pull requests that update Ruby code labels Aug 26, 2022

leastbad changed the title ~~Fix StimulusReflex::Fragment class in combination with page morphs~~ Introduce StimulusReflex::HTML::Document, featuring Nokogiri::HTML5 Aug 26, 2022

marcoroth force-pushed the master branch from 92aa922 to f4572da Compare September 4, 2022 06:56

leastbad merged commit d5ba70c into stimulusreflex:master Sep 10, 2022

marcoroth mentioned this pull request Jan 10, 2023

Properly handle incomplete HTML fragments #622

Merged

3 tasks

marcoroth mentioned this pull request Apr 23, 2024

[bug] DocumentFragment parse behavior wrt body tag sparklemotion/nokogiri#3023

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `StimulusReflex::HTML::Document`, featuring `Nokogiri::HTML5` #601

Introduce `StimulusReflex::HTML::Document`, featuring `Nokogiri::HTML5` #601

marcoroth commented Aug 9, 2022 •

edited

Loading

leastbad commented Aug 15, 2022 •

edited

Loading

hopsoft left a comment

Introduce StimulusReflex::HTML::Document, featuring Nokogiri::HTML5 #601

Introduce StimulusReflex::HTML::Document, featuring Nokogiri::HTML5 #601

Conversation

marcoroth commented Aug 9, 2022 • edited Loading

Type of PR

Description

Why should this be added

Checklist

leastbad commented Aug 15, 2022 • edited Loading

hopsoft left a comment

Choose a reason for hiding this comment

Introduce `StimulusReflex::HTML::Document`, featuring `Nokogiri::HTML5` #601

Introduce `StimulusReflex::HTML::Document`, featuring `Nokogiri::HTML5` #601

marcoroth commented Aug 9, 2022 •

edited

Loading

leastbad commented Aug 15, 2022 •

edited

Loading