Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add HTML & XML Inspectors API using Nokogiri #546

Merged
merged 6 commits into from
May 16, 2022
Merged

Conversation

jaredcwhite
Copy link
Member

@jaredcwhite jaredcwhite commented May 3, 2022

@render
Copy link

render bot commented May 3, 2022

@render
Copy link

render bot commented May 3, 2022

@jaredcwhite jaredcwhite requested a review from KonnorRogers May 3, 2022 17:52
@jaredcwhite jaredcwhite added this to the 1.1 milestone May 3, 2022
Copy link
Contributor

@andrewmcodes andrewmcodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incredible. I have been attempting to do something like this performantly (without JS of course) similar to what 11ty has and this is way better than I had even dreamed. This will open up the possibility to do some really cool transformations now. Definitely an area that plugins can flourish!

I left a few nitpicks in the docs in a hope you may like one or two of the suggestions.

Amazing job on this @jaredcwhite 👏

module RunInspectors
def self.call(resource, inspectors) # rubocop:disable Metrics/CyclomaticComplexity
return resource.output if !inspectors ||
!resource.destination&.output_ext&.starts_with?(".htm") ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: I don't particularly have a concrete example use case for this and am just thinking out loud

How hard would it be to allow this to work for XML too? I am thinking about sitemaps and app manifests. If the answer is one or two lines of code changed, I could see it being useful to add. ¯_(ツ)_/¯

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nokogiri ships an XML inspector so I imagine it should be possible.

https://nokogiri.org/tutorials/parsing_an_html_xml_document.html#from-a-string

Comment on lines +3 to +12
inspect_html do |document|
document.query_selector_all("article h2[id], article h3[id]").each do |heading|
heading << document.create_text_node(" ")
heading << document.create_element(
"a", "#",
href: "##{heading[:id]}",
class: "heading-anchor"
)
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 This is FANTASTIC

bridgetown-website/src/_docs/plugins/html-inspectors.md Outdated Show resolved Hide resolved
All resources which result in HTML output (rather than JSON or some other format) will be procssed through any defined inspectors. For greater performance and fidelity, the Nokogiri document for a single resource will be the same across all inspectors (rather than instantiating a new Nokogiri document for each inspector).

{%@ Note type: :warning do %}
Nokogiri [relies on a C extension](https://nokogiri.org/#guiding-principles_1) which in turn uses `libxml2`, so generally you should see very fast performance unless the number of resources in your project is extremely large.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Nokogiri [relies on a C extension](https://nokogiri.org/#guiding-principles_1) which in turn uses `libxml2`, so generally you should see very fast performance unless the number of resources in your project is extremely large.
Nokogiri [relies on a C extension](https://nokogiri.org/#guiding-principles_1) which in turn uses `libxml2`. You should see fast performance unless the number of resources in your project is monstrous.

Just a suggestion 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe enormous? Huge?

bridgetown-website/src/_docs/plugins/html-inspectors.md Outdated Show resolved Hide resolved
category: plugins
---

The HTML Inspectors API, added in Bridgetown 1.1, provides a useful and safe way to manipulate the HTML output of your resources. Safe because instead of using string manipulation, regular expressions, and the like—which is prone to error—you'll be working on real node trees. This is thanks to [Nokogiri](https://nokogiri.org), a Ruby gem which lets you work with a DOM-like API directly on HTML documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safe because instead of ...

I think the beginning of this sentence needs a little love. What is safe? Why would it not be? etc.

@jaredcwhite
Copy link
Member Author

jaredcwhite commented May 5, 2022

Thanks @andrewmcodes for taking a look and for your enthusiasm. It's been a super fun feature to work on and implement…once again gotta give Mr. Shoelace himself @claviska a big shoutout for coming up with the feature idea as he works on his own docs engine.

I'm about to be OOTO for a few days, but I'll take a close look at your feedback as soon as I can. 🙏

Copy link
Member

@KonnorRogers KonnorRogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sexy and looks great and is a surprisingly small amount of code, great work!

module RunInspectors
def self.call(resource, inspectors) # rubocop:disable Metrics/CyclomaticComplexity
return resource.output if !inspectors ||
!resource.destination&.output_ext&.starts_with?(".htm") ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nokogiri ships an XML inspector so I imagine it should be possible.

https://nokogiri.org/tutorials/parsing_an_html_xml_document.html#from-a-string

@jaredcwhite
Copy link
Member Author

Thanks again @andrewmcodes for the feedback! I'll clean up the first few sentences, and I'll also look into supporting .xml extension files as well.

@jaredcwhite jaredcwhite changed the title feat: add HTML Inspectors API using Nokogiri feat: add HTML & XML Inspectors API using Nokogiri May 16, 2022
@jaredcwhite jaredcwhite merged commit d8cc141 into main May 16, 2022
@jaredcwhite jaredcwhite deleted the html-inspectors branch May 16, 2022 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: allow resources/generated pages to be manipulated using DOM-style tooling
3 participants