-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom non-AMP scripts #6528
Conversation
021b654
to
c5ec8b8
Compare
* Add `unwrap_noscripts` arg to `AMP_Script_Sanitizer`. * Introduce `data-amp-no-unwrap` attribute on `noscript` to prevent unwrapping. Fixes #6030
1210af1
to
d8b4227
Compare
d8b4227
to
1578aba
Compare
…sition-observer@once` and `amp-font@on-*` attrs
…an be dynamically set by style sanitizer
Plugin builds for f35c4e9 are ready 🛎️!
|
@@ -1531,7 +1533,11 @@ public static function ensure_required_markup( Document $dom, $script_handles = | |||
} | |||
|
|||
// When opting-in to POST forms, omit the amp-form component entirely since it blocks submission. | |||
if ( amp_is_native_post_form_allowed() && $dom->xpath->query( '//form[ @action and @method and translate( @method, "POST", "post" ) = "post" ]' )->length > 0 ) { | |||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more checking if amp_is_native_post_form_allowed()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, because now that the AMP_Script_Sanitizer
can modify the args of AMP_Form_Sanitizer
to enable the native_post_forms_allowed
arg, the return value of amp_is_native_post_form_allowed()
may not be accurate at this point. It's return value is used as the initial value for native_post_forms_allowed
, but since it can be changed by a sanitizer then it's only relevant for setting the initial value in amp_get_content_sanitizers()
.
In fact, there's not really a need for a global amp_is_native_post_form_allowed()
function anymore.
The only reason why the function was here was to offer a slight performance improvement to skip doing the XPath query. But now we are checking to see if the amp-form
extension was identified instead.
* @since 2.2 | ||
*/ | ||
protected function sanitize_script_elements() { | ||
$scripts = $this->dom->xpath->query( '//script[ not( @type ) or not( contains( @type, "json" ) ) ]' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for whitelisting type=json
scripts? What about scripts containing HTML templates, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for skipping JSON scripts is that they have no logic that can mutate the page, which is the main concern for script sanitization. If there is JSON on the page, it's not going to cause issues with tree shaking, for example. Nor will it negatively impact PX. So this is specifically sanitizing JS scripts (which the method used to be called, and it could get renamed back).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
So this is specifically sanitizing JS scripts (which the method used to be called, and it could get renamed back).
According to MDN, if the type
attribute has any other value than module
or a permitted JavaScript MIME type, the script will be treated as a data block, so I'm wondering if skipping only JSON scripts would be a shortsighted approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. I forgot about examples like <script type="text/plain">
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 03fb8b7
Extension::POSITION_OBSERVER === $element->tagName | ||
&& | ||
Attribute::ONCE === $event_handler_attribute->nodeName | ||
) | ||
|| | ||
( | ||
Extension::FONT === $element->tagName | ||
&& | ||
substr( $event_handler_attribute->nodeName, 0, 3 ) === 'on-' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
|
||
// When there are kept custom scripts, skip tree shaking since it's likely JS will toggle classes that have | ||
// associated style rules. | ||
// @todo There should be an attribute on script tags that opt-in to keeping tree shaking and/or to indicate what class names need to be included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to keep tree shaking enabled by default, and then provide an opt-in attribute to disable tree shaking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here is this, consider this common script in themes:
<script>document.documentElement.className = document.documentElement.className.replace( 'no-js', 'js' );</script>
If there are CSS rules such as:
.js .mobile-nav > ul {
display: none;
}
.no-js .mobile-nav > ul {
display: block;
}
The result is that with tree-shaking, the second rule will get stripped out because it is not aware of the className
change by the script. So this is why tree shaking needs to be disabled by default. Nevertheless, there could be a way to opt-in to keeping tree shaking by adding an attribute to the script
to indicate which class names it would be mutating:
<script data-mutation-classlist="no-js js">
document.documentElement.className = document.documentElement.className.replace( 'no-js', 'js' );
</script>
If such a hypothetical attribute were present, then those class names could be added to the AMP_Style_Sanitizer::$used_class_names
and the style rule would not get stripped out by the tree shaker.
So that's the thinking behind this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK that makes sense 👍.
// Capture the selector conversion mappings from the other sanitizers. | ||
foreach ( $this->sanitizers as $sanitizer ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for moving this foreach loop from init()
to sanitize()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because it allows us to gather up the selector mappings after all of the previous sanitizers have run, as opposed to gathering them before the other sanitizers have run. This is important for example if the script sanitizer enables native_img_used
for the AMP_Img_Sanitizer
, because when enabled no longer does the style sanitizer need to remap img
to amp-img
/amp-anim
.
Co-authored-by: Pierre Gordon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
QA passed I tested this with Test Custom Scripts plugin that Weston created. Things to note:
|
See #6443.
This pull request adds an opt-in to the
AMP_Script_Sanitizer
to do the sanitization of scripts itself, rather than list limiting itself to unwrappingnoscript
elements. The new behavior is controlled by thesanitize_scripts
sanitizer arg, which isfalse
by default but which can be enabled as follows:When enabled, three new validation errors can be emitted by the script sanitizer:
CUSTOM_INLINE_SCRIPT
: A custom inline script is encountered.CUSTOM_EXTERNAL_SCRIPT
: A custom external script is encountered (not loaded from the AMP CDN).CUSTOM_EVENT_HANDLER_ATTR
: An event handler attribute is encountered (e.g.onclick
).When activating this Test Custom Scripts mini plugin and adding a
[custom_scripts]
shortcode to a post, the validation errors appear as follows:With this test plugin, when the scripts are removed the resulting output is as follows:
Notice how “Loading...” never changes since the script was removed and that the
noscript
was unwrapped. If if the invalid markup for the validation errors is marked as kept however:Then the result on the frontend is as follows:
The effect of keeping a custom script is as follows:
data-ampdevmode
attribute is added to prevent removal by theAMP_Tag_And_Attribute_Sanitizer
.noscript
unwrapping behavior is disabled since this could likely result in duplication of JS and non-JS functionality.TransformedIdentifier
transformer to be configured amp-toolbox-php#319.)img
elements are used instead ofamp-img
, since AMP validation is not going to happen anyway. See Experiment: Add nativeimg
opt-in to prevent conversion toamp-img
/amp-anim
#6518.action-xhr
, as there may very well be scripts that are adding custom form validation logic on the page. See Add opt-in to preventPOST
forms from being converted toamp-form
(withaction-xhr
) #6527.Note: The
AMP_Script_Sanitizer
is moved to the beginning of the list of sanitizers because it's likely that keeping custom scripts will result in other behavior changes to other sanitizers beyond disabling tree-shaking in the style sanitizer and using native images. The ability of a sanitizer to change the behavior of subsequently-run sanitizers is enabled by the newAMP_Base_Sanitizer::update_args()
method that allows the arguments to be updated.Unwrapping
noscript
elementsAs noted above, when custom scripts are kept the
noscript
elements do not get unwrapped to avoid duplicated functionality (JS and non-JS fallback) from appearing on the page.There is also now a
unwrap_noscripts
sanitizer argument that allows unwrapping to be turned off entirely, although it is enabled by default.Lastly, when an individual
noscript
element has adata-amp-no-unwrap
attribute it will be selectively skipped from being unwrapped. Fixes #6030.Checklist