-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize entire HTML output when theme support is present #888
Conversation
…-tag Remove erroneous additional allowance of script[type=text/javascript]
includes/class-amp-theme-support.php
Outdated
self::$amp_scripts = array_merge( self::$amp_scripts, $scripts ); | ||
self::$amp_styles = array_merge( self::$amp_styles, $styles ); | ||
|
||
$output = preg_replace( '#(<body.*?>)(.+)(</body>)#si', '$1' . $sanitized_inner_body . '$3', $output ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter here we're replacing the body with the sanitised body. This is because it's returning AMP_DOM_Utils::get_content_from_dom( $dom )
from the sanitize method. Wouldn't this be easier if the method returned $dom->saveHTML()
if the content passed is a complete document? This would stop the need to replace the body after it was sanitized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, potentially. The sanitize
method is better called sanitize_content
. Better handling of sanitizing entire documents should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidCramer how about this: 876c22f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter Thats nice, makes way for a full doc later on. I see that you made a comment about the head
needs updating to mandatory_parent_blacklist
so that makes sense now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that I'm happy with with this.
…hod w/ static return with a plain static variable
…ze always returning body content
includes/class-amp-theme-support.php
Outdated
* as otherwise elements from the HEAD could get added to the BODY. | ||
*/ | ||
$sanitized_inner_body = AMP_DOM_Utils::get_content_from_dom( $dom ); | ||
$output = preg_replace( '#(<body.*?>)(.+)(</body>)#si', '$1' . $sanitized_inner_body . '$3', $output ); | ||
|
||
// Inject required scripts. | ||
$output = preg_replace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter get_amp_component_scripts
doesn't require the argument anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @westonruter. This will help to move forward, thanks!
Hummm, something isn't working quite right. If I add this to a template:
It is not getting stripped out as expected. |
hmm. testing as well. |
…utput-buffer-sanitization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @westonruter @DavidCramer. This is really exciting!
$original_html = trim( ob_get_clean() ); | ||
$sanitized_html = AMP_Theme_Support::finish_output_buffering( $original_html ); | ||
|
||
$this->assertContains( '<meta charset="utf-8">', $sanitized_html ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter @DavidCramer these tests could have a bit more coverage here. I believe it is ok to do that at a later stage since the other components sanitization (form etc.) are still WIP.
See #875.
This largely eliminates the need for
amp_component_scripts
entirely because the AMP component scripts are discovered via the whitelist sanitizer in #882. So for example, the Form sanitizer in #871 can be used without needing to define itsget_scripts()
method.For another PR:
head
to sanitization.