Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use output buffer and HTML tag processor to inject directives on BODY tag for full-page client-side navigation #61212

Merged
merged 5 commits into from
May 27, 2024

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Apr 29, 2024

What?

This implements a resolution for this task from #60951:

  • Explore how to modify the body tag without the hack: link.

Why?

As discussed, the code is currently injecting an additional <body> tag after the BODY has already been opened, like so:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
  </head>
  <body>
    <h1>
      Hello World!
    </h1>
    <body data-wp-context="{}" data-wp-interactive="core/experimental">
  </body>
</html>

Surprisingly (to me at least), this actually works:

image

Maybe I shouldn't be surprised given HTML's loose parsing rules. But I can imagine other plugins that try to parse the HTML to, for example, apply various optimizations would get might confused when encountering multiple BODY tags.

The attributes could rather get injected on the existing BODY tag instead with the HTML Tag Processor. I see this is not currently possible since the BODY tag is hard-coded in template-canvas.php. This brings up the question again of output buffering for the entire template (Core-43258). It's something I've been working on in the context of the Performance team's Optimization Detective plugin which output buffers the entire template and then uses HTML Tag Processor to do optimizations.

How?

This PR copies the same output-buffering approach taken from the Optimization Detective plugin. Since there is no existing filter for the rendered template output, it starts an output buffer immediately before the template begins rendering. This is at the template_include filter.

Testing Instructions

  1. Enable the "Enable full page client-side navigation using the Interactivity API" experiment.
  2. Go to the frontend.
  3. Make sure that the data-wp-interactive and data-wp-context attributes appear on the BODY. (There should be only one <body> start tag when looking at the HTML source.)
  4. Ensure that client-side navigation is working as expected.

@dmsnell
Copy link
Member

dmsnell commented Apr 29, 2024

Surprisingly, this actually works

Actually this shouldn't be surprising because it's exactly how HTML5 specifies that the BODY and HTML tags work.

for each attribute on the token, check to see if the attribute is already present on the body element (the second element) on the stack of open elements, and if it is not, add the attribute and its corresponding value to that element.

While you highlight that downstream parsers may not pick up the right set of attributes for the BODY tag, every browser should, and this isn't "HTML's loose parsing rules" at play, but HTML's extremely deterministic parsing rules 😉. It's entirely reliable to add attributes to a BODY element by appending multiple BODY tags. It may not be ideal, but it's possible (this is one of the challenges with "retroactive changes" in the HTML API, because we may miss that a BODY has a given class name until we're deep into the document).

Copy link

github-actions bot commented Apr 29, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: cbravobernal <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where this is going, but I want to point out too that we can easily test this in Core in combination with Gutenberg PRs via the Playground.

Would it be workable to start by exploring a solution in Core? I've wanted to build this new pipeline anyway and we could potentially pioneer it through this need.

Was thinking less of gutenberg_template_output_buffer and more like one final pass by WordPress that includes hooks for things like on_tag__body and on_attribute__href. I don't know ohw these would all go or perform, but in this case the code could be something more like the following…

add_action( 'template_include', static function () {
	add_action( 'on_tag__body', static function ( $processor ) {
		// Check if these are already set...
		$processor->set_attribute( 'data-wp-…', … );
		$processor->set_attribute( 'data-wp-…', … );
	}, 1, 10 );
}, 0, PHP_INT_MAX );

I'm not as much of a fan of actions and filters in performance-sensitive code like this; maybe there's another option to explore. Still, the idea is of hooking into HTML events instead of creating content-oriented filter and actions.

lib/experimental/full-page-client-side-navigation.php Outdated Show resolved Hide resolved
);
return $passthrough;
}
add_filter( 'template_include', '_gutenberg_buffer_template_output', PHP_INT_MAX );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't an action just a filter whose return value is ignored? we could eliminate the $passthrough if we made this add_action() couldn't we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, add_action() defined as an alias:

function add_action( $hook_name, $callback, $priority = 10, $accepted_args = 1 ) {
	return add_filter( $hook_name, $callback, $priority, $accepted_args );
}

So if the callback doesn't return a value, then the filter is borked. I just tested:

diff --git a/lib/experimental/full-page-client-side-navigation.php b/lib/experimental/full-page-client-side-navigation.php
index 259517cfc84..452abee6c63 100644
--- a/lib/experimental/full-page-client-side-navigation.php
+++ b/lib/experimental/full-page-client-side-navigation.php
@@ -68,12 +68,8 @@ add_filter( 'gutenberg_template_output_buffer', '_gutenberg_add_client_side_navi
  *          } elseif ( current_user_can( 'switch_themes' ) ) {
  *
  * @link https://core.trac.wordpress.org/ticket/43258
- *
- * @param string $passthrough Value for the template_include filter which is passed through.
- *
- * @return string Unmodified value of $passthrough.
  */
-function _gutenberg_buffer_template_output( string $passthrough ): string {
+function _gutenberg_buffer_template_output() {
 	ob_start(
 		static function ( string $output ): string {
 			/**
@@ -85,6 +81,5 @@ function _gutenberg_buffer_template_output( string $passthrough ): string {
 			return (string) apply_filters( 'gutenberg_template_output_buffer', $output );
 		}
 	);
-	return $passthrough;
 }
-add_filter( 'template_include', '_gutenberg_buffer_template_output', PHP_INT_MAX );
+add_action( 'template_include', '_gutenberg_buffer_template_output', PHP_INT_MAX );

The result is an empty page because the if ( $template ) condition isn't entered here: https://github.com/WordPress/wordpress-develop/blob/204a1bbf4e5f22b07a93c1f4a0b12bdd65d6483f/src/wp-includes/template-loader.php#L105-L112

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

darn. you've shattered my understanding of this. I guess do_action() simply ignores the return value, but apply_filter() doesn't care if something was added as a filter or an action.

@westonruter
Copy link
Member Author

westonruter commented Apr 29, 2024

Would it be workable to start by exploring a solution in Core? I've wanted to build this new pipeline anyway and we could potentially pioneer it through this need.

@dmsnell For core, I was thinking this would land in a patch like as follows:

diff --git a/wp-includes/template-loader.php b/wp-includes/template-loader.php
index 0fd08545..119df0f3 100644
--- a/wp-includes/template-loader.php
+++ b/wp-includes/template-loader.php
@@ -102,6 +102,9 @@ if ( wp_using_themes() ) {
 	 * @param string $template The path of the template to include.
 	 */
 	$template = apply_filters( 'template_include', $template );
+
+	ob_start( 'wp_template_output_buffer_callback' );
+
 	if ( $template ) {
 		include $template;
 	} elseif ( current_user_can( 'switch_themes' ) ) {

Where wp_template_output_buffer_callback() would be defined as:

function wp_template_output_buffer_callback( string $output ): string {
	/**
	 * Filters the template output buffer prior to sending to the client.
	 *
	 * @param string $output Output buffer.
	 * @return string Filtered output buffer.
	 */
	return (string) apply_filters( 'wp_template_output_buffer', $output );
}

(Aside: This also needs to account for non-HTML responses. Done in 300d9e3.)

But you're saying the raw HTML output shouldn't be filterable at all, but only be accessible via the HTML Tag Processor? I will note that there are other use cases for being able to access the raw response such as via a filter which are noted in Core-43258, for example caching plugins and optimization plugins. Currently all such plugins have to reinvent the wheel to do their own output buffering of the entire response.

@westonruter westonruter force-pushed the update/client-side-navigation-directive-addition branch from 04171bf to 300d9e3 Compare April 29, 2024 23:50
return (string) $p . '<body data-wp-interactive="core/experimental" data-wp-context="{}">';
function _gutenberg_add_client_side_navigation_directives( $response_body ) {
$is_html_content_type = false;
foreach ( headers_list() as $header ) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In WordPress/performance#1189 I've improved on the testability of this by prepending 'Content-Type: ' . ini_get( 'default_mimetype' ) to this list.

Copy link
Contributor

@cbravobernal cbravobernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. It is experimental and won't be included in version 6.6, but it's better not to leave this stalled for much longer.

@cbravobernal cbravobernal merged commit 58b454c into trunk May 27, 2024
63 checks passed
@cbravobernal cbravobernal deleted the update/client-side-navigation-directive-addition branch May 27, 2024 12:01
@github-actions github-actions bot added this to the Gutenberg 18.5 milestone May 27, 2024
@dmsnell
Copy link
Member

dmsnell commented May 28, 2024

But you're saying the raw HTML output shouldn't be filterable at all, but only be accessible via the HTML Tag Processor? I will note that there are other use cases for being able to access the raw response such as via a filter which are noted in Core-43258, for example caching plugins and optimization plugins. Currently all such plugins have to reinvent the wheel to do their own output buffering of the entire response.

No that's not it. What I was saying is that I want to find an interface that meets peoples' needs in a way that's easier, more reliable, and more performant than when everyone does their own filtering.

Specifically, if we can turn 20x+ iterations over all the content into one iteration with 20x+ filters for that "chunk" or "thing" then we can potentially win a big performance improvement. For example, instead of filtering raw HTML, if Core exposes a way to filter attribute values, given the tag it's on and the content, then maybe tons of plugins would voluntarily remove their own filtering code.

function add_blank_opener( $tag_name, $attribute_name, $value ) {
	return `{$value} _blank`;
}

add_filter(
	'html_attribute__a__target',
	'add_blank_opener',
	10,
	3
);

@gziolo
Copy link
Member

gziolo commented May 28, 2024

Specifically, if we can turn 20x+ iterations over all the content into one iteration with 20x+ filters for that "chunk" or "thing" then we can potentially win a big performance improvement

It would be great to measure the impact of running apply_filters() on every HTML token, as the numbers could quickly add up for larger documents even when they would end up only checking if the filters are defined. At the same time, I understand the reasoning as some of these operations are performed differently today on the same HTML partials or even full body.

@dmsnell
Copy link
Member

dmsnell commented May 28, 2024

@gziolo it will be measured extensively 😉

though I predict we may not use filters directly for performance reasons.

still, from what I can tell, the existing situation is still probably worse because every filter splits apart the entire document and iterates every token anyway. We’ll probably skip lots of work in the worst case

carstingaxion pushed a commit to carstingaxion/gutenberg that referenced this pull request Jun 4, 2024
… tag for full-page client-side navigation (WordPress#61212)

* Inject client-side navigation directives via Tag Processor and output buffering

* Fix typing for render_block phpdoc

* Eliminate use of render_block filter

* Remove copy-pasta since tag

Co-authored-by: dmsnell <[email protected]>

* Ensure client-side directives are only added to HTML response bodies

---------

Co-authored-by: westonruter <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: cbravobernal <[email protected]>
@nextend
Copy link

nextend commented Jun 5, 2024

@westonruter We use output buffering in our 900k+ plugin and based on my experience I suggest the following changes:

<?php
function _gutenberg_buffer_template_output( string $passthrough ): string {
	ob_start(
		static function ( string $output, ?int $phase): string {

			if ($phase & PHP_OUTPUT_HANDLER_FINAL) {
				/**
			 	* Filters the template output buffer prior to sending to the client.
			 	*
			 	* @param string $output Output buffer.
				 * @return string Filtered output buffer.
				 */
				return (string) apply_filters( 'gutenberg_template_output_buffer', $output );
			}

			return $output;
		}
	);
	return $passthrough;
}

As I remember these are related when ob_clean and ob_flush called during the output buffer is open. It skips the processing of the output when its dropped.

patil-vipul pushed a commit to patil-vipul/gutenberg that referenced this pull request Jun 17, 2024
… tag for full-page client-side navigation (WordPress#61212)

* Inject client-side navigation directives via Tag Processor and output buffering

* Fix typing for render_block phpdoc

* Eliminate use of render_block filter

* Remove copy-pasta since tag

Co-authored-by: dmsnell <[email protected]>

* Ensure client-side directives are only added to HTML response bodies

---------

Co-authored-by: westonruter <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: cbravobernal <[email protected]>
@westonruter
Copy link
Member Author

@nextend Thank you. I've opened #62770 to implement this.

@westonruter
Copy link
Member Author

@nextend I did quite a bit more research and I found that instead of checking for whether it is in the PHP_OUTPUT_HANDLER_FINAL phase, it should rather open the output buffer without the PHP_OUTPUT_HANDLER_FLUSHABLE flag and then short-circuit if the $phase is PHP_OUTPUT_HANDLER_CLEAN. See WordPress/performance#1317 (comment). This allows for the template to call ob_clean() as expected, and for ob_flush() to have no effect and thus to not result in fragments being sent through the filter which expects an entire document.

Comment on lines +88 to +98
ob_start(
static function ( string $output ): string {
/**
* Filters the template output buffer prior to sending to the client.
*
* @param string $output Output buffer.
* @return string Filtered output buffer.
*/
return (string) apply_filters( 'gutenberg_template_output_buffer', $output );
}
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on WordPress/performance#1317, I believe this should now rather be improved as follows:

Suggested change
ob_start(
static function ( string $output ): string {
/**
* Filters the template output buffer prior to sending to the client.
*
* @param string $output Output buffer.
* @return string Filtered output buffer.
*/
return (string) apply_filters( 'gutenberg_template_output_buffer', $output );
}
);
ob_start(
static function ( string $output ): string {
// When the output is being cleaned (e.g. pending template is replaced with error page), do not send it through the filter.
if ( ( $phase & PHP_OUTPUT_HANDLER_CLEAN ) !== 0 ) {
return $output;
}
/**
* Filters the template output buffer prior to sending to the client.
*
* @param string $output Output buffer.
* @return string Filtered output buffer.
*/
return (string) apply_filters( 'gutenberg_template_output_buffer', $output );
},
0,
PHP_OUTPUT_HANDLER_STDFLAGS ^ PHP_OUTPUT_HANDLER_FLUSHABLE
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Interactivity API API to add frontend interactivity to blocks. [Packages] Interactivity /packages/interactivity [Type] Experimental Experimental feature or API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants