Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate synchronous loopback requests in favor of asynchronously re-checking AMP validity #2069

Closed
westonruter opened this issue Apr 4, 2019 · 33 comments · Fixed by #5741
Assignees
Labels
Enhancement New feature or improvement of an existing one Groomed P1 Medium priority WS:Core Work stream for Plugin core
Milestone

Comments

@westonruter
Copy link
Member

westonruter commented Apr 4, 2019

Feature description

Every time a post is saved in the editor, the AMP plugin performs a loopback request to that post's permalink on the frontend to then check it for validation errors. In the block editor, this loopback request is doing during the handling of the PUT request; this is done so that the PUT response can include all of the validation error data, but it also means that the PUT request takes much longer then it would normally.

In the classic editor this loopback request is performed when the the whole page is reloading after saving, and it is also done when activating a new plugin and switching between template modes. The loopback requests here also slow down the page load time, but with the benefit being the validation results can be easily displayed immediately in the page response.

Nevertheless, it would be much better to switch to doing asynchronous loopback requests. When updating a post in the block editor, a successful PUT request should then trigger another request to a new REST API endpoint dedicated to URL validation. So clicking the Save button would then kick off a new loading message and spinner that could say, “Checking AMP validity”. When this returns, any AMP validation messages would be added/removed in the block interface.

Similarly, when activating a new plugin, this loopback request should be performed via Ajax on the plugin screen; an admin notice can be added which contains, “Re-checking AMP validity…” with the notice then updated to he success or error based on the result.


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • Switch to asynchronous loopback requests
  • When updating a post in the block editor, a successful PUT request should trigger another request to a new REST API endpoint dedicated to URL validation.
  • Clicking the Save button should kick off a new loading message and spinner that could say, “Checking AMP validity”.
    • When this returns, any AMP validation messages should be added/removed in the block interface.

Implementation brief

  • Consider adding a new amp_uuid block attribute for all blocks which persists across page loads. This must be a UUID that remains unique even after duplicating a block. If such a UUID were in place, then it would be possible to store the amp_uuid with the block attributes in the validation error, and this could then be used to connect a validation error to the corresponding block in the editor.

QA testing instructions

Demo

Changelog entry

@westonruter westonruter added this to the v2.0 milestone Jul 15, 2019
@swissspidy swissspidy added the Enhancement New feature or improvement of an existing one label Jul 16, 2019
@archon810
Copy link

Can this solution also support async validation for the classic editor? We have specifically disabled the block editor on our site using https://wordpress.org/plugins/classic-editor/, and automatic validation essentially doubles the already slow Update/Publish action.

@archon810
Copy link

I just did some benchmarking on our dev server, out of a 12 second run after hitting Update, 9s is a sync call to AMP validate.

So to our writers, launching AMP means a 4x increase in wait time, which is excruciating.

Please let's change it to async ASAP.

@westonruter
Copy link
Member Author

westonruter commented Sep 11, 2019

Try this extension to the AMP plugin: https://gist.github.com/westonruter/31ac0e056b8b1278c98f8a9f548fcc1a

This is a workaround for sites encountering slowness when updating posts due to the synchronous loopback requests which happen at post save to report back validation errors to the editor; instead of running at the save_post action, the post is queued for validation in an immediate event via WP Cron. The AMP plugin intends to make this process asynchronous, but until that lands this plugin can be used. A side effect of turning off synchronous validation is that validation error warnings are no longer displayed in the editor. To see the errors, you have to go to the AMP Validated URLs screen.

@archon810
Copy link

Are you sure they won't be displayed? Because for me right now, they display every time I reload the edit page, not just after pressing update. They are persistent.

@westonruter
Copy link
Member Author

I do not see any warning, no. Not initially, not after update, not after reload.

The plugin prevents the admin notice from being added via this code:

remove_action( 'edit_form_top', [ 'AMP_Validation_Manager', 'print_edit_form_validation_status' ] );

If I comment that line out, then I see the admin notice again.

@archon810
Copy link

Oh, I didn't know you added that into this particular plugin, as it's a related feature but not necessarily in the same scope.

We discussed earlier hiding these from non-admin users. I think the plugin should just do that as a compromise. It's still valuable to see those warnings for admins.

@westonruter
Copy link
Member Author

See the plugin source for the amp_async_validation filter. There is some example code you can use to disable the async validation conditionally, such as when the user is an admin.

Hiding of the notices is really bound up with the async validation, because as soon as async validation is performed then the notices are no longer accurate.

@archon810
Copy link

I'd be OK with the notices showing and being delayed if the Recheck button continued to work and be synchronous.

@westonruter
Copy link
Member Author

Notices with stale information will be frustrating. If there is a script in the content, and a validation error notice appears because of it, removing the script should cause the notice to go away immediately after saving. This is only possible with synchronous validation. So when using async validation, the notice needs to be suppressed, at least in this little workaround plugin.

Figuring out how to do async validation and show a notice at the same time... that is what this issue is all about. Synchronous validation makes it easy to show the notice, because the currently saved state is the same state that was validated. But if async, then the user could make changes while validation is being performed, and this can cause the validated state and the editor state to be out of sync. This is particularly a challenge for the block editor, since we try to show validation warning notices contextually with each block.

@archon810
Copy link

Perhaps the solution is a metabox that has the live validation status and can refresh it via Ajax? Or a custom notice at the top that can do the same?

If you don't want to change much, what about only showing the notice after the cron job ran? If it's pending, suppress?

We're using the classic editor, not the block editor, so I'd love the solution to work for classic as well.

@archon810
Copy link

archon810 commented Sep 12, 2019

Mentioning as per Slack chat with @westonruter: currently, AMP_Validation_Manager::post_supports_validation() doesn't account for autosaves, which means the validator unnecessarily runs in the background for every autosave, which, if you leave a post open for a while, can amount to hundreds of autosaves and hundreds of wasted server resources.

Validation should only run on explicit post saves, not background ones. AMP_Validation_Manager::post_supports_validation() already filters out most of them in this code:

$supports_validation = (
// Skip if the post type is not viewable on the frontend, since we need a permalink to validate.
is_post_type_viewable( $post->post_type )
&&
! wp_is_post_autosave( $post )
&&
! wp_is_post_revision( $post )
&&
'auto-draft' !== $post->post_status
&&
'trash' !== $post->post_status
);

@westonruter's suggestion is to add a check for defined( 'DOING_AUTOSAVE' ) && DOING_AUTOSAVE.

@westonruter
Copy link
Member Author

This is something we'll need to look into further. The ideal is that the user is provided feedback as soon as possible after they have added something that is not AMP compatible. The idea behind autosave is that, especially in the context of Gutenberg, warnings can be added shortly after a block has been added which does something not valid in AMP.

In any case, I've updated the AMP async validation plugin to prevent validation from being queued during autosave requests, while also preventing validation for non-published posts (since WP Cron is not authenticated as a user and thus can't access non-published posts on the frontend): https://gist.github.com/westonruter/31ac0e056b8b1278c98f8a9f548fcc1a

@westonruter
Copy link
Member Author

For doing async validation of draft posts, we'll need to find a secure way for WP Cron to make requests as the authenticated user who last modified the post.

@archon810
Copy link

Perhaps wp-ajax is the right way to go here?

@schlessera
Copy link
Collaborator

The request passed to WP Cron could include a token that can only be validated using a private key information the user has.

@westonruter westonruter added the P1 Medium priority label Nov 26, 2019
@westonruter
Copy link
Member Author

Continuing a train of thought from an in-person conversation with @adamsilverstein

The issue with limiting validation results to autosave is that we need to handle the case of getting validation results when someone hits save draft or publish and no autosave has happened. We need to get the validation results and show them with the blocks they relate to. At the same time we shouldn't necessarily slow down the REST API call to save a post, especially for non-publish actions. Maybe that means triggering an autosave when the pre-publish flow is engaged, and prevent publishing until that save has completed and we have obtained the validation results. In other words, validation should be asynchronous and unblocking for saving posts in the editor UI, with the only exception being a soft-block during the presentation of the pre-publish panel. The pre-publish panel should display validation results once they have been asynchronously-obtained and block publishing until they have been addressed (marked as removed or kept). Certain limited user roles (e.g. author) would not necessarily see such validation UI, but that is to be tackled as part of #2673.

A closely related issue here is ensuring validation results can be applied to a document even after changes have been made since the time of getting the validation results. This is the idea above of adding a UUID to every block which is persistent, though which is ensured to be unique when duplicating blocks and when adding the same reusable block to the page. By keying block validation errors to the UUID we will no longer need to rely on the block index, which is brittle.

@westonruter
Copy link
Member Author

And continued further…

The synchronous validation being performed here is the loopback request that is performed to check the saved post on the frontend.

First of all, the AMP plugin registers a new amp-validity REST field for all supported posts:

register_rest_field(
$object_types,
self::VALIDITY_REST_FIELD_NAME,
[
'get_callback' => [ __CLASS__, 'get_amp_validity_rest_field' ],
'schema' => [
'description' => __( 'AMP validity status', 'amp' ),
'type' => 'object',
],
]
);

When a PUT request is being done, it will initiate the a loopback request to obtain the validation results of the just-saved post on the frontend and return them in the PUT response body:

/**
* Adds a field to the REST API responses to display the validation status.
*
* First, get existing errors for the post.
* If there are none, validate the post and return any errors.
*
* @param array $post_data Data for the post.
* @param string $field_name The name of the field to add.
* @param WP_REST_Request $request The name of the field to add.
* @return array|null $validation_data Validation data if it's available, or null.
*/
public static function get_amp_validity_rest_field( $post_data, $field_name, $request ) {
if ( ! current_user_can( 'edit_post', $post_data['id'] ) || ! self::has_cap() || ! self::post_supports_validation( $post_data['id'] ) ) {
return null;
}
$post = get_post( $post_data['id'] );
$validation_status_post = null;
if ( in_array( $request->get_method(), [ 'PUT', 'POST' ], true ) ) {
if ( ! isset( self::$posts_pending_frontend_validation[ $post->ID ] ) ) {
self::$posts_pending_frontend_validation[ $post->ID ] = true;
}
$results = self::validate_queued_posts_on_frontend();
if ( isset( $results[ $post->ID ] ) && is_int( $results[ $post->ID ] ) ) {
$validation_status_post = get_post( $results[ $post->ID ] );
}
}
if ( empty( $validation_status_post ) ) {
$validation_status_post = AMP_Validated_URL_Post_Type::get_invalid_url_post( amp_get_permalink( $post->ID ) );
}
$field = [
'results' => [],
'review_link' => null,
];
if ( $validation_status_post ) {
$field['review_link'] = get_edit_post_link( $validation_status_post->ID, 'raw' );
foreach ( AMP_Validated_URL_Post_Type::get_invalid_url_validation_errors( $validation_status_post ) as $result ) {
$field['results'][] = [
'sanitized' => AMP_Validation_Error_Taxonomy::VALIDATION_ERROR_ACK_ACCEPTED_STATUS === $result['status'],
'error' => $result['data'],
'status' => $result['status'],
'term_status' => $result['term_status'],
'forced' => $result['forced'],
];
}
}
return $field;
}

This slows down the request a lot. The whole thing needs to be refactored/revisited.

Again, if each block has a UUID this will help guard against a async delay in getting the validation results from causing validation errors to be displayed with the wrong blocks, since the block indices would no longer be used to match up the blocks with the errors.

Validation should no longer be performed in the REST API field callback. In fact, the amp-validity field should be removed entirely. Instead, there should be a REST API endpoint for validating any URL. This validation endpoint should be queried immediately after an autosave revision is saved, and the results can then update the UI with any changes for validation. When doing a Publish action, an autosave should be triggered followed by obtaining the validation results in a subsequent request… once these both complete, the pre-publish panel can display the validation results. (In the future, for some user roles, the presence of validation errors should perhaps block the post from being published. Again #2673.) This would be applicable to any template mode where validation results are provided. (Reader mode should get validation results as well: #2044 (comment)).

@westonruter
Copy link
Member Author

When doing a Publish action, an autosave should be triggered followed by obtaining the validation results in a subsequent request… once these both complete, the pre-publish panel can display the validation results.

One concern I have about this is it potentially means clicking Publish will require 3 HTTP requests (1 autosave, 1 validation, 1 publish) whereas at present only 1 HTTP request is required (the one to both update and publish at once). I suppose the latency is fine when the pre-publish checks are present, but if the user has turned them off then publishing should not be blocked by any other requests. In such case, validation results would appear in the UI after the user successfully published the post.

@adamsilverstein
Copy link
Collaborator

How about hooking into autosaves directly and adding the validation data to their response?

How/where is the validation actually performed against a URL, can you explain how this runs and I can try hooking up to autosaves and the autosave URL. Then we can show validation status on every autosave, and potentially lock publishing until validation passes.We can also trigger a recheck with a button that triggers an autosave.

Adding a UUID for each block makes sense if we need that, I'm still unclear why core GB doesn't have this in place already. We should be able to add a custom property via filters and set a UUID that way, I can try that.

@johnwatkins0
Copy link
Contributor

johnwatkins0 commented Dec 4, 2020

@westonruter After #5515 is merged, synchronous loopback requests will be eliminated for users with dev tools turned off. They'll still exist for users with dev tools turned on. Just wanted to confirm with you that we should still go forward with this, correct?

I think it will be fairly straightforward to create the REST endpoint and send a JS fetch request after post save in the editor. I think we'd also want to merge #5589 first, though, because we'd want to show the "Re-checking" state in the new sidebar, and we'd probably use the sidebar's data store as well. #5589 is already pretty huge, so this would be better as a followup.

@westonruter
Copy link
Member Author

After #5515 is merged, synchronous loopback requests will be eliminated for users with dev tools turned off. They'll still exist for users with dev tools turned on. Just wanted to confirm with you that we should still go forward with this, correct?

Synchronous loopback requests are already turned off for users who have DevTools turned off, no? No validation happens at all when a user has DevTools turned off.

And yes, for the moment users with DevTools enabled will continue to have synchronous validation being performed. To make validation asynchronous, I think we need to first address the concern of relying on the block index position in the content to identify which block goes with which errors. We discussed maybe using a UUID, but that can be problematic in that it would pollute the post_content.

@johnwatkins0
Copy link
Contributor

johnwatkins0 commented Dec 4, 2020

I'm going to test out a UUID idea on the editor sidebar branch and will follow up. I have had some bugginess on that branch with posts with a lot of blocks. The index is sometimes off.

@westonruter
Copy link
Member Author

Once this is implemented along with the plugin sidebar (#5589), then we can default to enabling DevTools for administrators regardless of template mode. This will revert part of #5333, specifically $enabled will by true by default here:

// Disable Developer Tools by default when in Reader mode.
$enabled = AMP_Theme_Support::READER_MODE_SLUG !== AMP_Options_Manager::get_option( Option::THEME_SUPPORT );

We disable DevTools by default in Reader mode because the UI is intrusive and it slows down saving of posts. But once these issues are solved, we can enable it again.

@johnwatkins0
Copy link
Contributor

I'm going to test out a UUID idea on the editor sidebar branch and will follow up. I have had some bugginess on that branch with posts with a lot of blocks. The index is sometimes off.

Promised followup: I have persistent block UUIDs working on the sidebar branch, and still testing it out. We obviously want to avoid causing Gutenberg block validation errors if the AMP plugin is turned off, so I'm double checking everything.

@westonruter
Copy link
Member Author

What about the issue of “polluting” block attributes with UUIDs? While most users won't notice this, anyone who uses the code editor would see a whole bunch of UUIDs all over the place, right?

@johnwatkins0
Copy link
Contributor

@westonruter That is exactly right.

Screen Shot 2020-12-09 at 1 45 27 PM

It does feel somewhat heavy handed to apply an override to every single block. Even though block order is currently unreliable, in working on the editor sidebar over several weeks I've only found it to be inaccurate once or twice. Maybe, on second thought, the amp_uuid is too heavy a solution for the problem.

When it comes to this issue, one thing we do know is that the core client IDs stay consistent within an editing session. Maybe instead of having a UUID that persists between editing sessions, we can do something like:

  1. On save, send a validation request to our new custom endpoint, and in the request include a list of the client IDs in the request POST data.
  2. In the REST response, return the errors with the client IDs.
  3. If any of the received errors includes a client ID that is no longer present in the editor, ignore it.

There's probably some holes to this, but I think a solution along these lines can be found. Let me know what you think -- removing the UUID update I made on the sidebar branch this morning (46a204e) would be no problem.

@westonruter
Copy link
Member Author

I think the client IDs are indeed the key here, but I think we don't even have to send them. What we need to know is what client IDs map to which block indices at the time of saving, right? That being the case, at the moment that a post is initiating a save request (or for the initial editor load), we can capture an ordered list of the client IDs for the blocks in the content. After the save request finishes, the request to validate the saved post can proceed. It then will return with validation errors with each including the block_content_index in the response. Then when deciding which block goes which which error, we can refer to that stored mapping of client ID to block_content_index. If a block no longer exists with a given client ID, then it would be skipped from being associated with the block. I believe this will solve our stability problem while also unblock us from performing asynchronous validation. What do you think?

@johnwatkins0
Copy link
Contributor

@westonruter Yes, I think we'll definitely be able to make something like that work when we have the validation REST endpoint. So for now I'm going to remove this UUID experiment from the sidebar branch and will switch to this issue immediately following the sidebar, so that hopefully both can be in 2.1.

@westonruter
Copy link
Member Author

westonruter commented Jan 7, 2021

This is something we'll need to look into further. The ideal is that the user is provided feedback as soon as possible after they have added something that is not AMP compatible. The idea behind autosave is that, especially in the context of Gutenberg, warnings can be added shortly after a block has been added which does something not valid in AMP.

Summarizing a discussion with @delawski regarding autosaves: I see there as being a continuum of how aggressive the validation requests could be performed in the lifecycle of a post. On the aggressive side, we could do validation for all autosaves even for auto-drafts. On the other end of the spectrum, we could limit validation exclusively to explicit save events for published posts.

Things have changed with the introduction of the plugin editor sidebar in #5589, which presents validation issues in the sidebar as opposed to inline warnings. The sidebar is not shown by default, and the user may not even have it pinned so its status icon (see screenshots in #5304 (comment)) may not be shown in the editor. The only way the user would see a validation error normally in the editor then is if the block were selected and the toolbar icon is displayed. (Note also in #5304 (comment) I've suggested that we show the AMP validation status in the Status & Visibility panel and the pre-publish panel.)

All this being said, I think the balance for when to do validation is at a point where the user would be expected to possibly see an issue when viewing the post on the frontend. And that is in two cases:

  1. They do an explicit save, regardless of the status (draft, private, publish).
  2. They click the preview button, at which point an autosave request is initiated before taking the user to the post on the frontend.
  3. 🆕 After making a change to the content and the post enters a dirty state, a notice is shown in the plugin sidebar indicating that the validation results are stale. A link is shown to update the results (in the same way as the Preview), which would initiate the autosave followed by the validation request. See Replace synchronous URL validation in editor with async validation #5741 (comment).

Once the user goes to view the post on the frontend, they may at that point notice something is broken with a block they added, and by the time they go back to the editor to investigate, the asynchronous AMP validation request will have finished and they'll see the validation issues when they click on the block.

@westonruter
Copy link
Member Author

QA Passed

On 2.0.11, the POST request to the REST API to save a post takes 3.58 seconds.

On 2.1, the same POST request takes only 459 ms. This shows that indeed validation is not being performed during the save request. This is also seen below: the POST request to save the post is followed immediately a POST request to validate the URL, with the first request taking <500ms and the second taking much longer. The AMP Validation sidebar shows a separate spinner while that validation request is happening:

async-validation.mov

@westonruter westonruter mentioned this issue Oct 26, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or improvement of an existing one Groomed P1 Medium priority WS:Core Work stream for Plugin core
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants