Skip to content
This repository has been archived by the owner on Dec 23, 2020. It is now read-only.

Optimize performance of regex for adding loading attribute #2

Closed
felixarntz opened this issue Jan 16, 2020 · 5 comments
Closed

Optimize performance of regex for adding loading attribute #2

felixarntz opened this issue Jan 16, 2020 · 5 comments
Labels
enhancement New feature or request

Comments

@felixarntz
Copy link
Member

felixarntz commented Jan 16, 2020

As previously pointed out by @azaozz, we should make sure we minimize the number of regular expressions used to modify content (via filters such as the_content).

A couple of things:

  • At a minimum, wp_lazy_load_content_media() should no longer run a preg_match() for each match from preg_replace_callback(). The preg_replace_callback() regular expression should be modified to cover both parts.
  • We should consider combining this regex with the one from core's wp_make_content_images_responsive().
    • In case we decide to only support img tags, we could use that as is.
    • In case we want to keep the current, more flexible approach, we could replace the img with (' . implode( '|', $tags ) . ') to cover different tags. In this case, the callback logic would need to account for that and only perform replacements for the right tags.
  • The regex in wp_make_content_images_responsive() only looks at img , not img(\s) like we do. We added that based on review by @westonruter - I think it makes the regex more robust, but it could also be considered overkill, given how WordPress image tags are typically printed.
  • If we combine both regexes, it should probably be in a new more generically named function that then calls the more specific functions (responsive-maker, lazy-loader) to run logic on the matched tags.

It would be great to do some profiling on how potential changes here would affect performance for content that contains a very high number of relevant tags.

@felixarntz felixarntz added the enhancement New feature or request label Jan 16, 2020
@azaozz
Copy link
Contributor

azaozz commented Jan 17, 2020

The regex in wp_make_content_images_responsive() only looks at img , not img(\s) like we do.

Generally (historically) having line breaks between HTML tag attributes doesn't work in user added content (because wpautop). It is also mentioned in the WP coding standards (I think) as being "not best practice" when outputting HTML from PHP. Wouldn't mind slowing things down a little bit by looking for \s instead of an empty space, but the regex for adding srcset and sizes has been in core for few years and I haven't seen "edge cases" where that is a problem :)

Also thinking that a tiny optimization would be to not capture the actual space character (\r, \n, \t, , etc.) and reinsert it afterwards. It is enough to look for its presence (so we don't match things that start with <img), but only replace <img with <img loading ="...". There's no need to replace/reinsert the space char.

@azaozz
Copy link
Contributor

azaozz commented Jan 18, 2020

If we combine both regexes, it should probably be in a new more generically named function that then calls the more specific functions (responsive-maker, lazy-loader) to run logic on the matched tags.

Right. Looked at this a bit. Ideally there should be a filter that runs on each <img> tag that is found, and replaces it with the filtered value (if different). That will let core and plugins filter or add other attributes besides srcset, sizes, and loading.

There are few "requirements" that make this somewhat more involved/a bit harder:

  • The regex should run only when there are filters that need it. This is easy to add but may introduce some filter timing issues/inconsistency. Also it doesn't seem possible to determine if the regex should run in certain cases. For example adding loading to post_content but not adding to comment_text or widget_text_content would still run it for comments and text widgets as there will be a filler using it.
  • Adding multiple attributes from the same filter should allow/make it easy for plugins to remove one but keep the others. This would need additional filters/actions while adding each attribute, i.e. shouldn't depend only on this filter.
  • Ideally the same filter should run when directly adding img tag attributes, for example from wp_get_attachment_image(). However the format there is different: string with the <img ...> tag vs. an associative array of attributes name => value.
  • In order for this to be useful, the (eventual) filter should provide more context.

@azaozz
Copy link
Contributor

azaozz commented Jan 18, 2020

Also, in order to be able to use that new filter for adding srcset and sizes, we will need to pass an array with all found img tags, not filter individual tags one by one. That's because wp_make_content_images_responsive() does "cache warming" before getting each individual image's meta from the DB. It needs all attachment IDs beforehand.

@felixarntz
Copy link
Member Author

What we need to here partially depends on the outcome of #3. If loading is intended specifically for images, we can combine the regexes easily. If not, we would need to expand the regex to maybe look at a specific whitelist of tags so that not only img tags are covered by this.

As long as the filter that we expose is img-specific for now, this should be fine though. We'd still be able to modify the regex in the future as it is internal. If the regex becomes e.g. (img|iframe), then we could internally handle it and simply add another filter for modifying iframe tags.

@felixarntz
Copy link
Member Author

Done via #23

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants