Skip to content

Commit

Permalink
Image Loading: Remove last_base_element_url_
Browse files Browse the repository at this point in the history
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}
  • Loading branch information
domfarolino authored and marcoscaceres committed Mar 28, 2023
1 parent 4f4b400 commit 882f4ce
Showing 1 changed file with 57 additions and 17 deletions.
74 changes: 57 additions & 17 deletions html/semantics/embedded-content/the-img-element/image-base-url.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,64 @@
</head>

<body>
<img id="img">
<script>
const t = async_test("An image request's parsed URL should be affected by a " +
"dynamically-inserted <base>, if it was inserted before " +
"the image request microtask runs");

t.step(() => {
const elm = document.getElementById('img');
elm.src = 'resources/image.png';
elm.onload = t.unreached_func("The image should have failed to load, as " +
"the request URL should be affected by the " +
"<base> element");
elm.onerror = t.step_func_done();

const base = document.createElement("base");
base.setAttribute("href", "bogus/");
document.head.appendChild(base);
});
// See https://github.com/whatwg/html/issues/7383 and
// https://chromium-review.googlesource.com/c/chromium/src/+/3311225.
// This test asserts two things:
// 1.) That Document base URL modifications that take place in between an
// image loading microtask being scheduled and executed are reflected in
// the final image request
// 2.) That subsequent changes to a Document's base URL before an image is
// inserted into the DOM do not lead to the image being refetched when it
// is inserted asynchronously later. This is because image insertion is
// not a relevant mutation
// (https://html.spec.whatwg.org/#relevant-mutations).
promise_test(async t => {
const image = new Image();
image.src = 'green.png';

// Dynamically insert a <base> tag that should influence the above image
// request because the above code triggers a microtask to continue fetching
// the image, which will run while we await `loadPromise` below.
const base = document.createElement('base');
base.setAttribute('href', 'resources/');
document.head.append(base);

const loadPromise = new Promise((resolve, reject) => {
image.addEventListener('load', e => {
resolve();
}, {once: true});

image.addEventListener('error', e => {
reject('The image must load');
}, {once: true});
});

// The image should load successfully, since its request was influenced by the
// <base> element which points the request to the right directory.
await loadPromise;

// Now manipulate the <base> element to point to a bogus directory.
base.setAttribute('href', 'bogus/');
document.body.append(image);

const timeoutPromise = new Promise(resolve => t.step_timeout(resolve, 1500));
const imageErrorPromise = new Promise((resolve, reject) => {
image.addEventListener('load', e => {
reject('The image should not be refetched upon insertion and load, ' +
'because (1) insertion is not a relevant mutation, and (2) the ' +
'new relative URL should not resolve to a real resource');
}, {once: true});

image.addEventListener('error', e => {
reject('The image should not be refetched upon insertion, because ' +
'insertion is not a relevant mutation');
}, {once: true});
});

await Promise.race([timeoutPromise, imageErrorPromise]);
}, "An image should not be refetched upon insertion asynchronously after its " +
"Document's base URL changes");
</script>
</body>
</html>

0 comments on commit 882f4ce

Please sign in to comment.