Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images should NOT refetch upon insertion even if the document's base URL has changed #7383

Closed
domfarolino opened this issue Dec 2, 2021 · 17 comments

Comments

@domfarolino
Copy link
Member

domfarolino commented Dec 2, 2021

Consider the following:

const image = new Image();
image.src = "image.jpg";

const base = document.createElement('base');
base.href = '/resources/';
document.head.append(base);

// Some time later
document.body.append(image); // Does this refetch?

The answer to // Does this refetch? is:

  • Chrome: Yes
  • Firefox: Yes
  • Safari: No
  • Spec: No

Per spec the answer is no because general element insertion is not one of the img's relevant mutations. If we wanted to match Chrome and Firefox's behavior, we should make general insertion one of the relevant mutations -- but I think that's not enough. Neither Chrome nor Firefox always attempt to refetch upon insertion, they only do so if the last known snapshot of the document's base URL has changed. We should probably change the spec to account for this extra snapshot piece of information, but we should open this up for more discussion and opinions.

/cc @emilio

@domfarolino domfarolino added topic: img agenda+ To be discussed at a triage meeting labels Dec 2, 2021
@annevk
Copy link
Member

annevk commented Dec 3, 2021

I think this is whatwg/dom#61.

@domfarolino
Copy link
Member Author

I actually don't quite think it is. Perhaps this is kinda a subset of whatwg/dom#61, but the main question here seems to be: should an image element internally store a snapshot of the base URL that it encountered upon first invocation of #update-the-image-data, to compare with the document's live base URL when the next relevant mutation is experienced?

Chrome and Firefox do this, but the spec and Safari do not. Spec & Safari's behavior (no base URL snapshot) is the simplest, and I'm leaning towards making Chrome do the same in https://chromium-review.googlesource.com/c/chromium/src/+/3311225, which I'd like to land with a WPT. @emilio what do you think about this? If I can land that CL in Chromium, I'll close this issue, but in the meantime it would be good to hear from FF.

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

I'm confused. Gecko doesn't special case the base URI afaict, we just don't force-reload if the current source is already the same on insertion. Our model is more similar to "make insertion a relevant mutation": https://searchfox.org/mozilla-central/rev/861fb9abfcaff123aab45f6ac56a0106b116dc14/dom/html/HTMLImageElement.cpp#587

You kinda need to do that for picture / srcset to deal for when an image is moving from one document to another with different viewport sizes etc, right?

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

(or specify another point more specific to adoption perhaps)

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

Cc @EdgarChen

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

The note in https://html.spec.whatwg.org/multipage/images.html#img-environment-changes recommends running that algorithm with images get inserted into the document, which does run the source selection algorithm. That's basically what Gecko is doing, though I'd need to check how precisely we implement that algorithm, let me know if you'd need that.

@domfarolino
Copy link
Member Author

You kinda need to do that for picture / srcset to deal for when an image is moving from one document to another with different viewport sizes etc, right?

Yes, and insertion is considered a "relevant mutation" when the img is a child of a picture element: https://html.spec.whatwg.org/#the-img-element:html-element-insertion-steps / https://html.spec.whatwg.org/#when-to-obtain-images:relevant-mutations.


Let me make sure I've got the scenario right. When I run the following script, akin to the code in the OP:

const image = new Image();
image.src = "image.jpg";

const base = document.createElement('base');
base.href = '/resources/';
document.head.append(base);

setTimeout(() => {
  document.body.append(image);
}, 2000);

... in Firefox DevTools I see two image requests:

  • An immediate request for /image.jpg
  • Another request 2 seconds later for /resources/image.jpg

Two things here surprise me:

  • That FF's first request is not influenced by the base URL; per the #update-the-image-data algorithm, source selection takes place after the microtask, so any document state mutation that happens synchronously after image creation should impact the image request (that's basically the goal of that microtask)
  • That Firefox fetches a second request. I don't think element element insertion in this case is considered a relevant mutation, so I'm not sure what is triggering the image to be refetched. Do you?

Overall here is the current state of affairs:

  • Firefox's first request is wrong; its second request should not be made (but when made, is "correct")
  • Safari's first request is wrong; it does not make a second request (which is "correct")
  • Chrome's first request is right; nevertheless, it makes a second request when I think it should not, whose URL is also right (the second request is triggered by some bogus logic that I'm going planning on removing)

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

Yeah Firefox triggering the first load is a known long-standing bug.

@domfarolino
Copy link
Member Author

Do you agree that per spec, the second request should not be made (the one after 2 seconds), since general (non-<picture>) insertion is not considered a relevant mutation for #update-the-image-data?

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

Yes, iff the first request does account for the base change. I'd expect two different requests if there's a microtask checkpoint after the src assignment but before the base change.

@domfarolino
Copy link
Member Author

I'd expect two different requests if there's a microtask checkpoint after the src assignment but before the base change.

Why would changing a document's base URL after an image's microtask checkpoint has already been run somehow refetch that image, and re-resolve its URL?

  1. Microtask checkpoint reached; image request is made
  2. Document's base URL is manipulated
  3. Image is inserted into DOM — I don't think this should cause a re-fetch due to the document base URL changing from the last time the image is fetched; refetching would mean counting insertion as a "relevant mutation" and keeping track of the old state of the base URL, I think

@emilio
Copy link
Contributor

emilio commented Jan 30, 2023

Ah, you're right. I thought https://html.spec.whatwg.org/multipage/images.html#img-environment-changes would run on insertion, and the source would be different and thus we'd load the new image. I guess that since that algorithm has a:

We shouldn't really reload in the src case. For srcset / picture, I believe that per spec we also shouldn't reload, but that's a bit more subtle.

I guess the selected source that gets compared is the relative URI rather than the "resolved" URI against the document's base URI. Does that match what other browsers do? We compare the fully-resolved URI, though it seems Blink at least is following the spec (on this regard, but density doesn't seem to be checked, see #4646).

Having tests for all this would be great...

@domfarolino
Copy link
Member Author

I guess the selected source that gets compared is the relative URI rather than the "resolved" URI against the document's base URI. Does that match what other browsers do? We compare the fully-resolved URI, though it seems Blink at least is following the spec

How do we know that Blink is following the spec here? Just so we're clear, you're now talking about something different than the original post here right, since we've established that there should only be a single question here, and therefore any steps after the following are not run:

If the img element does not use srcset or picture, its node document is not fully active, has image data whose resource type is multipart/x-mixed-replace, or the pending request is not null, then return.


Having tests for all this would be great...

Agreed. I found that we have the image-base-url.html WPT that I added in 2020 which covers the 3-step scenario I mentioned above in #7383 (comment), but I think we need another one to assert that not only the correct request is finally made, but that only a single request is made, and that spurious ones are not triggered due to the base URL changing in between microtask scheduling and execution. I'm adding that in https://crrev.com/c/3311225.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 31, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
@domfarolino
Copy link
Member Author

@emilio feel free to check out web-platform-tests/wpt#38275 where I modify the test I originally wrote for the OP of this issue and extend it to cover the "second fetching" case, where the base URL is modified after the image is fetched but before it is inserted.

Once that is landed, I'll file some browser bugs and close this issue.

@domfarolino domfarolino changed the title Images should refetch upon insertion, if the document's base URL has changed Images should NOT refetch upon insertion even if the document's base URL has changed Jan 31, 2023
@emilio
Copy link
Contributor

emilio commented Jan 31, 2023

I guess the selected source that gets compared is the relative URI rather than the "resolved" URI against the document's base URI. Does that match what other browsers do? We compare the fully-resolved URI, though it seems Blink at least is following the spec

How do we know that Blink is following the spec here?

Just by code inspection.

Just so we're clear, you're now talking about something different than the original post here right, since we've established that there should only be a single question here, and therefore any steps after the following are not run:

That's right. If you change src with srcset, then should it run? If we define the URL comparison on the parsed / resolved URI, then it should, but otherwise it should not.

Agreed. I found that we have the image-base-url.html WPT that I added in 2020 which covers the 3-step scenario I mentioned above in #7383 (comment), but I think we need another one to assert that not only the correct request is finally made, but that only a single request is made, and that spurious ones are not triggered due to the base URL changing in between microtask scheduling and execution. I'm adding that in https://crrev.com/c/3311225.

Thanks, that's great :)

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 31, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
aarongable pushed a commit to chromium/chromium that referenced this issue Jan 31, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)


[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 31, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jan 31, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}
@domfarolino
Copy link
Member Author

That's right. If you change src with srcset, then should it run? If we define the URL comparison on the parsed / resolved URI, then it should, but otherwise it should not.

Yep that makes sense to me; if you changed src/srcset or if we defined the URL comparison (that takes place in step 5 of #reacting-to-environment-changes) to not be relative, then it seems the image would be refetched upon insertion if the Document's base URL has changed. Do you think there is any appetite to do this? Do browser disagree on this case currently?

If not, since my CL with the test has landed, I'm happy to file a couple browser bugs and close this issue.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Feb 2, 2023
…_url_`, a=testonly

Automatic update from web-platform-tests
Image Loading: Remove `last_base_element_url_`

`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}

--

wpt-commits: 05e83c2ea5c6ccd05517d8b788e6b547976b7961
wpt-pr: 38275
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Feb 3, 2023
…_url_`, a=testonly

Automatic update from web-platform-tests
Image Loading: Remove `last_base_element_url_`

`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}

--

wpt-commits: 05e83c2ea5c6ccd05517d8b788e6b547976b7961
wpt-pr: 38275
@domfarolino
Copy link
Member Author

marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this issue Mar 28, 2023
`blink::ImageLoader::last_base_element_url_` stores a snapshot of the
Document's base URL during UpdateFromElement() [1]. However, this
snapshot is not actually used to resolve the URL of the image request after the microtask checkpoint in DoUpdateFromElement(); rather,
relative URL resolution always happens with the Document's live base URL
mechanism [2] (more on this later).

Instead, the member exists solely to detect, at image insertion time
[3], if the Document's base URL has changed since the last time
UpdateFromElement() was called. The rationale for this used to be to
detect if a document's base URL changed before the image loading
microtask (DoUpdateFromElement()) was run, so that we could protect
against fetching the wrong image with a stale URL; see [4].

This was important because at the time, (before crrev.com/c/2103987)
ImageLoader used to store a snapshot of the resolved image URL in the
microtask state computed in UpdateFromElement(), so it was possible to
wind up in DoUpdateFromElement() with an image request URL that is stale
compared to the live Document's base URL (this was intentional, see
crbug.com/569760 & https://codereview.chromium.org/2105283002). The
solution [5] at the time was to detect, only during insertion, if the
ImageLoader's last known snapshot of the Document base URL is stale, so
that the image processing model can be re-entered, and the final DoUpdateFromElement() call is made with an up-to-date URL. This protects
against the very specific issue of:
  1. Creating an image (thus scheduling the microtask)
  2. Changing the Document's base URL
  3. The microtask that finally runs fetches an image with a stale URL

However, I have no idea how this could protect against the original bug
filed [4]; see
https://chromium-review.googlesource.com/c/chromium/src/+/3311225/comments/5ab9c13f_9c553169
where there is some discussion about how the original bug could have
behaved.

In any case, since then, crrev.com/c/2103987 has landed which ensures
that DoUpdateElement() always resolves relative URLs with the live
Document's base URL, so it is impossible for DoUpdateElement() to fetch
an outdated URL. This leaves us with the `last_base_element_url_` member
that is used as a discriminator to re-invoke the image processing model
upon insertion due to this member being stale compared to the Document's
base URL. This results in Chromium sometimes fetching the same exact
image twice (3-step scenario listed above), which makes this member
entirely obsolete dead weight.

Furthermore, the HTML Standard does not define general image insertion
as a "relevant mutation" [6] that would invoke #update-the-image-data,
and it certainly does not capture any base-URL-specific state in the
microtask machinery. After some discussion on
whatwg/html#7383, we've decided to align
Chromium with the specification, which prompts this CL.

Testing-wise, this CL is covered by:
 - image-load-reset-on-new-base.html
    - Created in response to [4]
 - svg-image-load-reset-on-new-base.html
    - Same as above, but the SVG variant
 - image-base-url.html
    - Added/moved in crrev.com/c/2103987
 - TODO: Add a test that ensures only one request is made
    - This test is important because while the others ensure that the
      final image is correct, we've discovered that some browsers are
      re-fetching the image due to base URLs changing before the image
      microtask is run, and sometimes that second fetch is using the
      correct URL, and sometimes it's wrong: whatwg/html#7383 (comment)

[1]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=646-647;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[2]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=724;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[3]:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/loader/image_loader.cc;l=297-302;drc=3297fb6d70a02d8c4af8db2bc9e315053744442b
[4]: https://crbug.com/897545
[5]: https://crrev.com/c/1296506
[6]: https://html.spec.whatwg.org/multipage/images.html#relevant-mutations

Bug: 569760,897545
Change-Id: If6a93c091bfaea2944489e3660adc695d04c3e72
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3311225
Reviewed-by: Fredrik Söderquist <[email protected]>
Commit-Queue: Dominic Farolino <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1099380}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants