Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An event for when entries disappear for other reasons than forward-pruning #72

Open
domenic opened this issue Mar 15, 2021 · 12 comments
Open
Labels
addition A proposed addition which could be added later without impacting the rest of the API

Comments

@domenic
Copy link
Collaborator

domenic commented Mar 15, 2021

We currently have the dispose event for when app history entries get dropped due to "forward-pruning", as discussed in https://github.com/WICG/navigation-api#per-entry-events.

However, talking with implementation folks, there are some other cases where entries get pruned:

  • The user clears their history. This can remove entries from the joint session history.
  • Most browsers have a maximum number of joint session history entries (Chrome's is 50), so once you exceed this, they start dropping off the back.
  • An app history can get "cut in half" in a situation like the following:
    • History contains https://example.com/1, https://example.com/2, https://example.com/3. All pages are in bfcache.
    • User navigates back to position 2, i.e. https://example.com/2
    • The page does location.replace("https://other.example.com/")
    • Now if we navigate (with bfcache) back to https://example.com/1, it will see 1 entry in its navigation API history entry list, instead of the 3 it saw before, because there is only one same-origin contiguous entry. (Same for navigating forward to https://example.com/3.)

The third of these is a bit different than the first two, and might warrant different treatment.

Anyway, the question is, should the dispose event fire for these scenarios? From what I can tell the same use cases, e.g. removing associated caches, apply.

Offline, @dvoytenko mentioned that if an event fires, it should not be the same as the event for forward-pruning. But I didn't quite understand why.

@domenic domenic added the addition A proposed addition which could be added later without impacting the rest of the API label Mar 15, 2021
@dvoytenko
Copy link

dvoytenko commented Mar 16, 2021

@domenic I think I meant that it'd be good to always know with certainty when the user has navigated back (back button or API) vs forward (link, fragment, forward button or API). Currently both dispatch the popstate event, which is very inconvenient.

RE: "cut in half" example. It's an interesting case, and I'd certainly like to have an event for it. E.g. if an app keeps DOM/state for a previous navigation associated with the history stack then the app might want to clear out this DOM/state when the history records are pruned. However, a couple of comments on this:

  • From what I've usually seen, the client-nav DOM/state is not commonly kept in-sync with the history. E.g. I'd never keep a state for many previous navigations. I'd only keep the state for a couple of navigations for a quick return and try to clear it out quickly after navigation. I'd expect an app to do so only when the "back" navigation is very likely. Think: list -> 1up -> back to list.
  • Related, if I were to be able to tell a back navigation from forward, I'd like to immediately clear out the previous state because forward-button navigations are very rare.
  • I'm also curious, in your example, when such dispose events could arrive? There might not be enough time to send them before navigating to other.example.com. So, when the user navigates back to example.com?

@tbondwilkinson
Copy link
Contributor

In my opinion:

  1. Yes, send the dispose.
  2. Yes, send the dispose.
  3. Send the dispose for https://example.com/2 but do not send the dispose for any other entry. At this point though, the in-memory Object representing an entry I would expect to be no longer reachable via appHistory.entries but it's not technically disposed. It may be impossible to reach that entry while preserving the same JavaScript execution context, but you might have stored resources in places like session storage, and evicting that now may be premature. A new event to describe this state may be... possible, but I think we're getting into the weeds. The app could choose to see the bisection happen in the navigate event (or currentchange? do we send navigate events for location.replace? I hope we do), for location.replace and choose to evict either the forward or the backward entries, given what's happening.

But I think we want to emulate as close to possible organically having two same-origin websites sandwiching a non-same-origin website when the bisection event happens. We wouldn't expect to get dispose events in one website for something that happened in another part of the stack. Bisection, I think, should completely clear in-memory associations between two same-origin website instances. When you do that replace, nothing else that happens on the other side should reach you.

@jakearchibald
Copy link

3. Send the dispose for https://example.com/2 but do not send the dispose for any other entry.

This means the client may hold AppHistoryEntry objects that have keys that cannot be used in appHistory.goTo(), and they weren't notified. I presumed that might be a use-case for the dispose event. However, yeah, clearing related data in origin storage would be bad.

Another complication is, the split may be un-splut. location.replace("https://example.com/foo") would make the entries contiguous again.

Although, if we fire "dispose" on these, and if AppHistoryEntry objects are === (which might be expected when comparing to .current), you end up with objects that can be "dispose"d multiple times.

Maybe we should just have an event that signals an AppHistoryEntry is no longer accessible for that client, as in it can't be used in appHistory.goTo(). A property of that event could indicate if it's permanently gone or not. We could also have an event for "hey it's back!".

@annevk
Copy link

annevk commented Mar 10, 2022

With regards to the three items:

  1. If history is cleared, I wouldn't expect any state to remain. But perhaps it's a rather surgical operation?
  2. How does the limit for join session history entries relate to this API? As this API is per-origin it sounds somewhat dangerous if it ends up exposing a cross-origin limit somehow.
  3. Agree with Jake's assessment above. This case doesn't seem dangerous and something you should be able to deal with, even though it's a bit of an edge case.

@domenic
Copy link
Collaborator Author

domenic commented Mar 10, 2022

1. If history is cleared, I wouldn't expect any state to remain. But perhaps it's a rather surgical operation?

Agreed, this will remove entries from the navigation API. I believe the browser completely throws away any info about those entries. The page will get dispose events.

This needs to be specced, but seems uncontroversial.

2. How does the limit for join session history entries relate to this API? As this API is per-origin it sounds somewhat dangerous if it ends up exposing a cross-origin limit somehow.

Well, basically most (all?) browsers seem to limit the total number of joint history entries per TLBC (in Chrome's case I know the number is 50). Similar to the previous item, they completely throw away the entries. So the plan is to spec that, with dispose events firing as they fall off the end.

I think this would most clearly show up if you navigated to 50 same-origin (or even same-document) pages: at some point navigation.entries().length would stop increasing and stay at 50, and every time you got a currententrychange you would also get a dispose for what used to be navigation.entries()[0].

Could this reveal cross-origin information? Hmm, it seems possible, in the same way history.length could today... but it's not too obvious how. Since you only get NavigationHistoryEntry objects for your own origin, you wouldn't be able to observe anything before you in the same BC. But maybe in cross-origin iframes which contribute to the joint session history? I'll think on it a bit more...

@annevk
Copy link

annevk commented Mar 10, 2022

@domenic I guess I'm wondering why the page would still be up and not reloaded for 1. We haven't had a chance to revisit that UI in Firefox recently though.

For 2 I think the problem is similar to history.length, but that's also not a leak we're happy with. (And could maybe fix, by always reporting 1 or 2 or some such after a top-level cross-origin navigation.)

@domenic
Copy link
Collaborator Author

domenic commented Mar 16, 2022

Refreshing myself on whatwg/html#2018 I am currently thinking the boundary is that we are OK leaking the fact that a cross-origin iframe navigated. (We fire load events after all.) We are not OK leaking the URL that it navigated to.

I don't think dispose events on exceeding the 50-entry-limit can enable you to guess the URL that it navigated to, on their own. It has to be combined with some other gadget that causes new entries to be generated, or not generated, conditional on whether you guessed correctly. Currently the HTML spec has such a gadget: it makes the replace-or-push decision based on the iframe's current URL. But we should fix that separately. For example, we could adopt Chromium's solution, which is to always push when the navigation initiator is cross-origin. (Which is, I guess, another case where you can get two entries in a row with the same URL---relevant to #111.)

I am trying to confirm this with our security folks to see if there's anything I missed.

@annevk
Copy link

annevk commented Mar 17, 2022

I realized that there is a difference with history.length in that once history.length returns 50 it will not increase beyond that. Whereas dispose events will continue to function.

Given that you only have access to contiguous same-origin entries it seems that the only information you can get out of this is how much the user "used" a tab before getting to your site. That's correct, right? That doesn't seem particularly severe, though it would still be good if it could be tackled in some way.

@domenic
Copy link
Collaborator Author

domenic commented Mar 17, 2022

I realized that there is a difference with history.length in that once history.length returns 50 it will not increase beyond that. Whereas dispose events will continue to function.

But, the fact that history.length is not increasing also gives you information. If you do a push navigation, and history.length does not increase, that basically tells you a dispose event happened.

Given that you only have access to contiguous same-origin entries it seems that the only information you can get out of this is how much the user "used" a tab before getting to your site. That's correct, right?

I think that's right. And if we implement any of the ideas in whatwg/html#6356 then that limits the damage a bit. That might be the right way to tackle things.

(Note, session scoping on manual navigation is quite different from BCG scoping; see also #71.)

@domenic
Copy link
Collaborator Author

domenic commented Mar 18, 2022

  • The user clears their history. This can remove entries from the joint session history.

@domenic I guess I'm wondering why the page would still be up and not reloaded for 1. We haven't had a chance to revisit that UI in Firefox recently though.

In Chrome I tested this and:

  • We do not clear history entries if you delete them from your history list (Ctrl+H). They stick around.
  • We do clear history entries if you clear all browsing data, which I can't figure out how to do in a site-specific manner. We do not reload the page.

I'll ask around a bit more about the threat model here that motivated us not reloading.

@domenic
Copy link
Collaborator Author

domenic commented Mar 31, 2022

I was informed that simultaneously reloading all open pages when you clear all browsing data would give a cross-site identifier that you can join. Right, that makes sense!

You could mitigate this somewhat, at the expense of a worse user experience, if you shut down all pages, waiting for them all to shut down, cleared all browsing data while there were no pages open, and then reopened the pages. This would then give a weaker cross-site tracking identifier, based on the first-visit time for a domain.

You could imagine mitigations to try to fuzz these times, similar to other times that we fuzz for global events (like language changes). But basically this is a hard area.

@domenic
Copy link
Collaborator Author

domenic commented Apr 22, 2022

@annevk suggested that the simultaneous reload timing issue might be best mitigated by browsers unloading all tabs during a clear-browsing-data instance, and then only reloading them once the user navigates to them. Similar to what happens on a browser restart. That makes a lot of sense to me.

With regard to the navigation API, we think the conclusion is that: dispose events should fire for whenever appropriate entries are removed. But history clearing may or may not remove an entry, e.g. as described above some UIs remove entries and some don't, and if we go with @annevk's suggested improvements to the "clear browsing data" UI, this will likely result in no dispose events being fired at all because everything will just be unloaded and the new load will get a fresh view. (dispose is only for when an already-open page sees entry changes.)

So, I'm removing "might block v1", as we think we have a clear path here with no compat issues. However I'm leaving the issue open to track properly specifying dispose for non-forward-pruning cases, including both the cut-in-half situation discussed in the OP and history-clearing cases. (And, we want to write tests for the cut-in-half situation.)

That specification work might be best done as part of #221.

@domenic domenic removed this from the Might block v1 milestone Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition A proposed addition which could be added later without impacting the rest of the API
Projects
None yet
Development

No branches or pull requests

5 participants