-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Navigation and session history rewrite #6315
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty reasonable; no real concerns!
At some point it'd be good to have a series of examples of how these concepts relate and manifest in the real world, perhaps first in GitHub comments, then eventually in the spec with diagrams and pictures. For example, I'd be interested in seeing the browsing session/navigable/session history diagram for scenarios like:
- Two independent top-level tabs, each of which navigates around
- One top-level tab with two iframes, each of which navigates around, and then the top-level tab also navigates
- Any interesting scenarios that illustrate tricky cases where the current set of concepts (browsing context/joint session history) are not sufficient
Yeah, I want to create some table-like diagrams, like in the "Specifying the history as a 'timeline'" section of "#5767. |
Since we seem to be arriving at some kind of consensus in #6356, I'm going to refactor some stuff. @domenic, shout up if any of this sounds bad: Currently a "browsing session" has a navigable, and some additional state. Instead, I'm going to make it a subclass of navigable, a "top level navigable". It will have the same state as browsing session currently has, except a navigable, since it'd be that navigable. A "browsing session" will have associated session storage, as it's currently defined. A browsing context group will have a browsing session that cannot change throughout the life of the group. Multiple browsing context groups may share the same session. Manual navigations to cross-origin URLs will always create a new browsing context group with a new session. Other navigations may create a new browsing context group (depending on cross-origin isolation) but they will have the same session. |
I got into a bit of a mess with this in terms of which threads could read & write from which data. I now think that the top-level navigable needs to have its own copy of session history as a list of trees, so it can figure out things like history length without having to post tasks with every navigable to read that information. This session history copy will be serialisable, which we need anyway for restoring session history to iframes on reload. The copy will become temporarily out of date when session history is modified in a synchronous way, but a parallel task will be queued to keep it up to date. Here are the synchronous things:
I just need to make sure that, if a navigable makes 5 synchronous changes, it ignores all the synchronisations from the top level until it gets to the one that concerns its latest synchronous change. @domenic these two copies with strict rules around which thread can access which should make it easier for your new history API to have synchronous access to its own session history. |
Interesting. My understanding of how this is implemented, generally, is that the "browser process" has the list of trees, and individual frames' processes only have the /cc @natechapin for his thoughts, as he's been looking at this recently and can correct me if I've got this wrong.
This seems reminiscent of the interop issues in this document.
I agree in theory, although I worry about the potential mismatch with implementation in practice... |
Ok, I won't over-index on those requirements. We can figure out what should be copied across & when some other time. |
I've been playing around with some pseudocode to get a feel for the shape of the algorithms. Right now it's roughly like this: New navigations select a target context then go to navigate. History traversals go to traverse the history by a delta. Navigate handles:
Traverse the history to an entry handles:
Traverse the history by a delta handles:
Issues with this approach:
Here's what I'm going to try and do: "Navigate":
"Attempt populate history entry":
"Sync to a history step":
Any red flags there? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(spam)
Generally no red flags. I must admit I'm a bit sad that just after I feel like I've got the navigation algorithm straight in my head, you plan to change it dramatically, but your "issues with this approach" (2)-(4) are compelling.
You might get yourself into trouble here with "initiator": that's #1130. Maybe you can just preserve the existing brokenness there, or maybe I can try to fix some of it to prepare the way. |
Yeah, I saw the hand-waving with "initiator" 😞. I was going to leave it vague in this first pass. Or maybe it'll bother me too much. Thanks for the link, I didn't realise there was an existing issue for this! |
5828aa9
to
393dd31
Compare
@domenic I've just done my read-through & tweaking of "clicking on a hash-navigation link" if you want to review that path too. I'll let you know as other paths are ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I traced through that path as best I could, and probably commented on some stuff that isn't applicable for such cases as well. I think the trickiest things I found were:
- Be careful with document's URL vs. session history entry URL
- I don't understand "nested histories"
Otherwise it's mostly minor requests for clarification or suggested touchups.
I'm currently looking at rearranging the path of "navigate" a bit. Right now the spec follows this pattern: Navigate unloads the current document, then, calls out to process a navigate fetch, process a navigate response or process a navigate URL scheme. Process a navigate fetch handles redirects and such, and calls process a navigate response. Process a navigate response handles CSP, 204, 205, download triggering, then calling out to the correct handler for the type. Process a navigate URL scheme is a little hand-wavey and broken. The handler for the type (eg plain text file) takes care of creating the document, then calling update the session history. Creating the document selects the browsing context (following COOP), records navigation timing, and handles declarative refreshing. Update the session history does what you'd expect. It feels odd that this is pretty much a sequence of calls. Because of this it requires a bag of state to be passed around. You also get weirdness like the document creation steps handling navigation timing. I'm going to try and make it like this: Navigate:
Populate history item fetches the entry's url, handles CSP, 205, 205, and downloads, otherwise calls populate history item with response with the history entry and response. Populate history item with response creates the document depending on the response type, and records navigation timing, switches the browsing context in the history entry if needed, and handles declarative refreshing. In this case it feels like "navigate" handles the navigation, and calls out to other things for part of that, rather than it being a sequence of calls that each hand off to another. I'm hoping this will reduce the amount of state that needs to be passed around, making things easier to follow. It might even turn out that the history entry itself is enough state. |
Hmm. I have a bit of status quo bias because I've spent a good amount of time fixing up and understanding the existing algorithms, but I don't quite see why your proposed alternative is better. In particular, what I find intuitive about the current spec is that, at a high level, there's a pipeline: input -> response -> document -> update history. Each step feels nicely separated and feeds into the next, as e.g. we can see by how "process a navigate response" gets reused. Your new proposal seems confusing to me in that it starts creating history entries before we even have a response, much less a document. It seems like it's overloading the concept of history entry to be two things: a bag of state that may one day turn into a real history entry, and then eventually a real history entry. In other words, I like how in the current spec history entries are only ever created right before they're put into a real session history list. Also just on a naming level, I think "populate history item" is much less clear than "process a navigate fetch". |
…s not fire beforeunload, a=testonly Automatic update from web-platform-tests Test that javascript: URL navigation does not fire beforeunload The current spec fires beforeunload, but the rewrite in whatwg/html#6315 does not. -- wpt-commits: e5144d4daa5979805e0e1360c2bc69abf6825bff wpt-pr: 36488
…nts use the initiator as referrer, a=testonly Automatic update from web-platform-tests Test that javascript: URL-created documents carry over referrer Follows whatwg/html#6315. -- wpt-commits: 5ad834e4a682b4b4acd5428be83254318b81ad0f wpt-pr: 36709
…s not fire beforeunload, a=testonly Automatic update from web-platform-tests Test that javascript: URL navigation does not fire beforeunload The current spec fires beforeunload, but the rewrite in whatwg/html#6315 does not. -- wpt-commits: e5144d4daa5979805e0e1360c2bc69abf6825bff wpt-pr: 36488
…nts use the initiator as referrer, a=testonly Automatic update from web-platform-tests Test that javascript: URL-created documents carry over referrer Follows whatwg/html#6315. -- wpt-commits: 5ad834e4a682b4b4acd5428be83254318b81ad0f wpt-pr: 36709
The reference to "active document" was broken after whatwg/html#6315 (it now belongs to a "navigable"). Rather than trying to make sense of the changes to the HTML spec, just remove the note since it was not adding a lot of information anyway.
…n. (#354) The reference to "active document" was broken after whatwg/html#6315 (it now belongs to a "navigable"). Rather than trying to make sense of the changes to the HTML spec, just remove the note since it was not adding a lot of information anyway.
Co-authored-by: Domenic Denicola <[email protected]>
Since whatwg/html#6315, the HTML spec suggests other specifications use "navigable" and associated concepts (along with Document) rather than "browsing context" in most cases. In this specific case, however, we can simply remove the step that checks if `document`'s browsing context is null -- there is no case in which a document is fully active _and_ has a null browsing context, as confirmed by whatwg/html#9509. Fixes #362.
Since whatwg/html#6315, the HTML spec suggests other specifications use "navigable" and associated concepts (along with Document) rather than "browsing context" in most cases. In this specific case, however, we can simply remove the step that checks if `document`'s browsing context is null -- there is no case in which a document is fully active _and_ has a null browsing context, as confirmed by whatwg/html#9509. Fixes #362.
This monster completely rewrites everything to do with navigation and traversal.
It introduces the "navigable" and "traversable navigable" concepts, which take on many of the roles that browsing contexts previously did, but better. A navigable can present a sequence of browsing contexts, which to the user seem to all be the same, but due to browsing context group switches, have different WindowProxys and are allocated in different agent clusters. A traversable navigable manages the session history for itself and all its descendant navigables, providing a synchronization point and source of truth.
The general flow of navigation and traversal is now geared toward creating a session history entry, populated with the appropriate document, before finally applying the history "step". The step concept for session history, managed by the traversable, replaces the previous idea of joint session history, which was a sort of deduplicated union of individual session histories for each browsing context within a top-level browsing context.
Notable things to still do before merging:
Notable things we won't tackle this round, but are much easier to tackle in the future:
Closes #854 by clarifying the javascript: URL origin and origin-checking setup.
Closes #1073 by properly resetting active-ness of documents when they are removed.
Closes #1130 by removing the source browsing context concept, using a sourceDocument argument instead, and taking source snapshot params at the appropriate early time.
Closes #1191 by properly sharing document state across documents, as well as overlapping same-document navigations plus cross-document traversals.
Closes #1336 by properly handling child browsing contexts.
Closes #1382 by only unloading after we are sure we have a new document (i.e., not a 204 or download).
Closes #1454 by rewriting session history closer to what implementations do, with the nested history concept in particular taking care of the issues discussed there.
Closes #1524 by introducing the POST data concept and storing it in the document state.
Closes #2436 by rewriting the spec for history.go() to be clear about the results. Tests: web-platform-tests/wpt#36366.
Closes #2566 by introducing an explicit "history object" definition. Tests: web-platform-tests/wpt#36367.
Closes #2649 through clear creation of srcdoc documents, including during history traversal.
Closes #3215 by preserving POST data and reusing it on reloads.
Closes #3447 by specifying a precise mechanism (the ongoing navigation) for canceling navigations, and the points at which that mechanism is consulted. It also stops queuing a task for hyperlink navigations.
Closes #3497 by posting appropriate tasks for cross-event-loop navigations.
Closes #3615 by rewriting traverse a history by a delta, which eventually calls into apply the history step, to navigate all relevant navigables.
Closes #3625 by storing information in the document state (not just the URL), so that future traversals can reconstruct the request appropriately.
Closes #3730 by doing proper task queuing for navigation, including one for javascript: URLs but not including one for normal same-frame navigations. Tests: web-platform-tests/wpt#36358.
Closes #3734 by rewriting the definition of script-closable to use well-defined concepts.
Closes #3812 by removing all uses of "active document" as a predicate instead of a property.
Closes #4054 by introducing the session history traversal queue and renaming the previous "history traversal task source" to "navigation and traversal task source".
Closes #4121 by doing the "allowed to navigate" check at the top of apply the history step.
Closes #4428 by keeping a strong reference from documents (including bfcached documents) to their containing browsing context.
Closes #4782 by introducing the top-level traversable and navigable concepts.
Closes #4838 by doing sandbox checking in a much more precise manner, in particular snapshotting the relevant flags early in any traversals.
Closes #4852 by using document state (in particular history policy container, request referrer, and request referrer policy) in reloads.
Closes #5103 by properly restoring scroll positions for everything that is traversed, as part of properly traversing more than one navigable.
Closes #5350 by properly restoring window names across browsing context group switches, and going back to the same browsing context as was previously there when traversing back across a BCG switch boundary. (Implementations could create new browsing contexts, as long as they restore the WindowProxy scripting relationships and other browsing context features; the result is observably equivalent.)
Closes #5597 by rewriting "allowed to download" to just take booleans, derived from the appropriate snapshotted or computed sandboxing flags.
Closes #5767, modulo bugs and oversights we made, by rewriting everything :).
Closes #5877 by respecifying "fully active" in terms of navigables, instead of browsing contexts.
Closes #6446 by properly firing beforeunload to all descendant navigables, although whether or not they actually prompt still allows implementation leeway.
Closes #6483 by introducing the distinction between current session history entry and active session history entry.
Closes #6514 by settling on using a single origin for these checks.
Closes #6628 by storing window.name values in the document state, so even in strange splitting situations like described there, they remain.
Closes #6652 by no longer changing history.state when reactivating a document from bfcache ("restore the history object state" is called only when documentsEntryChanged is true). Tests: web-platform-tests/wpt#36368.
Closes #6773 by having careful handling of synchronous navigations during traversals. Test updates: web-platform-tests/wpt#36364.
Closes #6798 by treating javascript: URL navigations as replacements.
Works towards #6809 by storing srcdoc resources in the document state.
Closes #6813 by storing referrer in the document state. Tests for the repopulation case: web-platform-tests/wpt#36352. (No tests yet for the reload case.)
Closes #6947 by rolling its contents into this change: PDF documents are put in the same category as other inaccessible, no-DOM documents.
Closes #7107 by clearing history state on redirects and when origin changes by other means, such as CSP.
Closes #7441 by making window.blur() a no-op because that was simpler than updating it to operate on navigables.
Closes #7722 by incorporating its contents into the rewritten version.
Helps with #8395 by at least ensuring the javascript: case does not fire beforeunload. Tests: web-platform-tests/wpt#36488. (The other cases remain open for investigation and testing.)
/browsers.html ( diff )
/browsing-the-web.html ( diff )
/canvas.html ( diff )
/common-microsyntaxes.html ( diff )
/comms.html ( diff )
/custom-elements.html ( diff )
/dnd.html ( diff )
/dom.html ( diff )
/dynamic-markup-insertion.html ( diff )
/embedded-content-other.html ( diff )
/embedded-content.html ( diff )
/form-control-infrastructure.html ( diff )
/form-elements.html ( diff )
/iana.html ( diff )
/iframe-embed-object.html ( diff )
/imagebitmap-and-animations.html ( diff )
/images.html ( diff )
/index.html ( diff )
/indices.html ( diff )
/infrastructure.html ( diff )
/input.html ( diff )
/interaction.html ( diff )
/interactive-elements.html ( diff )
/introduction.html ( diff )
/links.html ( diff )
/media.html ( diff )
/microdata.html ( diff )
/obsolete.html ( diff )
/parsing.html ( diff )
/references.html ( diff )
/rendering.html ( diff )
/scripting.html ( diff )
/sections.html ( diff )
/semantics-other.html ( diff )
/semantics.html ( diff )
/server-sent-events.html ( diff )
/structured-data.html ( diff )
/syntax.html ( diff )
/system-state.html ( diff )
/timers-and-user-prompts.html ( diff )
/urls-and-fetching.html ( diff )
/web-messaging.html ( diff )
/webappapis.html ( diff )
/webstorage.html ( diff )
/workers.html ( diff )
/worklets.html ( diff )
/document-lifecycle.html ( diff )
/document-sequences.html ( diff )
/nav-history-apis.html ( diff )