Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance.memory #85

Closed
past opened this issue Apr 30, 2018 · 49 comments
Closed

performance.memory #85

past opened this issue Apr 30, 2018 · 49 comments
Labels
venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)

Comments

@past
Copy link

past commented Apr 30, 2018

Request for Mozilla Position on an Emerging Web Specification

Other information

The main goal of this specification is to "provide developers a mechanism to identify releases of their site that regress memory usage". Chrome has implemented an older proposal that has significant issues with privacy and accuracy. Some Google properties already use this API for capturing in-the-wild memory usage data to help:

  • A/B test and gradually roll out new features
  • Monitor for regressions in memory usage (including browser changes)

There has been some related discussion recently in blink-dev.

Open questions

  1. Do people want fast measurements or detailed measurements (that may jank the browser)?
  2. How to achieve: zero space/time overhead when not queried, O(1) performance in allocated memory when queried, and accurate measurements.

Some notes about a potential implementation:

  • With Fission, we might be able to report per-process statistics that have all these properties, but not much fine-grainedness to them
  • Can we cache the periodically-generated memory reports that we use for telemetry reporting?

One explicit non-goal for this specification is comparing memory measurements between different browsers.

@marcoscaceres
Copy link
Contributor

@past, anyone in particular your think would be good to get input from? Do you have a personal opinion about this API?

@froydnj
Copy link

froydnj commented May 1, 2018

@glandium, @nnethercote, or @EricRahm may have comments surrounding platform specifics.

Regarding open questions, it looks like the API (at least for memory.privateMemoryFootprint) was designed for zero overhead when not being queried, and constant-time performance when queried. The API also looks like it was designed for fast measurement; detailed reporting appears to have been discarded as too difficult to implement cross-platform. So jankiness shouldn't be a problem.

Regarding implementation notes, we could probably reuse some of the infrastructure that about:memory or Telemetry's memory reports rest on, but having to cache the memory reports shouldn't be necessary.

@glandium
Copy link

glandium commented May 1, 2018

The API pretty much relies on the fact that Chrome uses one process per site.

@EricRahm
Copy link

EricRahm commented May 1, 2018

  1. The language in Appendix D is definitely chrome dependent.
  2. I'm also not convinced that memory should be reported per frame. It seems like you'd want the memory usage for your overall page, including any iframes but that's debatable.
  3. Roughly equating overall process memory with overall page memory is a bit sketchy. For instance some Firefox content processes will contain activity stream info, but not all. That memory has nothing to do with the page.
  4. Excluding all shared memory might be a mistake. I agree that ignoring bits shared by the OS makes sense, but maybe not ignoring mmappings for graphics. A possibly flawed example might be a site that makes heavy use of canvases, you'd probably want that memory overhead reported even if it's backed by shared memory.

@glandium do you have any feedback on the proposed implementation as far as terminology and the portions they plan on measuring (including the pseudo-code). For example I'm not sure using PrivateUsage is the right thing on windows (so commit charge rather than resident-unique)

@glandium
Copy link

glandium commented May 1, 2018

I don't think the details of what they want to measure matter much, because well before that, the assumption that the process<->site model is that of Chrome pretty much kills it as a useful cross-browser standard from my POV.

@bzbarsky
Copy link
Contributor

bzbarsky commented May 2, 2018

I would like to urge everyone to read the thread at https://groups.google.com/a/chromium.org/forum/#!searchin/blink-dev/memory$20api/blink-dev/no00RdMnGio/hA9SZ3kmBAAJ

The basic problem with the initial proposals there was that they were basically tied to V8's memory stuff.

The current proposal is sort of an attempt to define an API that satisfies the three properties I listed at https://groups.google.com/a/chromium.org/d/msg/blink-dev/no00RdMnGio/b1Wvn2_uBwAJ. You're right that it assumes a particular process model; a process model that no browser actually ships right now. This is somewhat better for interop than assuming a particular JS implementation, especially if browsers were to converge on that process model.

That said, I think the Google folks would welcome suggestions for ways to improve this stuff and get it to the point where it could be standardized. If we have concrete suggestions we should offer them...

@annevk
Copy link
Contributor

annevk commented May 2, 2018

Given #81 (comment) I suspect at least some Mozillians would object to making (parts of) the web platform dependent on a particular architecture. It doesn't seem ideal to me either.

@EricRahm
Copy link

EricRahm commented May 2, 2018

@annevk I don't see the link with CORB. Finding a memory measurement that is useful cross-browser/cross-platform is pretty much a lost cause. Each browser has a different engine, each browser defines a different algorithm for how many sites are in a content process (if at all), each OS defines different rules for how memory is allocated, freed, and cached.

Defining a consistent way for sites to perform measurement that is useful even if it's only good for comparing in a given browser on a given platform seems like an okay goal. This is a huge win for pushing triaging memory regressions from browser developers (read: me) to the sites themselves.

@glandium
Copy link

glandium commented May 2, 2018

I would like to urge everyone to read the thread at https://groups.google.com/a/chromium.org/forum/#!searchin/blink-dev/memory$20api/blink-dev/no00RdMnGio/hA9SZ3kmBAAJ

That thread was instructive, because it wasn't clear to me what this spec was meant to achieve.

The current proposal is sort of an attempt to define an API that satisfies the three properties I listed at https://groups.google.com/a/chromium.org/d/msg/blink-dev/no00RdMnGio/b1Wvn2_uBwAJ.

AIUI, those properties are what prompted for the API to rely on system metrics. OTOH, using system metrics breaks most of the interesting points in the requirements:

Context free: Actions that a user takes on other sites or on other applications should not affect the memory footprint of the site calling the API.

Without the Chrome process model, that can't be fulfilled with system metrics.

Comprehensive: Should account for all memory allocated on behalf of the web page, including memory required by the browser's internal structures.

System metrics don't allow to have this information for internal structures used on behalf of the web page in other processes.

Actionable: (...)

With "Context free" being out of the picture outside Chrome, I'm afraid getting a signal from the aggregate is only possible for sites the size of Facebook or Gmail (both mentioned in the thread). That's good for such large sites, but less so for the web at large.

Definition consistent on all supported platforms.

I'm not sure system metrics actually allow consistency here.

Now, the problem is that, if not system metrics, then what? I'm afraid the answer is that you can only pick one of the properties you listed. Maybe two if you're lenient. I guess the question becomes what are the possible tradeoffs to make this useful to more than Facebook and Gmail in Chrome. I don't have an answer to that, but I'm not convinced this spec does that.

@bzbarsky
Copy link
Contributor

bzbarsky commented May 2, 2018

Finding a memory measurement that is useful cross-browser/cross-platform is pretty much a lost cause.

Part of the problem is that any API like this that exists across browsers will get used for cross-browser comparisons. :(

I'm afraid the answer is that you can only pick one of the properties you listed. Maybe two if you're lenient.

Yeah, that's about where I was...

@bzbarsky
Copy link
Contributor

bzbarsky commented May 2, 2018

System metrics don't allow to have this information for internal structures used on behalf of the web page in other processes.

Oh, and this is an important point, and I think we should raise it with the Chrome folks now. I hadn't thought of this, and it does make the entire "use the system metrics for the process" approach not really right even in the best case.

@glandium
Copy link

glandium commented May 2, 2018

Some more thoughts. It seems this (old) blog post covers the use cases for this: http://www.html5rocks.com/en/tutorials/memory/effectivemanagement/. There are essentially two things in there:

  • using devtools locally to track memory usage and debug it. I don't think that needs web-exposed APIs.
  • collecting data from the field. That's the only use case actually listed on https://github.com/WICG/performance-memory/blob/master/explainer.md#use-cases and it explicitly says users should aggregate per navigator.userAgent. Which means different browsers, the same browser running on different OSes and even different versions of the same browser, should be aggregated separately.

The latter, as hinted in my previous comment, may require a lot of data, even if the measures are perfect.

Part of the problem is that any API like this that exists across browsers will get used for cross-browser comparisons. :(

All of the above tempts me to say half jokingly an API returning a "bag of json, consult your browser vendor" might actually be a better option.

@amccreight
Copy link

froydnj kind of alluded to this, but I thought I'd point out that in Firefox we could actually report a good chunk of per-site memory, including JS and DOM (including things like memory entrained by arrays and the like), with little additional overhead, and without assuming that a process only contains a single website. This is because we're periodically running our memory reporter for telemetry purposes anyways, and it gathers that information (If you look at about:memory in Firefox, you can see what is being gathered, though for telemetry purposes the sites are removed, which I guess would be an issue if we wanted to use the data for another purpose that does care about the particular site.)

@amccreight
Copy link

WICG/performance-memory#7 and WICG/performance-memory#8 are interesting reading about what two sites are interested in knowing via a memory API.

@glandium
Copy link

glandium commented May 2, 2018

This is because we're periodically running our memory reporter for telemetry purposes anyways, and it gathers that information

I wonder how long that takes on my profile, because about:memory takes multiple seconds. But I don't have telemetry enabled, so I've never noticed we did that. I sure hope most of the time spent in about:memory is for the js side of things.

@amccreight
Copy link

The telemetry version skips some things that make it slow, like hashing strings.

@EricRahm
Copy link

EricRahm commented May 3, 2018

FWIW our memory measurement is roughly asynchronous, it shouldn't really be causing jank. Perhaps it makes sense to have it declared asynchronous in the spec?

@tschneidereit
Copy link

[I]n Firefox we could actually report a good chunk of per-site memory, including JS and DOM

Note that we share some JS data across domains, notably functions' bytecode. Hopefully we'll be able to transfer some or all of that into a post-Fission world. I'm not sure if that makes a difference for this use case, as long as we include the shared memory in the reported numbers, but it's something to keep in mind.

@dbaron
Copy link
Contributor

dbaron commented May 3, 2018

One other thought about how to deal with the fact that many of these measurements may be browser-specific and may change over time within a given browser:

  • It may make sense to expose some browser-specific details to sites. At a minimum, there should be an expectation that numbers from different browsers (or even different browser versions) aren't apples-to-apples comparisons. But if the purpose of the API is sites detecting memory usage regressions, they might be willing to write some browser-specific handling, or, even better, might even be able to do regression detection without anything browser-specific, if, for example, the memory report came in the form of a list of named buckets and their numeric values... where the set of named buckets might vary between browsers. I think it may well be reasonable for Firefox to expose a different set of metrics from Chrome, based on how the different browsers work, and if the API were a set of named buckets with sizes, this would still allow sites to do useful regression detection.
  • Exposing browser-specific things gives us the risk of having forward-compatibility problems, since sites might depend on something we want to change. One possible way to mitigate this risk might be to borrow an idea from Blink's origin trials, and only expose the API to 75% of [Edit: clarified; misstated this the first time] users of a web page. (It could be exposed based on, say, a hash of an install-specific ID and the eTLD+1, so that different users have it turned off on different domains.) This would mean that if we ended up in a case where major sites depended on something in the API that we needed to change, we could just rename the API, so that it would appear that all of the new users were in the "API not available" 25% bucket (which should still work). But exposing the API to 75% of a site's users would still satisfy regression detection goals for that site.

@past
Copy link
Author

past commented May 4, 2018

We discussed this proposal at the web-perf call yesterday and based on Apple's feedback and ours (I summarized this thread), there is going to be another iteration to try to smooth things out. Specifically the spec will be less prescriptive about process models and requiring system metrics, and the asynchronicity aspect will be added. I will update the thread when there is a new proposal to consider.

One point in this thread that I'm not sure I understood well enough, is the one @glandium raised about this API being useful only to large sites. Is this about sites that require large amounts of memory or about sites that users spend a lot of time at (hence more opportunities for signal collection)? If it's the latter, a point that was raised is that PWAs will make long-tail web sites able to get enough signal, too.

Given the difficulty in reconciling web sites' hunger for more information and browsers' reluctance to dole it out, I suggested that it may be useful to consider if a MVP-1 spec would be implementable. That is, if even a very limited spec that web sites would find too constrained to be useful failed to get browser consensus on implementability, then perhaps this effort is not going to succeed after all.

@wanderview
Copy link

wanderview commented May 4, 2018

@past Does the spec handle resources that might be shared by multiple open tabs for a single origin? For example, SharedWorker can be attached to multiple tabs simultaneously. Do they both count that memory or is it divided or is it ignored? What about a service worker that is not explicitly attached to a tab, but might be controlling zero or more tabs of the origin?

(Sorry if this was covered already. I've only been skimming this thread.)

Edit: Note, these are also recurring issues we run into with devtools trying to map resources to a single tab, btw.

@glandium
Copy link

glandium commented May 4, 2018

@past what I was saying is that the spec, as described, using system metrics, without the Chrome process model, would yield a lot of noise in the data due to other sites, browser internals, etc. (mostly the former). In order for the API to be any useful in that case, you need large amounts of data to compensate for the large noise.

@dbaron
Copy link
Contributor

dbaron commented May 4, 2018

I think it's more than just noise, it's signal -- if the spec requires that we include everything in the process even if we know it's for a different site, then if there's a memory regression in gmail or Facebook, every site will see a memory use regression in their metrics for Firefox users, since some significant portion of Firefox users will have gmail or Facebook in the same process.

I think it's important that the spec lets us give sites metrics that are useful for them to be looking at given our process model.

@past
Copy link
Author

past commented May 7, 2018

There have been spec updates in the last few days based on the feedback from the WG. They are very much in line with what we have been discussing here, so I would encourage folks to take another look and see how it holds up now.

@wanderview the spec mentions shared memory, but not workers explicitly. It recommends that all shared memory be attributed to a single owner. The service worker point that you made sounds like something we should raise an issue for.

@past
Copy link
Author

past commented May 7, 2018

The updated spec suggests at least a 30 sec time quantization. How soon do we update the telemetry data we collect around memory usage?

@wanderview
Copy link

I filed WICG/performance-memory#17 for the SharedWorker/ServiceWorker issue. I recommended possibly reporting memory for each client or environment independently.

@wanderview
Copy link

Note, the discussion in WICG/performance-memory#17 suggests that perhaps the proposal does not clearly define how this API should work across origin/site/domain boundaries. I think that must be defined in order for us to consider prototyping, etc.

@pweis8
Copy link

pweis8 commented May 14, 2018

Google G Suite engineer here. We're very excited to see this proposal gain shape. To help motivate our use cases of this API, we've compiled a document summarizing our existing use cases of the Chrome only API, its shortcomings, and how a standardized API would provide significant value to us and across the web (already referenced earlier here via WICG/performance-memory#8).

Quick summary of our main use cases: A/B testing, detection of regressions in our apps, and better understanding of memory usage from real users.

A couple of brief comments from our perspective on two issues raised on this thread:

  • While a high level summary of memory usage is already very useful for our use cases, detailed information (e.g. canvas or image memory) would add a lot of value for pinpointing regressions and understanding usage patterns. Having at least some of these details standardized seems like a good idea, but likely there will be browser-specific fields here, and we'd strongly prefer having those available to apps.
  • Time-limiting how often data would be refreshed, or using existing cached data, would limit the usefulness of the memory measurements. Ideally we would be able to tie memory measurements to user actions, so that we can get reliable memory measurements after specific user interactions with our apps, and detect regressions this way. We understand that these measurements wouldn't be perfect because of gc behavior, but believe that in aggregate they would still provide very useful information.
  • The current Chrome-only API does not include memory from shared workers or service workers. We understand this might depend on browser implementation details, but would be happy with excluding worker memory from these measurements, and ideally making worker memory data available in the worker context, separate from the main frame.

@erikchen
Copy link

erikchen commented Jun 8, 2018

Hi!

I worked with wanderview@ to clean up the proposal. Please take another look.

Question: What would be an appropriate communication medium for me to follow up on comments and feedback from this group?

Aside: All of your feedback has been insightful and helpful. As someone new to the web-spec space, I've been heartened by this friendly community. Thanks again for the support! =D

@past
Copy link
Author

past commented Jul 25, 2018

Could people please take another look at the updated proposal and see if there is anything objectionable there?

@bzbarsky
Copy link
Contributor

I just took a look; it looks like the updated proposal is trying to take the privacy issues quite seriously, which I appreciate.

I have two questions about the spec idl:

  1. Why a callback instead of a promise return value?

  2. Why null as the "no info" value instead of 0? They seem equally easy to test, but null has the problem that it wouldn't be too hard to accidentally write code that works as long as the value is not null but throws when it is, then only test it in cases where null is not returned.

Apart from that, the main question is how implementable the must-level requirements about including everything are....

@erikchen
Copy link

erikchen commented Aug 1, 2018

Why a callback instead of a promise return value?

I'm insufficiently familiar with standards best practices. I'll switch it to a promise.

Why null as the "no info" value instead of 0?

Since we're adding normalized noise, it's possible for the return value to be 0 [or even negative] and have different semantics than null.

Apart from that, the main question is how implementable the must-level requirements about including everything are....

For Chrome's implementation, I was intending to rely on the assumption of site-per-process isolation, at which point we can mostly just use process memory metrics, with some touch-ups. We'll still need to return "null" for certain edge cases.

I'm insufficiently familiar with Firefox's implementation to know whether the current form of the spec is possible to implement.

If it helps -- achieving a "perfect" measurement of memory has never been the goal [and quickly devolves into semantics discussions]. I mostly want Web Developers to be able to catch large, obvious regressions [e.g. accidentally retaining 100 canvases, unconditionally allocating a large array buffer on startup].

The one case I do think we need to get right is including memory for related similar-origin browsing contexts [e.g. same origin subframe], which supports synchronous scripting and therefore can participate in retain cycles, and not including memory for different-origin browsing contexts [e.g. ads], which would add too much noise.

We should probably have a conformance test suite that checks all the cases we care about. e.g.

  1. Start with empty site. Add a million DOM nodes. Memory estimate should go up.
  2. Start with empty site. Add 100 canvases [1000x1000, draw into them to make sure backing buffer is dirty]. Memory estimate should go up.
  3. Start with empty site. Add a different-origin iframe. In that iframe, make a million DOM nodes. Memory estimate should not change.
    etc.

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 1, 2018

I was intending to rely on the assumption of site-per-process isolation, at which point we can mostly just use process memory metrics

I guess I'm not sure how that goes along with the "Memory estimate should be directly attributable to the web developer" guideline. There's a bunch of process memory that web developers don't really have much control over, if any. I guess that's a guideline, not a normative requirement...

Also not sure how this would handle things like GPU memory (relevant to the canvases example).

I agree that there might not be perfect answers here; just want to make sure the spec doesn't paint browsers into corners, mostly. I think that it's in pretty decent shape at this point, for what it's worth.

@erikchen
Copy link

erikchen commented Aug 1, 2018

I guess I'm not sure how that goes along with the "Memory estimate should be directly attributable to the web developer" guideline. There's a bunch of process memory that web developers don't really have much control over, if any. I guess that's a guideline, not a normative requirement...

Good point. I've been implicitly relying on the assumption that the absolute value does not matter, so any fixed process overhead that gets included is not relevant. We could theoretically measure process memory right before loading any third-party code, and use that as a baseline [subtracting it from future measurements]. Then there's the matter of non-fixed process memory not directly attributable to the web developer. I think that mostly becomes a semantics discussion about the definition of "attributable".

Also not sure how this would handle things like GPU memory (relevant to the canvases example).

In Chrome, we have pretty good tracking of memory that crosses process boundaries [canvas, WebGL] and I was planning on adding those onto the process memory measurement to get the final result.

I agree that there might not be perfect answers here; just want to make sure the spec doesn't paint browsers into corners, mostly. I think that it's in pretty decent shape at this point, for what it's worth.

Thanks!

I guess for most specs, we want there to be a lot of normative text so that browsers have the same behavior. Maybe this API is an exception? As long the implementation is internally consistent and comprehensive [and helps web developers find bugs!], the exact behavior is less important. Differences between browser vendors implementations are expected, since memory is an implementation detail.

Maybe we should add a disclaimer like this to the proposal?

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 1, 2018

There's non-fixed stuff that may still not really be under the control of the web page (thread stacks in threadpools, font caches, etc, etc).

I think that mostly becomes a semantics discussion about the definition of "attributable".

Right.

Maybe we should add a disclaimer like this to the proposal?

Makes sense to me.

@overholt
Copy link

overholt commented Aug 1, 2018

One thing @smaug---- brought up in conversation with me is that getMemoryEstimate doesn't imply the estimate is very UA- (and UA build-) specific. I should open this on the other repo ...

@erikchen
Copy link

erikchen commented Aug 1, 2018

Disclaimer added: https://github.com/WICG/performance-memory/pull/18/files

getMemoryEstimate doesn't imply the estimate is very UA- (and UA build-) specific.

The current proposal states this several times. Did you mean the name of the API itself? What about getMemoryEstimateUASpecific()?

@overholt
Copy link

overholt commented Aug 1, 2018

Yes, I meant the name of the method :) What you propose is fine but I'm sure others more knowledgeable will have input (sorry to have asked for bikeshedding).

@erikchen
Copy link

erikchen commented Aug 2, 2018

overholt: No one else has voiced opinions, so I've gone ahead and changed to getMemoryEstimateUASpecific.

@wanderview
Copy link

Not to continue bikeshedding, but I kind of think adding "UASpecific" is a bit excessively long. Not a major issue I guess.

Another way to make it clear this is UA specific is to take the absolute units off. Instead of reporting MB, simply report a unitless UA-specific "memory load" value. It would be useful for comparing within a single browser, but not across browsers.

@smaug----
Copy link
Collaborator

How is that clear? Then people would assume the number means something reasonable and compare between UAs.
If the API returns something very UA and even UA version specific, it is better to say so.

In other specs where spec says some value is UA specific and random, we've still seen web devs complaining when API doesn't return same/similar values in all the UAs. (pointerevent's pointerId as an example)

@wanderview
Copy link

Ok. Happy to see this go forward without my comments.

@dbaron dbaron added the venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG) label Aug 9, 2018
@birtles
Copy link
Member

birtles commented Aug 23, 2018

One very naive question, is there any need for this API to be available in regular browsing sessions? For the purpose of memory regression testing, would it be sufficient to only enable it when, for example, a certain pref / command-line flag is set?

@marcoscaceres
Copy link
Contributor

One very naive question, is there any need for this API to be available in regular browsing sessions?

In such a case, we wouldn't need a standard. We can just build our own custom thing.

@birtles
Copy link
Member

birtles commented Aug 23, 2018

One very naive question, is there any need for this API to be available in regular browsing sessions?

In such a case, we wouldn't need a standard. We can just build our own custom thing.

By that logic we wouldn't need standards such as WebDriver either.

@marcoscaceres
Copy link
Contributor

By that logic we wouldn't need standards such as WebDriver either.

I believe that was debated :) But seriously, is the value here for web developers or browser developers? And if it's for browser developers, what's the value of having shared tests across browsers related to memory?

@erikchen
Copy link

This is intended for web developers. Facebook, gmail and gsuite have expressed interest in using the API. Memory usage of these sites is dependent on the content being displayed, and local testing may not be representative of real usage in the wild.

@erikchen
Copy link

erikchen commented Oct 2, 2018

Status update: I am pausing work on the API proposal.

In the Blink-Dev intent-to-implement thread, the concern was raised that the API would leak information about the size of cross-origin resources.

One proposal to mitigate this is to block the API on a feature-policy that turns all resource requests into CORS requests. This seems generally desirable anyways, but is a long way off from being possible on the majority of websites.

Another possibility would be to manually track the size of cross-origin resources [and associated metadata] and exclude them from the total. That is prone to error and edge cases -- I don't want potentially easy-to-commit errors to have severe privacy implications.

Figuring out how to help Web Developers reduce the memory usage of their sites still seems like a high priority, but this doesn't seem to be the right approach.

Thank you all for the support during this process, and the many iterations on the proposal. Hopefully we'll get another chance to work together in the future. :)

@marcoscaceres
Copy link
Contributor

Closing, as incubation of this proposal was halted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
venue: W3C CG Specifications in W3C Community Groups (e.g., WICG, Privacy CG)
Projects
None yet
Development

No branches or pull requests