Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Issue #12731: Collect telemetry about the content process of tabs getting killed. #17864

Merged
merged 1 commit into from
Feb 19, 2021

Conversation

pocmo
Copy link
Contributor

@pocmo pocmo commented Feb 5, 2021

AC issue:
mozilla-mobile/android-components#9366

Required change on the AC side:
mozilla-mobile/android-components#9634

This still needs a data-review.

r? @csadilek @dblohm7

Copy link
Contributor

@csadilek csadilek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, one comment below, and should we combine this with the existing TelemetryMiddleware or prefer keeping it separate?

}

@Suppress("MagicNumber")
private fun EngineState.ageNanos(): Long? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait this confused me. This is returning ageMillis, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, elapsedRealtime actually returns millis, but you want to record nanos. I guess nowNanos should be nowMillis and timeStamp is millis too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right, I changed this part of the code a couple of times and now the names do not make any sense. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK, let's update to nowMillis and timeStampMillis then? :)

@csadilek csadilek self-assigned this Feb 9, 2021
@pocmo
Copy link
Contributor Author

pocmo commented Feb 10, 2021

Looks great, one comment below, and should we combine this with the existing TelemetryMiddleware or prefer keeping it separate?

Yeah, I saw TelemetryMiddleware when I wrote this and thought about it, but it felt like I'd be mixing two independent things and turning TelemetryMiddleware into something that's hard to digest. wdyt?

@pocmo
Copy link
Contributor Author

pocmo commented Feb 10, 2021

(EDIT: Updated to current version of data collection review form)

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?
  • How often does the content process of foreground/background tabs get killed?
  • How "old" was the foreground/background session when it got killed.

See issue: mozilla-mobile/android-components#9366

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

For debugging purposes. It seems like a large amount of users see unexpected "reloads" of websites they are on. This seems to be caused by content process kills by the Android system. With those probes we want to see how big this problem is across our user base and also see whether mitigation strategies improve the situation.

See:

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?
  • We tried manually debugging the situation. However this didn't give us a a good picture on how often/likely those content process kills happen with normal usage on a variety of devices.
  1. Can current instrumentation answer these questions?
  • No.
  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.

Measurement Description Data Collection Category Tracking Bug #
How often was the content process of a foreground (selected) or background tab killed. Category 1 “Technical data” mozilla-mobile/android-components/issues/9366
Age of the engine session of a foreground (selected) tab at the time its content process got killed. Category 1 “Technical data” mozilla-mobile/android-components/issues/9366
Age of the engine session of a background tab at the time its content process got killed. Category 1 “Technical data” mozilla-mobile/android-components/issues/9366

I assume all collected data qualifies as Category 1 “Technical data”. Maybe it is possible to interpolate a rough estimate about how many (crashed) tabs a user had open at some point (but no information about those tabs). Would that be Category 2 “Interaction data”?

  1. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

See metrics.yaml attached to this PR. After landing this PR the documentation will be available at:

  1. How long will this data be collected? Choose one of the following:
  • I picked end of the year (2021-12-31) for now. We want those probes to ride the trains to release and also see the effect of bugfixes in this area in future releases.
  1. What populations will you measure?
  • All release, beta, and nightly users with telemetry enabled.
  1. If this data collection is default on, what is the opt-out mechanism for users?

This data collection hooks into the data collection mechanism of the app and can be disabled in the app settings (Settings > Data collection)

  1. Please provide a general description of how you will analyze this data.
  • Glean / Redash
  1. Where do you intend to share the results of your analysis?
  • Only on Glean, Redash and with mobile / GeckoView teams.
  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:
  • No

@pocmo pocmo added the needs:data-review PR is awaiting a data review label Feb 10, 2021
@csadilek
Copy link
Contributor

Yeah, I saw TelemetryMiddleware when I wrote this and thought about it, but it felt like I'd be mixing two independent things and turning TelemetryMiddleware into something that's hard to digest. wdyt?

@pocmo Makes sense to keep separate, but we should probably nest them into a TelemeteryMiddleware.create, similar to EngineMiddleware.create, otherwise it will be confusing that we have two?

@pocmo
Copy link
Contributor Author

pocmo commented Feb 17, 2021

Updated the data collection form.

And updated the PR. Looking at it again your proposal to integrate the changes into TelemetryMiddleware seemed fine and that's what it did now.

@pocmo pocmo marked this pull request as ready for review February 17, 2021 10:56
@pocmo pocmo requested review from a team as code owners February 17, 2021 10:56
Copy link
Contributor

@csadilek csadilek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@boek
Copy link
Contributor

boek commented Feb 19, 2021

Data Review Form (to be filled by Data Stewards)

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?
    Yes, see metrics.yaml and metrics.md

  2. Is there a control mechanism that allows the user to turn the data collection on and off?
    Yes, data collection settings

  3. If the request is for permanent data collection, is there someone who will monitor the data over time?
    Has expiry

  4. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
    Type 1

  5. Is the data collection request for default-on or default-off?
    Default on

  6. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
    No

  7. Is the data collection covered by the existing Firefox privacy notice? If unsure: escalate to legal if:
    Yes

  8. Does there need to be a check-in in the future to determine whether to renew the data?
    Fenix team will check-in at expiry

  9. Does the data collection use a third-party collection tool?
    No

@pocmo pocmo merged commit a2566f9 into mozilla-mobile:master Feb 19, 2021
@pocmo pocmo deleted the engine-telemetry branch February 19, 2021 10:32
@dblohm7
Copy link
Contributor

dblohm7 commented Feb 22, 2021

Thank you @pocmo! 🎉

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs:data-review PR is awaiting a data review needs:review PRs that need to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants