Skip to content
This repository has been archived by the owner on Jan 12, 2023. It is now read-only.

Add data review for Legacy Ids #5512

Merged
merged 2 commits into from
Jan 28, 2022
Merged

Conversation

jonalmeida
Copy link
Contributor

@jonalmeida jonalmeida commented Sep 29, 2021

As requested, I'm separating out the data review for the Legacy ID here from the activation ID to avoid confusion.


Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?

    • We send our legacy telemetry ID as part of the deletion request for users who wish to have their reported Telemetry data deleted.
  2. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

    • This is the supported way needed by the deletion-request for user to opt-out of data collection.
  3. What alternative methods did you consider to answer these questions? Why were they not sufficient?

    • N/A
  4. Can current instrumentation answer these questions?

    • Currently, we have telemetry from a deprecated and outdated system which we can give our users the opportunity to opt-out of.
  5. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Measurement Description Data Collection Category Tracking Bug #
legacy_ids.client_id category 4 #4901
  1. How long will this data be collected?

    • Always.
  2. What populations will you measure?

    • All release, beta, and nightly users with telemetry enabled.
  3. Please provide a general description of how you will analyze this data.

    • Glean will connect the legacy ID to collected metrics from prior systems to schedule deletion requests when users opt out of data collection.
  4. Where do you intend to share the results of your analysis?

    • No one, only Glean internal systems will have references to this ID.

@jonalmeida jonalmeida added the needs:data-review PR is awaiting a data review label Sep 29, 2021
@jonalmeida jonalmeida requested a review from travis79 September 29, 2021 20:29
@nshadowen314
Copy link

Hi @jonalmeida thanks for this request. Please complete all questions of the data request form for us to complete review. "N/A" is not an adequate response to questions.

To move forward in the process, this request still needs responses to the following questions:

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?
  2. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.
  3. If this data collection is default on, what is the opt-out mechanism for users?
  4. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection?

Please submit a new request with all questions completed. If you need support in completing, please let me or another data steward know and we can assist. Thanks.

@mcarare
Copy link
Contributor

mcarare commented Jan 26, 2022

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?

We need this legacy telemetry ID in order to perform a deletion request for users who wish to have their reported Telemetry data deleted.

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

This is the only way a deletion request can be completed for users to opt-out of data collection.

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

There are no alternative ways to perform matching between the client that request the data deletion and the stored data.

  1. Can current instrumentation answer these questions?

No.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.

Measurement Description Data Collection Category Tracking Bug #
legacy_ids.client_id Category 4 “Highly sensitive or clearly identifiable personal data” #4901
  1. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/apps/focus_android/metrics/legacy_ids_client_id
More info on the usage of this ID is documented at https://dictionary.telemetry.mozilla.org/apps/focus_android/pings/deletion-request

  1. How long will this data be collected? Choose one of the following:

Permanently.

This is data that is permanently needed in order to perform deletion requests.

  1. What populations will you measure?

All release, beta, and nightly users with telemetry enabled.

  1. If this data collection is default on, what is the opt-out mechanism for users?

Users can opt-out of data collection by disabling Usage and technical data from Settings -> Privacy and security -> Data choices.

  1. Please provide a general description of how you will analyze this data.

Glean will connect the legacy ID to collected metrics from prior systems to schedule deletion requests when users opt-out of data collection.

  1. Where do you intend to share the results of your analysis?

This data is not shared, it is only used by Glean internal systems that will have references to this ID.

  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:

There is no third-party tool proposed.

@mcarare
Copy link
Contributor

mcarare commented Jan 26, 2022

@nshadowen314 I am submitting the above data request for analysis and approval. Thank you!

@travis79
Copy link
Member

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

1. What questions will you answer with this data?

We need this legacy telemetry ID in order to perform a deletion request for users who wish to have their reported Telemetry data deleted.

2. Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

This is the only way a deletion request can be completed for users to opt-out of data collection.

3. What alternative methods did you consider to answer these questions? Why were they not sufficient?

There are no alternative ways to perform matching between the client that request the data deletion and the stored data.

4. Can current instrumentation answer these questions?

No.

5. List all proposed measurements and indicate the category of data collection for each measurement, using the [Firefox data collection categories](https://wiki.mozilla.org/Data_Collection) found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.
Measurement Description Data Collection Category Tracking Bug #
legacy_ids.client_id Category 4 “Highly sensitive or clearly identifiable personal data” #4901

6. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at dictionary.telemetry.mozilla.org/apps/focus_android/metrics/legacy_ids_client_id More info on the usage of this ID is documented at dictionary.telemetry.mozilla.org/apps/focus_android/pings/deletion-request

7. How long will this data be collected?  Choose one of the following:

Permanently.

This is data that is permanently needed in order to perform deletion requests.

8. What populations will you measure?

All release, beta, and nightly users with telemetry enabled.

9. If this data collection is default on, what is the opt-out mechanism for users?

Users can opt-out of data collection by disabling Usage and technical data from Settings -> Privacy and security -> Data choices.

10. Please provide a general description of how you will analyze this data.

Glean will connect the legacy ID to collected metrics from prior systems to schedule deletion requests when users opt-out of data collection.

11. Where do you intend to share the results of your analysis?

This data is not shared, it is only used by Glean internal systems that will have references to this ID.

12. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:

There is no third-party tool proposed.

Data Review

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the metrics.yaml file and the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, through the "Send Usage Data" setting in the application menu. This metric is a special case since the only purpose of sending the legacy-id here is to delete the data associated with it.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Permanent collection to be monitored by Jon Almeida

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 4

  1. Is the data collection request for default-on or default-off?

Default-on, but this is a special case since once we collect the identifiers we will use them to delete any stored data associated with them.

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No new identifiers added, only existing telemetry identifiers

  1. Is the data collection covered by the existing Firefox privacy notice? If unsure: escalate to legal if:

Yes

  1. Does the data collection use a third-party collection tool?

No

Result

data-review+

Copy link
Member

@travis79 travis79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-review+

@mcarare mcarare added 🛬 needs landing PRs that are ready to land and removed needs:data-review PR is awaiting a data review labels Jan 28, 2022
@mergify mergify bot merged commit fe45443 into mozilla-mobile:main Jan 28, 2022
@jonalmeida jonalmeida deleted the legacy-id branch January 28, 2022 18:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🛬 needs landing PRs that are ready to land
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants