Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connect] Emit analytic events #9873

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

simond-stripe
Copy link
Contributor

@simond-stripe simond-stripe commented Jan 8, 2025

Summary

This emits analytic events for the Connect SDK, as described here: https://docs.google.com/document/d/1nzHGofoclzij4nP5n5qgSpYLia-93heStXvt7vOfrkQ/edit?usp=sharing

Verified through Splunk

Motivation

#9780 set up the infrastructure for analytics, this PR emits the analytics.

Testing

  • Added tests
  • Modified tests
  • Manually verified

Copy link
Contributor

github-actions bot commented Jan 8, 2025

Diffuse output:

OLD: identity-example-release-base.apk (signature: V1, V2)
NEW: identity-example-release-pr.apk (signature: V1, V2)

          │          compressed          │         uncompressed         
          ├───────────┬───────────┬──────┼───────────┬───────────┬──────
 APK      │ old       │ new       │ diff │ old       │ new       │ diff 
──────────┼───────────┼───────────┼──────┼───────────┼───────────┼──────
      dex │     2 MiB │     2 MiB │  0 B │   4.1 MiB │   4.1 MiB │  0 B 
     arsc │     1 MiB │     1 MiB │  0 B │     1 MiB │     1 MiB │  0 B 
 manifest │   2.3 KiB │   2.3 KiB │  0 B │     8 KiB │     8 KiB │  0 B 
      res │ 301.8 KiB │ 301.8 KiB │  0 B │ 455.5 KiB │ 455.5 KiB │  0 B 
   native │   6.2 MiB │   6.2 MiB │  0 B │  15.8 MiB │  15.8 MiB │  0 B 
    asset │   7.1 KiB │   7.1 KiB │  0 B │   6.9 KiB │   6.9 KiB │  0 B 
    other │  90.2 KiB │  90.2 KiB │ +7 B │ 170.3 KiB │ 170.3 KiB │  0 B 
──────────┼───────────┼───────────┼──────┼───────────┼───────────┼──────
    total │   9.6 MiB │   9.6 MiB │ +7 B │  21.5 MiB │  21.5 MiB │  0 B 

 DEX     │ old   │ new   │ diff      
─────────┼───────┼───────┼───────────
   files │     1 │     1 │ 0         
 strings │ 19971 │ 19971 │ 0 (+0 -0) 
   types │  6191 │  6191 │ 0 (+0 -0) 
 classes │  4982 │  4982 │ 0 (+0 -0) 
 methods │ 29772 │ 29772 │ 0 (+0 -0) 
  fields │ 17542 │ 17542 │ 0 (+0 -0) 

 ARSC    │ old  │ new  │ diff 
─────────┼──────┼──────┼──────
 configs │  164 │  164 │  0   
 entries │ 3622 │ 3622 │  0
APK
    compressed    │   uncompressed   │                                           
──────────┬───────┼───────────┬──────┤                                           
 size     │ diff  │ size      │ diff │ path                                      
──────────┼───────┼───────────┼──────┼───────────────────────────────────────────
 28.5 KiB │ +10 B │  62.9 KiB │  0 B │ ∆ META-INF/CERT.SF                        
    271 B │  -1 B │     120 B │  0 B │ ∆ META-INF/version-control-info.textproto 
  1.2 KiB │  -1 B │   1.2 KiB │  0 B │ ∆ META-INF/CERT.RSA                       
 25.3 KiB │  -1 B │  62.8 KiB │  0 B │ ∆ META-INF/MANIFEST.MF                    
──────────┼───────┼───────────┼──────┼───────────────────────────────────────────
 55.2 KiB │  +7 B │ 127.1 KiB │  0 B │ (total)

val code: Int,
val file: String,
val line: Int
val error: String,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 I talked with @mludowise-stripe and since this is a catch-all error for capturing unexpected issues, we agreed the schema doesn't need to match between iOS and Android. I picked a schema that makes sense for Android independent of what iOS has chosen.

Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We need to ensure that we never log any PII or sensitive data. On iOS, we explicitly don't log the error's message since we can't control its contents (ex: json deserialization errors can contain the raw json which could have user-entered data like SSN or name). If we have full control over the error message on Android, then we can log it, but maybe we can add a comment that explicitly instructs adopters to ensure no PII or sensitive data is included.

    Similarly, since error can be an arbitrary string, maybe we should add a similar comment or change this to error_name or error_code.

  2. Is error meant to be enough information to uniquely identify the callsite the error is coming from?

    On iOS, we don't control the domain+code (it could be an iOS system error), so we use the file+line number to uniquely identify the call-site. On Android, if error is meant to be unique and something we can directly control, then should we make this an enum (e.g. AnalyticClientErrorCode or something)? Okay to leave it as a string, but maybe we add a comment stating that it should be uniquely identifiable so it can be used to trace the call site in code.

Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add more context: we plan to setup per-platform alerts for these these generic client errors. So adding a uniquely identifiable error_code could also be useful if we ever wanted to configure those alerts to be more fine-grained, but not a requirement.

@simond-stripe simond-stripe marked this pull request as ready for review January 8, 2025 20:33
@simond-stripe simond-stripe requested a review from a team as a code owner January 8, 2025 20:33
Base automatically changed from simond/connect-analytics to master January 9, 2025 21:28
# Conflicts:
#	connect/src/main/java/com/stripe/android/connect/webview/StripeConnectWebViewContainerController.kt
@@ -31,7 +31,7 @@ internal sealed class ConnectAnalyticsEvent(
* Note: This should happen before component_loaded, so we won't yet have a page_view_id.
*/
data class WebPageLoaded(
val timeToLoad: Double
val timeToLoad: Long
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the unit (ms, seconds)?

Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On iOS, this is a Double and the unit is in seconds. I think we want it to be consistent across platforms:
https://docs.google.com/document/d/1nzHGofoclzij4nP5n5qgSpYLia-93heStXvt7vOfrkQ/edit?tab=t.0#bookmark=id.1au4f34vo3wd

Was this changed to Long to be compatible with the Android Clock API? Maybe we could store it milliseconds using Long and then convert it inside mapOf to keep it easy to construct?

val timeToLoad: Double,
val perceivedTimeToLoad: Double
val timeToLoad: Long,
val perceivedTimeToLoad: Long
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Even better, could you add @param docs for all the events? It's not obvious what many of the params mean, nor their shape.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ I have comments documenting these in the Analytic Events section of our scoping doc. I think you could copy these over verbatim (that's what I did on iOS)

"file" to file,
"line" to line.toString()
"error" to error,
"errorMessage" to errorMessage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention seems to be snake_case

Suggested change
"errorMessage" to errorMessage,
"error_message" to errorMessage,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} catch (e: IllegalArgumentException) {
controller?.onErrorDeserializingWebMessage(
webMessage = message,
error = "Unable to deserialize message from $webFunctionName",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant. The event name is already component.web.error.deserialize_message. We should replace error with function and value $webFunctionName, similar to event component.web.warn.unrecognized_setter_function.

) {
analyticsService.track(
ConnectAnalyticsEvent.WebErrorDeserializeMessage(
message = webMessage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any security/privacy risks of sending this arbitrary data to analytics? We're sanitizing the URL for component.web.error.unexpected_navigation -- why isn't that necessary here?

Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do need to be sensitive of PII – please see my comment in the analytic event scoping doc:
https://docs.google.com/document/d/1nzHGofoclzij4nP5n5qgSpYLia-93heStXvt7vOfrkQ/edit?tab=t.0#bookmark=id.t7fuyluvgru2

When I originally scoped this, I checked that the JS messages we currently send shouldn't contain PII, but I'm considering removing it for future-proofing in case we eventually use a JS message that does contain PII. So maybe we drop it? – I'm open to suggestions

logger.warning("($loggerTag) Received pop-up for allow-listed host: $url")
analyticsService.track(
ConnectAnalyticsEvent.ClientError(
error = "Unexpected pop-up",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO error_code would be better than a verbose error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ Related to my comment above, I think giving it a name like error_code means it's less likely we'd inadvertently include an error message that contained PII.

@@ -139,7 +199,7 @@ internal class StripeConnectWebViewContainerController<Listener : StripeEmbedded
embeddedComponentManager.appearanceFlow
.collectLatest { appearance ->
updateState { copy(appearance = appearance) }
if (stateFlow.value.receivedPageDidLoad) {
if (stateFlow.value.pageViewId != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link

@mludowise-stripe mludowise-stripe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some feedback on some analytic inconsistencies and PII concerns, but I didn't look too closely at the android-y implementation bits.

But overall this looks great!

val code: Int,
val file: String,
val line: Int
val error: String,
Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We need to ensure that we never log any PII or sensitive data. On iOS, we explicitly don't log the error's message since we can't control its contents (ex: json deserialization errors can contain the raw json which could have user-entered data like SSN or name). If we have full control over the error message on Android, then we can log it, but maybe we can add a comment that explicitly instructs adopters to ensure no PII or sensitive data is included.

    Similarly, since error can be an arbitrary string, maybe we should add a similar comment or change this to error_name or error_code.

  2. Is error meant to be enough information to uniquely identify the callsite the error is coming from?

    On iOS, we don't control the domain+code (it could be an iOS system error), so we use the file+line number to uniquely identify the call-site. On Android, if error is meant to be unique and something we can directly control, then should we make this an enum (e.g. AnalyticClientErrorCode or something)? Okay to leave it as a string, but maybe we add a comment stating that it should be uniquely identifiable so it can be used to trace the call site in code.

@@ -31,7 +31,7 @@ internal sealed class ConnectAnalyticsEvent(
* Note: This should happen before component_loaded, so we won't yet have a page_view_id.
*/
data class WebPageLoaded(
val timeToLoad: Double
val timeToLoad: Long
Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On iOS, this is a Double and the unit is in seconds. I think we want it to be consistent across platforms:
https://docs.google.com/document/d/1nzHGofoclzij4nP5n5qgSpYLia-93heStXvt7vOfrkQ/edit?tab=t.0#bookmark=id.1au4f34vo3wd

Was this changed to Long to be compatible with the Android Clock API? Maybe we could store it milliseconds using Long and then convert it inside mapOf to keep it easy to construct?

val timeToLoad: Double,
val perceivedTimeToLoad: Double
val timeToLoad: Long,
val perceivedTimeToLoad: Long

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ I have comments documenting these in the Analytic Events section of our scoping doc. I think you could copy these over verbatim (that's what I did on iOS)

"file" to file,
"line" to line.toString()
"error" to error,
"errorMessage" to errorMessage,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

) {
analyticsService.track(
ConnectAnalyticsEvent.WebErrorDeserializeMessage(
message = webMessage,
Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do need to be sensitive of PII – please see my comment in the analytic event scoping doc:
https://docs.google.com/document/d/1nzHGofoclzij4nP5n5qgSpYLia-93heStXvt7vOfrkQ/edit?tab=t.0#bookmark=id.t7fuyluvgru2

When I originally scoped this, I checked that the JS messages we currently send shouldn't contain PII, but I'm considering removing it for future-proofing in case we eventually use a JS message that does contain PII. So maybe we drop it? – I'm open to suggestions

logger.warning("($loggerTag) Received pop-up for allow-listed host: $url")
analyticsService.track(
ConnectAnalyticsEvent.ClientError(
error = "Unexpected pop-up",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ Related to my comment above, I think giving it a name like error_code means it's less likely we'd inadvertently include an error message that contained PII.

analyticsService.track(
ConnectAnalyticsEvent.ClientError(
error = "Unexpected permissions request",
errorMessage = "Unexpected permissions '${request.resources.joinToString()}' requested"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that request.resources may contain PII? If so, we shouldn't include it in the analytic.

analyticsService.track(
ConnectAnalyticsEvent.ClientError(
error = "Unexpected pop-up",
errorMessage = "Received pop-up for allow-listed host: $url"
Copy link

@mludowise-stripe mludowise-stripe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only log the URL's host in the message, rather than the entire URL. It's possible the URL could contain sensitive data (e.g. if we get a bug in FinancialConnections integration, it could be a bank auth redirect and contain an oauth token as a query param).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants