You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cons: User & dapp builder pains that pop-ups get lost behind windows / easy to ignored, and feel like an odd part of the experience (breaks the flow, pop-ups from an extension without prompting isn't a common pattern for web users)
Full-screen
Pros: Could direct user to the actions that require attention - might feel more natural as opening a new tab from a site is a common web pattern, More real-estate as permissions system grows and confirmation screens get bundled
Cons: Could fire-back and impact user's dapp experience flow
Problem: Right now it seems lots of pop-ups are being "lost" or not visible to the users. This is a problem both seen in metrics and learnt about as a frustration from users and dapp builders.
A variation: Pop-up (control condition)
B variation: Fullscreen (test condition)
Success metric: Our hypothesis is that full-screen tx request opening in new tab will solve our problem. We aim to reduce "ignored" transaction requests, aka, increase the times that a pop-up is responded to by user (either with confirm or cancel).
Sample size calculation
Trying to calculate sample size for this A/B test - which means - how many users need go through per variation to determine which one performs better?
Per Optimzely's sample size calculator, here's the result. Below I share how I entered the Baseline Conversion Rate & Minimal Detectable Effect.
For Baseline Conversion Rate, I looked at our past 1 month's data for Confirm & Sign Request tx pop-ups conversions, and taking the Confirm screen number as that's a screen more triggered overall 40%
For Minimal Detectable Effect, I am taking 20% just to see if we can improve conversion rate with fullscreen mode by ~50%, that'll be a strong enough reason to switch to Fullscreen mode. We can change this if we have good reason to.
Questions
@bdresser thoughts on above baseline calculation? I couldn't find the numbers for 'Contract Interaction' pop-ups. Since those were majority events, that'd probably give the best baseline indication? Any other ideas on making it more robust for reliable results? 95% statistical significance is good enough I think (p<0.05). I am taking MDE quite randomly - should we aim for higher / lower?
@danjm Implementation detail for this test: Will we be able to treat each user consistently? That is, user1 always sees full-screen prompts, while user2 always sees pop-ups (like now). And will we be able to roll this out to only a small number of users (e.g., 330 people see the fullscreen variation?)
I think we should be testing dapp-initiated transaction confirmations.
This includes ETH sends & contract interactions (both are "transactions")
Open questions
Should we test on mainnet only?
Should we include signing data? It would introduce a confounding variable since it has a different conversion rate. I would be confident in applying the winner of a tx confirmation A/B test to signing data as well.
For conversion rate
The "Confirm: Started" and "Transaction completed events can fire for the extension in multiple views. We should make sure we're looking at confirmation rate for transactions through the standalone window, called the notification inside of Matomo.
413 transactions seems unreasonably small for the month? Let's widen the scope to all of 2019 and make sure we're on the right data set
Digging in to this custom report ("Events by environment type"), setting the timeframe to 2019 and drilling in to the "notification category:
Confirm: Started | 4,177,867
Transaction Completed | 3,421,816
Conversion rate: 81.9%
For sample size calculation
Minimum detectable effect could be much smaller. If the full screen transaction confirmations are even 5% more effective than the standalone window, that's a huge efficiency gain I'd advocate for shipping! 20% MDE would mean our test doesn't detect anything smaller than 20%.
We should have no trouble with reaching sample size given how many users confirm tx! Updating to 81% conversion and 5% MDE gives me a 1k sample size, which we could reach in a couple days.
Per Kyokan sync:
Implementation of this is not a lot of work, but it will probably not make it to this kyokan sprint. @bdresser did it make it in this sprint? shall we pick this up when kyokan sprint has bandwidth? I'll drop this out of this design sprint
@omnat commented on Mon Jul 08 2019
A/B test which actions is better. Pop-up (current) or Full-screen (proposed change)
@omnat commented on Mon Jul 08 2019
Pop-up
Pros: Keeps user in context of dapps
Cons: User & dapp builder pains that pop-ups get lost behind windows / easy to ignored, and feel like an odd part of the experience (breaks the flow, pop-ups from an extension without prompting isn't a common pattern for web users)
Full-screen
Pros: Could direct user to the actions that require attention - might feel more natural as opening a new tab from a site is a common web pattern, More real-estate as permissions system grows and confirmation screens get bundled
Cons: Could fire-back and impact user's dapp experience flow
@omnat commented on Thu Jul 18 2019
A/B metrics (notes from Bobby chat):
Goal: reduce # of pop-ups that are being ignored (pop-ups showed - pop-ups confirmed/rejects)
Things to find to setup A/B: Sample size, #number of flow - visit
Implementation detail: each user should consistently see either Fullscreen OR pop-up
@omnat commented on Tue Jul 23 2019
A/B test
Problem: Right now it seems lots of pop-ups are being "lost" or not visible to the users. This is a problem both seen in metrics and learnt about as a frustration from users and dapp builders.
A variation: Pop-up (control condition)
B variation: Fullscreen (test condition)
Success metric: Our hypothesis is that full-screen tx request opening in new tab will solve our problem. We aim to reduce "ignored" transaction requests, aka, increase the times that a pop-up is responded to by user (either with confirm or cancel).
Sample size calculation
Trying to calculate
sample size
for this A/B test - which means - how many users need go through per variation to determine which one performs better?Per Optimzely's sample size calculator, here's the result. Below I share how I entered the
Baseline Conversion Rate
&Minimal Detectable Effect
.For
Baseline Conversion Rate
, I looked at our past 1 month's data for Confirm & Sign Request tx pop-ups conversions, and taking the Confirm screen number as that's a screen more triggered overall 40%For
Minimal Detectable Effect
, I am taking 20% just to see if we can improve conversion rate with fullscreen mode by ~50%, that'll be a strong enough reason to switch to Fullscreen mode. We can change this if we have good reason to.Questions
@bdresser thoughts on above baseline calculation? I couldn't find the numbers for 'Contract Interaction' pop-ups. Since those were majority events, that'd probably give the best baseline indication? Any other ideas on making it more robust for reliable results? 95% statistical significance is good enough I think (p<0.05). I am taking MDE quite randomly - should we aim for higher / lower?
@danjm Implementation detail for this test: Will we be able to treat each user consistently? That is, user1 always sees full-screen prompts, while user2 always sees pop-ups (like now). And will we be able to roll this out to only a small number of users (e.g., 330 people see the fullscreen variation?)
@bdresser commented on Wed Jul 24 2019
Good stuff @omnat!
For the actual structure of the A/B test:
For conversion rate
notification
inside of Matomo.For sample size calculation
@omnat commented on Tue Aug 06 2019
Per Kyokan sync:
Implementation of this is not a lot of work, but it will probably not make it to this kyokan sprint.
@bdresser did it make it in this sprint? shall we pick this up when kyokan sprint has bandwidth? I'll drop this out of this design sprint
@bdresser commented on Wed Aug 07 2019
Not in Kyo sync, hopefully next one.
Just to confirm, we're testing
New tab behavior
@cjeria @omnat @danfinlay sound good?
@omnat commented on Wed Aug 07 2019
Yes all the above sounds good. Thanks Bobby!
The text was updated successfully, but these errors were encountered: