-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Close #26041: Re-set TrackingProtectionPolicy after Nimbus SDK is initialized #26228
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good minimal fix for TCP. I might also recommend we flip the default for TCP in the nimbus.fml.yaml
to true
. I've asked bugs to be filed if they haven't already for the tabs prioritization.
// value again so that we can still access the NimbusApi that is wrapped | ||
// in `FxNimbus.initialize.getSdk`. | ||
// See: https://github.com/mozilla-mobile/fenix/issues/26041 | ||
FxNimbus.api = this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed mozilla/application-services#4999 first released in v93.6.0
It should already be in main
, and releases_v104.0.0
.
Added mozilla-mobile/android-components#12567 to land on main
, this should be uplifted to releases_v104.0.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
FeatureHolder
withinFxNimbus
has a lazy block around it
which also ends up caching the first result it evaluates.
This isn't quite accurate: prior to v93.6.0
, the feature holder didn't get to hear about a call to connect the Nimbus SDK (the Rust code that connects to the server and calculates enrollment) if it had been used already.
So if the feature was used before the very earliest steps to initialize the SDK had been started, then we return the default values because if the SDK isn't available, the feature just returns that defaults every time its called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed mozilla/application-services#4999 first released in v93.6.0
It should already be in
main
, andreleases_v104.0.0
.Added mozilla-mobile/android-components#12567 to land on
main
, this should be uplifted toreleases_v104.0.0
.
@jhugman Upgrading to the latest version (that is in mozilla-mobile/android-components#12592), I'm still not able to get the experiment to work without adding this line.
At this point, I'm more inclined to leaving this line in here and filing a bug to have it fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonalmeida @jhugman I don't think we can land the FxNimbus.api = this
assignment because it's not thread-safe. It's a public var
without any visibility guarantees, i.e., it's not volatile or synchronized, and also executed on an IO thread here.
Maybe that's also part of the problem? I know @jhugman made FeatureHolder
thread-safe but this assignment isn't (to be fair I'd think it's also not meant to be called like this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be fair I'd think it's also not meant to be called like this
This is correct. This is a workaround, using the old API, which we've deprecated, and are trying to remove.
As discussed, the fix is to upgrade the FML tooling to at least v93.6.0
(currently on v93.5.0
).
That said, there is an issue with upgrading to latest
(currently v93.8.0
) due to mozilla/application-services#5079 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhugman Above @jonalmeida mentioned that upgrading to the latest version didn't fix the problem. We should take another look together, please.
…bus SDK is initialized There are three issues here that we have uncovered while investigating this bug: 1. Settings.kt has a lazy block around `enabledTotalCookieProtection` which ends up caching the first result it evaluates. 3. The `FeatureHolder` within FxNimbus caches the incorrectly evaluated value and returns this value hence forth. 4. Nimbus is not ready to return a result for an engine experiment when we need it early on in the dependency tree initialization. There are multiple systems that require engine to be initialized for them to work (e.g. Glean, Profiler, concept-fetch). In our TCP, experiment, we need to apply these engine settings during the engine initialization. So when we try and evaluate Nimbus that early on, it has not had time to initialize itself correctly or even use the engine's concept-fetch client to return the correct experiment result. This bug is made worse because of the first two caching bugs where we are always holding onto a cached value of the wrong result. Our temporary solution is to: 1. Remove the `lazy` around `Settings.enabledTotalCookieProtection`. 2. Set the `FxNimbus.api` value right after we are done initializing `FxNimbus` and `NimbusApi` so that all future queries to FxNimbus will be made against a real instance of `NimbusApi`. This is a short-term fix for the `FeatureHolder` caching bug. 3. Set a new TrackingProtectionPolicy that will evaluate Nimbus now that it is in the correct state when receive the `NimbusInterface.Observer.onUpdatesApplied`. Co-authored-by: jhugman <[email protected]> Co-authored-by: Christian Sadilek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added further comments in the review.
@jonalmeida asks: why does setting
However, if they’re instantiated before the The initial value of So changing the value of For more context: That it still works is a lucky accident for us, but definitely the not something we want anyone to rely on. |
@jhugman But this workaround here calls |
We've landed the 83.6.0 Nimbus plugin change, so we can safely land this change as well. |
QA verified the fix we have in Nightly, so let's uplift this to the 104.0 release; the AC uplift has already landed (albeit it needs a new AC release for it to be used). @Mergifyio backport releases_v104.0.0 |
@Mergifyio backport releases_v104.0.0 |
✅ Backports have been created
|
There are three issues here that we have uncovered while investigating
this bug:
enabledTotalCookieProtection
which ends up caching the first result it evaluates.
FeatureHolder
within FxNimbus caches the incorrectlyevaluated value and returns this value hence forth.
when we need it early on in the dependency tree initialization.
There are multiple systems that require engine to be initialized for
them to work (e.g. Glean, Profiler, concept-fetch). In our TCP,
experiment, we need to apply these engine settings during the engine
initialization. So when we try and evaluate Nimbus that early on, it
has not had time to initialize itself correctly or even use the
engine's concept-fetch client to return the correct experiment result.
This bug is made worse because of the first two caching bugs where we
are always holding onto a cached value of the wrong result.
Our temporary solution is to:
lazy
aroundSettings.enabledTotalCookieProtection
.FxNimbus.api
value right after we are done initializingFxNimbus
andNimbusApi
so that all future queries to FxNimbuswill be made against a real instance of
NimbusApi
. This is ashort-term fix for the
FeatureHolder
caching bug.that it is in the correct state when receive the
NimbusInterface.Observer.onUpdatesApplied
.Co-authored-by: jhugman
Co-authored-by: Christian Sadilek
Pull Request checklist
QA
To download an APK when reviewing a PR (after all CI tasks finished running):
Checks
at the top of the PR page.firefoxci-taskcluster
group on the left to expand all tasks.build-debug
task.View task in Taskcluster
in the newDETAILS
section.GitHub Automation
Fixes #26041