You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the standard brazil band plot ROOT HypotestInverter produces, there is a bit of an imprecision as the expected limits are calculated in $mu^/sigma$ space but for significant upward fluctuations it returns p-values that are not actually realizable with any data since if $expected µ^ > µ$ the test stat can at most be the one returned for $µ=µ^$, namely $q_µ = 0$, that in turn means that the p-value is capped at $CLs_obs$.
Specifically, just trying to understand a bit more -- as to why this needs to get capped. I'll just dump some quoted blocks of text from offline messages
hm, thinking out loud here
q_obs is the test stat that we observe for the observed data (looking for equations)
so q_mu, obs is q(mu | mu-hat) = f(mu | obs data) as mu-hat is a random variable of the data
so if CLs is a function of q_mu which is a function of mu and the obs data, and we've shown that for a given set of observations (and so a given mu-hat) q_mu is constant for certain values of mu
so I think that what happens is that CL_s+b becomes a constant when mu < mu-hat, but CL_b doesn't, and so there is a max value that everything is capped at (but this last part I'm not 100% on and will need to pencil and paper it to show myself) so I guess I don't know either
oh wait I think I have it for CL_s+b, obs
CL_s+b, obs = 1 - F(q_obs(mu) | mu') but HERE what we were calling mu is mu-hat and mu' is mu
for signal strength 𝜇 and model hypothesis signal strength 𝜇′
So
CL_s+b, obs = 1 - F(q_obs(mu-hat) | mu)
and for mu < mu-hat q_obs(mu-hat) is a constant of 0. So then
CL_s+b, obs = 1 - F(constant | mu) = g(mu). https://pyhf.readthedocs.io/en/clipped_expected/_generated/pyhf.infer.hypotest.html
but CL_b is
CL_b, obs = 1 - F (q_obs (mu-hat) | 0) and so that whole thing is a constant.
so then CL_s_obs = CL_s+b, obs / CL_b,obs = g(mu) / constant.
you knew all this already
but the important thing is that as q_obs(mu-hat) is a constant for mu < mu-hat this means that this CLs, obs value we've found is the maximum possible CL_s value you can calculate for a given mu
so it doesn't matter if you're variations of the signal strength for the expected give you an expected mu-hat > mu, as the observed test stat has already ruled that out. So you can use this information from the data to cap it
Oh. looking at the bottom plots on https://scikit-hep.org/pyhf/examples/notebooks/toys.html
As higher values of the test stat represent greater incompatibility between the model (the test hypothesis value of mu) and the observed data, then that means that for q_mu = 0 you've reached the minimum test stat (-2 ln lambda is larger than 0) -- and so the minimum level of incompatibility. By definition this means that you've also reached the largest possible p-value the data can give you.
So regardless of what the expected values are, this is the highest p-value and so the highest CL_s value that you're going to be to calculate with your data set for that given value of mu, so you can cap it there.
The text was updated successfully, but these errors were encountered:
Related issue: #993
Specifically, just trying to understand a bit more -- as to why this needs to get capped. I'll just dump some quoted blocks of text from offline messages
The text was updated successfully, but these errors were encountered: