feat: Add kwarg for AsymptoticCalculator base distribution #993

lukasheinrich · 2020-07-27T15:31:03Z

Description

Resolves #1159

In the standard brazil band plot ROOT HypotestInverter produces, there is a bit of an imprecision as the expected limits are calculated in $\hat{\mu}/\sigma$ space but for significant upward fluctuations it returns p-values that are not actually realizable with any data since if expected $\hat{\mu} > \mu$ the test stat can at most be the one returned for $\mu=\hat{\mu}$, namely $q_\mu = 0$, that in turn means that the p-value is capped at $\mathrm{CLs}_\mathrm{obs}$.

For compatibility and recognizability the default behavior is unchanged but it's now possible to get the correct asymptotic bahaviorr with calc_kwargs = {'base_dir': 'clipped_normal'}

ReadTheDocs build: https://pyhf.readthedocs.io/en/clipped_expected/api.html#inference

Checklist Before Requesting Reviewer

Tests are passing
"WIP" removed from the title of the pull request
Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

Summarize commit messages into a comprehensive review of the PR

* Add calc_base_dist kwarg to AsymptoticCalculator to control cutoff for AsymptoticTestStatDistribution
* Add test for calc_base_dist behavior for 'clipped_normal'
* Add docstring example for pvalue
* Update docstring for AsymptoticCalculator and ToyCalculator

Co-authored-by: Matthew Feickert <[email protected]>

kratsg

The changes look fine. Additional docstrings are needed to describe the new APIs. Otherwise, most of my concern is on the description:

In the standard brazil band plot ROOT HypotestInverter produces, there is a bit of an imprecision as the expected limits are calculated in $mu^/sigma$ space but for significant upward fluctuations it returns p-values that are not actually realizable with any data since if $expected µ^ > µ$ the test stat can at most be the one returned for $µ=0$ that in turn means that the p-value is capped at $$

I don't follow this logic. Let's suppose, for example, scanning $µ$ from [0, 10], then if $µ^ > 10$ (a significant upward fluctuation?), wouldn't the test stat likely be the one returned for $µ=10$? If so, I understand that you want to apply a cap.. but what I don't understand is how, if we have bounds on $µ$, that we're able to get unrealizable p-values.

src/pyhf/infer/calculators.py

codecov · 2020-07-29T21:57:16Z

Codecov Report

Merging #993 (2d4e6fd) into master (00af4ba) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #993   +/-   ##
=======================================
  Coverage   97.49%   97.49%           
=======================================
  Files          63       63           
  Lines        3749     3758    +9     
  Branches      535      537    +2     
=======================================
+ Hits         3655     3664    +9     
  Misses         55       55           
  Partials       39       39

Flag	Coverage Δ
unittests	`97.49% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/pyhf/infer/calculators.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00af4ba...2d4e6fd. Read the comment docs.

matthewfeickert · 2021-02-09T18:32:24Z

@lukasheinrich I rebased this but have a pre-rebase version at backup-clipped_expected in case I borked anything too badly. 😬.

We should discuss all of this again, but as we've introduced kwargs since this PR started I changed calc_kwargs to just be a calc_base_dist kwarg. With this rebase I'm able to have the following code snippet produce the following plots which you made earlier.

import pyhf
import pyhf.contrib.viz.brazil as barzil_band
import numpy as np
import matplotlib.pyplot as plt

model = pyhf.simplemodels.hepdata_like(
    signal_data=[20.0], bkg_data=[50.0], bkg_uncerts=[7.0]
)
data = [70.0] + model.config.auxdata

poi_vals = np.linspace(0, 2.5)
results = [
    pyhf.infer.hypotest(
        test_poi, data, model, return_expected_set=True, calc_base_dist="normal"
    )
    for test_poi in poi_vals
]

fig, ax = plt.subplots()
barzil_band.plot_results(ax, poi_vals, results)
fig.savefig("base_dist_normal.png")

results = [
    pyhf.infer.hypotest(
        test_poi, data, model, return_expected_set=True, calc_base_dist="clipped_normal"
    )
    for test_poi in poi_vals
]

fig, ax = plt.subplots()
barzil_band.plot_results(ax, poi_vals, results)
fig.savefig("base_dist_clipped_normal.png")

base_dist_normal.png	base_dist_clipped_normal.png

The tests pass also beyond having to fixup doctest, so I think this is moving in the right direction.

matthewfeickert · 2021-02-09T21:30:03Z

This PR also addresses some of Issue #1268, which while not the goal is nice. 👍

matthewfeickert · 2021-02-09T21:57:19Z

The changes look fine. Additional docstrings are needed to describe the new APIs.

In addition to the APIs being updated the docstring for AsymptoticCalculator

pyhf/src/pyhf/infer/calculators.py

Lines 166 to 182 in 9d56349

    
                   test_stat="qtilde", 
        
               ): 
        
                   r""" 
        
                   Asymptotic Calculator. 
        
                   Args: 
        
                       data (:obj:`tensor`): The observed data. 
        
                       pdf (~pyhf.pdf.Model): The statistical model adhering to the schema ``model.json``. 
        
                       init_pars (:obj:`tensor`): The initial parameter values to be used for fitting. 
        
                       par_bounds (:obj:`tensor`): The parameter value bounds to be used for fitting. 
        
                       fixed_params (:obj:`tensor`): Whether to fix the parameter to the init_pars value during minimization 
        
                       test_stat (:obj:`str`): The test statistic to use as a numerical summary of the data. 
        
                       qtilde (:obj:`bool`): When ``True`` perform the calculation using the alternative 
        
                        test statistic, :math:`\tilde{q}_{\mu}`, as defined under the Wald 
        
                        approximation in Equation (62) of :xref:`arXiv:1007.1727` 
        
                        (:func:`~pyhf.infer.test_statistics.qmu_tilde`). 
        
                        When ``False`` use :func:`~pyhf.infer.test_statistics.qmu`.

should also get fixed up, as qtilde is no longer an arg

matthewfeickert · 2021-02-09T21:59:34Z

@kratsg @lukasheinrich This PR still needs to address the docstring requests that @kratsg and I made, but beyond that I think is ready for review. If you both make comments I will implement anything else that needs to get taken care of. 👍

src/pyhf/infer/calculators.py

matthewfeickert · 2021-02-10T00:04:13Z

Also maybe worth discussing if the changes to the EmpiricalDistribution from PR #1160 should be migrated to this PR as well.

lukasheinrich · 2021-02-11T00:34:56Z

can't aprrovemy own PR but thanks for rebasing. lgtm.

As kwargs are now used it is easier to just add a "base_distr" keyword arg rather than adding a calc_kwargs specifically.

Seems more explicit, to Feickert at least

The bool qtilde is now no longer used and so should be removed

kratsg

Will continue discussion in #1310 to not hold up this PR any longer.

matthewfeickert assigned lukasheinrich Jul 27, 2020

matthewfeickert added the feat/enhancement New feature or request label Jul 27, 2020

kratsg requested changes Jul 27, 2020

View reviewed changes

src/pyhf/infer/calculators.py Outdated Show resolved Hide resolved

src/pyhf/infer/calculators.py Outdated Show resolved Hide resolved

lukasheinrich changed the title ~~calculator kwargs~~ feat: calculator kwargs Jul 29, 2020

lukasheinrich force-pushed the clipped_expected branch 2 times, most recently from f97c9a6 to e3275ff Compare July 30, 2020 01:05

matthewfeickert force-pushed the clipped_expected branch from 512a420 to f3aaad9 Compare August 15, 2020 06:19

This was referenced Feb 4, 2021

feat: clipping asymptotics and incl pvalue for q-like test stats #1160

Open

Toys seems to give strange results #892

Closed

matthewfeickert force-pushed the clipped_expected branch from f3aaad9 to 9125564 Compare February 9, 2021 18:27

matthewfeickert force-pushed the clipped_expected branch from 9125564 to a1c0e4d Compare February 9, 2021 21:28

matthewfeickert changed the title ~~feat: calculator kwargs~~ feat: Add kwarg for AsymptoticCalculator base distribution Feb 9, 2021

matthewfeickert added docs Documentation related tests pytest labels Feb 9, 2021

matthewfeickert requested a review from kratsg February 9, 2021 21:58

matthewfeickert self-assigned this Feb 9, 2021

matthewfeickert reviewed Feb 9, 2021

View reviewed changes

src/pyhf/infer/calculators.py Outdated Show resolved Hide resolved

lukasheinrich added 5 commits February 11, 2021 16:57

calculator kwargs

3838094

1000!

3152982

more content

023198f

default value

345ed4c

make tests pass

3fabda9

lukasheinrich and others added 11 commits February 11, 2021 16:57

what

b40ce2a

clipped test

2541786

Drop calc_kwargs for base_distr keyword after rebase

01ff7e5

As kwargs are now used it is easier to just add a "base_distr" keyword arg rather than adding a calc_kwargs specifically.

Rename base_distr kwarg to calc_base_dist

2764b7a

Seems more explicit, to Feickert at least

Fix doctest

ab0c104

Make test_clipped_normal_calc more verbose

8c434c6

Correct AsymptoticCalculator docstring to use test_stat not qtilde

204fe39

The bool qtilde is now no longer used and so should be removed

Correct ToyCalculator docstring to use test_stat not qtilde

c7ec4a0

Add docstring for calc_base_dist kwarg behavior

d5ae4a5

Revise text on calc_base_dist

887c1c2

Bullet-ize test_stat docstring

0c1a0e1

matthewfeickert force-pushed the clipped_expected branch from da9cbdd to 0c1a0e1 Compare February 11, 2021 22:58

Add colons

5335701

matthewfeickert approved these changes Feb 11, 2021

View reviewed changes

Correct to mu-hat/sigma space

2d4e6fd

kratsg mentioned this pull request Feb 14, 2021

Learn notebook to explain p-values and test statistics #1310

Open

kratsg approved these changes Feb 14, 2021

View reviewed changes

kratsg merged commit ae9e39c into master Feb 14, 2021

kratsg deleted the clipped_expected branch February 14, 2021 23:46

matthewfeickert mentioned this pull request Feb 22, 2021

Harmonize docstring for test_stat between AsymptoticCalculator and ToyCalculator #1335

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add kwarg for AsymptoticCalculator base distribution #993

feat: Add kwarg for AsymptoticCalculator base distribution #993

lukasheinrich commented Jul 27, 2020 •

edited by kratsg

Loading

kratsg left a comment

codecov bot commented Jul 29, 2020 •

edited

Loading

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 9, 2021 •

edited

Loading

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 10, 2021

lukasheinrich commented Feb 11, 2021

kratsg left a comment

feat: Add kwarg for AsymptoticCalculator base distribution #993

feat: Add kwarg for AsymptoticCalculator base distribution #993

Conversation

lukasheinrich commented Jul 27, 2020 • edited by kratsg Loading

Description

Checklist Before Requesting Reviewer

Before Merging

kratsg left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 29, 2020 • edited Loading

Codecov Report

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 9, 2021 • edited Loading

matthewfeickert commented Feb 9, 2021

matthewfeickert commented Feb 10, 2021

lukasheinrich commented Feb 11, 2021

kratsg left a comment

Choose a reason for hiding this comment

lukasheinrich commented Jul 27, 2020 •

edited by kratsg

Loading

codecov bot commented Jul 29, 2020 •

edited

Loading

matthewfeickert commented Feb 9, 2021 •

edited

Loading