Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confidence levels are not p-values #162

Closed
pberkes opened this issue Nov 16, 2024 · 9 comments · Fixed by #188
Closed

Confidence levels are not p-values #162

pberkes opened this issue Nov 16, 2024 · 9 comments · Fixed by #188

Comments

@pberkes
Copy link
Collaborator

pberkes commented Nov 16, 2024

Sometimes the docs and the code refers to the confidence levels (0.95, 0.9, etc) used to compute the confidence intervals as "p-values", but they are really not. A p-value is a frequentist concept that is far away from Bayesian confidence intervals.

Do you agree? if so, we should change it

@pberkes
Copy link
Collaborator Author

pberkes commented Dec 4, 2024

is there consensus on this? in principle

in practice, it's mostly a documentation and variable naming change

@dekuenstle
Copy link
Member

Yes, they should be renamed to credibility intervals.

@otizonaizit
Copy link
Collaborator

Yes, they should be renamed to credibility intervals.

I think this depends on what term gets used in the field. If people are used to call these things confidence intervals, I don't think it is a service to the user to call them differently, even if the new name is more accurate. The more accurate name can be mentioned in the documentation to make it clear that the authors know what they are talking about and use the conventional name for simplicity.

Regarding p-value vs confidence level, I think I approved too soon. I think for that renaming the same argument as above applies: what is the term used most commonly in the field? Even if the name is not correct, I think it is still more useful that the more correct name if no one is using it.

Again, this very much depends on the field. So @guillermoaguilar , @lschwetlick , @FelixWichmann : what do you think?

@guillermoaguilar
Copy link
Collaborator

I definitely agree with Tiziano, the clarification can be added to the documentation. But it is more common to call these confidence intervals, even if we know the correct name is credible.
This has to do with users too: if they read 'credible', they might think is it not confidence and thus might be confused and ask themselves how do I get confidence intervals.

I reckon that's also why the documentation of the MATLAB version uses also 'confidence'.

@guillermoaguilar
Copy link
Collaborator

Regarding p-value vs confidence level, I think I approved too soon. I think for that renaming the same argument as above applies: what is the term used most commonly in the field? Even if the name is not correct, I think it is still more useful that the more correct name if no one is using it.

I missed what was changed. But in the MATLAB documentation 'confidence level' is used. And that's also how I would think the field understand it.
0.95 indicates a 95% confidence interval.

@guillermoaguilar
Copy link
Collaborator

Sometimes the docs and the code refers to the confidence levels (0.95, 0.9, etc) used to compute the confidence intervals as "p-values", but they are really not.

so IMO this should be "confidence level" all over. In the jupyter book documentation I don't see that we use p-value.

@pberkes
Copy link
Collaborator Author

pberkes commented Dec 5, 2024

in the code we use p-value a bunch, e.g.

def confidence_intervals(probability_mass: np.ndarray, grid_values: np.ndarray,
                         p_values: Sequence[float], mode: str) -> Dict[str, list]:
    """ Confidence intervals on probability grid.

    Supports two methods:

        - 'project', projects the confidence region down each axis.
          Implemented using :func:`grid_hdi`.
        - 'percentiles', finds alpha/2 and 1-alpha/2 percentiles (alpha = 1-p_value).
          Implemented using :func:`percentile_intervals`.

    Args:
        probability_mass: Probability mass at each grid point, shape (n_points, n_points, ...)
        grid_values: Parameter values along grid axis in the same order as zerocentered_normal_mass dimensions,
                     shape (n_dims, n_points)
        p_values: Probabilities of confidence in the intervals.
        mode: Either 'project' or 'percentiles'.
    Returns:
        A dictionary mapping p_values as a string to a list containing the start and end grid-values for the
        confidence interval per dimension, shape (n_dims, 2).
    Raises:
        ValueError for unsupported mode or sum(probability_mass) != 1.
     """

@guillermoaguilar
Copy link
Collaborator

guillermoaguilar commented Dec 5, 2024

Ok so there are two things here, lets not mix them up.

One thing is the nomenclature of confidence vs credible interval. In frequentist we talk about confidence, in bayesian credible interval. They mean different things mathematically, but people use them interchangeably and it's better to stick to confidence. In the documentation we can clarify that they are really credible intervals.

Another point is confidence level and p-value. And yes, confidence levels are not p-values (at all!).. like the name of the issue says.

So

p_values: Probabilities of confidence in the intervals.

is really wrong, even from a frequentist perspective.

What we do in psignifit is to set 'confidence levels', say .95 and .9, and calculate confidence (credible) intervals with that confidence.

So in the functions the argument should be confidence_level and the description is

confidence_level: the confidence level for the computed confidence intervals.

@pberkes
Copy link
Collaborator Author

pberkes commented Dec 5, 2024

yes, the issue is about p-levels, and it seems like everyone agree they should be renamed to confidence level

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants