Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving summary output for BFBayesFactor objects #620

Closed
IndrajeetPatil opened this issue Oct 20, 2021 · 16 comments
Closed

improving summary output for BFBayesFactor objects #620

IndrajeetPatil opened this issue Oct 20, 2021 · 16 comments
Labels
Consistency 🍏 🍎 Expected output across functions could be more similar

Comments

@IndrajeetPatil
Copy link
Member

IndrajeetPatil commented Oct 20, 2021

We should include the name of the frequentist effect size estimate in the Parameter column:

For example:

current behavior

library(BayesFactor)
library(parameters)

ttestBF(mtcars$wt, mu = 3) |> parameters() |> as.data.frame()
#>     CI     CI_low   CI_high  Parameter     Median     pd ROPE_Percentage
#> 1 0.95 -0.5512066 0.1682628 Difference -0.2039748 0.8745       0.2365167
#> 2 0.95 -0.1729238 0.5331445       <NA>         NA     NA              NA
#>   Prior_Distribution Prior_Location Prior_Scale        BF  Cohens_d
#> 1             cauchy              0   0.7071068 0.3868518        NA
#> 2               <NA>             NA          NA        NA 0.1977558
#>            Method
#> 1 Bayesian t-test
#> 2 Bayesian t-test

Created on 2021-10-20 by the reprex package (v2.0.1)

expected behavior

library(BayesFactor)
library(parameters)

ttestBF(mtcars$wt, mu = 3) |> parameters() |> as.data.frame()
#>     CI     CI_low   CI_high  Parameter     Median     pd ROPE_Percentage
#> 1 0.95 -0.5512066 0.1682628 Difference -0.2039748 0.8745       0.2365167
#> 2 0.95 -0.1729238 0.5331445 Cohens_d  0.1977558     NA              NA
#>   Prior_Distribution Prior_Location Prior_Scale        BF  
#> 1             cauchy              0   0.7071068 0.3868518       
#> 2               <NA>             NA          NA        NA 
#>            Method
#> 1 Bayesian t-test
#> 2 Bayesian t-test

The same for output from contingency table analysis where we return Cramer's V, Phi, etc...

@IndrajeetPatil IndrajeetPatil added the Consistency 🍏 🍎 Expected output across functions could be more similar label Oct 20, 2021
@strengejacke

This comment has been minimized.

@mattansb

This comment has been minimized.

@IndrajeetPatil
Copy link
Member Author

I would personally vote for One row for estimate + effect size, if and only if we also make the behavior of model_parameters() for BFBayesFactor objects similar to that for htest: return frequentist effect sizes only when the user asks for them.

So, for example, the following would return only Median + CI columns:

BayesFactor::ttestBF(mtcars$wt, mu = 3) |> 
  parameters::model_parameters()

While, this would return Median + CI and Cohen's d + CI columns:

BayesFactor::ttestBF(mtcars$wt, mu = 3) |> 
  parameters::model_parameters(standardized_d = TRUE)

@strengejacke
Copy link
Member

Currently, we merge the results from describe_posterior() and effectsize(), so new rows are added. Instead, we could just cbind(), or not?

@IndrajeetPatil

This comment has been minimized.

IndrajeetPatil added a commit that referenced this issue Oct 22, 2021
@IndrajeetPatil
Copy link
Member Author

What remains to be decided:
Should we bind the frequentist effect size plus CIs as additional rows or columns?

@strengejacke
Copy link
Member

I think we decided on columns, as we have for htests?

@IndrajeetPatil
Copy link
Member Author

Yes, that will work for t-tests, but the output from Contingency table analysis is an exception since it's not a one-row return:

library(BayesFactor)
library(parameters)
data(raceDolls)

contingencyTableBF(raceDolls, sampleType = "indepMulti", fixedMargin = "cols") |>
  parameters() |>
  as.data.frame()
#> Warning: Could not estimate a good default ROPE range. Using 'c(-0.1, 0.1)'.
#>     CI     CI_low   CI_high Parameter    Median pd ROPE_Percentage
#> 1 0.95 0.23724792 0.3778516 cell[1,1] 0.3088749  1      0.00000000
#> 2 0.95 0.08060299 0.1836630 cell[2,1] 0.1331295  1      0.09129177
#> 3 0.95 0.23433029 0.3741541 cell[1,2] 0.2979956  1      0.00000000
#> 4 0.95 0.19290301 0.3234083 cell[2,2] 0.2563099  1      0.00000000
#> 5   NA         NA        NA     Ratio        NA NA              NA
#> 6 0.95 0.02164193 0.3011174      <NA>        NA NA              NA
#>        Prior_Distribution Prior_Location Prior_Scale       BF Cramers_v
#> 1                    <NA>             NA          NA 1.814856        NA
#> 2                    <NA>             NA          NA 1.814856        NA
#> 3                    <NA>             NA          NA 1.814856        NA
#> 4                    <NA>             NA          NA 1.814856        NA
#> 5 independent multinomial              0           1 1.814856        NA
#> 6                    <NA>             NA          NA       NA 0.1621184
#>                                Method
#> 1 Bayesian contingency table analysis
#> 2 Bayesian contingency table analysis
#> 3 Bayesian contingency table analysis
#> 4 Bayesian contingency table analysis
#> 5 Bayesian contingency table analysis
#> 6 Bayesian contingency table analysis

Compare it to htest return:

library(parameters)

chisq.test(table(mtcars$am, mtcars$cyl)) |> 
  parameters(cramers_v = TRUE) |> 
  as.data.frame()
#> Warning in chisq.test(table(mtcars$am, mtcars$cyl)): Chi-squared approximation
#> may be incorrect
#>       Chi2 df Cramers_v   CI Cramers_CI_low Cramers_CI_high          p
#> 1 8.740733  2 0.5226355 0.95       0.180431               1 0.01264661
#>                       Method
#> 1 Pearson's Chi-squared test

Created on 2021-10-24 by the reprex package (v2.0.1)

I am wondering if we should be returning only Ratio parameter for Bayesian contingency table?

JASP also does the same:

image

@mattansb
Copy link
Member

I think it's redundant to return the posterior cell proportions/counts from the BFBayesFactor model. The analysis is intended to estimate and evaluate the dependency - which is captured not by the individual cells, but by the effect size(s).

@mattansb
Copy link
Member

(@IndrajeetPatil why do you as.data.frame() everything?? @strengejacke worked so hard on those beautify printing methods!)

@IndrajeetPatil
Copy link
Member Author

Sorry 🙈
As a user, I appreciate the pretty-printing, but as a developer, I want to see everything that exists in the dataframe.

I think it's redundant to return the posterior cell proportions/count

I think so, too! So if we decide to not return those rows, the output will be a one-row output, and it will be consistent with {htest} summary outputs!

@strengejacke
Copy link
Member

So from your example, that would be rows 5 and 6 that will be merged into one row, right?

@IndrajeetPatil
Copy link
Member Author

Exactly!

When cramers_v = NULL: only BF values
When TRUE: also frequentist effect sizes in the same row

@mattansb
Copy link
Member

(The effect size isn't frequentists - it is based on the posteriors!)

@IndrajeetPatil
Copy link
Member Author

Can we please fix this soon? I haven't been able to work on my package for these two weeks because of this issue.

The GitHub {effectsize} breaks {statsExpressions}, so the next update will also need to inform CRAN about that 🙈

I am really looking forward to the 1.0.0 release 😅

IndrajeetPatil added a commit that referenced this issue Nov 5, 2021
don't include proportions by default
#620 (comment)
@IndrajeetPatil
Copy link
Member Author

Since the outputs have changed since then and this issue has got too murky, I am closing it in favor of the cleaner #630 issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency 🍏 🍎 Expected output across functions could be more similar
Projects
None yet
Development

No branches or pull requests

3 participants