Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework #1541

Closed
35 of 36 tasks
whedon opened this issue Jul 2, 2019 · 59 comments
Assignees
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review

Comments

@whedon
Copy link

whedon commented Jul 2, 2019

Submitting author: @DominiqueMakowski (Dominique Makowski)
Repository: https://github.com/easystats/bayestestR
Version: 0.2.5
Editor: @cMadan
Reviewer: @paul-buerkner, @tjmahr
Archive: 10.5281/zenodo.3361605

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/1d180e6004a0dd1e6b235eb24fe66276"><img src="http://joss.theoj.org/papers/1d180e6004a0dd1e6b235eb24fe66276/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/1d180e6004a0dd1e6b235eb24fe66276/status.svg)](http://joss.theoj.org/papers/1d180e6004a0dd1e6b235eb24fe66276)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@paul-buerkner & @tjmahr, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @cMadan know.

Please try and complete your review in the next two weeks

Review checklist for @paul-buerkner

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: 0.2.5
  • Authorship: Has the submitting author (@DominiqueMakowski) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @tjmahr

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: 0.2.5
  • Authorship: Has the submitting author (@DominiqueMakowski) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
@whedon
Copy link
Author

whedon commented Jul 2, 2019

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @paul-buerkner, @tjmahr it looks like you're currently assigned to review this paper 🎉.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

@whedon
Copy link
Author

whedon commented Jul 2, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Jul 2, 2019

@paul-buerkner
Copy link

I have just finished my review and have very few minor comments.

  • All the citations in the software paper should list a DOI (if they have one) as per reviewer checklist above.
  • The definition of the maximum a-posteriori value as "the most probable value" is not entirely correct for continuous parameters. Instead it is the value with the highest density (which still have probability zero for continuous parameters).
  • you seem to use the abbreviation MAP both for the maximum a posteriori and the maximum a-priori value. This will likely confuse readers.

@cMadan
Copy link
Member

cMadan commented Jul 8, 2019

Thank you for the thorough review, @paul-buerkner!

@DominiqueMakowski
Copy link

Dear @paul-buerkner, thanks a lot for your comments! We addressed them in this PR:

Reviewer 1 (@paul-buerkner)

  • All the citations in the software paper should list a DOI (if they have one) as per reviewer checklist above.

  • Added DOIs for all refs but the following (none was found):

    • see package (here)
    • rstanarm package
    • BayesFactor package
    • Mill's "Objective Bayesian Precise Hypothesis Testing"
    • Multiple Comparisons with BayesFactor, Part 2 (Morey's blog, 2015)
    • Practical bayesian optimization of machine learning algorithms (Snoek's proceedings, 2012)
    • Jeffrey's Theory of Probability book
  • The definition of the maximum a-posteriori value as "the most probable value" is not entirely correct for continuous parameters. Instead it is the value with the highest density (which still have probability zero for continuous parameters).

  • We changed its definition to the following:

"find the Highest Maximum A Posteriori (MAP) estimate of a posterior, i.e., the value associated with the highest probability density (the "peak" of the posterior distribution). In other words, it is an estimation of the mode for continuous parameters."

  • you seem to use the abbreviation MAP both for the maximum a posteriori and the maximum a-priori value. This will likely confuse readers.
  • This was likely an error and was addressed by replacing instances of the latter by the former (the maximum a posteriori).

Please note that there are still some references for which we did not find a DOI: we continue our search in the meantime. We hope you will find the revised version satisfying ☺️

@paul-buerkner
Copy link

Looks good to me.

@cMadan
Copy link
Member

cMadan commented Jul 21, 2019

@tjmahr, are you still able to review this submission?

@paul-buerkner, thanks again!!

@tjmahr
Copy link

tjmahr commented Jul 22, 2019

I would like to review it but won't be able to look at it until next week.

@cMadan
Copy link
Member

cMadan commented Jul 24, 2019

@tjmahr, no problem, thanks for following up!

@strengejacke
Copy link

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jul 25, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Jul 25, 2019

@BarryDeCicco
Copy link

BarryDeCicco commented Jul 25, 2019 via email

@cMadan
Copy link
Member

cMadan commented Jul 28, 2019

@BarryDeCicco, you should unsubscribe from the GitHub notifications for this repository, or otherwise change your notification settings. See the second post (#1541 (comment)) in this review thread (same information is available in every review thread).

@tjmahr
Copy link

tjmahr commented Jul 29, 2019

I worked with version 0.2.4, the most recent on the GitHub repository, although the version mentioned in the checklist is 0.2.3

Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

I do not see anything for contribution guidelines. I would add a CONTRIBUTING.md file.

Comments

The amount of documentation to support the package is very generous. For my review, however, I focused on the README and the software paper.

This package borrows a lot from the Kruschke school of Bayesian inference. HDI and ROPE are a distinct feature of his tutorials and textbook; one does not see them very often in works by, say, Stan developers. Therefore, this package is tremendously useful for those reading his tutorials, or for people like me, who occasionally will quantify an effect with a ROPE percentage or who want to learn about using Bayes factors.

The ROPE procedure and other indices use the highest density interval. Is there any option to use an equal-tailed interval?

README

I was confused by the README. When I see R code followed immediately by a plot, I assume that the R code produced the plot. But the functions in the README produce text output (which is not included in the README) and they do not produce plots. I would include the text output of the R code. I would also note that the figures there are diagrams meant to illustrate the statistical concept. The software paper does a good job of making this point clear.

I don't see a demo for eti().

Moreover, 89 is the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2015).

The primeness of 89 is not important. McElreath's choice of 89 in Statistical Rethinking text was to illustrate that interval widths are arbitrary and that there is nothing special about 95 or 90 compared to 89.

equivalence_test() a Test for Practical Equivalence based on the

Needs a verb.

I don't understand the Bayes Factor diagram in the README.

a range of -0.05 to -0.05.

This range is the same number twice.

Savage-Dickey density ratio is computed

Should this have a reference?

Probability of a Value

density_at() isn't doing computing a probability. I would remove estimate_probability() and probability_at() because they are just aliases for density functions and density is the more appropriate term.

I don't see a demo for the area under the curve functions.

Documentation

A ROPE-based p of 97% means that there is a probability of .97 that a parameter (described by its posterior distribution) is outside the ROPE. On the contrary, a ROPE-based p of -97% means that there is a probability of .97 that the parameter is inside the ROPE. (R/p_rope.R)

I don't understand how a p-value can get a negative percentage. What would a 0% p-value mean? If this index doesn't act like a familiar p-value, it is probably the wrong name for it.

Software paper

The first mention of bayestestR in the second paragraph is awkward. Specifically, the text shifts from talking about common ways to describe effects in a Bayesian framework to talking about the features of the package:

Additionally, bayestestR also focuses on implementing a Bayesian null-hypothesis testing framework ...

It's great that the output of point_estimate() prints out mean/median/map to make it clear what value is being used.

However, bayestestR functions also include plotting capabilities via the see package (Lüdecke, Waggoner, Ben-Shachar, & Makowski, 2019).

I don't see any plotting examples in the README or documentation pages. I see plotting methods in the NAMESPACE.

I think it would worthwhile to demonstrate that the functions demoed in the article also work on models. For example, I can call p_direction() and bayesfactor_parameters() directly on a model and get the results for each parameter. One of the key contributions of this package is that it can make these indices immediately available to users who are comfortable with brms and rstanarm.

Proofreading concerns

Every reference of Kruschke spells out the author's full name.

Figure 2 should be referenced in the text.

(i.e. the difference

Needs a comma.

The Bayesian framework allows to neatly delineate

Allows one.

developped

Nevertheless, in the absence of user-provided values, bayestestR will automatically find an appropriate range

Nevertheless doesn't make sense.

bases on prior and posterior samples

Based.

The system for building the references section should protect some words from being converted to lowercase. (In LaTeX, this is done with {}). Right now, for example, it says Brms: An r package for bayesian multilevel models using stan but I would make sure that the system produces brms: An R package for Bayesian multilevel models using Stan.

@DominiqueMakowski
Copy link

Dear @tjmahr, thanks a lot for your thorough review. We have addressed them in this PR:

Features

  • The ROPE procedure and other indices use the highest density interval. Is there any option to use an equal-tailed interval?

We added a ci_method argument in rope() to allow for ETI to be used.

  • density_at() isn't doing computing a probability. I would remove estimate_probability() and probability_at() because they are just aliases for density functions and density is the more appropriate term.

We removed the two aliases with probability. We also clarified in the documentation that it is pertaining to the value of the density function.

README

  • I do not see anything for contribution guidelines. I would add a CONTRIBUTING.md file.

We added a contributing file.

  • I would include the text output of the R code [in the README].

We have additionally included the text output from the R Code in the README.

  • I would also note that the figures there are diagrams meant to illustrate the statistical concept.

We have added a sentence to point out that these figures are meant to illustrate the statistical concepts, and pointed the readers to the see-package, where plotting-methods are provided:

"The following figures are meant to illustrate the (statistical) concepts behind the functions. However, for most functions, plot()-methods are available from the see-package."

  • I don't see a demo for eti().

We have added a demo for eti() to the README.

  • "Moreover, 89 is the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2015)." The primeness of 89 is not important. McElreath's choice of 89 in Statistical Rethinking text was to illustrate that interval widths are arbitrary and that there is nothing special about 95 or 90 compared to 89.

We have rephrased the sentence to emphasize the idea behind choosing the 89 as CI-level:

"Moreover, 89 indicates the arbitrariness of interval limits - its only remarkable property is being the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2015)"

Furthermore, although already implied in the paper, we also emphasized the point of arbitrariness on the paper as well.

  • "equivalence_test() a Test for Practical Equivalence based on the" Needs a verb.

We added a verb to the sentence:

equivalence_test() is a Test for Practical Equivalence based on the...

  • I don't understand the Bayes Factor diagram in the README.

We have added a paragraph to explain the figure more in detail:

The lollipops represent the density of a point-null on the prior distribution (the blue lollipop on the dotted distribution) and on the posterior distribution (the red lollipop on the yellow distribution). The ratio between the two - the Svage-Dickey ratio - indicates the degree by which the mass of the parameter distribution has shifted away from or closer to the null.

  • "a range of -0.05 to -0.05." This range is the same number twice.

Thanks, we fixed the typo!

  • "Savage-Dickey density ratio is computed" Should this have a reference?

Thanks, we have added a reference, and furthermore added a reference-list to the end of the README.

  • I don't see a demo for the area under the curve functions.

See comment from TJ below, no longer necessary.

Documentation

  • ROPE-based p documentation: I don't understand how a p-value can get a negative percentage. What would a 0% p-value mean? If this index doesn't act like a familiar p-value, it is probably the wrong name for it.

We have clarified the documentation of this index and underlined its exploratory nature. We also made clear that the negative sign reflects the direction of the index (wether in corresponds to significance or non-significance), rather than actual negative probabilities, which indeed make no sense.

The ROPE-based \emph{p}-value is an exploratory and non-validated index representing the 
maximum percentage of \link[=hdi]{HDI} that does not contain (or is entirely contained, in which 
case the value is prefixed with a negative sign), in the negligible values space defined by the 
\link[=rope]{ROPE}. It differs from the ROPE percentage, \emph{i.e.}, from the proportion of a given
 CI in the ROPE, as it represents the maximum CI values needed to reach a ROPE proportion of 0\% 
or 100\%. Whether the index reflects the ROPE reaching 0\% or 100\% is indicated through the sign:
 a negative sign is added to indicate that the probability corresponds to the probability of a not 
significant effect (a percentage in ROPE of 100\%). For instance, a ROPE-based \emph{p} of 97\% 
means that there is a probability of .97 that a parameter (described by its posterior distribution) is 
outside the ROPE. In other words, the 97\% HDI is the maximum HDI level for which the percentage 
in ROPE is 0\%. On the contrary, a ROPE-based p of -97\% indicates that there is a probability of .97 
that the parameter is inside the ROPE (percentage in ROPE of 100\%). A value close to 0\% would 
indicate that the mode of the distribution falls perfectly at the edge of the ROPE, in which case the 
percentage of HDI needed to be on either side of the ROPE becomes infinitely small. Negative values 
do not refer to negative values \emph{per se}, simply indicating that the value corresponds to non-
significance rather than significance.

Paper

  • The first mention of bayestestR in the second paragraph is awkward. Specifically, the text shifts from talking about common ways to describe effects in a Bayesian framework to talking about the features of the package: "Additionally, bayestestR also focuses on implementing a Bayesian null-hypothesis testing framework ..."
  • "However, bayestestR functions also include plotting capabilities via the see package (Lüdecke, Waggoner, Ben-Shachar, & Makowski, 2019).": I don't see any plotting examples in the README or documentation pages. I see plotting methods in the NAMESPACE.
  • I think it would worthwhile to demonstrate that the functions demoed in the article also work on models. For example, I can call p_direction() and bayesfactor_parameters() directly on a model and get the results for each parameter.

Proofreading

  • Every reference of Kruschke spells out the author's full name.

Hopefully fixed (changed the name in the .bib file). However, I am not sure why would that happen. One possible reason is disambiguation, yet all instances were written the same way...

  • Figure 2 should be referenced in the text.
  • "(i.e. the difference": Needs a comma.
  • "The Bayesian framework allows to neatly delineate": Allows one.
  • developped
  • Nevertheless, in the absence of user-provided values, bayestestR will automatically find an appropriate range: Nevertheless doesn't make sense.
  • bases on prior and posterior samples: Based
  • The system for building the references section should protect some words from being converted to lowercase. (In LaTeX, this is done with {}). Right now, for example, it says Brms: An r package for bayesian multilevel models using stan but I would make sure that the system produces brms: An R package for Bayesian multilevel models using Stan.

Typos have been fixed.

We hope you will be satisfied with the revisions ☺️

@strengejacke
Copy link

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jul 30, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Jul 30, 2019

@tjmahr
Copy link

tjmahr commented Jul 30, 2019

We have added a paragraph to explain the figure more in detail:

The lollipops represent the density of a point-null on the prior distribution (the blue lollipop on the dotted distribution) and on the posterior distribution (the red lollipop on the yellow distribution). The ratio between the two - the Svage-Dickey ratio - indicates the degree by which the mass of the parameter distribution has shifted away from or closer to the null.

Just fix the typo in Savage-Dickey, and I'm satisfied.

@tjmahr
Copy link

tjmahr commented Jul 30, 2019

Also, thanks, in particular, for adding the ETI functionality for the ROPE methods.

@danielskatz
Copy link

👋 @DominiqueMakowski - please see easystats/bayestestR#217 and merge it - also carefully check the rest of the bib to make sure I didn't miss anything else (e.g., words in lower case that should be in upper case, odd periods at the end of titles, etc.)

@strengejacke
Copy link

@danielskatz Thanks for the thorough reading! I have read the paper and checked all hyperlinks, everything looks good so far.

I'll go through the paper and check the references now.

@strengejacke
Copy link

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Aug 12, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Aug 12, 2019

@danielskatz
Copy link

danielskatz commented Aug 12, 2019

In addition, here are some changes for the paper. (in a PR that I forgot to add but has now been merged) :)

@strengejacke
Copy link

Thanks for the language editing! i have gone through the references and found some minor changes. I will hand over to @DominiqueMakowski for the final check.

@danielskatz
Copy link

Ok - please let me know when you & @DominiqueMakowski are done, then we can proceed.

@DominiqueMakowski
Copy link

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Aug 13, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Aug 13, 2019

@DominiqueMakowski
Copy link

@danielskatz Thanks a lot for your changes!
@cMadan I think we are good to go ☺️

@danielskatz
Copy link

@whedon accept

@whedon
Copy link
Author

whedon commented Aug 13, 2019

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Aug 13, 2019

Check final proof 👉 openjournals/joss-papers#901

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#901, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

@danielskatz
Copy link

@whedon accept deposit=true

@whedon
Copy link
Author

whedon commented Aug 13, 2019

Doing it live! Attempting automated processing of paper acceptance...

@whedon
Copy link
Author

whedon commented Aug 13, 2019

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

@whedon
Copy link
Author

whedon commented Aug 13, 2019

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited 👉 Creating pull request for 10.21105.joss.01541 joss-papers#902
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01541
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

@whedon
Copy link
Author

whedon commented Aug 13, 2019

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01541/status.svg)](https://doi.org/10.21105/joss.01541)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01541">
  <img src="https://joss.theoj.org/papers/10.21105/joss.01541/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.01541/status.svg
   :target: https://doi.org/10.21105/joss.01541

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@DominiqueMakowski
Copy link

@cMadan @danielskatz @paul-buerkner @tjmahr Thanks a lot again for your time and contributions! 😍

@ajstewartlang
Copy link

@paul-buerkner
Copy link

Hey! I am currently swamped with reviews that I still have to complete so I cannot accept new ones right now unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review
Projects
None yet
Development

No branches or pull requests

9 participants