-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JOSS REVIEW] Main feedback #93
Comments
Edited to be clear that |
Overview
I commend the author on their effort, the package appears to be well written and documented, with comprehensive tests and exhaustive use of continuous integration. I installed the software successfully, and ran a few statistical tests myself. I could see myself using this software in the future. At this stage, there are a number of relatively minor revisions that should be made to the package and manuscript. Below, I list the changes which would allow me to recommend acceptance. Manuscript and DocumentationI've made some minor typographical corrections and commentary to the manuscript in this attached PDF (10.21105.joss.03236_WithComments.pdf). Below are more general or substantial issues to be addressed.
Package summary and need statementThere is currently inconsistency across sources regarding the exact purpose and scope of the package which needs to be resolved.
These various sources emphasize somewhat different package features. For instance, the statement of need does not There is a need for these various descriptions to become more homogeneous (consistent) with each other. This will
Note that I have seen mixed answers regarding whether an explicit statement of need is required if it is already You should mention the specific user-base (target audience) this package would appeal to in the statement of need (if applicable).
I also want to flag that it was not immediately clear to me what an "expression" was, and I initially had to read
Comparison to Other PackagesI had a few issues with the contents of this section which require revision.
Tidy Dataframes from Statistical AnalysisIn the Tidy section starting from L47, the examples demonstrate that effectively only a single string needs to be changed in the
The output code is not needed in the manuscript. If you believe it is crucial to show somewhere the dataframe format, I would dedicate a specific section and description to it, rather than show it for all functions by default.
You do not need full reproducible examples (setting seed; library loading) in the paper, particularly as these are already contained on the README.
I would also specify the
Expressions for PlotsFigure 2 appears very useful and important, and immediately makes clear what the package offers. I suggest it appear first, with a description of the philosophy (i.e., gold standard), and then show the plot afterwards as an example output. You also can probably remove the generating code here, given the only unique argument is one string
ReferencesThe APA reference is appearing incorrectly, you can address this LaTeX such that it treats American Psychological Association as a full singular name by wrapping in
Other Feedback (related to package)
Please let me know if you have any questions or require clarification. Thank you for inviting me to review this work. |
Making one minor change: Adding |
Hi @IndrajeetPatil, great job on addressing the comments above. I still have a minor issue remaining with the manuscript. In referencing to providing a concrete example on L13-21, I think your reasoning is sensible and examples can be omitted (see your response below).
However, I still believe the manuscript would benefit from addressing my second point ("If you are suggesting that in order to..."). To clarify this point, I was not necessarily stating that you need explicit examples; but that the text on L29-31 is unclear. My assumed interpretation is that: In many packages, the statistical test objects returned by the test functions If my explanation above is correct, this is a different meaning than what is implied by saying that "functions from the same package can return different types", with the type "depending on the function", which implies either multiple function return values or multiple functions for a test object. I believe it is important to revise and make this small sentence clear, because it is major strength of your package! Further information Basically I am assuming what you are getting at, is that for a single object (let's take data(sleep)
myobject = BayesFactor::ttestBF(x = sleep$extra[sleep$group==1], y = sleep$extra[sleep$group==2], paired=TRUE)
bayes_factor = BayesFactor::extractBF(myobject)
sample_size = nrow(attributes(myobject)$data)
attributes(myobject)$bayesFactor[["bf"]]
ps_median = parameters::model_parameters(myobject)$Median And once you factor in switching between multiple approaches (robust/parametric), it becomes completely unmanageable. |
Thanks, Michael, for pushing me to be clearer here and for you suggested rephrasing. This is what I have changed the paragraph to. Do you find this more reasonable?
|
Perfect @IndrajeetPatil ! |
Submission: openjournals/joss-reviews#3236
Title: statsExpressions: R Package for Tidy Dataframes and Expressions with Statistical Details
Review Outcome: Minor Revisions
Overview
statsExpressions
is an R package for quickly running various forms of statistical tests using a single framework (e.g.,consistent syntax) and offers users "pre-formatted statistical expressions" for annotating plots, as well as consistent
dataframe outputs. The package covers a comprehensive range of commonly used statistical tests (at least in the
behavioral sciences).
I commend the author on their effort, the package appears to be well written and documented, with comprehensive tests and exhaustive use of continuous integration. I installed the software successfully, and ran a few statistical tests myself. I could see myself using this software in the future.
At this stage, there are a number of relatively minor revisions that should be made to the package and manuscript. Below, I list the changes which would allow me to recommend acceptance.
Manuscript and Documentation
I've made some minor typographical corrections and commentary to the manuscript in this attached PDF (10.21105.joss.03236_WithComments.pdf). Below are more general or substantial issues to be addressed.
Package summary and need statement
There is currently inconsistency across sources regarding the exact purpose and scope of the package which needs to be resolved.
These various sources emphasize somewhat different package features. For instance, the statement of need does not
mention expressions or consistent dataframe outputs, while the vignette description downplays what appears to be a key selling point of the package: consistent syntax and outputs.
There is a need for these various descriptions to become more homogeneous (consistent) with each other. This will
require updating in documentation also (README and Vignettes). I foresee that a hybrid of all three descriptions that clearly explains the purpose of the package is warranted.
Note that I have seen mixed answers regarding whether an explicit statement of need is required if it is already
covered in the summary (in the case you experience a lot of overlap). @mikldk
You should mention the specific user-base (target audience) this package would appeal to in the statement of need (if applicable).
I also want to flag that it was not immediately clear to me what an "expression" was, and I initially had to read
further into the README to know what the term "expression" was referring to. I recommend placing some added description in parentheses when first introducing the term expression (in both manuscript and README). For instance, "expression (i.e., a pre-formatted in-text statistical result)".
Comparison to Other Packages
I had a few issues with the contents of this section which require revision.
Only the last paragraph (L43-46) is directly discussing comparisons to other packages. The remaining content could be moved into a separate subsection, or alternatively, have a new header (e.g., "Package overview").
The language in L23-36 is somewhat speculative and encroaching informality in tone, particularly from L32. I suggest revising for objective tone, noting the following:
statsExpressions
covers, which of these have an object that return different data types for similar statistical tests?statsExpressions
…. "Tidy Dataframes from Statistical Analysis
In the Tidy section starting from L47, the examples demonstrate that effectively only a single string needs to be changed in the
statsExpressions::two_sample_test
function to use a parametric/robust t-test. This section can be simplified and reduced by removing the code examples (which verge on being API examples), and stating directly in-line exactly what the functionality is. An example replacement could be: "A user can simply modify one argument in thestatsExpressions::two_sample_test
function to change between a robust, parametric or Bayesian analysis.".The output code is not needed in the manuscript. If you believe it is crucial to show somewhere the dataframe format, I would dedicate a specific section and description to it, rather than show it for all functions by default.
You do not need full reproducible examples (setting seed; library loading) in the paper, particularly as these are already contained on the README.
I would also specify the
dplyr
grouped example in text (i.e., say it can be done, mention the function used), and point the reader to the vignettes or README where further details can be obtained.Expressions for Plots
Figure 2 appears very useful and important, and immediately makes clear what the package offers. I suggest it appear first, with a description of the philosophy (i.e., gold standard), and then show the plot afterwards as an example output. You also can probably remove the generating code here, given the only unique argument is one string
df$expression[1]
.References
The APA reference is appearing incorrectly, you can address this LaTeX such that it treats American Psychological Association as a full singular name by wrapping in
{{name}}
. Also ensuring you specify which APA version (6 or 7), rather than the original text from 1985.Other Feedback (related to package)
if or how much data was omitted due to this. I did wonder if it would be beneficial to have this information in the
dataframe (or warning), as it seems particularly important for grouped analyses. That said, I also do not see this as a concern precluding publication. I leave this to the digression of the author to decide.
…
notation, meaning additional arguments to the underlying packages are not supported. This itself is not an issue, but I do think it would be worthwhile mentioning in the manuscript or documentation that some arguments (likely only to be used by for specific/advanced reasons) are not supported in the package (as a footnote perhaps).Please let me know if you have any questions or require clarification. Thank you for inviting me to review this work.
The text was updated successfully, but these errors were encountered: