-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Updated badge criteria, and completed for Allen. Detailed uncertainit…
…ies in logbook. Plus a few other minor changes.
- Loading branch information
1 parent
ba875be
commit eaf2f53
Showing
4 changed files
with
102 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
title: "Day 5" | ||
author: "Amy Heather" | ||
date: "2024-06-04" | ||
categories: [guidelines] | ||
--- | ||
|
||
::: {.callout-note} | ||
|
||
Evaluated study against badges. | ||
|
||
::: | ||
|
||
## Work log | ||
|
||
### Badges | ||
|
||
Evaluating artefacts as in https://zenodo.org/records/3760626 (as has been copied into `original_study/`). | ||
|
||
Felt uncertain around these criteria: | ||
|
||
* `documentation_sufficient` - it was missing one package from environment - but it had an environment file, package versions, and a README explaining how to set up environment and referring to a notebook which runs the code that produces the output - hence, despite missing one, I felt it met this criteria | ||
* `documentation_careful` - I feel the documentation is minimal, but that it was actually still sufficient for running the model - and, as such, presume it might meet this criteria? Unless "reuse" refers to being able to use and change - in which case, there is not much guidance on how to change the parameters, only on how to run it matching up with the paper? In which case, I don't think it meets these. I think that interpretation also makes sense, as it then distinguishes from "sufficient"? | ||
* The criteria this relates to is from @association_for_computing_machinery_acm_artifact_2020 : "The artifacts associated with the paper are of a quality that significantly exceeds minimal functionality. That is, they have all the qualities of the Artifacts Evaluated – Functional level, but, in addition, they are very carefully documented and well-structured to the extent that reuse and repurposing is facilitated. In particular, norms and standards of the research community for artifacts of this type are strictly adhered to" | ||
* `execute` - yes, but with one change to environment. Do they allow changes? I've selected yes, as I'm assuming minor troubleshooting is allowed, and the distinction between this and reproduction is that this is just about running scripts, whilst reproduction is about getting sufficiently similar results. Execution required in ACM and IEEE criteria... | ||
* From @association_for_computing_machinery_acm_artifact_2020: "Included scripts and/or software used to generate the results in the associated paper can be successfully executed, and included data can be accessed and appropriately manipulated", with no mention of whether minor troubleshooting is allowed | ||
* From @institute_of_electrical_and_electronics_engineers_ieee_about_nodate: "runs to produce the outputs desired", with no mention of whether minor troubleshooting is allowed | ||
* `regenerated` - as above, unclear whether modification is allowed. For simplicity, have assumed definition we allowed - that you can troubleshoot - but this might not align with journals. Is this an issue? | ||
* `hour` - failed this, but we weren't trying to do this within an hour. If I had been asked to do that (rather than spend time reading and thinking beforehand), I anticipate I could've run it (without going on to then add seeds etc). Hence, does it pass this? | ||
* Psychological Science (@hardwicke_transparency_2023, @association_for_psychological_science_aps_psychological_2023) require reproduction within an hour, which I think implies some minor troubleshooting would be allowed? | ||
* Others don't specify either way | ||
* `documentation_readme` - I wouldn't say it explicitly meets this criteria, although it was simple enough that it could do it anyway | ||
|
||
## Suggested changes for protocol/template | ||
|
||
✅ = Made the change. | ||
|
||
Protocol: | ||
|
||
* Suggest that uncertainties on whether a badge criteria is met or not could be discussed within the STARS team |