-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curating ranked papers for relevance #1351
Conversation
Thanks @nalikapalayoor! One minor question also to @nagutm is that I think most of the irrelevant entries are fundamentally because a paper is not about an identifiers resource (or about a provider for an identifiers resource). Are we overusing the
|
Also, after we clarify the curation tags, we should also lint this file to make sure it's sorted, as the test suggests |
I think that the I think that the The I think these revised definitions/tags would help to make the distribution of tags for irrelevant papers more balanced and less one-sided as they currently are. |
@nagutm is it straightforward to make these changes to the table overall (i.e., past |
Changing the tags for all previous papers tagged as |
Hi, thank you for looking at this! I can go in and reclassify the papers associated with this PR based on those redefined tags. I will also relint the file. I will make these changes and push the changes later today! |
I have updated the tags for the papers determined to be irrelevant based on the new definitions discussed: not_identifiers_resource: papers with software repos, data visualization tools, externally linked resources, as well as databases not related to defining new identifiers non_resource_papers: self-contained papers that don't link to external resources irrelevant_other: all other irrelevant papers I also linted the currated_papers.tsv file after updating these tags. |
This looks good, we just need to update all the places where the |
See #1359 for these updated changes |
This pull request updates the Curation Relevance vocabulary to - Expand the definition of `not_identifier_resource` - Replace the `no_website` tag with `non_resource_paper` See #1351 (comment) for a full explanation of what these tags mean and why they were implemented.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1351 +/- ##
==========================================
+ Coverage 42.51% 46.58% +4.07%
==========================================
Files 117 118 +1
Lines 8327 8297 -30
Branches 1963 1364 -599
==========================================
+ Hits 3540 3865 +325
+ Misses 4582 4245 -337
+ Partials 205 187 -18 ☔ View full report in Codecov by Sentry. |
This pull request adds the curations for papers that I have found to be irrelevant to the Bioregistry. These papers came from the potentially relevant paper ranking table. I will curate relevant entries in separate PRs.