-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group-10-Sanityze #6
Comments
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
It should be
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 2.5 hours Review CommentsThe design of the package is quite robust as it can support different types of spotter. i.e. future enhancement can be easily achieved. Nonetheless, there are a few observations from me:
|
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
It should be
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 2.5 hours Review Comments
|
Good suggestion, we will look into it. |
Regarding the following two badges, I believe we have at the top of the README.md
|
Agreed. My apologies. It's overlooked by me.
|
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 2 Review CommentsOverall this is a very useful and interesting package, great job! Here are a few pointers:
from sanityze import Cleanser, EmailSpotter However this leads to an import error, as other reviewers have mentioned. Based on your script file names in from sanityze.cleanser import Cleanser
from sanityze.spotters import EmailSpotter This way I have been able to successfully import the classes and functions.
|
Submitting Author: Name @tzoght
All current maintainers: ( @tzoght, @xXJohamXx, @caesarw0)
Package Name: Sanityze
One-Line Description of Package: This package provides utilities to spot and redact PII from Pandas data frames.
Repository Link: https://github.com/UBC-MDS/sanityze
Version submitted: 0.1.3
Editor: @Fdandrea
Reviewer 1: Chenyang Wang
Reviewer 2: Markus Nam
Reviewer 3: Marian Agyby
Reviewer 4: Chen Li
Description
Data scientists often need to remove or redact Personal Identifiable Information (PII) from their data. This package provides utilities to spot and redact PII from Pandas data frames.
PII can be used to uniquely identify a person. This includes names, addresses, credit card numbers, phone numbers, email addresses, and social security numbers, and therefore regulatory bodies such as the European Union's General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) require that PII be removed or redacted from data sets before they are shared an further processed.
Scope
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Data munging: This package provides utilities to spot and redact PII from Pandas data frames.
Who is the target audience and what are scientific applications of this package?
Data scientists working with files that contain PII that will be used for analysis.
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
Yes, the closet Python package in functionality to sanityze is scrubadub which is a package for finding and removing PII from text. However, the package is not designed to work with Pandas data frames, or other data structures.
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Note: Do not submit your package separately to JOSS
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
Editor and Review Templates
The editor template can be found here.
The review template can be found here.
The text was updated successfully, but these errors were encountered: