Submission: pysketball #31

carlinakim · 2020-03-17T21:18:43Z

Submitting Author: Andres Pitta (@AndresPitta ), Kenneth Foo (@kfoofw ), Anand Shankar(@vanandsh ), Carlina Kim (@carlinakim )
Package Name: pysketball
One-Line Description of Package: A python package to explore ESPN online data through statistics and graphs.
Repository Link: pysketball
Version submitted: 2.0.0
Editor: Varada Kolhatkar (@kvarada)
Reviewer 1: Zhengyang Pan (@zoepan00)
Reviewer 2: Varada Kolhatkar (@kvarada)
Archive: TBD
Version accepted: TBD

Description

Include a brief paragraph describing what your package does:

This package works to scrape online tabular data from ESPN NBA website into a csv file. It also includes various functions to create graphs and statistical analysis for your interest (such as boxplots, player rankings by stats, and a summary statistics table).

Scope

Please indicate which category or categories this package falls under:
- Data retrieval
- Data extraction
- Data munging
- Data deposition
- Reproducibility
- Geospatial
- Education
- Data visualization*

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

Explain how the and why the package falls under these categories (briefly, 1-2 sentences):

Our package works to scrape online tabular data from the ESPN NBA website into a csv file.

Who is the target audience and what are scientific applications of this package?

The target audience is data scientists and NBA enthusiasts. They can use our package for interest in basketball, to predict playoffs, fantasy drafts, etc.

Are there other Python packages that accomplish the same thing? If so, how does yours differ?

nbastatR is another package that take data from other sources (NBA Stats API, Basketball Insiders, Basketball-Reference, HoopsHype, and RealGM), but no package that we currently know of takes data from ESPN NBA specifically.

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

does not violate the Terms of Service of any service it interacts with.
has an OSI approved license
contains a README with instructions for installing the development version.
includes documentation with examples for all functions.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Checks

The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
The package is deposited in a long-term repository with the DOI:

Note: Do not submit your package separately to JOSS

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

The text was updated successfully, but these errors were encountered:

zoepan00 · 2020-03-20T05:15:10Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s) demonstrating major functionality that runs successfully locally
Function Documentation: for all user-facing functions
Examples for all user-facing functions
Community guidelines including contribution guidelines in the README or CONTRIBUTING.
Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

Package has a README.md file in the root directory.

The README should include, from top to bottom:

The package name
Badges for continuous integration and test coverage, the badge for pyOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table should be more wide than high.
Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
Installation instructions
Any additional setup required (authentication tokens, etc)
Brief demonstration usage
Direction to more detailed documentation (e.g. your documentation files or website).
If applicable, how the package compares to other similar packages and/or how it relates to other packages
Citation information

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
Performance: Any performance claims of the software been confirmed.
Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

A short summary describing the high-level functionality of the software
Authors: A list of authors with their affiliations
A statement of need clearly stating problems the software is designed to solve and its target audience.
References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

Review Comments

Hi, I really enjoyed reviewing the package. Functions are great and intuitive to use. Here are my thoughts and some recommendations:

General:

I seem can't find the metadata for authors etc. information
Nice installation instructions and pointing out test might run a little longer
~~- I had the error when run pip install -i https://test.pypi.org/simple/ pysketball~~
(Sorry my mistake. It was the setting in my computer. Package installed successfully)
Nice edge test and unit test
Usage example:
- it would be nice to have the rendered outputs for nba_ranking, nba_boxplot, nba_team_stats
- Step 3 example nba_ranking(nba_2018_regular, 'PLAYER' , 'GP', top = 2, ascending = False, fun = 'mean') is not working

Function specific:

nba_scrapper: Nice user-friendly message for webscraping. Docstring for season_type argument is not updated. I think it would be either regular or postseason
nba_boxplot: position argument and the docstring for position is a little confusing. I thought I can put string of any columns names.
Error message helps for the case if both position and teams have values. I think it would be nice to also include an error message if I input other columns names to the position argument(for example an error message for position="TEAM"), or change the position argument type to boolean.
nba_ranking: Error message for fun argument is not updated. Currently it says The fun argument should be either var or mean
nba_team_stats: Nice summary options. It would be nice if you can round the numbers for mean and std columns.

Overall it's a very good package for NBA enthusiasts! If you have any questions about my review, please feel free to let me know.

kvarada · 2020-03-22T00:15:24Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s) demonstrating major functionality that runs successfully locally
Function Documentation: for all user-facing functions
Examples for all user-facing functions
Community guidelines including contribution guidelines in the README or CONTRIBUTING.
Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

Package has a README.md file in the root directory.

The README should include, from top to bottom:

The package name
Badges for continuous integration and test coverage, the badge for pyOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table should be more wide than high.
Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
Installation instructions
Any additional setup required (authentication tokens, etc)
Brief demonstration usage
Direction to more detailed documentation (e.g. your documentation files or website).
If applicable, how the package compares to other similar packages and/or how it relates to other packages
Citation information

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
Performance: Any performance claims of the software been confirmed.
Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

A short summary describing the high-level functionality of the software
Authors: A list of authors with their affiliations
A statement of need clearly stating problems the software is designed to solve and its target audience.
References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 1

Review Comments

Great job, team! I enjoyed playing around with your package. Here are some things I thought are really neat.

Good idea to include your dataset description file!
The warning to the user about the the wait time for the scraping function is helpful. (I wonder why it takes so long though.)
Helpful installation instructions for the chromedriver. (It gave me some security warning though. Might be worth mentioning that in the documentation.)

Here are some comments that might help improve your package.

Installation instructions

Installation directions might benefit from some clarification. I had to do the following in order to get it working.
- conda install selenium
- pip install python-semantic-release
Good that you include the instruction to put the chromedriver path in the system path variable. I guess you do not mean PATH_TO_CHROMEDRIVER.EXE in the following command for Linux and MacOS.

export PATH="<PATH_TO_CHROMEDRIVER.EXE>:$PATH"

BTW, on my MacBook Pro, it was installed under /usr/local/bin by default and I didn't haven't to add the path explicitly.

Tests

When I ran your tests, two tests failed with the following error.

AttributeError: module 'altair.vegalite' has no attribute 'v4'

README, documentation, and functions.

In your README, in the following usage example of the function nba_ranking, the 'PLAYER' column is undefined in the scraped CSV and so it gives an error. Replacing it with 'NAME' worked for me.

nba_ranking(nba_2018_regular, 'PLAYER' , 'GP', top = 2, ascending = False, fun = 'mean')

In your docstrings, be consistent with the default value of keyword arguments. The usual practice is to write (default = 10) after the data type.
Some of the usage examples in the docstrings are not complete. In many cases, the output is missing. For example, in your docstring of nba_ranking function, I would give an example with a toy CSV which has similar to your scraped CSV (e.g., NAME, GP columns). Also, showing the output of the example (the dataframe and not the plot) would help.

Possible extensions

Currently your scraper takes year (an integer) as an argument. A possible extension could be allowing passing a list of years.

carlinakim · 2020-03-25T22:03:31Z

Thank you for all the comments and feedback. Please refer here for addressed feedback changes and here for our latest release for our final milestone.

carlinakim assigned zoepan00 Mar 17, 2020

kvarada self-assigned this Mar 17, 2020

This was referenced Mar 23, 2020

Feedback andres UBC-MDS/pysketball#60

Merged

Review fixes UBC-MDS/pysketball#59

Merged

implementing changes to boxplot and altair tests changed UBC-MDS/pysketball#58

Merged

Peer Review UBC-MDS/pysketball#63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Submission: pysketball #31

Submission: pysketball #31

carlinakim commented Mar 17, 2020 •

edited by kfoofw

Loading

zoepan00 commented Mar 20, 2020 •

edited

Loading

kvarada commented Mar 22, 2020

carlinakim commented Mar 25, 2020 •

edited

Loading

Submission: pysketball #31

Submission: pysketball #31

Comments

carlinakim commented Mar 17, 2020 • edited by kfoofw Loading

Description

Scope

Technical checks

Publication options

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

Code of conduct

Editor and Review Templates

zoepan00 commented Mar 20, 2020 • edited Loading

Package Review

Documentation

Functionality

For packages co-submitting to JOSS

Final approval (post-review)

Review Comments

kvarada commented Mar 22, 2020

Package Review

Documentation

Functionality

For packages co-submitting to JOSS

Final approval (post-review)

Review Comments

Installation instructions

Tests

README, documentation, and functions.

Possible extensions

carlinakim commented Mar 25, 2020 • edited Loading

carlinakim commented Mar 17, 2020 •

edited by kfoofw

Loading

zoepan00 commented Mar 20, 2020 •

edited

Loading

carlinakim commented Mar 25, 2020 •

edited

Loading