Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS Review #7

Closed
11 tasks done
cheginit opened this issue Jan 24, 2023 · 6 comments
Closed
11 tasks done

JOSS Review #7

cheginit opened this issue Jan 24, 2023 · 6 comments

Comments

@cheginit
Copy link

cheginit commented Jan 24, 2023

I am one of the reviewers of your JOSS submission. I am going to include my comments and suggestions in this issue and will edit until all issues are resolved.

Comments on the accompanying paper:

  • The paper is missing some important sections:
    • No discussion on the current state of the field and the knowledge gap that this package is aiming to fill.
    • The discussion on the software capabilities are very limited. You need to provide more details on high-level functionalities of the software, such as, Urban processes that it can model, required input data, number of calibrated vs non-calibrated parameters, and its applicability range and limitations.
  • Line 42 should be removed.

Comment on the package:

  • [x Suggestions (non-blockers):
    • Including a (long) TOC in the README is not helpful. It'd be better if it includes contents similar to the About section of the software's website and some figures demonstrating its capabilities. Essentially, it should give users a quick glance to the content of the repository.
    • Since the software includes a test suite, it's easy to set up a testing GitHub Actions workflow.
    • Using linters could help improve the code quality and catch possible errors. For example, pre-commit can be a good starting point.
    • In my opinion, the way that input data are structured and users have to make all components manually is not user-friendly at all. Since there are many components that required user input, using a config file is much more user-friendly. Many modeling packages follow this input structure, such as SWMM. It's easy to implement such a config file-based input using yaml files in Python.
  • Issues (blockers):
    • There is no clear and direct guidance or discussion in the documentation regarding the required input data and pre-processing them in a form that the model accepts. There are some hints in the examples, but there should be a dedicated section for processing raw input data from different formats to the required format.
    • Considering that this package combines several models, at least a brief discussion on their assumptions and governing equations are required so a user can make informed decisions regarding the necessary components that need to be included for getting acceptable results. A very detailed and technical discussion is not required like a journal paper, but at least a user should be able to have a good understanding of the model assumptions by reading the documentation.
@barneydobson
Copy link
Collaborator

Hi, @cheginit. Thanks so much for these comments, they have been incredibly helpful. I just thought I would provide a short update on these tasks. Note that I have not yet been able to meet with co-authors so I haven't focussed on paper edits. Changes are currently on the https://github.com/barneydobson/wsi/tree/joss_revisions branch.

  • TOC in readme edited according to suggestion.
  • Testing added in a GitHub Actions workflow
  • Linters I need to discuss with collaborators, I might create this as a GitHub issue to address in the future since it seems less urgent but would be quite a bit of work to address now.
  • Agreed RE input data, this has been a super helpful suggestion. I have started doing exactly as you suggest (a yaml based config file that contains parameters and data addresses of input data). Though I still need to do further testing on this. I will aim to make a tutorial on this, but need to carefully consider what to include, because I think this could also address your comment on required input data/format.
  • Agree with your comment RE guidance around input data, which I hope to address as above. I think generalising pre-processing will be too exhaustive a task, as there are many different modules included in WSIMOD, and each country/region/water company will have differently formatted raw data needed to inform each model element. Our intention was not for this WSIMOD package to be a geospatial data processing package (though we may aim to work on this in the future). If you have any detailed suggestions here though I would be grateful!
  • I could possibly incorporate model assumptions/equations into the paper via a table format, though it will just be the assumptions/equations at this present time (I will discuss the merits of this with the co-authors). Perhaps preferable, and to ensure continuity without creating an unmanageable documentation maintanence burden, I will ensure that the docstrings for different model components list key assumptions and require in the contributing guidlines (which I will be adding on the suggestion of another reviewer) that new model components do the same.

@cheginit
Copy link
Author

Thanks for starting to address the suggestions/issues.

Regarding the changes related to a user-friendly input, you can take your time since it's just a suggestion and not a publication blocker.

I think generalising pre-processing will be too exhaustive a task, as there are many different modules included in WSIMOD, and each country/region/water company will have differently formatted raw data needed to inform each model element.

I agree with this, and that's not what I asked for. I was referring to adding clear discussion and guidance on the input data structure that WSIMOD accepts, such as variable names, units, and data types. Essentially, a user should be able to read this "input data processing" section of the documentation and transform their data to the format that WSIMOD can read.

I will ensure that the docstrings for different model components list key assumptions and require in the contributing guidlines (which I will be adding on the suggestion of another reviewer) that new model components do the same.

This will be helpful, but docstring alone is not enough, a user should be able to read the software documentation and understand the key assumptions behind model components. This is necessary for community contribution. I didn't explicitly add a discussion on the missing contributing guidelines and code of conduct, since @jlarsen-usgs included it in #6.

@barneydobson
Copy link
Collaborator

@cheginit Just to double check with you before I implement it. I can add the key assumptions and input data to the model components' docstrings. See below an example for the Land node:

image

And I can write a script so that the documentation pulls all of these together on one page, so people have the ability to quickly identify what the model can do, what are its assumptions, and input data requirements. As I see it this would address both your blockers about input data guidance and key assumptions. But I wanted to check first before implementing it

@cheginit
Copy link
Author

Yes, this is very helpful and will address those issues.

You can also add citations. For example, you can check out one of my packages called PyDaymet and its associated lines in its source code, to see how referencing docstrings would look like. For the method parameter, I cite the paper that I used for the supported methods.

@barneydobson
Copy link
Collaborator

Hi @cheginit , thanks so much for the review, it's been incredibly helpful for us. Below I will go through the open tasks and provide links to where they have been addressed and add text for any responses:

  • Paper (literature)
    • No discussion on the current state of the field and the knowledge gap that this package is aiming to fill.
    • The discussion on the software capabilities are very limited. You need to provide more details on high-level functionalities of the software, such as, Urban processes that it can model, required input data, number of calibrated vs non-calibrated parameters, and its applicability range and limitations.

Addressed in:
https://github.com/barneydobson/wsi/blob/main/docs/paper/paper.md
https://barneydobson.github.io/wsi/paper/paper/
Note that this is a reasonably small amount of text added, we are currently writing up a full research paper with a more complete explanation of theory. Although RE capabilities, more information provided below

  • Including a (long) TOC in the README is not helpful. It'd be better if it includes contents similar to the About section of the software's website and some figures demonstrating its capabilities. Essentially, it should give users a quick glance to the content of the repository.
    Updated https://github.com/barneydobson/wsi/blob/main/README.md

  • Using linters could help improve the code quality and catch possible errors. For example, pre-commit can be a good starting point.
    New issue, to be addressed in future: Implement linters #9

  • Input data

    • In my opinion, the way that input data are structured and users have to make all components manually is not user-friendly at all. Since there are many components that required user input, using a config file is much more user-friendly. Many modeling packages follow this input structure, such as SWMM. It's easy to implement such a config file-based input using yaml files in Python.
    • There is no clear and direct guidance or discussion in the documentation regarding the required input data and pre-processing them in a form that the model accepts. There are some hints in the examples, but there should be a dedicated section for processing raw input data from different formats to the required format.

Addressed as specified in:
https://github.com/barneydobson/wsi/blob/main/wsimod/orchestration/model.py
https://barneydobson.github.io/wsi/wsimod_models/ (tutorial)
https://barneydobson.github.io/wsi/component-library/ (data requirements)

  • Considering that this package combines several models, at least a brief discussion on their assumptions and governing equations are required so a user can make informed decisions regarding the necessary components that need to be included for getting acceptable results. A very detailed and technical discussion is not required like a journal paper, but at least a user should be able to have a good understanding of the model assumptions by reading the documentation.
    Addressed as discussed above: https://barneydobson.github.io/wsi/component-library/

@cheginit
Copy link
Author

Thanks for taking the time to address the issues. For the most part, you have addressed my concerns, there's still room for improvement, but in my opinion, the package is useful at its current state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants