Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phenopype: a phenotyping pipeline for Python #24

Closed
11 of 22 tasks
mluerig opened this issue May 4, 2020 · 65 comments
Closed
11 of 22 tasks

Phenopype: a phenotyping pipeline for Python #24

mluerig opened this issue May 4, 2020 · 65 comments

Comments

@mluerig
Copy link

mluerig commented May 4, 2020

Submitting Author: Moritz Lürig (@mluerig)
All current maintainers: Moritz Lürig (@mluerig)
Package Name: phenopype
One-Line Description of Package: a phenotyping pipeline for Python
Repository Link: https://github.com/phenopype/phenopype
Version submitted: 1.0.5
Editor: @jbencook
Reviewer 1: @agporto
Reviewer 2: @sdonoughe
Archive: DOI
JOSS DOI: N/A
Version accepted: 2.0.1
Date accepted (month/day/year): 05/13/2021

Edit: Bumped from 1.0.4 to 1.0.5 since submission.


Description

Phenopype is a high throughput phenotyping pipeline for Python to support biologists in extracting high dimensional phenotypic data from digital images. The program provides intuitive, high level computer vision functions for image preprocessing, segmentation, and feature extraction. Users can assemble their own function-stacks that can be stored in the human-readable yaml-format along with raw data and results, facilitating high throughput and full data reproducibility. Phenopype can be run from Python or from a Python Integrated Development Environment (IDE), like Spyder. Phenopype is designed to provide robust image analysis workflows that can be implemented with little or no Python experience.

Scope

  • Please indicate which category or categories this package falls under:
    • Data retrieval
    • Data extraction
    • Data munging
    • Data deposition
    • Reproducibility
    • Geospatial
    • Education
    • Data visualization*

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

Phenopype is designed to extract phenotypic data (https://en.wikipedia.org/wiki/Phenotype) of plants, animals, and other organisms from images and videos. By processing images in Python, through reproducible code and human readable configuration files, phenotyping becomes higher throughput and more reproducible than established GUI programs like "ImageJ".

  • Who is the target audience and what are scientific applications of this package?

Phenopype is intended for ecologists and evolutionary biologists that work with phenotypic data. Phenotypic data are an essential component of ecological and evolutionary research (https://www.nature.com/articles/nrg2897).

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

Only low level computer vision packages like OpenCV or scikit-image are out there that require a lot of configuring and a relatively deep understanding of computer vision and Python in general. Phenopype offers high level functions so that users can focus on the relevant analytic parts of image analysis.

  • Any other questions or issues we should be aware of?:

    • Full functionality is currently given for Windows and Linux builds. Under macOS, the high throughput workflow does not work due to two conflicting dependencies (watchdog and OpenCV)
    • Although phenopype attempts to minimize user interaction with images (e.g. setting ROIs or landmarks), this is a still a key component. I have done what I can to mimic user input for testing and CI, but I could not find a way to fake mouse movement and clicking, which is why the testing suite does not cover some of the lowlevel functions. Therefore, coverage is about 10% lower than what it could be with these interactions.
    • This is the presubmission inquiry: Phenopype: a phenotyping pipeline for Python #23
  • does not violate the Terms of Service of any service it interacts with.

  • has an OSI approved license

  • contains a README with instructions for installing the development version.

  • includes documentation with examples for all functions.

  • contains a vignette with examples of its essential functions and uses.

  • has a test suite.

  • has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: Do not submit your package separately to JOSS

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

note: original repo url: https://github.com/mleurig/phenopype

@mluerig
Copy link
Author

mluerig commented May 18, 2020

@lwasser could you let me know whether I should contact some potential reviewers, so whenever you are ready to initiate the review process they are ready as well? thanks

@lwasser
Copy link
Member

lwasser commented May 18, 2020

@mluerig my apologies - in my mind i was waiting for work on this package. let me see if i can find a guest editor for this one so I don't hold things up. i'll get back to you this week so we can kick off the review process!!

UPDATE: i just tweeted to see if we can find a guest editor to move this along!! hoping to hear back!!

@mluerig
Copy link
Author

mluerig commented May 18, 2020

Ok great! In any case, Seth Donoughe (@sdonoughe) and Arthur Porto (@agporto) could be potential reviewers without conflict of interest. I have not yet contacted them.

@lwasser
Copy link
Member

lwasser commented May 18, 2020

awesome. thank you @mluerig !!! if i don't hear back on twitter i'll reach out to some folks directly.

@lwasser
Copy link
Member

lwasser commented May 19, 2020

ok @mluerig i found another person who might be interested as well. @sdonoughe and @agporto are you still interested in reviewing this package? please let me know... i'm still looking for someone to serve as a guest editor. I also have a ping from Ben Cook who i don't have a github handle for yet who may be available for review or editor role!! All -- please let me know if you are available!! i will watch this thread for responses. And i'll keep looking for a guest editor in case everyone wants to review! thank you!!

@jbencook
Copy link

Hey All! I'm new here, but I'm available to review if it's helpful. I lead the Applied Machine Learning team at Hudl where we build computer vision services for sports video. I have some background in scientific computing and I'd love to help if I can.

@jbencook
Copy link

I'm also happy to serve as editor instead if that's where the need is

@lwasser
Copy link
Member

lwasser commented May 19, 2020

@jbencook ok awesome!! Let's see if our other two colleagues respond!! I can absolutely walk you through serving as an editor. thank you again for being so responsive and willing to help! and welcome to pyopensci!!

@agporto
Copy link

agporto commented May 19, 2020

Hi everyone. I am happy to review the package. Count me in

@lwasser
Copy link
Member

lwasser commented May 19, 2020

perfect. thank you @agporto ! i'll followup with instructions once.i see if we have a third person or not!! :) so appreciative of you all being willing to work with pyopensci!!

@sdonoughe
Copy link

sdonoughe commented May 19, 2020 via email

@lwasser
Copy link
Member

lwasser commented May 19, 2020

👋 there @sdonoughe sure... tell me a bit more. do you have enough programming background that you feel comfortable reviewing with another person as a team? You could also test out the functionality and documentation whereas @agporto could focus on the code / back end. please just let me know what works best for you and if you have questions you can also email me directly at [email protected] !! thank you for being willing to participate.

@lwasser
Copy link
Member

lwasser commented May 22, 2020

ok everyone... i think we are in good shape here. @jbencook if you are still open to serving as a guest editor, that would be great. I will be out next week. While i am out, @jlpalomino has agreed to answer any questions that you have about the review process. We just updated our contributing guide so i'm hoping better instructions will be available to you ONCE we figure out why github actions is acting funny when trying to deploy the updates!

@agporto your review is much appreciated! if you can focus more on the technical end of things - code, tests etc that would be great! @sdonoughe if you can please focus on the usability and docs + applications of the package for your needs / and general use cases in the community that would be excellent. I am working on also finding another person to support this review so you can ask questions as you go!! that can be me once i'm back on june 1.

@mluerig i just also want to confirm that you are NOT interested in submitting this to JOSS. i noticed that was not checked. This process requires that you write a short paper. but JOSS accepts our review once your package is approved so it is not a challenging additional step.

@jbencook the next steps are for you to do the following following our guidelines

All please look at our documentation here: https://www.pyopensci.org/contributing-guide/ we are working on improving it so any feedback is much appreciated!!

Please all get in touch with any questions or concerns (knowing i'll just be offline for one week if you need feedback from me!). Thank you ALL for supporting pyopensci!!

@jlpalomino
Copy link
Member

Thanks @lwasser for providing @jbencook with the next steps. Happy to answer any questions about the editor process, so feel free to reach out!

While we wait for the updates to the guideline docs to render, we can see the latest version here.

@jbencook
Copy link

jbencook commented May 23, 2020

Editor checks:

  • Fit: The package meets criteria for fit and overlap.
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has an OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

  • Some good notes on fit in the pre-submission issue.
  • LGPL license
  • Automated tests are passing after xvfb is installed on Ubuntu 18.04. A short dev guide in the README or in the docs would make it easier to get started. I usually use automated tests as an entry point to understand a new project. Making sure people know the environment requirements would be helpful. You could also do that by including a working Dockerfile.
  • I suggest explicitly saying which versions of Python are supported, again in the README or docs. Also, if you can support more than just 3.7 that's a win -- maybe 3.6-3.8?

Reviewers: @agporto, @sdonoughe
Due date: 2020-06-19

@jbencook
Copy link

Thanks @lwasser and good to meet you @jlpalomino!

I just added the editor checks. @agporto and @sdonoughe does 19 June work for the review deadline? I'll start watching the repo and work on my checks.

Also, @mluerig I noticed the repo has an LGPL license, but the README says MIT. Looks like both are approved OSI licenses, but they should probably match.

Looking forward to working with you all!

@mluerig
Copy link
Author

mluerig commented May 24, 2020

@lwasser @jbencook @jlpalomino yupp correct I am NOT planning to submit to JOSS, but to Methods in Ecology and Evolution. The manuscript is already written and can be made available to anyone involved in the review process, if that is needed and helpful

@mluerig
Copy link
Author

mluerig commented May 24, 2020

@jbencook I changed it to LGPL in the readme

@lwasser
Copy link
Member

lwasser commented Jun 3, 2020

hi all!! @jbencook just a note that i have slowed the process down here as we are looking for someone to support the second review. With pyopensci we want to mentor folks through reviews if it is their first. i however was out last week and i hadn't found someone prior to being out. So i'm on the hunt again and we will get back to you with a timeline for the second review! if you know of any people who have experience with code reviews that could support a second review please do say the word.

@lwasser
Copy link
Member

lwasser commented Jun 4, 2020

Update: I have found a person to support the second review and it looks like the second review can be done next week! we are in good shape still - cheers all! :)

@jbencook
Copy link

I've completed the editor checks - I think now we're just waiting on the reviews. How are things coming @agporto and @sdonoughe? Any questions so far?

@agporto
Copy link

agporto commented Jun 16, 2020

@jbencook I have started working on it but still have a lot of ground to cover.

@agporto
Copy link

agporto commented Jun 19, 2020

Hi everyone. I am going to need a few more days to finish the review (I expect Tuesday next week). I apologize for the delay. The source code of the package is more extensive than I initially thought.

@sdonoughe
Copy link

sdonoughe commented Jun 19, 2020 via email

@agporto
Copy link

agporto commented Jun 26, 2020

Package Review

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README

  • Installation instructions: for the development version of package and any non-standard dependencies in README

  • Vignette(s) demonstrating major functionality that runs successfully locally

  • Function Documentation: for all user-facing functions

  • Examples for all user-facing functions

  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.

    I could not find any reference to community guidelines
    
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name

  • Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.

  • Installation instructions

  • Any additional setup required (authentication tokens, etc)

  • Brief demonstration usage

      The readme itself does not contain usage demonstration, but it links to it. 
    
  • Direction to more detailed documentation (e.g. your documentation files or website).

  • If applicable, how the package compares to other similar packages and/or how it relates to other packages

  • Citation information

      The package is still unpublished, so it does not contain citation information. 
    

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider:

  • The documentation is easy to find and understand
  • The need for the package is clear
  • All functions have documentation and associated examples for use

Functionality

  • Installation: Installation succeeds as documented.

  • Functionality: Any functional claims of the software been confirmed.

    See review below for a few exceptions.
    
  • Performance: Any performance claims of the software been confirmed.

  • Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

    While not all functions contain automated tests, the exceptions are related to difficulties in replicating human input. 
    
  • Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

Estimated hours spent reviewing: 12 hours


Review Comments

I have now reviewed the package entitled “Phenopype: a phenotyping pipeline for Python”. This package represents a tremendous amount of work, especially considering that it has a single author/contributor.
I strongly support its publication at pyOpenSci. The python code is clearly written and easy to follow. I believe it will be a critically important package for biologists interested in computer vision, because it is not only comprehensive but also offers a lot of flexibility in terms of usage. I have only a few comments with regards to the usage and code.

General suggestions

The documentation is quite extensive and well organized, but I think one thing that would help new users is moving the Tutorial on GUI interactions (currently, Tutorial 5) to the beginning. This tutorial explains in more detail how to interact with OpenCV's High GUI module. In my experience with the current package, this module was by far the hardest one to interact with. It is somewhat unstable, crashing the jupyter kernel with some frequency, and its keystrokes are not intuitive. Given that this module is important in mediating user interactions with Phenopype, I think it would be useful to have a tutorial on it front and center. It might prevent new users from getting overly frusfrated and ending up abandoning the package. In the long run (beyond this review), it might be worth considering other options for GUI interaction.

Another aspect that might need some attention is the standardization of the root directory. When going through the tutorials, perhaps it would be useful to explicitely set the root directory, since the root directory jumps around a little across the different tutorials (sometimes it starts on one place, sometimes on another)

Tutorial and Examples - Code Errors

When running Tutorial 2 and going through the 'Low throughput workflow' I get the following error:

~/phenopype/utils_lowlevel.py in _auto_line_width(image, **kwargs)
    642 def _auto_line_width(image, **kwargs):
    643     factor = kwargs.get("factor", 0.001)
--> 644     image_height, image_width = image.shape[0:2]
    645     image_diagonal = (image_height + image_width) / 2
    646     line_width = max(int(factor * image_diagonal), 1)

AttributeError: 'NoneType' object has no attribute 'shape'

Based on an inspection of the source code, it seems that, when first initializing a container and trying to draw a contour, no image is getting assigned to container.canvas. As a consequence, any attempt to extract the image parameters (height, width) from an empty object leads to the error above. It could be an easy fix, but my attempts at tracing it back have not yielded anything other than what is described above.

Similarly, when running Tutorial 5 and going through the polylines example, I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-a3de9e316ea9> in <module>
      2                                  line_width=2,
      3                                  line_colour="green") 
----> 4 canvas = pp.visualization.draw_polylines(canvas, df_polylines=df_lines,
      5                                          line_width=2,
      6                                          line_colour="green")    

TypeError: draw_polylines() got an unexpected keyword argument 'df_polylines'

This one seems more straightforward, since the draw_polylines function definition does not contain the keyword in question. So, my assumption is that some recent changes to the code have removed the need to explicitly define df_polylines and the tutorial has not been updated to reflect those changes.

Code architecture - Suggestions

I have a suggestion in terms of code architecture that I do not expect the author to address in this review (but the author might find it useful in the long run). As of now, phenopype makes an explicit choice with regards to how to store data (both raw and processed data). Whether working with masks, polylines, landmarks, etc, phenopype will store that information (most of the time) in human-readable csv files or similar. It will also store different types of information (e.g., landmarks or masks) in different files using different functions. While this makes accessing the data for each specimen quite easy and in human-readable format, it also creates a lot of code duplication due to the need of having one function for each type of annotation. Given that this package will likely become a reference for computer vision in ecology and evolution, it will also set data standards for future work. Having followed the success of the COCO (https://cocodataset.org/#home) data format, I wonder if phenopype would benefit from having a similar architecture for data storage. By storing different types of annotations in a single JSON file, COCO has created an almost universal file format that makes it remarkably easy to train different machine learning algorithms using the same dataset (whether trying to predict landmarks or masks, for example). Again, I do not expect this change to occur, but I think it might be worth considering in the future.

Final remarks

In general, this is a well written and user friendly python package. My comments here are purely to provide the author with some (hopefully) useful feedback.

Other minor comments
  • Some images in the image folder have 'JPG' (capitalized) extensions. These do not match some of the example notebooks (e.g. Example 2). These should probably be corrected, so as not to throw out a 'File not found' error simply due to capitalization.

  • Hyperlink on Readme.md still links to MIT license.

@mluerig
Copy link
Author

mluerig commented Mar 14, 2021

Hi again!

So, below are the responses to your comments, I have updated phenopype accordingly, it is now version 2.0.0. Since v1.0.5, which was the one I originally submitted, I have changed a few things that how users the yaml configuration files (I moved to list structures for greater flexibility when configuring the high throughput workflow). Therefore, it is not backwards compatible and thus got a major version update. I also added some minor code changes in the background that improve code readability, but should not affect user experience. All changes are documented in the changelog.

I modified what I could, but some things (failing on macOS, HighGUI instability, changing how annotations are stored) were beyond the scope of these revisions (as both of you already acknowledged in your reviews). However I'd be very happy to have a chat with you about what can be done in the future.

Below I reply to your comments - hope they make sense, otherwise let me know.

Arthur

The documentation is quite extensive and well organized, but I think one thing that would help new users is moving the Tutorial on GUI interactions (currently, Tutorial 5) to the beginning. This tutorial explains in more detail how to interact with OpenCV's High GUI module. In my experience with the current package, this module was by far the hardest one to interact with. It is somewhat unstable, crashing the jupyter kernel with some frequency, and its keystrokes are not intuitive. Given that this module is important in mediating user interactions with Phenopype, I think it would be useful to have a tutorial on it front and center. It might prevent new users from getting overly frusfrated and ending up abandoning the package.

I followed your (and Seth's suggestion) and rearranged the tutorials: tutorial 2 now covers GUI interactions. I completely overhauled this tutorial and I now also discuss at the most common HighGUI issues in OpenCV, so that users are prepared and can try the workarounds I suggested. Indeed, in the long term I will have to consider alternatives to the OpenCV GUIs.

Another aspect that might need some attention is the standardization of the root directory. When going through the tutorials, perhaps it would be useful to explicitely set the root directory, since the root directory jumps around a little across the different tutorials (sometimes it starts on one place, sometimes on another)

This should be fixed: every tutorial now assumes the folder tutorials in the downloaded phenopype-master folder as root.

When running Tutorial 2 and going through the 'Low throughput workflow' I get the following error:

fixed: https://github.com/mluerig/phenopype/issues/7

Similarly, when running Tutorial 5 and going through the polylines example, I get the following error:

fixed: https://github.com/mluerig/phenopype/issues/8

Having followed the success of the COCO (https://cocodataset.org/#home) data format, I wonder if phenopype would benefit from having a similar architecture for data storage. By storing different types of annotations in a single JSON file, COCO has created an almost universal file format that makes it remarkably easy to train different machine learning algorithms using the same dataset (whether trying to predict landmarks or masks, for example). Again, I do not expect this change to occur, but I think it might be worth considering in the future.

This is something I would love to chat about. I had a look at the COCO json files, and I think I would like to follow a similar, albeit not identical structure (I think phenotypic data would need to have somewhat different structure than pure segmentation labels).

Seth

Brief demonstration usage: There currently isn't a demonstration usage directly in the README, but there are, of course, lots of links to demonstrations. If a demo is needed here, perhaps the best would be have code that opens an image, takes user mouse clicks to select a region, and then does, say, a threshold operation on the contents of that region, then displays and saves the results.

I followed your suggestion and added a gif to the readme that shows some basic image processing operations and the results in real time.

How the package compares to similar packagese: I think it would be useful to have an explanation for how this package relates to the most widely used Python image manipulation packages, and why a user might want to use phenopype in combination with one or more of those packages. As for the graphical interface for taking user input interactively, the closest Python tool that I'm aware of is napari, even though it is still in alpha. For users that do not have much coding experience, another option for them is to use Fiji/ImageJ, and then record a macro of their image processing steps. Here, phenopype offers functionality that is not present in any of these alternatives, so it would be useful to point that out too.

I used some of what you wrote below (actually, almost one-to-one, so you should get credit) in the review to point out that phenopype is a wrapper for OpenCV in combination with a project management ecosystem.

Citation information: If/when this package is described in a journal article, then that citation can be added, but even in the meantime I can see biologists wanting to try phenopype out and potentially even using it in publications. So I'd definitely recommend having a "how to cite phenopype" section.

I added citation instructions

One might wonder: why wouldn't a biologist just use opencv by itself? Or scikit-image, or any other Python library of image processing tools, for that matter? Answer: Because phenopype is aiming to augment, rather than replace, the utility of existing CV libraries for scientists measuring phenotypes. Put differently, phenopype is not a library of granular image processing functions. Instead, it is a set of wrappers around some of opencv's most broadly used functions, combined with a simplified GUI for taking user input; these collectively act as a scientist's bookkeeper during an image processing project. In doing so, it fills a major void in the scientific Python toolkit. As far as I know there is no other Python tool that enables the user to create a pipeline of image-processing steps that incorporates manual input along the way. phenopype is also (to my knowledge) unique in defining a terse, yet legible format for recording and replicating the parameters of such a pipeline.

I actually think that this text gives a nice summary of what phenopype is providing, and how it is distinct from other packages. I took the liberty of using some of your formulations, and incorporating them into the readme.

There are some tradeoffs involved in structuring the package this way. For one, the set of image-manipulation functions is impressive, but it understandably covers only a small subset of the image-parsing tasks that are regularly encountered in image analysis work for biological data. This is not a knock against phenotype in the slightest. But a user that wanted to integrate additional image-processing functions into the middle of a workflow would need to invest some time to work out how phenopype works under the hood if they wanted to take advantage of the replicability tools that are the core of the package.

So, I wonder if, in the long term, there might be a better way of wrapping OpenCV (and potentially scikit-image) functions, maybe even through means of a toolchain that does this in an automatic fashion. Would be great to hear your and @agporto's thoughts on this.

Notes on the Installation page itself:

I decluttered and streamlined the installation instructions:

  • I made it clear that managing environments via conda and installing phenopype via pip are the default. In addition, I provided some more information regarding Python installation and package management, as this appears to be a common source of confusion.
  • using Spyder is, of course, optional - I now state this in a tip-bubble. I do think that many potential users will be coming from R/Rstudio, and are accustomed to that user-experience and workflow, so I want to make this a strong recommendation.
  • I removed the options to install hotfixes and development versions, which carried over from my inexperience with the git workflow.
  • I streamlined the information regarding text editors.

On MacOS, I came across several issues with the opencv image viewer windows that are used as a part of the phenopype workflow:

I am currently not able to solve this, but in the long term I am planning to make phenopype usable on all platforms (including macOS) by either fixing the OpenCV GUI issues OR by finding other means of visualization and user interactions (napari for instance looks very promising).

Until then, to make it clear that phenopype doesn't run on macOS, I added a little section to the readme and documented your efforts by transferring them to a phenopype issue


Tutorials

Note that tutorial 5 got moved to position 2 - I commenting here in the new order

Tutorial 1

It is very thoughtful that this tutorial is here to introduce Python, and that there are also pointers to other introductions. My sense is that this intro might feel somewhat bewildering to a beginner. That is potentially okay, of course, since new things are almost always confusing. But I suspect the best way to make this tutorial maximally accessible would be to recruit an actual beginner and have them spend an hour or so trying to work through this section and then have them tell you how comfortable they feel about moving on to the next one. Since I'm not a beginner, I can't precisely point out where the hurdles are.

I added a little balloon to ask for feedback.

There are several references to the "Resources" page. This is great! But there are a so many links there and it is not clear where someone should start. I suggest that, in general, the author take a slightly more authoritative tone in introducing material. "If you are new to Python, do [THIS] beginner overview first, then come back." Then next: "If have not done much image processing with code, read through [THIS] walkthrough second." And finally: "Okay now you are ready to begin with the phenopype tutorials."

Fully agree - I added a Further Reading balloon to this and all other tutorials (and will do so for the examples).

I was somewhat puzzled as to the usefulness of opening lots of images at once. That is, since I couldn't think of why a user would want to do that, it might be better to demo a different functionlity as a way to illustrate the principles that are being taught for that section.

I sometimes like to look at several images to look for differences in traits, or to compare different thresholding outcomes - I thought it was useful, but if you really think it's confusing I can remove it.

I expected a link at the bottom of Tutorial 1 that takes me to Tutorial 2, but ultimately found one in the top nav bar.

fixed - now all tutorials have links to the next one

Tutorial 2

Consider putting this tutorial earlier in the series. Say, right after tutorial 1? I would have found it helpful to get these GUI basics before coming across image processing steps that involve the GUI in the other tutorials.

done - Tutorial 5 is now the second one

Also, my personal opinion is that the GUI---and the capability to do semi-automated analysis that includes human interaction at more or more steps---is probably going to be the part of phenopype that will be most exciting and potentially useful to the research community, since it is a lot harder to build one's own GUI than it is to record a sequence of image transformations. I could be wrong about this, but if I'm right, maybe consider that a bit when describing the package on its landing page.

good point - I mentioned it now in the reworked readme

Tutorial 3

I cleaned this tutorial up a bit, i.e. I cut back on intro-text, removed the "morphology" operations, I added instructions for what the expected interactions are, and streamlined the end about yaml and pype (pype-behavor section got moved to API)

"Fig. 2: Schematic of Phenopype’s prototyping workflow.": This figure only includes some of the functions that are in the code block immediately beneath it. I wondered: "what happened to the morphology operation?"

I removed morphology operation to keep it simple

Under Figure 2, in cell 10 there is an example sequence of image manipulation operations. As the user, I naturally ran the code, but there weren't any prompts in the tutorial or in the terminal telling me what to do. I think the reader needs something like:

I added some instructions at the very beginning to show users what they have to do

Note to the author: I had to press CTRL twice to end a selection, but this does not appear to be the intended behavior from the docstring of the create_mask function. In general, in my hands, the opencv keystroke codes occasionally vary across operating systems.

Since for this example only one polygon-mask is needed, I changed the text to "finish and close with Enter" to keep complexity low. Multi-polygon masks are also covered in Example 5

For the Low-Throughput section, the output from one code block for me didn't match what was shown in the tutorial. This was the one:

should be fixed

The contours.csv file was present in the _temp directory, but it looks like it isn't being found?

should be fixed

In the High-Throughput section, tell the user explicitly to press 'y' to make a new YAML file (I can see it in the terminal input/output lines, but I'm just trying to make it more approachable to a newcomer).

this has changed

The section on the pype is nice and detailed, but it feels like a lot to append to the end of this tutorial. I think the reader would be better served by splitting this tutorial into two, perhaps? It looks like this pype section sort of serves as an overview/preview of some of the material that also gets covered in the subsequent tutorial. That is fine! It just seemed like sort of a lot to digest in one go.

agreed - I sourced it out to the API. I first thought to make another tutorial or section only for the pype, but I think it's better to keep all relevant information in one place, so the API made more sense to me. let me know if this helps or whether it complicates things.

This behavior did not occur for me: "Editing and saving the opened configuration file will close the image window, run the functions in the control file, and show the updated results." I think this is possibly due to the ongoing misbehavior of the opencv image viewer window.

I think this is another macOS problem - sorry :-)

Tutorial 4

I imagine most readers would run the notebook from within the tutorials folder, which sets that as the working directory. It might confuse the reader that the working directory gets changed, which then breaks the relative path to the images dir in the tutorials directory. Is it necessary for the _temp folder to be where it is? Could it be in the tutorials folder itself, so that the reader doesn't have to update the path, and you could just go with the same relative paths for everyone?

fixed - now all tutorials and examples start from phenopype/tutorials, and save output under phenopype/tutorials/_temp

The block of text that begins "The remaining settings..." is pretty dense. I recommend using bullets or a table to organize it.

fixed

All the embedded images that illustrate how phenopype works are much appreciated. It is nice touch. The black windows in "Step 2", however, are hard to read. That is, it's not clear what those represent.

I added some text to show that they are the YAML config files

This cell runs indefinitely, even after I close the text file that gets opened automatically:

Those functions got reworked ... hope it works now

In the section on loading an existing project, again I had to update the path given in the tutorial, because it seems like the tutorial expects me to have the tutorial notebook to be running in a different directory? Or at least expects the working directory to be a different one. Here I had to update to:

should be fixed

Tutorial 5

I recommend omitting the option for users to load the previous project or make a new one, since it is a bit confusing with the commented out new directory line. Better, I think, to just have a minimal code block that generates an example project for this notebook

agreed, and done!

Again, the tutorial expects that we are working in a different directory than the one that actually has the ipynb files, I think? In the cell that follows this line: "We will use the image stickleback_side.jpg from the image folder in tutorials"

no, the images are just a subfolder - the working directory shouldn't move anymore

Specifically: when I run the function, I can click the two points for setting a reference length. Then I can enter a number (by the way, I would update the prompt to include the requested measurement units). But then it is not clear what I am I supposed to do next. The tutorial says to draw a rectangle around the scale image, but when I try to click on the image again the terminal prints: "already two points selected", and I don't know what to do next. Then pressing a key brings up red text on the image, and the terminal prints "- add column length". It is not clear what this means. I couldn't tell, in any case from the tutorial or the add_scale docstring. Bottom line: I think it would help to hit this function with a coat of polish, so to speak, and put a little more explanation in the tutorial and in its docstring.

I think this may have been a bug I fixed in the meantime - should be working now

Tutorial 6

This is super cool. My impression is that there are several other published tools out there for specifically tracking animals in arenas (used for fish, flies, ants, birds, migrating animals, etc). This review is seven years old, but it suggests that a lot options were available. Maybe all the alternatives are too hard to use, or not open, or whatever, but still I think it would be helpful to explain how this tracking tool fits into the existing ecosystem.

I agree that the tracking module at this point seems more of an appendix. My goal is that, at some point, all the core processing functions can be used on objects found in every single frame.

@lwasser
Copy link
Member

lwasser commented Mar 17, 2021

@mluerig thank you for this thoughtful response. I will come back to it shortly so i can really read it in more detail.!! I just wanted you to know that it's on my radar. More later.

@lwasser
Copy link
Member

lwasser commented Apr 30, 2021

@agporto @sdonoughe can you please have a look at @mluerig responses to your review above? are you satisfied with the responses? the one issue i see is the MAC compatibility issue. i am going to see how ropensci handles these types of issues in a review and will respond accordingly. That aside are you both content with t he review r esponse above OR are there additional items that you'd like to see addressed?
my apologies for the huge delay - a lot has been going on here locally (not work related) and it's really thrown me off.

@sdonoughe
Copy link

Three main notes

  1. The reorganization and revision of the tutorials is very impressive. Even before my review they were already quite clear, and now they're even better.
  2. As before, I couldn't get the image windows to behave as intended on MacOS, so I wasn't able to actually test all the functionality.
  3. That said, I read through the tutorials, and I found them very thoughtfully structured and explained. All I found were a few miscellaneous typos:

Tutorial 1

I think this is a perfectly pitched intro.

Tutorial 2

Overall, I find the updated tutorials to be extremely thorough and helpful, and this one in particular.

A small typo: "Currently Phenopype uses the standard HighGUI libraries that ship with the most recent precompiled opencv-contrib-python package that is listed on the, which is Qt for Linux and macOS, and Win32 UI on Windows."

Something is missing after "listed on the"

Tutorial 3

Small typo: "To learn how to analyze entir data sets with the high-throughput method, move on to the next Tutorial 4."

Tutorial 4

Small typo: "initiatlized"

Tutorial 5

I found this very useful! I think it would be helpful to add a small summary bit at the end, like you've done for the previous tutorials.

Tutorial 6

Small typo: "exlcude"
Small typo: "applied matters The following"

Bottom line

@lwasser I'm totally content with the review response.

Congratulations on a comprehensive and useful tool, @mluerig!

@agporto
Copy link

agporto commented May 10, 2021

@lwasser I found a few issues with the examples, but @mluerig has already fixed them.
I am completely satisfied with the responses provided by the author. This is an impressive tool and I hope biologists will rejoice.

@mluerig
Copy link
Author

mluerig commented May 10, 2021

thanks @sdonoughe and @agporto - I have released a new version (2.0.1) that addresses the typos and issues you mentioned.

@mluerig
Copy link
Author

mluerig commented May 13, 2021

pinging @jbencook and @lwasser. both reviewers are happy - so, what comes next? I would be very grateful if we could conclude the review process soonish, given that it started almost a year ago!

@lwasser
Copy link
Member

lwasser commented May 13, 2021

hi @mluerig
To begin - congratulations - 🎉 phenopype is now a part of the pyopensci ecosystem!

Next steps:


🎉 Thank you @jbencook @sdonoughe @agporto for supporting this review process! 😸

There are a few things left to do to wrap up this submission:

  • Add the badge for pyOpenSci peer-review to the README.md of . The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/issue-number)
  • Add phenopype to the pyOpenSci website. @mluerig , please open a PR to update this file: to add your package and name to the list of contributors
  • If you have time and are open to being listed on our website, please add yourselves to this file via a pr so we can list you on our website as contributors!

All -- if you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and our documentation in the contributing-guide. We have also been updating our documentation to improve the process so all feedback is appreciated!


@mluerig i'm so sorry that this has been a painfully slow process. I took some time to review the package, feedback and documentation given the tool doesn't work on MAC Os currently (for understandable reasons!). I wanted to be careful about setting precedent in the future. However, you have that clearly stated in your README file and also requested help / are interested in building out that functionality so we can consider that if a tool comes through with similar functionality for MAC.

The past year has been extremely challenging but starting in August / possibly July this project will have my full attention and future reviews will NOT be so slow. Thank you again for your patience. Please let me know if you have any questions / concerns! A review should not take a year so I have that commented noted already :)

@mluerig
Copy link
Author

mluerig commented May 13, 2021

@lwasser: done, done and done! Thanks for wrapping this up so quickly. I will submit this to Methods in Ecology and Evolution in the next few days. Any suggestion for how I should bring up this review there? There are options for rOpenSci, but no pyOpenSci - should I just write this in the cover letter?

Feedback: I would not change anything in the review process; there were templates, instructions and at the start the roles were distributed clearly. My only suggestion for improvement is, as you may have guessed, to work with compulsory deadlines, and a stronger role of the editor to make sure they are kept. Otherwise it gets swept away by the flood of other work we all are involved with - at least I would prefer to have a deadline, simply to mark this in my agenda and get my butt going :-).

In any case, I am extremely grateful for the time you spent on reviewing and editing, and I thank everyone involved. pyOpenSci is a great consortium, and I would submit here again!

@lwasser
Copy link
Member

lwasser commented May 13, 2021

wonderful. Thank you for this feedback.

And I hear you @mluerig i really do. I will note the need for deadlines and followup to ensure a more efficient process. Given I have your ear I have another question for you.
Right now, you have citation information in your readme. I'm curious if you had thought about a more formal DOI / citation approach through a tool such as zenodo ?

I'm asking because i've been thinking a lot about citations. JOSS provides one by default through their review process. But Zenodo can track releases and provide a citable DOI as well. Do you have any thoughts on this by chance?

i will merge both pr's shortly to our website. Thank you for doing that so quickly!

https://github.com/pyOpenSci/pyopensci.github.io/pull/62/files
pyOpenSci/pyopensci.github.io#63

@mluerig
Copy link
Author

mluerig commented May 14, 2021

the need for deadlines and followup

this can really just be a short post in the pyOpenSci issue to remind everyone

@lwasser good point about citations. phenopype both has a zenodo DOI and a bioRxiv DOI - would it make sense to provide the zenodo reference in the manuscript, and then place the manuscript reference in the readme?

I will submit this to Methods in Ecology and Evolution in the next few days. Any suggestion for how I should bring up this review there? There are options for rOpenSci, but no pyOpenSci - should I just write this in the cover letter?

do you have any thoughts on this?

@lwasser
Copy link
Member

lwasser commented May 17, 2021

hi @mluerig my apologies i was out of the office Friday.

I think for now it does make sense to include the zenodo reference in the manuscript. And then absolutely include the manuscript reference once you have it. If the article is really methods based (i presume it is given the journal you are submitting to) then i would think once you have a final journal DOI, that citation will be preferred over zenodo.

Yes, rOpenSci has a relationship with them. I have yet to build that relationship but can and would like to once we really get moving on the project in August. I need to speak with Karthik more about how that relationship works to better understand it. Yes please add this to your cover letter. And if there is a person that I can reach out to there, please feel free to forward their information to me OR give them my contact information. How does that sound?

@lwasser lwasser closed this as completed May 17, 2021
@mluerig
Copy link
Author

mluerig commented May 18, 2021

yupp that sounds good - I proceeded with these suggestions!

@lwasser
Copy link
Member

lwasser commented May 18, 2021 via email

@lwasser
Copy link
Member

lwasser commented Sep 15, 2022

hey 👋 @mluerig @jbencook @agporto @sdonoughe ! I hope that you are all well. I am reaching out here to all reviewers and maintainers about pyOpenSci now that i am working full time on the project (read more here). We have a survey that we'd like for you to fill out so we can:

🔗 HERE IS THE SURVEY LINK 🔗

  1. invite you to our slack channel to participate in our community (if you wish to join - no worries if that is not how you prefer to communicate / participate).
  2. Collect information from you about how we can improve our review process and also better serve maintainers.
    The survey should take about 10 minutes to complete depending upon how much you decide to write. This information will help us greatly as we make decisions about how pyOpenSci grows and serves the community. Thank you so much in advance for filling it out.

NOTE: this is different from the form designed for reviewers to sign up to review.
If there are other maintainers for this project, please ping them here and ask them to fill out the survey as well. It is important that we ensure packages are supported long term or sunsetted with sufficient communication to users. Thus we will check in with maintainers annually about maintenance.

Thank you in advance for doing this and supporting pyOpenSci.

@lwasser
Copy link
Member

lwasser commented Sep 28, 2022

hey there @mluerig @jbencook @agporto 👋 Just a friendly reminder to take 5-10 minutes to fill out our survey . We really appreciate it. Thank you in advance for helping us by filling out the survey!! 🙌 Moritz, it's really important for us to collect information from our maintainers so that we can both stay in touch with you regarding package maintenance and also support you through time. We really appreciate your time in filling this out. Also are you the sole maintainer of this package? if not, please have your co-maintainers also fill it out and please list them here as well. Many thanks in advance!

✨ Seth thank you so much for taking the time to fill it out 🙌

🔗 HERE IS THE SURVEY LINK 🔗

@mluerig
Copy link
Author

mluerig commented Oct 3, 2022 via email

@lwasser
Copy link
Member

lwasser commented Mar 6, 2023

hey there @mluerig i've going through and updating our package catalog and noticed that there is no pyopensci badge on your readme for phenopype! this only came up because the original url here was updated to phenopype/phenopype so i had to search for ya!

i wondered if you were open to adding the reviewed badge to your project readme? Many thanks!

[![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/24)

@mluerig
Copy link
Author

mluerig commented Mar 8, 2023 via email

@lwasser
Copy link
Member

lwasser commented Mar 8, 2023 via email

@NickleDave
Copy link
Contributor

Hi @mluerig just checking back as we're updating metadata for our pyOpenSci sprint at PyCon -- are you back from vacation and if so could you add our badge when you get a chance?

#24 (comment)

We're happy to make a PR to phenophype adding it to the README if that would help!

@mluerig
Copy link
Author

mluerig commented Apr 28, 2023

finally done!

@NickleDave
Copy link
Contributor

Great, thanks so much @mluerig!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: pyos-accepted
Development

No branches or pull requests

8 participants