Installation process #83

afrubin · 2020-07-12T07:56:55Z

Currently the cerebra installation process requires some system packages to be installed. These are all relatively standard requirements that many users will already have. It's good to see that users have the option of running this in Docker. Providing a Nix expression is another option.

However, the documentation should be improved with an explanation of why these system packages are needed. They seem to be requirements for cerebra's dependencies rather than the package itself. This transparency is especially important if the dependencies-of-dependencies change in the future.

There are no installation instructions for Windows users, although there are is a large population of bioinformatics specialists who use Windows as their primary environment and might benefit from this software. I'm guessing this is because the package won't install on Windows due to the pysam dependency, but it should be explicitly stated.

The installation instructions for the package itself are not ideal. The primary method of installation (especially for research software with pinned requirements) should be a virtual environment. The primary installation method suggested is a system-level installation.

There are instructions for installing in a virtual environment via conda, but there is no reason I could see to require conda rather than using standard library's venv. The virtual environment instructions also install the package in editable mode, which is ideal for development but not what most users would be expecting.

The installation section could be split into a "for users" section and a "for developers" section, where the latter describes an editable installation with test requirements and instructions for running unit tests locally.

openjournals/joss-reviews#2432

lincoln-harris · 2020-07-15T21:20:19Z

as far as I can tell pysam just doesnt support windows: pysam-developers/pysam#575. We dont directly use pysam -- its imported by a package that we call (though I cant figure out which one).

bc we're not calling it directly its not a matter of converting to a pure python BAM parser like bamnostic.

are there any potential workarounds to this?

tagging @betteridiot

afrubin · 2020-07-16T03:37:25Z

I think that if you clearly document that you don't support Windows installs due to the pysam dependency that should be sufficient. Explicitly stating the nature of the incompatibility is important in case pysam changes or the package that requires pysam changes in the future.

There are a lot of useful bioinformatics tools that won't run/compile on Windows, so most users are familiar with navigating this situation. Windows Subsystem for Linux should be a good workaround for users who want to run things locally, and very few academic HPC environments are Windows-based.

betteridiot · 2020-07-16T15:12:21Z

I appreciate the tag @lincoln-harris . Attempting to be platform agnostic is a lofty goal when working in the genomics space. I see that you have removed pysam from your requirements.txt, but the dependency still exists in your environment.yml as well.

In developing bamnostic, I quickly realized how prevalent the norm of a linux-like environment is. It is perfectly valid to state that (as of now) your software works in Linux-based environments. This is because many of your dependencies are inextricably tied to linux-based software (pathos, cython (which is touch and go on Windows environments at the best of times), and even colorama can cause some issues).

If the intention of the package is to analyze VCF files, then it is perfectly fine to rely on these other tools that have shown a long history of maintenance/support.

I agree with @afrubin that the expectation of a linux-like development environment is completely acceptable-so long as you state as such. An easy place to do this is in the setup.py using classifiers for your program.

afrubin · 2020-07-31T22:36:39Z

The new installation section in the README is greatly improved, but still I have some comments and concerns.

You should mention the Windows limitations at the start of the installation section. It may actually be possible to run the software using docker or WSL (maybe you or one of your co-authors can test this?) and it's fair for you to state that these solutions exist but are not officially supported.

Instructions for users and developers are mixed together. I recommend that you split this into two suitably-titled subsections, possibly with a third subsection describing the additional dependencies required by all pip install methods.

The docker build seems to be working nicely and you might even want to suggest this as the recommended way to run the software, with pip installation instructions provided for those who don't want to or can't run docker.

User instructions for pip installation should not install the package from source in editable mode. Instead, this should have the commands for creating a venv or conda environment and then installing from pypi.

Developer instructions begin by reminding the reader to create their own fork and clone that. These instructions only work for you and other contributors to the main repo.

Instructions for system-wide pypi install should be removed, since it is almost always better to use a virtual environment and this creates a pitfall for novice users.

+1 for adding the classifiers to setup.py as suggested by @betteridiot. I should point out that there is a Topic :: Scientific/Engineering :: Bio-Informatics that might be a better fit than Information Technology.

lincoln-harris · 2020-08-26T22:58:45Z

this is addressed with #109

afrubin · 2020-08-27T02:50:22Z

I am still confused by the installation instructions for end users. Why do you recommend installing from source via git? It should be as simple as creating the virtual environment and using pip install to get the latest cerebra from PyPI.

lincoln-harris · 2020-09-15T22:35:14Z

ok so you think

conda create -n cerebra python=3.7
conda activate cerebra
pip install cerebra

is a better way of structuring it?

afrubin · 2020-09-16T03:04:55Z

Yes, thanks. These conda instructions and the venv instructions in the README are much better!

afrubin mentioned this issue Jul 12, 2020

[REVIEW]: cerebra: A tool for fast and accurate summarizing of variant calling format (VCF) files openjournals/joss-reviews#2432

Closed

38 tasks

lincoln-harris mentioned this issue Jul 16, 2020

pysam build fails on python3.8 #86

Open

lincoln-harris mentioned this issue Aug 25, 2020

JOSS review comments -- redux #109

Merged

20 tasks

afrubin closed this as completed Sep 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation process #83

Installation process #83

afrubin commented Jul 12, 2020 •

edited

Loading

lincoln-harris commented Jul 15, 2020 •

edited

Loading

afrubin commented Jul 16, 2020

betteridiot commented Jul 16, 2020

afrubin commented Jul 31, 2020

lincoln-harris commented Aug 26, 2020

afrubin commented Aug 27, 2020

lincoln-harris commented Sep 15, 2020

afrubin commented Sep 16, 2020

Installation process #83

Installation process #83

Comments

afrubin commented Jul 12, 2020 • edited Loading

lincoln-harris commented Jul 15, 2020 • edited Loading

afrubin commented Jul 16, 2020

betteridiot commented Jul 16, 2020

afrubin commented Jul 31, 2020

lincoln-harris commented Aug 26, 2020

afrubin commented Aug 27, 2020

lincoln-harris commented Sep 15, 2020

afrubin commented Sep 16, 2020

afrubin commented Jul 12, 2020 •

edited

Loading

lincoln-harris commented Jul 15, 2020 •

edited

Loading