Skip to content
This repository has been archived by the owner on Mar 17, 2023. It is now read-only.

Installation process #83

Closed
afrubin opened this issue Jul 12, 2020 · 8 comments
Closed

Installation process #83

afrubin opened this issue Jul 12, 2020 · 8 comments

Comments

@afrubin
Copy link

afrubin commented Jul 12, 2020

Currently the cerebra installation process requires some system packages to be installed. These are all relatively standard requirements that many users will already have. It's good to see that users have the option of running this in Docker. Providing a Nix expression is another option.

However, the documentation should be improved with an explanation of why these system packages are needed. They seem to be requirements for cerebra's dependencies rather than the package itself. This transparency is especially important if the dependencies-of-dependencies change in the future.

There are no installation instructions for Windows users, although there are is a large population of bioinformatics specialists who use Windows as their primary environment and might benefit from this software. I'm guessing this is because the package won't install on Windows due to the pysam dependency, but it should be explicitly stated.

The installation instructions for the package itself are not ideal. The primary method of installation (especially for research software with pinned requirements) should be a virtual environment. The primary installation method suggested is a system-level installation.

There are instructions for installing in a virtual environment via conda, but there is no reason I could see to require conda rather than using standard library's venv. The virtual environment instructions also install the package in editable mode, which is ideal for development but not what most users would be expecting.

The installation section could be split into a "for users" section and a "for developers" section, where the latter describes an editable installation with test requirements and instructions for running unit tests locally.

openjournals/joss-reviews#2432

@lincoln-harris
Copy link
Collaborator

lincoln-harris commented Jul 15, 2020

as far as I can tell pysam just doesnt support windows: pysam-developers/pysam#575. We dont directly use pysam -- its imported by a package that we call (though I cant figure out which one).

bc we're not calling it directly its not a matter of converting to a pure python BAM parser like bamnostic.

are there any potential workarounds to this?

tagging @betteridiot

@afrubin
Copy link
Author

afrubin commented Jul 16, 2020

I think that if you clearly document that you don't support Windows installs due to the pysam dependency that should be sufficient. Explicitly stating the nature of the incompatibility is important in case pysam changes or the package that requires pysam changes in the future.

There are a lot of useful bioinformatics tools that won't run/compile on Windows, so most users are familiar with navigating this situation. Windows Subsystem for Linux should be a good workaround for users who want to run things locally, and very few academic HPC environments are Windows-based.

@betteridiot
Copy link

I appreciate the tag @lincoln-harris . Attempting to be platform agnostic is a lofty goal when working in the genomics space. I see that you have removed pysam from your requirements.txt, but the dependency still exists in your environment.yml as well.

In developing bamnostic, I quickly realized how prevalent the norm of a linux-like environment is. It is perfectly valid to state that (as of now) your software works in Linux-based environments. This is because many of your dependencies are inextricably tied to linux-based software (pathos, cython (which is touch and go on Windows environments at the best of times), and even colorama can cause some issues).

If the intention of the package is to analyze VCF files, then it is perfectly fine to rely on these other tools that have shown a long history of maintenance/support.

I agree with @afrubin that the expectation of a linux-like development environment is completely acceptable-so long as you state as such. An easy place to do this is in the setup.py using classifiers for your program.

@afrubin
Copy link
Author

afrubin commented Jul 31, 2020

The new installation section in the README is greatly improved, but still I have some comments and concerns.

You should mention the Windows limitations at the start of the installation section. It may actually be possible to run the software using docker or WSL (maybe you or one of your co-authors can test this?) and it's fair for you to state that these solutions exist but are not officially supported.

Instructions for users and developers are mixed together. I recommend that you split this into two suitably-titled subsections, possibly with a third subsection describing the additional dependencies required by all pip install methods.

The docker build seems to be working nicely and you might even want to suggest this as the recommended way to run the software, with pip installation instructions provided for those who don't want to or can't run docker.

User instructions for pip installation should not install the package from source in editable mode. Instead, this should have the commands for creating a venv or conda environment and then installing from pypi.

Developer instructions begin by reminding the reader to create their own fork and clone that. These instructions only work for you and other contributors to the main repo.

Instructions for system-wide pypi install should be removed, since it is almost always better to use a virtual environment and this creates a pitfall for novice users.

+1 for adding the classifiers to setup.py as suggested by @betteridiot. I should point out that there is a Topic :: Scientific/Engineering :: Bio-Informatics that might be a better fit than Information Technology.

@lincoln-harris
Copy link
Collaborator

this is addressed with #109

@afrubin
Copy link
Author

afrubin commented Aug 27, 2020

I am still confused by the installation instructions for end users. Why do you recommend installing from source via git? It should be as simple as creating the virtual environment and using pip install to get the latest cerebra from PyPI.

@lincoln-harris
Copy link
Collaborator

ok so you think

conda create -n cerebra python=3.7
conda activate cerebra
pip install cerebra

is a better way of structuring it?

@afrubin
Copy link
Author

afrubin commented Sep 16, 2020

Yes, thanks. These conda instructions and the venv instructions in the README are much better!

@afrubin afrubin closed this as completed Sep 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants