Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docker #236

Merged
merged 26 commits into from
Apr 9, 2017
Merged

Add docker #236

merged 26 commits into from
Apr 9, 2017

Conversation

blezek
Copy link
Collaborator

@blezek blezek commented Apr 5, 2017

Added two extra Dockerfiles and associated documentation.

fedorov and others added 17 commits February 23, 2017 09:23
A PyRadiomics Docker based on Jupyter notebook Dockers
(https://github.com/jupyter/docker-stacks), specifically jupyter/datascience-notebook
located here: https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook

The Docker exposes the /data volume.  The work directory, where the Jupyter
notebook starts, has a symlink to /data, making the host directory easily
accessable.
* datascience-docker:
  Rename Dockerfile to Dockerfile.notebook in prep for merging with Docker PR
  Docker for PyRadiomics based on a Jupyter notebook
  STYLE: fix syntax
  ENH: check if feature class name is valid
  STYL: Remove unused variable
  BUG: Fix pyradiomicsbatch error when using python 3
  BUG: Fix error when image or mask is not loaded correctly
* master: (42 commits)
  ENH: Small bugfix in commandlinebatch.py
  DOCS: Document adding a progress reporter
  ENH: Simplify specifying a progress reporter
  ENH: Remove dependency for `tqdm`
  ENH: Add pandas example
  ENH: Allow variable number of columns in input CSV
  BUG: Correct Np when weighting is applied
  STYL: Don't replace , with ; in output of general info
  BUG: Error in geometry check during resampling
  STYL: Add FAQ for image and mask of different geometry
  ENH: Add physical space check when resampling
  ENH: Add checks for mask size and dimensions.
  STYL: Update version in README
  DOCS: Update reference
  DOCS: Add reference for gray level discretization method
  STYL: Add suggested value for voxelArrayShift
  STYL: Add discretization formula
  MATH: Change default value of voxelArrayShift
  BUG: Change binning
  STYL: Rename `force2Dextraction` to `force2D`
  ...
@blezek blezek requested review from fedorov and JoostJM April 5, 2017 18:26
Copy link
Collaborator

@fedorov fedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Just few minor stylistic comments.


FROM jupyter/datascience-notebook

MAINTAINER "Daniel Blezek" [email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be swapped to pyradiomics as in the above, unless you really want to sign up to maintain this personally!


FROM ubuntu:16.04

MAINTAINER "Daniel Blezek" [email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@fedorov
Copy link
Collaborator

fedorov commented Apr 5, 2017

Another suggestion is to organize dockerfiles into a dedicated directory


# Make a global directory and link it to the work directory
RUN mkdir /data
RUN ln -s /data /home/jovyan/work/data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a custom image for this, I would suggest to:

  • use the official jupyter images
  • ask the user to simply clone the repository container the notebooks
  • run the jupyter image mounting that folder
  • Installation of Pyradiomics and its dependency would be step 1 of the notebook

Doing so, we would have one less image to maintain.

Copy link
Collaborator

@jcfr jcfr Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instructions would then be:

docker pull jupyter/datascience-notebook
git clone repo/containing/the-notebooks-dir
cd the-notebooks-dir
docker run -it --rm -p 8888:8888 -v $(pwd):/home/jovyan/work jupyter/datascience-notebook

README.md Outdated

docker build -t radiomics/cli .
docker build -t radiomics/notebook -f Dockerfile.notebook .
docker build -t radiomics/cli2 -f Dockerfile.ubuntu .
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image should probably be named differently, it is not clear what is the difference between cli and cli2


# Install Scipy https://www.scipy.org/install.html
USER root
RUN apt-get update && apt-get install -y python-pip python-numpy python-scipy python-nose
Copy link
Collaborator

@jcfr jcfr Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these requirements are redundant. Using pip install -r requirements.txt should be sufficient to pull most of the requirements.

Dockerfile Outdated
python setup.py install

WORKDIR /usr/src
ENTRYPOINT ["pyradiomics"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to simply have an image providing pyradiomics, executable there is no need to include the compiler in the image to distribute. Instead, I would suggest to build a pyradiomics wheel built using manylinux. These would simply be installed in this image and it would be much smaller.

Copy link
Collaborator

@jcfr jcfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see docker images coming up for pyradiomics 👍

I would suggest to:

  • revisit the need for the jupyer based one
  • consider creating a lighter image to provide pyradiomics. I think there is no need to include the compiler toolchain in the image
  • consider having pyradiomic wheel based of manylinux that we can there easily integrate in any container
  • squash all the commits

@blezek
Copy link
Collaborator Author

blezek commented Apr 6, 2017

@fedorov, @jcfr, Based on the comments, I'm going to propose that we settle down on 1 Dockerfile. It will be based on the jupyter/datascience-notebook and install the example notebooks, as well as the pyradiomics command. In the documentation, we can give examples on how to start up the notebook interface, and use the command in the Docker image. I do realize the datascience-notebook is heavyweight, but it does come with a huge amount of helpful packages already baked in. Having our examples (and more would be better) already baked in is a huge benefit.

We could also try to produce the smallest possible image for the command line version, if we want to have 2 images.

I hadn't planned on squashing the commits, it didn't seem to be the style in this repo. Is it necessary?

@fedorov
Copy link
Collaborator

fedorov commented Apr 6, 2017

Personally, I really really would hate the perfect to be the enemy of the good here. We can always improve later, but I think it is better to provide this functionality to the user early than have this PR pending much longer.

@blezek
Copy link
Collaborator Author

blezek commented Apr 6, 2017

@fedorov Happy to make changes, but I also agree with you. We can iterate on Docker support. I'll let someone else approve the PR.

@fedorov
Copy link
Collaborator

fedorov commented Apr 6, 2017

@blezek would be great if you do those changes that you can and agree with today or tomorrow, we really would like to add docker support before the end of the week in time for the QIN meeting.

@pieper
Copy link
Contributor

pieper commented Apr 6, 2017 via email

@jcfr
Copy link
Collaborator

jcfr commented Apr 6, 2017

offering a stripped down version later.

You are right size of the images is not a critical problem.

That said, the other issues with the current approach is that we do not know if the created images will work or not.

To keep up with the high bar maintained so far 😄 , I would highly recommend to add test(s) checking that the built docker images works as expected. For example:

  • build and publish the image as part of the CI process (e.g execute a basic notebook that would import pyradiomics and do a simple computation or even run the complete test suite). The new CircleCi 2.0 makes working with docker even easier.

If not we should at least update the release check list to ensure the images are manually tested before each release.

Finally, for any given docker images ... we should be able to tracker what is the corresponding pyradiomics SHA version. This can be done easily adding metadata. See https://microbadger.com/labels#!
For example: https://github.com/QIICR/dcmqi/blob/f621a32eed0eadac852cee9860addbffb5d92460/docker/Makefile#L65-L72

I know it is tempting to move forward .. but I really think having systematic testing and provenance tracking for the docker images will pay off.

@pieper
Copy link
Contributor

pieper commented Apr 6, 2017 via email

@jcfr
Copy link
Collaborator

jcfr commented Apr 6, 2017

A self test command that the user can easily run would be very helpful. Let's then add running the tests to a notebook tutorial.

Great. To achieve this, we could:

With that, tests can be executed by anyone installing the package. And in that particular case, since docker provides a consistent and repeatable environment. We can really ensure the tests run in the environment we provide right after we build the docker images.

@pieper
Copy link
Contributor

pieper commented Apr 6, 2017 via email

@jcfr
Copy link
Collaborator

jcfr commented Apr 6, 2017

Great.

@blezek Let us know if you have any questions. I would be happy to review further.

@blezek
Copy link
Collaborator Author

blezek commented Apr 7, 2017

Hey @jcfr @pieper @fedorov & @JoostJM,

Bunch of commits, but I slimmed down to one Dockerfile that runs the example notebooks as part of the build with Python 2 and 3. Also added a CircleCI v2 configuration that installs radiomics under Python 2 & 3, test the package and executes the notebooks.

A working example build is at https://circleci.com/gh/blezek/pyradiomics/9.

Added labels to the Docker (https://github.com/rossf7/label-schema-automated-build) but can't figure out how to test it without going through DockerHub.

If we configure DockerHub to auto build, we should have a nice pipeline going.

@pieper
Copy link
Contributor

pieper commented Apr 7, 2017

Hi @blezek I checked out the branch and ran 'docker build' but got the error below.

---> ebf7a1a7a6a3
Removing intermediate container 66d0c72f5b2d
Step 13 : ADD bin/Notebooks/FeatureVisualizationWithClustering.ipynb /home/jovyan/work/
lstat bin/Notebooks/FeatureVisualizationWithClustering.ipynb: no such file or directory

The file is referenced here, but looks like it's not in your branch.

https://github.com/Radiomics/pyradiomics/pull/236/files#diff-3254677a7917c6c01f55212f86c57fbfR33

@blezek
Copy link
Collaborator Author

blezek commented Apr 7, 2017

Sorry @pieper, I missed it. Try again. Need someone to look at that file and decide if it should be added to the Docker or not.

@fedorov
Copy link
Collaborator

fedorov commented Apr 7, 2017

I confirm I could build docker without problems. Thank you Dan!

@fedorov
Copy link
Collaborator

fedorov commented Apr 9, 2017

I am merging this so we officially have docker support in time for the QIN meeting that starts tomorrow.

@fedorov fedorov merged commit d151d44 into AIM-Harvard:master Apr 9, 2017
@fedorov
Copy link
Collaborator

fedorov commented Apr 9, 2017

@JoostJM you will need to configure docker hub yourself, it does not work for me, perhaps because of some permission issue. You will need to create Automated build, not Repository, which you have at the moment. I cannot do this, because somehow radiomics organization is not showing up in the list when I initiate creation of a new Automated build.

image

@fedorov
Copy link
Collaborator

fedorov commented Apr 10, 2017

I was not able to access radiomics/pyradiomics organization from Docker hub because of the third-party access restriction policy. After disabling it, I can see the organization and the repo. However, I am still not able to create automated build under radiomics organization on Docker hub.

@JoostJM
Copy link
Collaborator

JoostJM commented Apr 10, 2017

@fedorov, I just added it. Setting it up now.

@JoostJM
Copy link
Collaborator

JoostJM commented Apr 10, 2017

PyRadiomics docker is building.

@@ -1,481 +0,0 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blezek, you deleted this example, was there a specific reason for this? This example was also used in the short video we made on PyRadiomics usage, and was therefore also made available in the repository

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, @JoostJM I'm terribly sorry, I mistakenly deleted this notebook! I thought it was one of my mistakes. I'll submit a PR and fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants