Skip to content

Running a jupyter notebook using docker

petermr edited this page Jul 31, 2020 · 6 revisions

In this page we will sketch out the files needed for a dockerised jupyer notebook. All these files should be placed in a folder along with a folder called data where the produced notebooks will be kept. It is recommended you put the fetched paper's cproject folder in the data folder in order to allow access to them from within the notebook. The folder will have the format:

jupyter-
       |-Dockerfile
       |-requirements.txt
       |-run_notebook.sh
       |-data-
              |-results

Dockerfile

FROM python:3.6

ADD ./requirements.txt /

RUN pip install -r requirements.txt

RUN apt-get update

RUN apt-get -y install ipython

RUN pip install jupyter

CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--allow-root", "--notebook-dir=/data/"]

requirements.txt

numpy==1.17.4
scipy==1.4.1
requests==2.22.0
mysqlclient==1.4.4
python-dateutil==2.8.1
networkx==2.3
psycopg2==2.8.4
scikit-learn==0.22.1
pandas==0.25.3
wordcloud==1.6.0
nltk==3.4.5
dateparser==0.7.2
tensorflow
keras==2.3.1
neo4j==1.7.6
h5py==2.10.0
spacy==2.2.4
pdfminer3

run_notebook.sh

#!/usr/bin/env bash

docker build -t openvirus_tests_notebook:0.0.0 .

docker run -it \
 -p 8888:8888 \
 -v $(pwd)/data:/data \
 openvirus_tests_notebook:0.0.0

Once you have all these files run run_notebook.sh (you may need to run chmod in order to give yourself the required permissions) and then wait for the notebook to finish building (you will only need to do this once). When it is finished you will be shown a localhost address. Copy paste this into your browser and you will be good to go.

Alpha test (PeterMR)

NOTE: replace <my/workspace/> by your location for code.

Stuart Slack'ed a recent instruction which I have followed. https://github.com/bauhuasbadguy/exampleJupyterDockerisation#instructions

In order to use this repo clone it as normal and run run_notebook.sh. Wait a few minutes for the tool to start and copy the 127.0.0.1 address in full into your browser to start using the notebook. This was tested on Ubuntu so further work may be needed to make the tool easy to use by windows users.

PMR on MacOSX>

git clone

cd <my/workspace/>
git clone https://github.com/bauhuasbadguy/exampleJupyterDockerisation.git

produced

pm286macbook:workspace pm286$ git clone https://github.com/bauhuasbadguy/exampleJupyterDockerisation.git
Cloning into 'exampleJupyterDockerisation'...
remote: Enumerating objects: 17, done.
remote: Counting objects: 100% (17/17), done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 17 (delta 5), reused 12 (delta 3), pack-reused 0
Unpacking objects: 100% (17/17), done.

this created a new directory

.
├── Dockerfile
├── README.md
├── data
├── requirements.txt
└── run_notebook.sh

Time: < 1 minute

install @Stuart's system

cd <my/workspace>/exampleJupyterDockerisation/
sh ./run_notebook.sh

There is a lot of output and it took about 10 mins on a good line. I think it will be a lot slower in some cases. Here are some Steps you will probably see .

Sending build context to Docker daemon  78.34kB
Step 1/7 : FROM python:3.6
3.6: Pulling from library/python
31dd5ebca5ef: Pull complete 
3ed641c4ae98: Pull complete 
bcd57146431e: Pull complete 
ac34a4d7c330: Pull complete 
3b0a7e6f20fb: Pull complete 
ebb3a49d6a6e: Pull complete 
68216b6edce5: Pull complete 
122eb670adba: Pull complete 
dc0778ea9227: Pull complete 
Digest: sha256:61aaf7a0ae69997cf041c4f41b405ff6365e73f2709cebbaf443902cf2871baa
Status: Downloaded newer image for python:3.6
 ---> 3cfab35f43d8
Step 2/7 : ADD ./requirements.txt /
 ---> 61212b0b9111
Step 3/7 : RUN pip install -r requirements.txt
 ---> Running in fc35185e4105
Collecting numpy==1.17.4
  Downloading numpy-1.17.4-cp36-cp36m-manylinux1_x86_64.whl (20.0 MB)
Collecting scipy==1.4.1
  Downloading scipy-1.4.1-cp36-cp36m-manylinux1_x86_64.whl (26.1 MB)
...

Collecting oauthlib>=3.0.0 Downloading oauthlib-3.1.0-py2.py3-none-any.whl (147 kB) Building wheels for collected packages: mysqlclient, networkx, psycopg2, nltk, neo4j, pdfminer3, absl-py, termcolor, wrapt, pyyaml, neobolt, neotime, wasabi Building wheel for mysqlclient (setup.py): started Building wheel for mysqlclient (setup.py): finished with status 'done' Created wheel for mysqlclient: filename=mysqlclient-1.4.4-cp36-cp36m-linux_x86_64.whl size=112291 sha256=c443a1e3562f827b4c3a04ff9f1ae49871ce680cfea19c2dc616f7b4c6abb016 Stored in directory: /root/.cache/pip/wheels/5b/ed/b7/1f71e860fe48ad66c84d8a88abbb4b39854cfddb28c2d35392 Building wheel for networkx (setup.py): started Building wheel for networkx (setup.py): finished with status 'done' Created wheel for networkx: filename=networkx-2.3-py2.py3-none-any.whl size=1555989 sha256=163db1a92a04c897688a6c298a4fd8672cda9cb7b2a0f9e0c6dc0c2d92c6f84b ...

The next lines show the Python infrastructure and the libraries we shall use.

Stored in directory: /root/.cache/pip/wheels/2f/a5/75/e4c94412ac76860c05f9042cfa656900f3635903dee6696629 Building wheel for wasabi (setup.py): started Building wheel for wasabi (setup.py): finished with status 'done' Created wheel for wasabi: filename=wasabi-0.7.1-py3-none-any.whl size=20834 sha256=c5b649e698fb77450f5521c21c10649da9c076b1a5c604ef81816513a8060da0 Stored in directory: /root/.cache/pip/wheels/81/48/90/cf81833b3dfce6eaf7eab4bd5fdc0e75dbca4418b263f444b8 Successfully built mysqlclient networkx psycopg2 nltk neo4j pdfminer3 absl-py termcolor wrapt pyyaml neobolt neotime wasabi Installing collected packages: numpy, scipy, idna, chardet, certifi, urllib3, requests, mysqlclient, six, python-dateutil, decorator, networkx, psycopg2, joblib, scikit-learn, pytz, pandas, cycler, pillow, pyparsing, kiwisolver, matplotlib, wordcloud, nltk, tzlocal, regex, dateparser, grpcio, absl-py, zipp, importlib-metadata, markdown, werkzeug, tensorboard-plugin-wit, protobuf, pyasn1, pyasn1-modules, rsa, cachetools, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, tensorboard, google-pasta, gast, h5py, termcolor, wrapt, tensorflow-estimator, astunparse, keras-preprocessing, opt-einsum, tensorflow, keras-applications, pyyaml, keras, neobolt, neotime, neo4j, cymem, srsly, catalogue, tqdm, wasabi, murmurhash, preshed, blis, plac, thinc, spacy, pycryptodome, sortedcontainers, pdfminer3 Successfully installed absl-py-0.9.0 astunparse-1.6.3 blis-0.4.1 cachetools-4.1.1 catalogue-1.0.0 certifi-2020.6.20 chardet-3.0.4 cycler-0.10.0 cymem-2.0.3 dateparser-0.7.2 decorator-4.4.2 gast-0.3.3 google-auth-1.20.0 google-auth-oauthlib-0.4.1 google-pasta-0.2.0 grpcio-1.30.0 h5py-2.10.0 idna-2.8 importlib-metadata-1.7.0 joblib-0.16.0 keras-2.3.1 keras-applications-1.0.8 keras-preprocessing-1.1.2 kiwisolver-1.2.0 markdown-3.2.2 matplotlib-3.3.0 murmurhash-1.0.2 mysqlclient-1.4.4 neo4j-1.7.6 neobolt-1.7.17 neotime-1.7.4 networkx-2.3 nltk-3.4.5 numpy-1.17.4 oauthlib-3.1.0 opt-einsum-3.3.0 pandas-0.25.3 pdfminer3-2018.12.3.0 pillow-7.2.0 plac-1.1.3 preshed-3.0.2 protobuf-3.12.4 psycopg2-2.8.4 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycryptodome-3.9.8 pyparsing-2.4.7 python-dateutil-2.8.1 pytz-2020.1 pyyaml-5.3.1 regex-2020.7.14 requests-2.22.0 requests-oauthlib-1.3.0 rsa-4.6 scikit-learn-0.22.1 scipy-1.4.1 six-1.15.0 sortedcontainers-2.2.2 spacy-2.2.4 srsly-1.0.2 tensorboard-2.3.0 tensorboard-plugin-wit-1.7.0 tensorflow-2.3.0 tensorflow-estimator-2.3.0 termcolor-1.1.0 thinc-7.4.0 tqdm-4.48.0 tzlocal-2.1 urllib3-1.25.10 wasabi-0.7.1 werkzeug-1.0.1 wordcloud-1.6.0 wrapt-1.12.1 zipp-3.1.0 Removing intermediate container fc35185e4105 ---> 31b290c38d50 Step 4/7 : RUN apt-get update ---> Running in d4c20d3f9566 Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB] Get:2 http://security.debian.org/debian-security buster/updates/main amd64 Packages [217 kB] Get:3 http://deb.debian.org/debian buster InRelease [121 kB] Get:4 http://deb.debian.org/debian buster-updates InRelease [51.9 kB] Get:5 http://deb.debian.org/debian buster/main amd64 Packages [7905 kB] Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [7868 B] Fetched 8369 kB in 3s (2765 kB/s) Reading package lists... Removing intermediate container d4c20d3f9566 ---> 473d1b386d66 ...

Step 5/7 : RUN apt-get -y install ipython
 ---> Running in 3af739632db3
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  python-backports-shutil-get-terminal-size python-chardet python-decorator
  python-enum34 python-ipython python-ipython-genutils python-pathlib2
  python-pexpect python-pickleshare python-pkg-resources python-prompt-toolkit
  python-ptyprocess python-pygments python-scandir python-simplegeneric
  python-six python-traitlets python-wcwidth
Suggested packages:
  python-enum34-doc python-pexpect-doc python-setuptools python-pygments-doc
  ttf-bitstream-vera
The following NEW packages will be installed:
  ipython python-backports-shutil-get-terminal-size python-chardet
  python-decorator python-enum34 python-ipython python-ipython-genutils
  python-pathlib2 python-pexpect python-pickleshare python-pkg-resources
  python-prompt-toolkit python-ptyprocess python-pygments python-scandir
  python-simplegeneric python-six python-traitlets python-wcwidth
0 upgraded, 19 newly installed, 0 to remove and 0 not upgraded.
Need to get 1727 kB of archives.
After this operation, 8524 kB of additional disk space will be used.  !!!PMR NOTE SIZE !!!
Get:1 http://deb.debian.org/debian buster/main amd64 python-decorator all 4.3.0-1.1 [14.4 kB]
Get:2 http://deb.debian.org/debian buster/main amd64 python-ptyprocess all 0.6.0-1 [13.1 kB]
Get:3 http://deb.debian.org/debian buster/main amd64 python-pexpect all 4.6.0-1 [51.6 kB]
...

Setting up python-ipython (5.8.0-1) ... Setting up ipython (5.8.0-1) ... Removing intermediate container 3af739632db3 ---> 85f76b61be78 Step 6/7 : RUN pip install jupyter ---> Running in f2a0da4efc8a Collecting jupyter Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB) Collecting ipykernel Downloading ipykernel-5.3.4-py3-none-any.whl (120 kB) Collecting qtconsole Downloading qtconsole-4.7.5-py2.py3-none-any.whl (118 kB)

...

Successfully built tornado pandocfilters pyrsistent
Installing collected packages: ipython-genutils, traitlets, pygments, pickleshare, ptyprocess, pexpect, wcwidth, prompt-toolkit, backcall, parso, jedi, ipython, tornado, jupyter-core, pyzmq, jupyter-client, ipykernel, qtpy, qtconsole, jupyter-console, pyrsistent, attrs, jsonschema, nbformat, terminado, prometheus-client, Send2Trash, mistune, defusedxml, entrypoints, packaging, webencodings, bleach, testpath, pandocfilters, MarkupSafe, jinja2, nbconvert, notebook, widgetsnbextension, ipywidgets, jupyter
Successfully installed MarkupSafe-1.1.1 Send2Trash-1.5.0 attrs-19.3.0 backcall-0.2.0 bleach-3.1.5 defusedxml-0.6.0 entrypoints-0.3 ipykernel-5.3.4 ipython-7.16.1 ipython-genutils-0.2.0 ipywidgets-7.5.1 jedi-0.17.2 jinja2-2.11.2 jsonschema-3.2.0 jupyter-1.0.0 jupyter-client-6.1.6 jupyter-console-6.1.0 jupyter-core-4.6.3 mistune-0.8.4 nbconvert-5.6.1 nbformat-5.0.7 notebook-6.0.3 packaging-20.4 pandocfilters-1.4.2 parso-0.7.1 pexpect-4.8.0 pickleshare-0.7.5 prometheus-client-0.8.0 prompt-toolkit-3.0.5 ptyprocess-0.6.0 pygments-2.6.1 pyrsistent-0.16.0 pyzmq-19.0.1 qtconsole-4.7.5 qtpy-1.9.0 terminado-0.8.3 testpath-0.4.4 tornado-6.0.4 traitlets-4.3.3 wcwidth-0.2.5 webencodings-0.5.1 widgetsnbextension-3.5.1
Removing intermediate container f2a0da4efc8a
 ---> 9e1a53797700
Step 7/7 : CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--allow-root", "--notebook-dir=/data/"]
 ---> Running in 9a680b965707
Removing intermediate container 9a680b965707
 ---> babf4c303358
Successfully built babf4c303358
Successfully tagged openvirus_tests_notebook:0.0.0
[I 08:54:32.561 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[I 08:54:32.909 NotebookApp] Serving notebooks from local directory: /data
[I 08:54:32.910 NotebookApp] The Jupyter Notebook is running at:
[I 08:54:32.910 NotebookApp] http://5b02a6a7c63b:8888/?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115
[I 08:54:32.910 NotebookApp]  or http://127.0.0.1:8888/?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115
[I 08:54:32.910 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 08:54:32.915 NotebookApp] No web browser found: could not locate runnable browser.
[C 08:54:32.915 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/nbserver-1-open.html
    Or copy and paste one of these URLs:
        http://5b02a6a7c63b:8888/?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115
     or http://127.0.0.1:8888/?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115
[I 08:55:39.686 NotebookApp] 302 GET /?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115 (172.17.0.1) 1.27ms

It stops at this stage, waiting for input.

Browser

I pasted http://127.0.0.1:8888/?token=51ed4ea89cde9e861299033a86e319418eb09e501f558115 into my Chrome browser: and get an empty Jupyter Notebook ; this is the expected result so far.

Thanks to Stuart.

Clone this wiki locally