Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Dockerfile for Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8 #572

Merged
merged 8 commits into from
Sep 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 98 additions & 8 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,114 @@
# icefall dockerfile

We provide a dockerfile for some users, the configuration of dockerfile is : Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8-python3.8. You can use the dockerfile by following the steps:
2 sets of configuration are provided - (a) Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8, and (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8.

If your NVIDIA driver supports CUDA Version: 11.3, please go for case (a) Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8.

Otherwise, since the older PyTorch images are not updated with the [apt-key rotation by NVIDIA](https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key), you have to go for case (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8. Ensure that your NVDIA driver supports at least CUDA 11.0.

You can check the highest CUDA version within your NVIDIA driver's support with the `nvidia-smi` command below. In this example, the highest CUDA version is 11.0, i.e. case (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8.

```bash
$ nvidia-smi
Tue Sep 20 00:26:13 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 TITAN RTX On | 00000000:03:00.0 Off | N/A |
| 41% 31C P8 4W / 280W | 16MiB / 24219MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 TITAN RTX On | 00000000:04:00.0 Off | N/A |
| 41% 30C P8 11W / 280W | 6MiB / 24220MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2085 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 2240 G /usr/bin/gnome-shell 4MiB |
| 1 N/A N/A 2085 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+

```

## Building images locally
If your environment requires a proxy to access the Internet, remember to add those information into the Dockerfile directly.
For most cases, you can uncomment these lines in the Dockerfile and add in your proxy details.

```dockerfile
ENV http_proxy=http://aaa.bb.cc.net:8080 \
https_proxy=http://aaa.bb.cc.net:8080
```

Then, proceed with these commands.

### If you are case (a), i.e. your NVIDIA driver supports CUDA version >= 11.3:

```bash
cd docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8
docker build -t icefall/pytorch1.12.1 .
```

### If you are case (b), i.e. your NVIDIA driver can only support CUDA versions 11.0 <= x < 11.3:
```bash
cd docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8
docker build -t icefall/pytorch1.7.1:latest -f ./Dockerfile ./
docker build -t icefall/pytorch1.7.1 .
```

## Using built images
Sample usage of the GPU based images:
## Running your built local image
Sample usage of the GPU based images. These commands are written with case (a) in mind, so please make the necessary changes to your image name if you are case (b).
Note: use [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) to run the GPU images.

```bash
docker run -it --runtime=nvidia --name=icefall_username --gpus all icefall/pytorch1.7.1:latest
docker run -it --runtime=nvidia --shm-size=2gb --name=icefall --gpus all icefall/pytorch1.12.1
```

Sample usage of the CPU based images:
### Tips:
1. Since your data and models most probably won't be in the docker, you must use the -v flag to access the host machine. Do this by specifying `-v {/path/in/docker}:{/path/in/host/machine}`.

2. Also, if your environment requires a proxy, this would be a good time to add it in too: `-e http_proxy=http://aaa.bb.cc.net:8080 -e https_proxy=http://aaa.bb.cc.net:8080`.

Overall, your docker run command should look like this.

```bash
docker run -it --runtime=nvidia --shm-size=2gb --name=icefall --gpus all -v {/path/in/docker}:{/path/in/host/machine} -e http_proxy=http://aaa.bb.cc.net:8080 -e https_proxy=http://aaa.bb.cc.net:8080 icefall/pytorch1.12.1
```

You can explore more docker run options [here](https://docs.docker.com/engine/reference/commandline/run/) to suit your environment.

### Linking to icefall in your host machine

If you already have icefall downloaded onto your host machine, you can use that repository instead so that changes in your code are visible inside and outside of the container.

Note: Remember to set the -v flag above during the first run of the container, as that is the only way for your container to access your host machine.
Warning: Check that the icefall in your host machine is visible from within your container before proceeding to the commands below.

Use these commands once you are inside the container.

```bash
rm -r /workspace/icefall
ln -s {/path/in/docker/to/icefall} /workspace/icefall
```

## Starting another session in the same running container.
```bash
docker exec -it icefall /bin/bash
```

## Restarting a killed container that has been run before.
```bash
docker start -ai icefall
```

## Sample usage of the CPU based images:
```bash
docker run -it icefall/pytorch1.7.1:latest /bin/bash
```
docker run -it icefall /bin/bash
```
72 changes: 72 additions & 0 deletions docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
FROM pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel

# ENV http_proxy=http://aaa.bbb.cc.net:8080 \
# https_proxy=http://aaa.bbb.cc.net:8080

# install normal source
RUN apt-get update && \
apt-get install -y --no-install-recommends \
g++ \
make \
automake \
autoconf \
bzip2 \
unzip \
wget \
sox \
libtool \
git \
subversion \
zlib1g-dev \
gfortran \
ca-certificates \
patch \
ffmpeg \
valgrind \
libssl-dev \
vim \
curl

# cmake
RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \
cd /opt && \
tar -zxvf cmake-3.18.0.tar.gz && \
cd cmake-3.18.0 && \
./bootstrap && \
make && \
make install && \
rm -rf cmake-3.18.0.tar.gz && \
find /opt/cmake-3.18.0 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \
cd -

# flac
RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz && \
cd /opt && \
xz -d flac-1.3.2.tar.xz && \
tar -xvf flac-1.3.2.tar && \
cd flac-1.3.2 && \
./configure && \
make && make install && \
rm -rf flac-1.3.2.tar && \
find /opt/flac-1.3.2 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \
cd -

RUN pip install kaldiio graphviz && \
conda install -y -c pytorch torchaudio

#install k2 from source
RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \
cd /opt/k2 && \
python3 setup.py install && \
cd -

# install lhotse
RUN pip install git+https://github.com/lhotse-speech/lhotse

RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \
cd /workspace/icefall && \
pip install -r requirements.txt

ENV PYTHONPATH /workspace/icefall:$PYTHONPATH

WORKDIR /workspace/icefall
65 changes: 32 additions & 33 deletions docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
FROM pytorch/pytorch:1.7.1-cuda11.0-cudnn8-devel

# install normal source
# ENV http_proxy=http://aaa.bbb.cc.net:8080 \
# https_proxy=http://aaa.bbb.cc.net:8080

RUN rm /etc/apt/sources.list.d/cuda.list && \
rm /etc/apt/sources.list.d/nvidia-ml.list && \
apt-key del 7fa2af80

# install normal source
RUN apt-get update && \
apt-get install -y --no-install-recommends \
g++ \
Expand All @@ -21,20 +27,25 @@ RUN apt-get update && \
patch \
ffmpeg \
valgrind \
libssl-dev \
vim && \
rm -rf /var/lib/apt/lists/*


RUN mv /opt/conda/lib/libcufft.so.10 /opt/libcufft.so.10.bak && \
libssl-dev \
vim \
curl

# Add new keys and reupdate
RUN curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub | apt-key add - && \
curl -fsSL https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub | apt-key add - && \
echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \
echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \
rm -rf /var/lib/apt/lists/* && \
mv /opt/conda/lib/libcufft.so.10 /opt/libcufft.so.10.bak && \
mv /opt/conda/lib/libcurand.so.10 /opt/libcurand.so.10.bak && \
mv /opt/conda/lib/libcublas.so.11 /opt/libcublas.so.11.bak && \
mv /opt/conda/lib/libnvrtc.so.11.0 /opt/libnvrtc.so.11.1.bak && \
mv /opt/conda/lib/libnvToolsExt.so.1 /opt/libnvToolsExt.so.1.bak && \
mv /opt/conda/lib/libcudart.so.11.0 /opt/libcudart.so.11.0.bak
# mv /opt/conda/lib/libnvToolsExt.so.1 /opt/libnvToolsExt.so.1.bak && \
mv /opt/conda/lib/libcudart.so.11.0 /opt/libcudart.so.11.0.bak && \
apt-get update && apt-get -y upgrade

# cmake

RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \
cd /opt && \
tar -zxvf cmake-3.18.0.tar.gz && \
Expand All @@ -45,11 +56,7 @@ RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \
rm -rf cmake-3.18.0.tar.gz && \
find /opt/cmake-3.18.0 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \
cd -

#kaldiio

RUN pip install kaldiio


# flac
RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz && \
cd /opt && \
Expand All @@ -62,15 +69,8 @@ RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz &&
find /opt/flac-1.3.2 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \
cd -

# graphviz
RUN pip install graphviz

# kaldifeat
RUN git clone https://github.com/csukuangfj/kaldifeat.git /opt/kaldifeat && \
cd /opt/kaldifeat && \
python setup.py install && \
cd -

RUN pip install kaldiio graphviz && \
conda install -y -c pytorch torchaudio=0.7.1

#install k2 from source
RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \
Expand All @@ -79,14 +79,13 @@ RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \
cd -

# install lhotse
RUN pip install torchaudio==0.7.2
RUN pip install git+https://github.com/lhotse-speech/lhotse
#RUN pip install lhotse
RUN pip install git+https://github.com/lhotse-speech/lhotse

RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \
cd /workspace/icefall && \
pip install -r requirements.txt

ENV PYTHONPATH /workspace/icefall:$PYTHONPATH

# install icefall
RUN git clone https://github.com/k2-fsa/icefall && \
cd icefall && \
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

ENV PYTHONPATH /workspace/icefall:$PYTHONPATH
WORKDIR /workspace/icefall