Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micromamba not initializing correctly #359

Closed
jolespin opened this issue Aug 22, 2023 · 7 comments
Closed

Micromamba not initializing correctly #359

jolespin opened this issue Aug 22, 2023 · 7 comments

Comments

@jolespin
Copy link

Here's my current image building from this environment. Here's the repo for reference: https://github.com/jolespin/veba

install/docker/Dockerfile_test

# v2023.8.22
# =================================
FROM mambaorg/micromamba:1.4.9

ARG ENV_NAME

#SHELL ["bash", "-c"]
SHELL ["/usr/local/bin/_dockerfile_shell.sh"]

WORKDIR /tmp/

# Data
USER root
RUN mkdir -p /volumes/
RUN mkdir -p /volumes/input
RUN mkdir -p /volumes/output
RUN mkdir -p /volumes/database

# Retrieve VEBA repository
RUN mkdir -p veba/
USER $MAMBA_USER
COPY --chown=$MAMBA_USER:$MAMBA_USER ./install/ veba/install/
COPY --chown=$MAMBA_USER:$MAMBA_USER ./src/ veba/src/
COPY --chown=$MAMBA_USER:$MAMBA_USER ./VERSION veba/VERSION
COPY --chown=$MAMBA_USER:$MAMBA_USER ./LICENSE veba/LICENSE

# RUN echo "channel_priority: flexible" > ~/.condarc && \
#     echo "channels:" >> ~/.condarc && \
#     echo "  - conda-forge" >> ~/.condarc && \
#     echo "  - bioconda" >> ~/.condarc && \
#     echo "  - jolespin" >> ~/.condarc && \
#     echo "  - defaults" >> ~/.condarc && \
#     echo "  - qiime2" >> ~/.condarc && \
#     echo "report_errors: true" >> ~/.condarc

RUN micromamba install -y -n base -f veba/install/environments/${ENV_NAME}.yml
RUN micromamba clean -a -y -f
# =================================
#RUN micromamba shell init --shell bash --root-prefix=~/micromamba

# Add environment scripts to environment bin
RUN cp -rf veba/src/* /opt/conda/bin/
RUN ln -sf /opt/conda/bin/scripts/*.py /opt/conda/bin/
RUN ln -sf /opt/conda/bin/scripts/*.r /opt/conda/bin/

# Add conda bin to path
ENV PATH /opt/conda/bin:$PATH


ENTRYPOINT ["/usr/local/bin/_entrypoint.sh"]

The image builds without error. Here is the command I use:

docker build --build-arg ENV_NAME=VEBA-preprocess_env -t jolespin/veba_preprocess:1.2.0-test -f install/docker/Dockerfile_test .

When I try to run the image to check some basics, it doesn't look like it's properly initialized:

(base) $ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  -c "echo $PATH"
/usr/local/bin/_entrypoint.sh: line 24: /tmp/echo /Users/jespinoz/anaconda3/bin:/Users/jespinoz/anaconda3/condabin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/munki:/opt/X11/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin: No such file or directory
(base) $ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  -c "micromamba -h"
/usr/local/bin/_entrypoint.sh: line 24: exec: micromamba -h: not found
(base) $ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  -c "which python"
/usr/local/bin/_entrypoint.sh: line 24: exec: which python: not found

For my image, I'm doing the following:

  • Making a specific directory in which I can mount my volumes
  • Copying over the relevant "install" files and scripts
  • Installing the dependencies in the environment yml into the base environment
  • Copying over the scripts into the base environment bin/
  • Symlinking some of the scripts to the parent location in bin/

My goal is to be able to run the container interactively like this:

# Version
VERSION=1.2.0-test

# Image
DOCKER_IMAGE="jolespin/veba_preprocess:${VERSION}"

# Test it
docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  -c "preprocess.py -h"

# Interactive
docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  -c "bash"

However, the main use will be to mount volumes and deploy on personal computers and AWS:

# Directories
LOCAL_WORKING_DIRECTORY=$(pwd)
LOCAL_WORKING_DIRECTORY=$(realpath -m ${LOCAL_WORKING_DIRECTORY})
LOCAL_OUTPUT_PARENT_DIRECTORY=../
LOCAL_OUTPUT_PARENT_DIRECTORY=$(realpath -m ${LOCAL_OUTPUT_PARENT_DIRECTORY})
LOCAL_DATABASE_DIRECTORY=${VEBA_DATABASE} # /path/to/VEBA_DATABASE/
LOCAL_DATABASE_DIRECTORY=$(realpath -m ${LOCAL_DATABASE_DIRECTORY})

CONTAINER_INPUT_DIRECTORY=/volumes/input/
CONTAINER_OUTPUT_DIRECTORY=/volumes/output/
CONTAINER_DATABASE_DIRECTORY=/volumes/database/

# Parameters
ID=S1
R1=Fastq/${ID}_1.fastq.gz
R2=Fastq/${ID}_2.fastq.gz
NAME=VEBA-preprocess__${ID}
RELATIVE_OUTPUT_DIRECTORY=veba_output/preprocess/

# Command
CMD="preprocess.py -1 ${CONTAINER_INPUT_DIRECTORY}/${R1} -2 ${CONTAINER_INPUT_DIRECTORY}/${R2} -n ${ID} -o ${CONTAINER_OUTPUT_DIRECTORY}/${RELATIVE_OUTPUT_DIRECTORY} -x ${CONTAINER_DATABASE_DIRECTORY}/Contamination/chm13v2.0/chm13v2.0"

# Docker

# Run
docker run \
    --name ${NAME} \
    --rm \
    --volume ${LOCAL_WORKING_DIRECTORY}:${CONTAINER_INPUT_DIRECTORY}:ro \
    --volume ${LOCAL_OUTPUT_PARENT_DIRECTORY}:${CONTAINER_OUTPUT_DIRECTORY}:rw \
    --volume ${LOCAL_DATABASE_DIRECTORY}:${CONTAINER_DATABASE_DIRECTORY}:ro \
    ${DOCKER_IMAGE} \
    -c "${CMD}"
@wholtz
Copy link
Member

wholtz commented Aug 22, 2023

Hello @jolespin. Thank you for the detailed report.

I think there is some confusion about what command the image is executing by default. _entrypoint.sh performs an exec "$@" and does not do /bin/bash "$@". Therefore, I think you want to do:

$ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  /bin/bash -c "echo $PATH"
$ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  micromamba -h
$ docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  which python

Please let me know if that does or does not allow you to reach your goal.

@jolespin
Copy link
Author

Thanks for getting back to me so quickly. It's working now. I'm still learning docker so please forgive some of the basics that I might be missing.

Can you explain why this works:

docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  which python

but this doesn't work?

docker run --name VEBA-preprocess --rm -it ${DOCKER_IMAGE}  "which python"

@wholtz
Copy link
Member

wholtz commented Aug 22, 2023

That is due to how your shell parses commands into individual arguments. "which python" causes the full string to be a single argument. This results in the shell trying to find an executable named which python, and that doesn't exist.

@maresb
Copy link
Collaborator

maresb commented Aug 22, 2023

Ya. In contrast, if you replace "which python" with bash -c "which python" it will work again, at the cost of an extra Bash subprocess, because Bash will split the argument "which python".

I only understood what was really happening once I looked at shlex.split and shlex.join. It shows really simply in a Pythonic way how POSIX shells parse arguments. https://docs.python.org/3/library/shlex.html

@jolespin
Copy link
Author

Here's my updated Dockerfile:

# v2023.8.22
# =================================
FROM mambaorg/micromamba:1.4.9

ARG ENV_NAME

SHELL ["/usr/local/bin/_dockerfile_shell.sh"]

WORKDIR /tmp/

# Data
USER root
RUN mkdir -p /volumes/
RUN mkdir -p /volumes/input
RUN mkdir -p /volumes/output
RUN mkdir -p /volumes/database

# Retrieve VEBA repository
RUN mkdir -p veba/
USER $MAMBA_USER
COPY --chown=$MAMBA_USER:$MAMBA_USER ./install/ veba/install/
COPY --chown=$MAMBA_USER:$MAMBA_USER ./src/ veba/src/
COPY --chown=$MAMBA_USER:$MAMBA_USER ./VERSION veba/VERSION
COPY --chown=$MAMBA_USER:$MAMBA_USER ./LICENSE veba/LICENSE

# Install dependencies
RUN micromamba install -y -n base -f veba/install/environments/${ENV_NAME}.yml && \ 
    micromamba clean -a -y -f

# Add environment scripts to environment bin
RUN cp -rf veba/src/* /opt/conda/bin/ && \
    ln -sf /opt/conda/bin/scripts/*.py /opt/conda/bin/ && \
    ln -sf /opt/conda/bin/scripts/*.r /opt/conda/bin/


ENTRYPOINT ["/usr/local/bin/_entrypoint.sh"]

Building it like this:

docker build --build-arg ENV_NAME=VEBA-preprocess_env -t jolespin/veba_preprocess:1.2.0 -f install/docker/Dockerfile .

I'm running it like this:

# Version
VERSION=1.2.0

# Image
DOCKER_IMAGE="jolespin/veba_preprocess:${VERSION}"

# Directories
LOCAL_WORKING_DIRECTORY=$(pwd)
LOCAL_WORKING_DIRECTORY=$(realpath -m ${LOCAL_WORKING_DIRECTORY})
LOCAL_OUTPUT_PARENT_DIRECTORY=../
LOCAL_OUTPUT_PARENT_DIRECTORY=$(realpath -m ${LOCAL_OUTPUT_PARENT_DIRECTORY})
LOCAL_DATABASE_DIRECTORY=${VEBA_DATABASE} # /path/to/VEBA_DATABASE/
LOCAL_DATABASE_DIRECTORY=$(realpath -m ${LOCAL_DATABASE_DIRECTORY})

CONTAINER_INPUT_DIRECTORY=/volumes/input/
CONTAINER_OUTPUT_DIRECTORY=/volumes/output/
CONTAINER_DATABASE_DIRECTORY=/volumes/database/

# Parameters
ID=S1
R1=Fastq/${ID}_1.fastq.gz
R2=Fastq/${ID}_2.fastq.gz
NAME=VEBA-preprocess__${ID}
RELATIVE_OUTPUT_DIRECTORY=veba_output/preprocess/

# Command
CMD="preprocess.py -1 ${CONTAINER_INPUT_DIRECTORY}/${R1} -2 ${CONTAINER_INPUT_DIRECTORY}/${R2} -n ${ID} -o ${CONTAINER_OUTPUT_DIRECTORY}/${RELATIVE_OUTPUT_DIRECTORY} -x ${CONTAINER_DATABASE_DIRECTORY}/Contamination/chm13v2.0/chm13v2.0"

# Docker

# Run
docker run \
    --name ${NAME} \
    --rm \
    --volume ${LOCAL_WORKING_DIRECTORY}:${CONTAINER_INPUT_DIRECTORY}:ro \
    --volume ${LOCAL_OUTPUT_PARENT_DIRECTORY}:${CONTAINER_OUTPUT_DIRECTORY}:rw \
    --volume ${LOCAL_DATABASE_DIRECTORY}:${CONTAINER_DATABASE_DIRECTORY}:ro \
    ${DOCKER_IMAGE} \
    ${CMD}

Hope this helps someone figure out how to use Docker.

You're help was GREATLY appreciated today. Can't thank you enough for creating this and being helpful in running it.

Looking forward to rebuilding all my images w/ micromamba.

@jolespin
Copy link
Author

Does the ENV_NAME variable I created cause any conflict since ENV_NAME is used internally?

@maresb
Copy link
Collaborator

maresb commented Aug 23, 2023

Probably not, but for simplicity I'd recommend using the base environment unless you have a particular reason to use multiple environments in your container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants