<title>Intro to Singularity containers</title> <textarea id="source">

Intro to Singularity Containers

Pablo Escobar Lopez

https://scicore-unibas-ch.github.io/singularity-slides

https://www.sylabs.io/

License and credits

This work was done by sciCORE - Center for scientific computing at the University of Basel and SIB Swiss Institute of bioinformatics

Except where otherwise noted, this presentation is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0) .center[ ]

.small[These slides are based in previous work from Alexander Kashev at http://www.scits.unibe.ch/]

Agenda

Tutorial intro
The problem we're solving
Virtual machines vs containers
History of containers
Docker vs Singularity
Singularity workflow
Examples

Objective

Explain the basic technical concepts about containers
Differences between Virtual Machines and containers. Pros and Cons
Differences between Docker and Singularity. Pros and Cons
Some practical examples with Singularity
- Execute Singularity containers
- Create Singularity containers from scratch
- Share Singularity containers

Requirements to follow the course

Your laptop with:
- VirtualBox and Vagrant
Linux shell knowledge
If you are using a Windows laptop you also need Cygwin to have a bash shell. During the Cygwin installation make sure to install all the ssh related packages as in this screenshot:

The Problem(s) we would like to solve

Problem (for developers)

Suppose you're writing some software. It works great on your machine.

However, eventually it has to leave your machine to run on your colleague's machine, or be deployed in its production environment.

It could be a completely different flavour of OS, with a different set of libraries and supporting tools.

It can be difficult to test if you accounted for all these variations on your own development system. You may have things in your environment you're not even aware of which make a difference.

Your users could also be less technically inclined to deal with dependencies. You may wish to decrease this friction.

Problem (for researchers)

Suppose you have a piece of scientific software you used to obtain some result.

Then someone half across the globe tries to reproduce it, and can't get it to run, or worse - is getting different results for the same inputs. What is to blame?

Or, even simpler: your group tries to use your software a couple of years after you left, and nobody can get it to work.

To enable reproducible science with the help of software packaging, just the source code might not be enough; the environment should also be predictable.

Problem (for server administrators)

Suppose you have one hundred users, each requesting certain software.

Some of those applications have ancient dependencies very hard (or imposible) to compile from sources in your system

Some of those application only work with linux distribution "XXX" . the developer said "it works in my computer" ;). Of course your cluster runs a different distro

Some of the software works with mutually-incompatible library versions. Possibly these have known security issues.

And finally, you most certainly don't trust any of this software not to mess up your OS. You've been there before.

What would be a solution?

A turnkey solution

A recipe that can build a working instance of your software, reliably and fast.
BYOE: Bring Your Own Environment

A way to capture the prerequisites and environment together with the software.

The Solution(s)

Solution: Virtual Machines?

A virtual machine is an isolated instance of a .highlight[whole other "guest" OS] running under your "host" OS.

A .highlight[hypervisor] is responsible for handling the situations where this isolation causes issues for the guest.

From the point of view of the guest, it runs under its own, dedicated hardware. Hence, it's called .highlight[hardware-level virtualization].

Most^* guest/host OS combinations can run: you can run Windows on Linux, Linux on Windows, etc.

* MacOS being a stinker here, of course, with their license.

Virtual machines

image source: http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC

Virtual Machines: the good parts

The BYOE principle is fully realized

Whatever your environment is, you can package it fully, OS and everything.
Easy to precisely measure out resources

The contained application, together with its OS, has restricted access to hardware: you measure out its disk, memory and alloted CPU.
More Flexible. More OS are supported
Security risks are truly minimized.

Very narrow and secured bridge between the guest and the host means little opportunity for a bad actor to break out of isolation. ]

Virtual machines: the not so good parts

Operational overhead. For every piece of software, the full underlying OS has to be run.
Setup overhead. Starting and stopping a virtual machine is not very fast, and/or requires saving its state.
Hardware availability. The isolation between the host and the guest can hinder access to specialized hardware on the host system (eg GPUs)
There are a lot of different hypervisors (eg Xen, KVM, VirtualBox). Very little chance that you get a hypervisor installed in your HPC cluster.
No integration with HPC schedulers (Slurm, SGE)

Solution: Containers (on Linux)?

If your host OS is Linux and your software expects Linux, there's a more direct and lightweight way to reach similar goals.

Recent kernel advances allow the isolation of processes from the rest of the system, presenting them with their own view of the system.

You can package other different Linux distributions, and with the exception of the host kernel, the entire userland can be different for the process.

From the point of view of the application, it's running on the same hardware as the host, hence containers are called .highlight[operating-system-level virtualization].

Containers

image source: http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC

Containers: the good parts

Lower operational overhead. You don't need to run a whole second OS to run an application.
Lower startup overhead. Setup and teardown of a container is much less costly.
More hardware flexibility. You don't have to dedicate a set portion of memory to your VM well in advance, or contain your files in a fixed-size filesystem. Also, the level of isolation is up to you. You may present devices on the system directly to containers if you so desire. e.g. GPUs
Integration with HPC schedulers is trivial

Containers: the not so good parts

Kernel is shared between the host and the container, so there may be some incompatibilties.

Plus, container support is (relatively) new, so it needs a recent kernel on the host.

Security concerns

The isolation is thinner than in VM case. The host OS kernel and hardware is directly exposed so the attack surface is larger.
Linux on Linux

Containers are inherently a Linux technology. You need a Linux host (or a Linux VM) to run containers, and only Linux software can run. ]

History of containers

The idea of running an application in a different environment is not new to UNIX-like systems.

Perhaps the first effort in that direction is the chroot command and concept (1982): presenting applications with a different view of the filesystem (a different /, root directory).

This minimal isolation was improved in FreeBSD with jail (2000), separating other resources (processes, users) and restricting how applications can interact with each other and the kernel.

Linux developed facilities for isolating and controlling access to some processes with namespaces (2002) and cgroups (2007).

Those facilities led to creation of solutions for containerization, notably LXC (2008), Docker (2013) and Singularity (2016).

Main technologies used to create containers

All the recent containers solutions (LXC, Docker, Singularity...etc) use namespaces and cgroups to isolate the containers from the host

These two technologies (namespaces and cgroups) are provided by the Linux kernel

The container solution "only" provides a easy way for the end user to automate the creation of the required namespaces and cgroups

In the case of Docker it's the docker daemon who does the job. In Singularity it's a setuid binary who does it.

Docker

Docker came about in 2013 and since has been on a meteoric rise as a gold standard for containerization technology.

A huge number of tools are built around Docker to build, run, orchestrate and integrate Docker containers.

Many cloud service providers can directly integrate Docker containers.

Docker encourages splitting software into microservice chunks that can be portably used as needed.

Docker concerns

Docker uses a pretty complicated model of images/volumes/metadata, and it is not always very transparent how this is stored.

Also, isolation features require superuser privileges; Docker has a persistent daemon running with root privileges and many container operations require root as well. In practice, .highlight[a user who can run docker containers can get root access]

Docker is mainly designed and used to run microservices like web servers, databases, mail servers, etc and not to run command line utilies like most users usually do in an HPC cluster. Microservices also require orchestration tools like docker-compose, docker swarm or kubernetes

Docker concerns

The user IDs inside and outside the docker containers are not easily mapped.

This means that processes running inside the container can run with different UID than the user who is executing the container. Because of this, the generated files won't be accessible by the user "outside" the container

These concerns make Docker undesirable in multi-user, shared HPC environments.

Out of these concerns, and the scientific community, came Singularity.

Singularity

Singularity is quite similar in principle to Docker. In fact, it's pretty straightforward to convert a Docker container to a Singularity image.

Singularity uses a monolithic, image-file based approach. Instead of dynamically overlaid "layers" of Docker, you have a single file you can build once and simply copy over to the target system.

The privilege problem was a concern from the ground-up, and solved by having a setuid-enabled binary that can accomplish container startup - and drop privileges completely as soon as practical.

Singularity

Privilege elevation inside a Singularity container is impossible (in theory ;): to be root inside, you have to be root outside.

Users don't need explicit root access to execute existing containers

Singularity automatically maps the user IDs . All the processes inside the container will run with the same user ID as the user executing the container. This means that also the files generated inside the container will have the right access permissions outside the container.

Singularity and HPC

root is only required for initial build of the container but you can do this in a virtual machine (as we will do in this course) and then move the container to your HPC cluster. .highlight[You don't need root to execute the containers]

If the container is already available in a registry (containers repository) you can directly download it to your HPC cluster without root permissions.

Thanks to the above improvements over Docker, HPC cluster operators are much more welcoming to the idea of Singularity support.

Once your software is packaged in Singularity, it should work across all Science IT platforms supporting the technology.

Performance comparison: native VS singularity in the sciCORE cluster

NGS pipeline without containers

user@scicore:~$ sacct -j 86719  -o State,ExitCode,Elapsed,MaxRSS

     State ExitCode  Elapsed     MaxRSS
---------- --------  ---------- ----------
 COMPLETED      0:0  00:55:47
 COMPLETED      0:0  00:55:47   31754776K (31.754776GB)

NGS pipeline using Singularity containers

user@scicore:~$ sacct -j 86720  -o State,ExitCode,Elapsed,MaxRSS

     State ExitCode   Elapsed     MaxRSS
---------- --------  ----------  ----------
 COMPLETED      0:0  00:58:46
 COMPLETED      0:0  00:58:46    31755136K (31.755136GB)

Performance benchmark: Singularity VS Docker VS native

.footnote[https://twitter.com/ljdursi/status/827186459975286787]

Singularity workflow

.small[The general idea: prepare container on a local machine with full control, then you can move it to the cluster to run it without root. You can use scp or upload it to a containers registry.

If you upload the container image to a registry like docker hub , singularity hub or a private registry you can directly download the container to the cluster as a regular user. ]

Some notes about the shell exercises

USER @ MACHINE: WORKING_DIR

user@laptop:~$ mkdir ~/singularity-vm

vagrant@vm:~$ id

root@vm:~/singularity-vm#

Installing Singularity in your laptop

.small[In fact we are installing Singularity inside a VirtualBox VM running in your laptop. First boot a Ubuntu16.04 VM and test that you can ssh to it and become root]

user@laptop:~$ mkdir ~/singularity-vm

user@laptop:~$ cd ~/singularity-vm

user@laptop:~/singularity-vm$ vagrant init ubuntu/xenial64

user@laptop:~/singularity-vm$ vagrant up

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ id
uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant)

vagrant@vm:~$ sudo -s

root@vm:~# id
uid=0(root) gid=0(root)

root@vm:~# exit

vagrant@vm:~$ exit

user@laptop:~/singularity-vm$

Installing Singularity in your laptop

.small[Verify that the folder /vagrant inside the VM is mapped to the local folder ~/singularity-vm in your laptop

user@laptop:~$ cd ~/singularity-vm

user@laptop:~/singularity-vm$ touch ~/singularity-vm/dummy.file

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ ls /vagrant
Vagrantfile  dummy.file  ubuntu-xenial-16.04-cloudimg-console.log

vagrant@vm:~$ touch /vagrant/dummy2.file

vagrant@vm:~$ exit

user@laptop:~/singularity-vm$ ls ~/singularity-vm
dummy2.file  dummy.file  ubuntu-xenial-16.04-cloudimg-console.log  Vagrantfile

We will use this to copy files from/to the VM and laptop. We can also use it to edit the files inside the VM with our preferred editor running in our laptop

.highlight[Don't generate your containers directly in /vagrant . Generate the containers in /home/vagrant and then copy to /vagrant if needed] ]

Installing Singularity in your laptop

Copy&Paste this in your laptop's terminal to install singularity in the VM

user@laptop:~$ cd ~/singularity-vm/

user@laptop:~/singularity-vm$ vagrant ssh -c /bin/bash <<EOF
    sudo apt-get update
    sudo apt-get -y install build-essential squashfs-tools curl git sudo man \
    vim autoconf libtool emacs24-nox python bash-completion libarchive-dev
    mkdir ~/src/ && cd ~/src/
    wget https://github.com/singularityware/singularity/archive/2.6.0.tar.gz
    tar xvf 2.6.0.tar.gz
    cd singularity-2.6.0/
    ./autogen.sh
    ./configure --prefix=/usr/local
    make
    sudo make install
    sudo ln -s /usr/local/etc/bash_completion.d/singularity /etc/profile.d/singularity.sh
EOF

Check the official docs if you want to create a RPM or DEB package for your distribution (not needed for this course)

Using Singularity

If you followed build instructions, you should now have singularity available in your VM.

user@laptop:~$ cd ~/singularity-vm/

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ singularity --version
2.6.0-dist

The general format of Singularity commands is:

singularity [global options...] <command> [command options...] ...

Singularity is pretty sensitive to the order of those.

Use singularity help [<command>] to check built-in help.

You can find the configuration of Singularity under /usr/local/etc/singularity

Container images

A container needs to be somehow bootstrapped to contain a base operating system before further modifications can be made.

A container image is a filesystem tree that will be presented to the applications running inside it.

A Docker container is built with a series of layers that are stacked upon each other to form the filesystem. Singularity collapses those into a single, portable file.

Pulling a pre-built Docker image to test the basic functionality of Singularity

We will do this exercise in the VM:

vagrant@vm:~$ singularity pull docker://centos:6 WARNING: pull for Docker Hub is not guaranteed to produce the WARNING: same image on repeated pull. Use Singularity Registry WARNING: (shub://) to pull exactly equivalent images. Docker image path: index.docker.io/library/centos:6 Cache folder set to /home/vagrant/.singularity/docker [1/1] |===================================| 100.0% Importing: base Singularity environment Importing: /home/vagrant/.singularity/docker/sha256:ca9499a209fd9abe7f919a5c99989fd26d3410164e499fe577be504341f0a352.tar.gz Importing: /home/vagrant/.singularity/metadata/sha256:b4d98049dd466efa5cfbf4b07aa672ced350fdbec3cd2faa139b194623df29ef.tar.gz WARNING: Building container as an unprivileged user. If you run this container as root WARNING: it may be missing some functionality. Building Singularity image... Singularity container built: ./centos-6.simg Cleaning up...

]

.small[This will download the layers of the Docker container to your
machine and assemble them into an image.

Note that this .highlight[does not require `sudo` or Docker]!]
---

# Entering shell in the container

To test our freshly-created container, we can invoke an interactive shell
  to explore it with .highlight[`shell`]:

```terminal
vagrant@vm:~$ singularity shell centos-6.simg
Singularity: Invoking an interactive shell within container...

Singularity centos-6.simg:~>

At this point, you're within the environment of the container.

We can verify we're "running" CentOS:

Singularity centos-6.simg:~> cat /etc/centos-release
CentOS release 6.10 (Final)

User/group within the container

Inside the container, we are the same user:

Singularity centos-6.simg:~> whoami
vagrant

Singularity centos-6.simg:~> groups
vagrant

.small[We will also have the same groups. That way, if any host resources are mounted in the container, we'll have the same access privileges. You will also have access from outside the container to files generated inside the container. As you can imagine this is quite convenient when using Singularity in an HPC cluster.

One of the biggest problems when working with Docker is dealing with different user uids inside and outside the container but this is "automagically" solved by singularity so you don't need to worry about it

If we launch singularity with sudo, we would be root inside the container. ]

Default folders mounts

.small[Additionally, your home folder and the folder we've invoked Singularity from are accessible inside the container (by default):

Singularity centos-6.simg:~> ls ~
centos-6.simg  src

We have access to the bound folders with the same rights as outside, so we can for example write to:

Singularity centos-6.simg:~> touch ~/test_container
Singularity centos-6.simg:~> exit
vagrant@vm:~$ ls ~/test_container
/home/vagrant/test_container

The current working directory inside the container is the same as outside at launch time.

note for sysadmins: Check `/usr/local/etc/singularity/singularity.conf` for defaults ]

Custom folder mounts

You can mount custom folders inside the container using the --bind flag

vagrant@vm:~$ touch /var/tmp/file.dummy

vagrant@vm:~$ singularity shell -B /var/tmp:rw centos-6.simg 
Singularity: Invoking an interactive shell within container...

Singularity centos-6.simg:~> ls /var/tmp/
file.dummy

Running a command directly

Besides the interactive shell, we can execute any command inside the container directly with .highlight[exec]:

vagrant@vm:~$ singularity exec centos-6.simg cat /etc/centos-release
CentOS release 6.9 (Final)

STDIO with container processess

Standard input/output are processed as normal by Singularity. You can redirect them:

vagrant@vm:~$ singularity exec centos-6.simg echo Boo! > ~/test_container
vagrant@vm:~$ singularity exec centos-6.simg cat < ~/test_container
Boo!

You can use containers in pipelines:

$ singularity exec centos-6.simg echo Boo! | singularity exec centos-6.simg cat
Boo!

.exercise[ Count the number of words in host's ls /etc's output using container's copy of wc, then the other way around. Hint:

ls /etc | wc -w

]

STDIO with containers. A real world example

#!/bin/bash

module load Singularity/2.4

RefGenome='~/scicore-pipeline2-input-data/RefGenome/igenomes/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/'

echo 'SAM to sorted BAM...'
singularity exec pipeline2.simg samtools view -bS output/bowtie2/yeast_reseq_ds-bt2aln.sam | singularity exec pipeline2.simg samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o output/samtools/yeast_reseq_ds-bt2aln.s.bam

echo 'calling variants with FreeBayes...'
singularity exec pipeline2.simg freebayes -f $RefGenome/WholeGenomeFasta/genome.fa output/samtools/yeast_reseq_ds-bt2aln.s.bam > output/freebayes/yeast_reseq_ds-bt2aln.s.vcf

Creating your container from scratch

Option 1: Create a writable singularity image for development

.small[First create an empty writable image 2048MB in size and initialize it with a centos7 docker image

vagrant@vm:~$ sudo singularity image.create --size 2048 centos7.simg
vagrant@vm:~$ sudo singularity build --writable centos7.simg docker://centos:7
vagrant@vm:~$ sudo singularity shell --writable centos7.simg
Singularity centos7.simg:~> whoami
root

Let's get a missing application, fortune. It's not available in the default distribution, we will need to enable a community repository and install it:

Singularity centos7.simg:~> yum -y --enablerepo=extras install epel-release
Singularity centos7.simg:~> yum -y install fortune-mod

]

Creating your container from scratch

Option 2: Create a sandbox environment for development

.small[Another approach is to create a sandbox for development. This is the approach recommended by the singularity developers.

A sandbox is just a plain folder which contains all the files of the container.

One of the advantages of this approach is that you don't need to define a fixed size in advance. ]

vagrant@vm:~$ sudo singularity build --sandbox centos7_sandbox docker://centos:7

vagrant@vm:~$ ls centos7_sandbox/

vagrant@vm:~$ sudo singularity shell --writable centos7_sandbox/

Singularity centos7_sandbox:~> whoami
root

Singularity centos7_sandbox:~> cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)

Giving a purpose to the container

A container can have a "default" command which is run without specifying it.

Inside the container, it's /singularity. Let's try modifying it:

vagrant@vm:~$ sudo singularity exec -w centos7.simg vi /singularity

By default you'll see the following:

#!/bin/sh

exec /bin/bash

This is a script that will execute /bin/bash.

Giving a purpose to the container

We installed fortune, so let's use that instead by modifying /singularity like this:

#!/bin/sh

exec /usr/bin/fortune "$@"

Now we can invoke it with .highlight[run] or even by .highlight[running the image]:

vagrant@vm:~$ singularity run centos7.simg
[..some wisdom or humor..]

vagrant@vm:~$ ./centos7.simg
[..some more wisdom or humor..]

Making the container reproducible

Instead of taking some base image and making changes to it by hand, we want to make this build process reproducible.

This is achieved with singularity recipes

Let's try to retrace our steps to obtain a fortune-telling CentOS container.

Bootstrapping

The definition file starts with a header section.

The key part of it is the Bootstrap: configuration.

There are currently 2 types of bootstrap methods:

using yum/apt/pacman on the host system to bootstrap a similar one
pull a Docker image

We'll be using the Docker method.

Bootstrap: docker
From: centos:7

Setting up the container

There are 2 sections for setup commands (essentially shell scripts):

setup for commands to be executed .highlight[outside the container].

You can use $SINGULARITY_ROOTFS to access the container's filesystem, as it is mounted on the host during the build.

post for commands to be executed .highlight[inside] the container.

This is a good place to set up the OS, such as installing packages.

Setting up the container

Let's save the name of the build host and install fortune.

Bootstrap: docker
From: centos:7

%setup
  hostname -f > $SINGULARITY_ROOTFS/etc/build_host

%post
  yum -y --enablerepo=extras install epel-release
  yum -y install fortune-mod

Adding files to the container

An additional section, files, allows to copy files or folders to the container.

We won't be using it here, but the format is as follows (like cp, but destination is inside):

%files
  some/file /some/other/file some/path/
  some/directory some/path/

Note that this happens after %post. If you need the files earlier, copy them manually in %setup.

Setting up the environment

You can specify a script to be sourced when something is run in the container.

This goes to the environment section. Treat it like .bash_profile.

%environment
  export HELLO=World

All the environment is defined in profile files in folder /.singularity.d/env/ inside the container.

Note that the host environment variables are passed on by default as well, unless -e | --cleanenv is specified.

Setting up the runscript

The runscript (/singularity) is specified in the runscript section.

Let's use the file we copied at %setup and run fortune:

%runscript
  read host < /etc/build_host
  echo "Hello, $HELLO! Fortune Teller, built by $host"
  exec /usr/bin/fortune "$@"

Testing the built image

You can specify commands to be run at the end of the build process inside the container to perform sanity checks.

Use test section for this:

%test
  test -f /etc/build_host
  test -f /usr/bin/fortune

All commands must return successfully or the build will fail.

The whole definition file

Bootstrap: docker
From: centos:7

%setup
  hostname -f > $SINGULARITY_ROOTFS/etc/build_host

%post
  yum -y --enablerepo=extras install epel-release
  yum -y install fortune-mod

%environment
  export HELLO="World"

%runscript
  read host < /etc/build_host
  echo "Hello, $HELLO! Fortune Teller, built by $host"
  exec /usr/bin/fortune "$@"

%test
  test -f /etc/build_host
  test -f /usr/bin/fortune

Building a container from definition file

build the container using the definition (.highlight[requires root]):

vagrant@vm:~$ sudo singularity build fortune.simg fortune.def

Create the container as shown above.
Test running it directly. ]

Building a "real world" container

Download this Singularity definition file

.exercise[ Build this container and try to execute STAR and htseq-count utilities using "singularity exec" ]

Host resources

A Singularity container can have more host resources exposed.

For providing access to more directories, one can specify bind options at runtime with -B:

-B source[:destination[:mode]]

where .highlight[source] is the path on the host, .highlight[destination] is the path in a container (if different) and .highlight[mode] is optionally ro if you don't want to give write access.

Sysadmins can define default mounts in the global singularity config file (e.g. always mount /local_scratch to /tmp inside the container)

Additionally, devices on the host can be exposed, e.g. the GPU

OpenMPI should also work. Check official docs

X-forwarding inside containers

Quoting the official docs:

Can I run X11 apps through Singularity? Yes. This works exactly as you would expect it to.

Assuming you have X-forwarding enabled in your current ssh session you only need to open the graphical application inside the container.

Fuller isolation

By default, a container is allowed a lot of "windows" into the host system (dictated by Singularity configuration).

For an untrusted container, you can further restrict this with options like --contain, --containall.

In this case, you have to manually define where standard binds like the home folder or /tmp point.

See singularity help run for more information.

Integration with Slurm (or other schedulers)

#!/bin/bash

#SBATCH --job-name=using-singularity
#SBATCH --cpus-per-task=8
#SBATCH --mem=10G

module load Singularity/2.4

singularity exec pipeline2.simg samtools view -bS yeast_reseq_ds-bt2aln.sam | singularity exec pipeline2.simg samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o yeast_reseq_ds-bt2aln.s.bam

#!/bin/bash

#SBATCH --job-name=without-containers
#SBATCH --cpus-per-task=8
#SBATCH --mem=10G

module load SAMtools/1.7-goolf-1.7.20

samtools view -bS yeast_reseq_ds-bt2aln.sam | samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o yeast_reseq_ds-bt2aln.s.bam

Scientific Filesystem (SCIF) Apps

.small[The Scientific Filesystem (SCIF) provides internal modularity of containers, and it makes it easy for the creator to give the container implied metadata about software

e.g. metadata allows to define the list of applications available inside the container so the user doesn't need to know the details about the container internals.

In this example I am listing all the utilities available in a container named "pescobar-OpenStructure-Singularity-master.simg"

This container can be downloaded from the singularity hub ]

$ singularity apps pescobar-OpenStructure-Singularity-master.simg
chemdict_tool
lddt
molck
ost
tmalign
tmscore

Scientific Filesystem (SCIF) Apps

.exercise[ Download this Singularity definition file and modify it so "singularity apps" lists the available applications in the container (STAR and htseq-count). ]

hint: you have to add some %apprun lines. See the docs

Running services

Singularity 2.4 introduces the ability to run container instances, allowing you to run services (e.g. Nginx, MySQL, etc...) using Singularity.

.exercise[ Read the docs about Singularity instances and do the hello world exercise to boot an Nginx webserver with Singularity. ]

Distributing the container

The easiest: just copy it

Using the container after creation on another Linux machine is simple: you simply copy the image file there. You can use any file transfer tool. e.g. scp or ftp or even a usb stick

Note that you can't just run the image file on a host without Singularity installed!

Distributing the container

Using Singularity Hub

Singularity Hub allows you to build your containers in the cloud from Bootstrap files, which you can then simply pull on a target host.

This requires a GitHub repository with a Singularity definition file.

After creating an account in the Singularity Hub and connecting it to your GitHub account, you can select a repository and branches from github to be built.

https://github.com/pescobar/OpenStructure-Singularity

https://www.singularity-hub.org/collections/607 ]

Distributing the container

Using Singularity Hub

Once your container is in Singularity hub you can pull the result:

vagrant@vm:~$ singularity pull shub://pescobar/OpenStructure-Singularity
vagrant@vm:~$ ./pescobar-OpenStructure-Singularity-master-latest.simg

You can check the official docs to learn how to upload your images to the official Singularity hub

You can also host your private singularity hub

Optional exercise: Upload your first container to singularity hub

Docker and Singularity

Instead of writing a Singularity file, you may write a Dockerfile, build a Docker container and convert that.

Pros:

More portable: for some, using Docker or some other container solution is preferable.

Cons:

Blackbox: Singularity understands less about the build process, in terms of container metadata.
Complexity: Another tool to learn if you don't know Docker already.

Docker -> Singularity

If you have a Docker image you want to convert to Singularity, you have at least two options:

Upload the image to a Docker Registry (such as Docker Hub) and pull from there.
Convert locally with Docker and docker2singularity https://github.com/singularityware/docker2singularity https://hub.docker.com/r/vanessa/singularity/

Another work in progress:

https://github.com/dctrud/docker2singularity-go

Docker -> Singularity

vagrant@vm:~$ singularity pull docker://biocontainers/blast

vagrant@vm:~$ singularity exec blast.simg blastx -h

More examples

Using Conda inside Singularity

Check this Singularity definition file

.exercise[ Extend the container to install some of your preferred bioinfo tools from BioConda ]

More examples

Using EasyBuild inside Singularity

EasyBuild is a tool to build scientific software. Building your software inside a Singularity container you can keep full control about what dependencies your tool is using.

See an example container here

optional exercise: Build this container and use EasyBuild inside the container to build an application

Extra exercises

Some other interesting tutorials online

Another intro to Singularity

https://singularity-tutorial.github.io/

SCI-F Apps interactive tutorial

https://sci-f.github.io/tutorials

Tutorial using Singularity, SCIF and SnakeMake

https://github.com/sci-f/snakemake.scif
https://sci-f.github.io/snakemake.scif/

</textarea> <script src="js/vendor/remark.min.js"></script> <script src="js/vendor/jquery-3.2.1.min.js"></script> <script src="js/terminal.language.js"></script> <script src="js/common.js"></script>

vim: filetype=markdown syntax=markdown tabstop=2 shiftwidth=2 expandtab

Files

index.html

Latest commit

History

index.html

File metadata and controls

Intro to Singularity Containers

License and credits

.small[These slides are based in previous work from Alexander Kashev at http://www.scits.unibe.ch/]

Agenda

Objective

Requirements to follow the course

The Problem(s) we would like to solve

Problem (for developers)

Problem (for researchers)

Problem (for server administrators)

What would be a solution?

The Solution(s)

Solution: Virtual Machines?

Virtual machines

image source: http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC

Virtual Machines: the good parts

Virtual machines: the not so good parts

Solution: Containers (on Linux)?

Containers

Containers: the good parts

Containers: the not so good parts

History of containers

Main technologies used to create containers

Docker

Docker concerns

Docker concerns

Out of these concerns, and the scientific community, came Singularity.

Singularity

Singularity

Singularity and HPC

Performance comparison: native VS singularity in the sciCORE cluster

NGS pipeline without containers

NGS pipeline using Singularity containers

Performance benchmark: Singularity VS Docker VS native

.footnote[https://twitter.com/ljdursi/status/827186459975286787]

Singularity workflow

Singularity workflow

Some notes about the shell exercises

Installing Singularity in your laptop

Installing Singularity in your laptop

Installing Singularity in your laptop

Using Singularity

Container images

Pulling a pre-built Docker image to test the basic functionality of Singularity

User/group within the container

Default folders mounts

note for sysadmins: Check /usr/local/etc/singularity/singularity.conf for defaults ]

Custom folder mounts

Running a command directly

STDIO with container processess

STDIO with containers. A real world example

Creating your container from scratch

Option 1: Create a writable singularity image for development

]

Creating your container from scratch

Option 2: Create a sandbox environment for development

Giving a purpose to the container

Giving a purpose to the container

Making the container reproducible

Bootstrapping

Setting up the container

Setting up the container

Adding files to the container

Setting up the environment

Setting up the runscript

Testing the built image

The whole definition file

Building a container from definition file

Building a "real world" container

Host resources

X-forwarding inside containers

Fuller isolation

Integration with Slurm (or other schedulers)

Scientific Filesystem (SCIF) Apps

note for sysadmins: Check `/usr/local/etc/singularity/singularity.conf` for defaults ]