Docker (fMRIprep) — From Intro to Usage

Introduction to Docker

Setting up an environment could be painful even if you are an experienced programmer. There are different OS, such as Mac, windows, and Linux. For Linux, there are so many distributions that their underlying structure are also different. Let alone the packages you are using could also have compatibility issues with OS or each other. This is where Docker comes in.

Docker is a containerization platform that allows you to run applications in an isolated environment, called a container. With Docker, you can package an application and all its dependencies into a single container, which can then be run on any platform that supports Docker without worrying about compatibility issues.

For neuroimaging scientists, Docker can be particularly useful for running neuroimaging software and tools, which often have complex dependencies and configurations. With Docker, you can easily share and distribute pre-configured containers that contain the necessary software and libraries for neuroimaging analysis. For example, it is much easier to use Niprep software with a container.

💡

You + Docker = Ready for Action!

Docker Basics

Docker works in the logic displayed above. The server and OS are related to the machine you are using. Docker is the Docker software. The Docker software manages multiple containers each of which runs separate software/projects together with their dependencies.

The containers are isolated areas of an operating system with a resource usage limit applied. The word Isolated means no matter what your machine is, as long as you have docker installed, you can run any project. So ideally you don’t need to buy a MacBook, a Windows, and a Linux machine!

Docker structure

Docker is built on top of many great open-source projects, how Docker itself works is provided below. However, these are not necessities for end-user. For end-user, Docker CLI is the only thing you need to learn, just got to the next part.

Docker CLI (Command Line Interface) is a tool that allows users to interact with Docker through a command-line interface. With the Docker CLI, developers can build container images, run containers, and manage Docker networks and volumes.

The Docker daemon is the core service that runs on the host machine and manages the lifecycle of containers. The Docker daemon is responsible for creating and managing containers, downloading and managing images from registries, and managing Docker networks and volumes.

containerd is an open-source container runtime that provides a simple and secure way to run containers on a variety of platforms, including Linux and Windows. containerd is integrated with Docker, and it provides a lower-level interface for container management than Docker itself. containerd is responsible for creating, starting, stopping, and deleting containers. It also manages images, network interfaces, and storage volumes used by containers.

Shim is a lightweight component that acts as a bridge between the container runtime and the container engine. It was introduced to allow for a separation of concerns between the container engine (such as Docker) and the container runtime (such as runc) to provide enhanced security and isolation. Shim launches containers on behalf of the container engine and then communicates with the runtime to create and manage the container.

runc is an open-source command-line tool used to create and run containers. It is a lightweight alternative to other container engines like Docker, but it only provides container runtime functionality. It is designed to be used as a low-level container runtime, providing all the necessary tools to manage containers, such as creating, starting, stopping, and deleting them.

Installation

I mainly work with RedHat systems, RedHatEnterprise for HPC and Fedora for PC, so the following instruction might not be suitable for other distributions such as Ubuntu.

If you use a distribution optimized for scientific computing, you may already have a docker installed. Type docker version in your terminal, and you should see:


Docker version 24.0.2, build cb74dfc

Remove moby-engine if it’s on your PC. This is an upstream project of docker but they are not fully compatible.


sudo dnf remove moby-engine

Add the docker source to your package manager


sudo dnf -y install dnf-plugins-core
sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo

Install the latest version of Docker CE and CLI:


sudo dnf install docker-ce docker-ce-cli containerd.io

Start and enable the Docker service:


sudo systemctl start docker 
sudo systemctl enable docker
# systemctl stop docker

Verify the installation:


docker --version
# Docker version 24.0.2, build cb74dfc

Docker CLI

Docker CLI (Command Line Interface) is a tool that allows users to interact with the Docker platform using commands in the terminal or command prompt. It provides a convenient way to manage Docker containers, images, networks, and volumes. With Docker CLI, you can perform various tasks such as building, running, and managing Docker containers, as well as pushing and pulling images from Docker registries.

Container Commands


# list all running container on your pc
docker container ps
## or
docker ps

# list all available containers
docker container ps -a
## or
docker ps -a

# stop a container
docker stop <id or name> # id can be unique initials

# remove a container
docker rm <id or name>

# remove all containers
docker container prune

Image Commands


# list all images
docker images
docker images ls

# remove an image that is not used by any container
docker rmi <name>
docker image rm <name>

# remove all images that is not used by any container
docker image prune

# download an image
docker pull <name> # latest
docker pull <name>:<version>

# push an image 
docker push <name>

Docker Run commands

docker run is a command used to create and start a new container from a Docker image. It has several parameters that can be used to customize the container's behavior and configuration. For simple usage:


# runs a command in a new container
# This runs the command and removes the container
docker run <name> 

# Run one command
docker exec <name or id> <command you want to run>

# Check the information of a running container
docker inspect <name or id>

Most of the time, a simple command won’t be enough. For example, you may want your container to read a local file, read an environment variable or publish on a port of localhost. You may need to specify one of the following parameters:

Mapping Parameters

p or -publish: Publish a container's port(s) to the host. This maps a host port to a container port.

v or -volume: Bind mount a volume. This is used to share data between the host and the container.


docker run -p host_port:container_port -v host_path:container_path image_name

Set env

e or -env: Set environment variables in the container.

-env-file: Read environment variables from a file.


docker run -e "ENV_VAR_NAME=value" --env-file env_file_path image_name

Handy Functions

d or -detach: Run the container in the background and print the container ID.

-rm: Automatically remove the container when it exits. This is useful for cleaning up temporary containers.

-restart: Set the restart policy for the container. Options include "no", "on-failure", "always", and "unless-stopped".

Others

These are just a few of the many parameters available for docker run. You can find the full list of parameters in the official Docker documentation. Here I only list several:

-name: Assign a custom name to the container. This makes it easier to identify and manage the container.

-network: Connect the container to a specific network.


docker run --name my_container_name --network network_nameimage_name

it: Run the container in interactive mode with a TTY attached. This is useful for running a shell or other interactive applications.


docker run -it image_name
docker exec -it <name or id> <command you want to connect>

Docker Volume Command

💡

As mentioned before, this allows the docker to access the files on your localhost

💡

This is a advanced usage of the docker run -v

Docker volumes are directories or file systems that exist outside of a container's file system, allowing for persistent and shared data storage. They come in two types: named volumes, managed by Docker, and bind mounts, which link specific directories on the host system to directories within the container.

By using the docker volume you can create a volume inside docker, thus you can use the a volume mount


# List all volumes
docker volume ls
# create a new volume
docker volume create data_volume
# Volume mount
docker run –v data_volume:/var/lib/mysql mysql 
# Bind Mount
docker run -v /data/mysql:/var/lib/mysql mysql
docker run --mount type=bind,source=/data/mysql,target=/var/lib/mysqlmysql
# Delete a volume
docker volume remove data_volume
# Delete all unused volumes
docker volume prune
# Read-only mounts
docker container run --mount source=data_volume, desitnation=/var/lib/data,readonly httpd

💡

Named volumes provide Docker-managed persistence, while bind mounts offer more flexibility and control over the volume's location.

Run fMRI prep

Ok, now you know the basics of docker, though these are not enough for developing a docker app but are enough for you to use them.


# pull the fmriprep
sudo docker pull nipreps/fmriprep:23.1.2
## If you are not one of the sudoer
## you can ask the admin to add you to 'docker' group, for instance:
(base) [nij@neils-fedora ~]$ groups
nij wheel docker
# Once you are in, you can just run
docker pull nipreps/fmriprep:23.1.2

Now you can start using it. See? It’s very very simple, just like git clone.

For using fMRI prep, you can use a Python package called fmriprep-docker which is a Python interface of the wrapped docker commands. Or you can directly use the docker command.

The following script runs the fMRI preprocessing for a single subject:


docker run --rm -e DOCKER_VERSION_8395080871=20.10.23 -it \
-v <bids_dir>/derivatives/license.txt:/opt/freesurfer/license.txt:Z \
-v <bids_dir>:/data:Z -v <bids_dir>/derivatives:/out nipreps/fmriprep:23.1.2 \ 
/data /out participant --participant-label sub-1001 --skip-bids-validation \
 --md-only-boilerplate --fs-no-reconall --nthreads 25 \
--stop-on-first-crash --mem_mb 15000 --output-spaces MNI152NLin2009cAsym:res-1

The top three lines are Docker commands, I believe you can understand them all. They last three lines is the command you want to run in the container, these are also the input parameters for the fmriprep, we will get into details another time (but they are relatively intuitive as you can see).

Then you can just wait to read the output report. Cheers!

Tips

Docker Web

Docker is also a great platform for web applications. Many open-source projects are shared through Docker, and it is also worth looking at.

Docker Hub

‘GitHub’ for Docker. Search for the published images you need at hub.docker.com

Docker Directory

The /var/lib/docker directory is the default location where Docker stores its data, including images, containers, volumes, and networks. This directory contains all the necessary files and metadata required for Docker to function properly. If you started by uninstalling the older version of Docker, remember to rm -rf this directory.

Debug Mode


sudo dockerd # to start manually
sudo dockerd --debug # debug mode

References

https://www.coursera.org/projects/docker-for-absolute-beginners