Introduction to Docker
Setting up an environment could be painful even if you are an experienced programmer. There are different OS, such as Mac, windows, and Linux. For Linux, there are so many distributions that their underlying structure are also different. Let alone the packages you are using could also have compatibility issues with OS or each other. This is where Docker comes in.
Docker is a containerization platform that allows you to run applications in an isolated environment, called a container. With Docker, you can package an application and all its dependencies into a single container, which can then be run on any platform that supports Docker without worrying about compatibility issues.
For neuroimaging scientists, Docker can be particularly useful for running neuroimaging software and tools, which often have complex dependencies and configurations. With Docker, you can easily share and distribute pre-configured containers that contain the necessary software and libraries for neuroimaging analysis. For example, it is much easier to use Niprep software with a container.
You + Docker = Ready for Action!
Docker Basics
Docker works in the logic displayed above. The server and OS are related to the machine you are using. Docker is the Docker software. The Docker software manages multiple containers each of which runs separate software/projects together with their dependencies.
The containers are isolated areas of an operating system with a resource usage limit applied. The word Isolated means no matter what your machine is, as long as you have docker installed, you can run any project. So ideally you don’t need to buy a MacBook, a Windows, and a Linux machine!
Docker structure
Docker is built on top of many great open-source projects, how Docker itself works is provided below. However, these are not necessities for end-user. For end-user, Docker CLI is the only thing you need to learn, just got to the next part.
- Docker CLI (Command Line Interface) is a tool that allows users to interact with Docker through a command-line interface. With the Docker CLI, developers can build container images, run containers, and manage Docker networks and volumes.
- The Docker daemon is the core service that runs on the host machine and manages the lifecycle of containers. The Docker daemon is responsible for creating and managing containers, downloading and managing images from registries, and managing Docker networks and volumes.
- containerd is an open-source container runtime that provides a simple and secure way to run containers on a variety of platforms, including Linux and Windows. containerd is integrated with Docker, and it provides a lower-level interface for container management than Docker itself. containerd is responsible for creating, starting, stopping, and deleting containers. It also manages images, network interfaces, and storage volumes used by containers.
- Shim is a lightweight component that acts as a bridge between the container runtime and the container engine. It was introduced to allow for a separation of concerns between the container engine (such as Docker) and the container runtime (such as runc) to provide enhanced security and isolation. Shim launches containers on behalf of the container engine and then communicates with the runtime to create and manage the container.
- runc is an open-source command-line tool used to create and run containers. It is a lightweight alternative to other container engines like Docker, but it only provides container runtime functionality. It is designed to be used as a low-level container runtime, providing all the necessary tools to manage containers, such as creating, starting, stopping, and deleting them.
Installation
I mainly work with RedHat systems, RedHatEnterprise for HPC and Fedora for PC, so the following instruction might not be suitable for other distributions such as Ubuntu.
- If you use a distribution optimized for scientific computing, you may already have a docker installed. Type
docker version
in your terminal, and you should see:
Docker version 24.0.2, build cb74dfc
- Remove
moby-engine
if it’s on your PC. This is an upstream project of docker but they are not fully compatible.
sudo dnf remove moby-engine
- Add the docker source to your package manager
sudo dnf -y install dnf-plugins-core sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
- Install the latest version of Docker CE and CLI:
sudo dnf install docker-ce docker-ce-cli containerd.io
- Start and enable the Docker service:
sudo systemctl start docker sudo systemctl enable docker # systemctl stop docker
- Verify the installation:
docker --version # Docker version 24.0.2, build cb74dfc
Docker CLI
Docker CLI (Command Line Interface) is a tool that allows users to interact with the Docker platform using commands in the terminal or command prompt. It provides a convenient way to manage Docker containers, images, networks, and volumes. With Docker CLI, you can perform various tasks such as building, running, and managing Docker containers, as well as pushing and pulling images from Docker registries.
Container Commands
# list all running container on your pc docker container ps ## or docker ps # list all available containers docker container ps -a ## or docker ps -a # stop a container docker stop <id or name> # id can be unique initials # remove a container docker rm <id or name> # remove all containers docker container prune
Image Commands
# list all images docker images docker images ls # remove an image that is not used by any container docker rmi <name> docker image rm <name> # remove all images that is not used by any container docker image prune # download an image docker pull <name> # latest docker pull <name>:<version> # push an image docker push <name>
Docker Run commands
docker run
is a command used to create and start a new container from a Docker image. It has several parameters that can be used to customize the container's behavior and configuration. For simple usage:# runs a command in a new container # This runs the command and removes the container docker run <name> # Run one command docker exec <name or id> <command you want to run> # Check the information of a running container docker inspect <name or id>
Most of the time, a simple command won’t be enough. For example, you may want your container to read a local file, read an environment variable or publish on a port of localhost. You may need to specify one of the following parameters:
Mapping Parameters
p
or-publish
: Publish a container's port(s) to the host. This maps a host port to a container port.
v
or-volume
: Bind mount a volume. This is used to share data between the host and the container.
docker run -p host_port:container_port -v host_path:container_path image_name
Set env
e
or-env
: Set environment variables in the container.
-env-file
: Read environment variables from a file.
docker run -e "ENV_VAR_NAME=value" --env-file env_file_path image_name
Handy Functions
d
or-detach
: Run the container in the background and print the container ID.
-rm
: Automatically remove the container when it exits. This is useful for cleaning up temporary containers.
-restart
: Set the restart policy for the container. Options include "no", "on-failure", "always", and "unless-stopped".
Others
These are just a few of the many parameters available for
docker run
. You can find the full list of parameters in the official Docker documentation. Here I only list several:-name
: Assign a custom name to the container. This makes it easier to identify and manage the container.
-network
: Connect the container to a specific network.
docker run --name my_container_name --network network_nameimage_name
it
: Run the container in interactive mode with a TTY attached. This is useful for running a shell or other interactive applications.
docker run -it image_name docker exec -it <name or id> <command you want to connect>
Docker Volume Command
As mentioned before, this allows the docker to access the files on your localhost
This is a advanced usage of the
docker run -v
Docker volumes are directories or file systems that exist outside of a container's file system, allowing for persistent and shared data storage. They come in two types: named volumes, managed by Docker, and bind mounts, which link specific directories on the host system to directories within the container.
By using the
docker volume
you can create a volume inside docker, thus you can use the a volume mount# List all volumes docker volume ls # create a new volume docker volume create data_volume # Volume mount docker run –v data_volume:/var/lib/mysql mysql # Bind Mount docker run -v /data/mysql:/var/lib/mysql mysql docker run --mount type=bind,source=/data/mysql,target=/var/lib/mysqlmysql # Delete a volume docker volume remove data_volume # Delete all unused volumes docker volume prune # Read-only mounts docker container run --mount source=data_volume, desitnation=/var/lib/data,readonly httpd
Named volumes provide Docker-managed persistence, while bind mounts offer more flexibility and control over the volume's location.
Run fMRI prep
Ok, now you know the basics of docker, though these are not enough for developing a docker app but are enough for you to use them.
# pull the fmriprep sudo docker pull nipreps/fmriprep:23.1.2 ## If you are not one of the sudoer ## you can ask the admin to add you to 'docker' group, for instance: (base) [nij@neils-fedora ~]$ groups nij wheel docker # Once you are in, you can just run docker pull nipreps/fmriprep:23.1.2
Now you can start using it. See? It’s very very simple, just like
git clone
.For using fMRI prep, you can use a Python package called
fmriprep-docker
which is a Python interface of the wrapped docker commands. Or you can directly use the docker command. The following script runs the fMRI preprocessing for a single subject:
docker run --rm -e DOCKER_VERSION_8395080871=20.10.23 -it \ -v <bids_dir>/derivatives/license.txt:/opt/freesurfer/license.txt:Z \ -v <bids_dir>:/data:Z -v <bids_dir>/derivatives:/out nipreps/fmriprep:23.1.2 \ /data /out participant --participant-label sub-1001 --skip-bids-validation \ --md-only-boilerplate --fs-no-reconall --nthreads 25 \ --stop-on-first-crash --mem_mb 15000 --output-spaces MNI152NLin2009cAsym:res-1
The top three lines are Docker commands, I believe you can understand them all. They last three lines is the command you want to run in the container, these are also the input parameters for the
fmriprep
, we will get into details another time (but they are relatively intuitive as you can see). Then you can just wait to read the output report. Cheers!
Tips
Docker Web
Docker is also a great platform for web applications. Many open-source projects are shared through Docker, and it is also worth looking at.
Docker Hub
‘GitHub’ for Docker. Search for the published images you need at
hub.docker.com
Docker Directory
The
/var/lib/docker
directory is the default location where Docker stores its data, including images, containers, volumes, and networks. This directory contains all the necessary files and metadata required for Docker to function properly. If you started by uninstalling the older version of Docker, remember to rm -rf
this directory.Debug Mode
sudo dockerd # to start manually sudo dockerd --debug # debug mode