Wednesday 16 March 2016

Using docker on HPC

NERSC have a new project called “shifter” which allows docker images to run on their HPC systems (edison.nersc.gov and cori.nersc.gov)

docker

  • What is docker?
    Docker is a virtualisation system, but unlike traditional VM systems, which emulate a complete machine, docker uses “containers” which are more lightweight, and share the kernel of the host OS. It’s probably better explained on the docker website, but basically, that’s it.
    Docker images can be layered, i.e. you build up new images on the basis of existing images. For example, a base image might be a plain Ubuntu installation, and then you might add a layer with some extra packages installed, then another layer with your application software installed.

  • Why do I want it?
    In a word: consistency. Suppose I have a complex package, with loads of dependencies (e.g. FEniCS!) - it is much easier to define a docker image with everything specified exactly, rather than some installation instructions which will probably fail, depending on the particular weirdness of the machine you are trying to install on.
    That is a real advantage for HPC systems, which are notoriously diverse and difficult to build on.
    Only one problem: I can’t run docker on HPC. Not until now.

shifter

So the guys at NERSC in Berkeley, CA have come up with a system to load docker images on HPC nodes. It is also a repo on github.

Cool.

Actually, it looks really complex, and I’m glad I didn’t have to figure it out - but what is it like to use?

First you have to download your image from dockerhub onto the HPC machine.

module load shifter
shifterimg -v pull docker:image_name

This will result in some output like:

Message: {
  "ENTRY": "MISSING", 
  "ENV": "MISSING", 
  "WORKDIR": "MISSING", 
  "groupAcl": [], 
  "id": "MISSING", 
  "itype": "docker", 
  "last_pull": "MISSING", 
  "status": "INIT", 
  "system": "cori", 
  "tag": [], 
  "userAcl": []
}

You can keep on running “shifterimg -v pull” and “status” will cycle through
“INIT”, “PULLING”, “CONVERSION” and finally “READY”. I’ve no idea where it stores the image, that is a mystery…

OK, so now the image is “READY”, what next? Well, as on all HPC systems, you have to submit to the queue. I’m not going to repeat what is on the NERSC page, but they don’t show any examples of how to run with MPI. This is possible as follows in a SLURM script:

#!/bin/bash
#SBATCH --image=docker:repo:image:latest
#SBATCH --nodes=2
#SBATCH -p debug
#SBATCH -t 00:05:00
shifter --image=docker:repo:image:latest mpirun -n 32 python demo.py

… which runs a 32-core MPI demo nicely on 2 nodes.

Of course, one of the nice things about this is that because docker contains a complete local filesystem, there is no penalty for loading python libraries in parallel (as seen on most HPC systems).