SINGULARITY: HPC CONTAINER
Most of us, at least those in IT/scientific computing should have heard of Docker, a container platform used for packaging necessary code and running application within the containers.
What are Containers?
Containers are a type of light weight virtualisation technology that allows you to run any application in an isolated process with its dependencies packaged together. They are helpful for running applications across heterogeneous infrastructure, from a virtual machine in a data center to a virtual machine in a private or public cloud. Recently it is also widely adopted in research computing as it offers a great solution for reproducibility.
Containers for HPC
Building and deploying software on high-end computing system is a challenging task. HPC applications must be able to run across multiple environments while dealing with complicated hardware and software stack dependencies. Containers are a more convenient way to deal with dependencies and making them portable across different environments.
With the recent rise of Deep Learning (DL) technology, containerisation is now rising in importance as each DL framework has many dependencies, specific version requirements with frequent updates. Also, the most supported OS for all DL frameworks is Ubuntu, while RedHat/CentOS is typically used amongst most of the data centers. Containers help developers and administrators to overcome these challenges.
Singularity
There exist many containerization technologies. The most popular one is Docker and others include LXC (Linux Containers), LXD (Learning Experience Design), Shifter and Singularity. Though Docker is first in the market and widely adopted in the community, it poses a serious security problem for HPC environment. Most of the Docker images are rooted and provide superuser privileges to the user who can now access other user’s data from within Containers. Singularity, on other hand, operates on a different premise by assuming all the users as untrusted. The default usage model for Singularity is using SetUID and it increases the effective permission set only when necessary and drops it immediately. Thus, it allows for containers that run as a regular user. It also allows one to use the docker container images directly. When importing a Docker container image, it removes any element which can only run as root and thus enabling it to run as a regular user.
We will be provisioning a GPU cluster in Q4 this year, to support Deep learning and machine learning research. Singularity will be used to manage different DL frameworks and other analytics applications. It is currently available on HPC and can only be used for CPU bound applications.
Using Singularity on HPC
Singularity is available as a module on HPC. HPC maintains few singularity images for users to directly use them to execute their applications (e.g. tensorflow). Users can also request for specific images to be available on HPC by writing to DataEngineering@nus.edu.sg. I have showcased a simple example of running a sample tensorflow code using CPU.
SSH to atals8 login node and load singularity module.
The prebuilt singularity images are available at /app1/centos6.3/gnu/singularity/2.5.2/images. For now, Ubuntu and tensorflow_cpu images are available to use. It is also possible to look into the container and verify the OS and tensorflow version.
Working with Files inside Singularity Image
For all the HPC-maintained images, both scratch spaces /hpctmp and /hpctmp2 are mounted automatically along with the home directory. Users can seamlessly access all their files within each container without transferring them. Here, I will be running my tensorflow code pin my hpctmp scratch folder directly from the container. This code implements a convolutional neural network to classify MNIST digits dataset.
For those custom images, users can bind the directories using –bind parameter. Below is an example where user can bind the /hpctmp directory to the custom image.
Singularity tries to mount both scratch directory but issues a warning if the corresponding directories are missing inside the image. Likewise users have a seamless access to their files and directories without compromising their access levels. Users can also execute Singularity using PBS. Below is the sample job script to execute the same tensorflow code using PBS.
Users can reach to DataEngineering@nus.edu.sg for any further support on Singularity.