» Efficient Execution of Mass Serial Jobs on NUS HPC - NUS Information Technology

EFFICIENT EXECUTION OF MASS SERIAL JOBS ON NUS HPC

By Lee Po-Hsien – Cancer Science Institute of Singapore on 27 Aug, 2018

Prof. Tenen’s group (https://www.csi.nus.edu.sg/ws/team/pi/daniel-g-tenen) and Prof. Chow’s group (https://www.csi.nus.edu.sg/ws/team/pi/edward-kai-hua-chow) in Cancer Science Institute of Singapore had its long history and distinguished contributions in deciphering the underlying mechanisms of tumorigenesis and tumor progression of liver cancer. Our group is working in collaboration with the two groups to target their identified druggable proteins by computational approaches that shed lights on new therapeutics. With known 3D structures of the proteins and organic compound libraries, molecular docking calculations were performed to select compounds with strong binding affinities that are potential to be inhibitors of the protein targets. This approach is one of the virtual screen applications that are used to narrow down candidates from large amount of compounds so that the effort and costs invested in biomedical assays can be more efficient.

In this technical article, I would like to share my experiences and tips on the usage of computational resource provided by NUS HPC. Our project is to dock ~50000 organic compounds to the proteins of interest. From computational point of view, we have ~50000 serial docking calculations to be run for that we adopted serial version of docking software (AutoDock4). So the tip presented here is suitable for the user who has a large number of serial applications to run on NUS HPC at the same time.

NUS HPC adopts PBSpro as the queuing system and provides a bunch of queue types for varieties of end-user’s applications (as shown in https://nusit.nus.edu.sg/services/hpc/getting-started/introductory-guide-for-new-hpc-users/). Intuitively, the suitable queue types for our application which is the serial version of AutoDock4, are “serial” and “short”. Note that “short” queue is designed for the simpler applications or quick testing jobs that can be finished within 24 hours on either single or multiple computational cores. If your job is not the case, “short” queue should not be taken into consideration. However, “serial” and “short” queue types allow few dozens of running jobs per user.

By considering ~50000 docking jobs, we need more than 500 rounds of job turnover to finish this docking project. To have high job turnover in the current queuing system, we can make use of the queue types such as “parallel8”, “parallel12” and “parallel24” that are created for parallelised applications (e.g. MPI version software). It should be emphasised that queue types such as “gpu”, “openmp” and “matlab” are not applicable because these are reserved for the demands on specific hardware (i.e. Nividia graphic cards, multiple CPU cores with large memory) or licensed commercial software (i.e. Matlab).

Thus, my idea is to distribute multiple jobs in a single node by a simple script (e.g. shell script, PERL script). For example, when claiming a node with 8 cores by “parallel8” queue, I used the following bash shell script in the PBS job submission script. In Example 1, “./autodock4 -p compound_1.dpf -l compound_1.dlg” is the command of our docking application. “&” is to execute the application in background. “$!” retrieves the PID (process ID) of the latest background job. All the retrieved PIDs (“$pid”) are saved into “PID_LIST” which is an array-like variable. Finally, the “wait” command will be running until all processes in “PID_LIST” are terminated.

Example 1. Bash shell job submission script

#!/bin/bash
#PBS -o job_log.stdout
#PBS -e job_log.stderr
#PBS -N docking
#PBS -l select=1:ncpus=8
#PBS -q parallel8

./autodock4 -p compound_1.dpf -l compound_1.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_2.dpf -l compound_2.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_3.dpf -l compound_3.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_4.dpf -l compound_4.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_5.dpf -l compound_5.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_6.dpf -l compound_6.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_7.dpf -l compound_7.dlg & pid=$! ; PID_LIST+=" $pid";
./autodock4 -p compound_8.dpf -l compound_8.dlg & pid=$! ; PID_LIST+=" $pid";
wait $PID_LIST

If the user is familiar with interpreted languages such as PERL and Python, the similar implementation can be easily achieved. The aforementioned commands of autodock4 applications in the job submission file can be replaced by a PERL script.

As shown in Example 2, the idea is to use a “system” call in combination with a “for” loop to execute command of my application in background. More PERL functions can be integrated in the script to check if the processes are completed successfully or even directly process output files.

Example 2. PERL script (partial) to distribute jobs in a node

for($i=1; $i<=8; $i++)
{
    system("./autodock4 -p compound_$i.dpf -l compound_$i.dlg &");
}

By applying the script, users can query queue types of “serial”, “parallel8”, “parallel12” and “parallel24” for the mass serial applications as our projects. According to my experience, the job turnover rate can be increased by more than 5 fold compared to exclusive “serial” jobs. Of course, the increase will also depend on the total job loading on the HPC system. In this article, we provide examples of bash shell script and PERL script. Other shell scripts (e.g. C shell, Bourne shell, Korn shell…) or interpreted languages (e.g. Python) should have equivalent functions to achieve the implementations we hope the experience shared here can help the large NUS HPC users with facilitate their computational projects.