PBS JOB SCHEDULER
With the release of new HPC cluster Atlas8 consisting of 64 compute nodes having 192GB memory and 24 cores each and 100Gbps interconnect, the PBS job scheduler was introduced to users on the cluster as well. PBS is a trusted job scheduling toolkit for complex Top500 systems as well as smaller cluster. With the national site licenses for institutes of high-learning organisations facilitated by the National Supercomputing Center (NSCC), it is more cost-effective to implement and use PBS job scheduler.
In another few weeks, we shall migrate to using PBS job scheduler for all HPC clusters. You are encouraged to use and try it on Atlas8 cluster now.
PBS vs LSF
The existing LSF job scheduler has been used in the central HPC systems for more than a decade. Comparing the functionalities, the PBS job scheduler has almost similar functions as LSF. The below table summarises the mostly used commands for batch job submission, monitor and management.
Action | PBS | LSF |
---|---|---|
Submit a Job |
# qsub job_script.txt |
# bsub < job_script.txt # bsub –q parallel –o …. |
List & Check Job(s) |
# qstat # qstat -asn1 |
# bjobs # bjobs -w |
Check Job Output |
# qcat |
# bpeek |
Check a Job Info |
# qstat -f 1234 |
# bjobs -l 1234 |
Terminate a Job |
# qdel 1234 |
# bkill 1234 |
Besides the command prefix changes from b* to q*, there will be a short learning curve for users to switch to use PBS. A customized help interface is enabled on the HPC system for users to list out the instructions on how to use PBS job scheduler. To begin, you can enter this command on Atlas8 cluster:
[atlas8-c01]> hpc pbs
Furthermore, if you would like to know how to submit Abaqus or Gromacs jobs, you can enter commands:
[atlas8-c01]> hpc pbs Abaqus
[atlas8-c01]> hpc pbs Gromacs
Pros
PBS job scheduler gives easy to understand message when you check pending reasons of your batch jobs. For example, the following two messages tell you the pending reason of your jobs simply when you check via “qstat -ans1”
[atlas8-c01]> qstat -ans1 Not Running: Queue parallel24 per-user limit reached on resource ncpus Not Running: Not enough free nodes available
Cons
You need to prepare a job submission script in order to do the job submission in PBS job scheduler. This could be a little tedious when you are not familiar with editing a text file in Linux system.
Tips for you
• Run “hpb pbs [app_name]” to display the instructions and sample script on the cluster.
• Copy-paste the sample script file as reference, and modify the necessary parts for your own job submission need.
• Use nano text editor on Linux system, which is very similar to Notepad on Windows, except that there is no support for mouse operation.
Summary
Feel free to contact us at nus.edu.sgnusit-hpc@ should you have any questions about using PBS job scheduler on the cluster.