HPC Technical Updates

HPC TECHNICAL UPDATES

» The Evolving Roles of HPC in Research and Enterprise Computing Support

Tan Chee Chiang, Research Computing, NUS Information Technology, on 21 January 2021

The last issue I shared about greater enterprise adoption of HPC resources and technologies due to the emergence of AI as a common application. This round we will look into how the NUS IT Research Computing (REC) team has evolved to stay relevant in this new HPC-AI era.

» Understand Your HPC Usage Profile

by Wang Junhong, Research Computing, NUS Information Technology , on 24 September 2020

Each month an intuitive usage profiling of HPC resources will be generated for every HPC user. This will improve user experience and allow users to understand how well their jobs are performing in the aspect of number of jobs completed, waiting time vs running time, parallel speedup performance, efficiency, and memory usage. Such usage profiling data will not only help users to identify room for improvement in either the parallel performance or the memory utilisation of their HPC jobs, but also improve the overall HPC resources utilisation and planning for future expansion. Read on for more details. 

» renv – Managing R Packages and Environment

by Vamshidhar Gangu, Research Computing, NUS Information Technology, on 24 September 2020

One of the major complaints within R community is that it is very hard to maintain project level package dependencies as R is continuously evolving with several releases within a year.

The renv package is a new effort to bring project-local R dependency management to your projects. This is a robust, stable replacement for the Packrat package, with fewer surprises and better default behaviours.

» New Computational Cluster in the Cloud

By Yeo Eng Hee, Research Computing, NUS Information Technology, on 15 May 2020

Over the past few articles in the HPC Newsletter, I have been writing on Cloud resources and how computational jobs can be run in the Cloud. The Research Computing team here has been working hard to make the Cloud resources available to registered HPC users in a secure and simple way, so that our HPC workloads can be run in the Cloud as well.

» Friendly Email Alert for HPC Batch Jobs

By Wang Junhong, Research Computing, NUS Information Technology, on 15 May 2020

A customised email alerting function is developed and enabled in the HPC system to send an email alert reporting the summary of jobs completed in the last hour to an individual user. So the individual user can get almost instant updates of his/her jobs via email App on a mobile devices anytime anywhere. This overcomes the inconvenience where users need to log into the HPC system from a computer terminal to check. Other useful information of the jobs can also be added into the alert report. Read on for more details.

» Anomaly Detection: A Machine Learning Use Case

By Kuang Hao, Research Computing, NUS Information Technology, on 15 May 2020

Anomaly detection is mainly a data-mining process and is widely used in behavioral analysis to determine types of anomaly occurring in a given data set. It’s applicable in domains such as fraud detection, intrusion detection, fault detection and system health monitoring in sensor networks. Since the definition of anomaly is often complicated, and depending on historical data, machine learning is optimal for this type of application.

» What is Data Engineering

By Tan Chee Chiang, Research Computing, NUS Information Technology, on 12 May 2020

We launched Data Engineering support services a while ago to support and accelerate data centric research such as in Analytics and AI. We will discuss the similarities and differences between Data Engineers and Data Scientists, and on how Data Engineers can help in both data-centric research and enterprise computing.

» Projecting 2020 HPC Trends

By Tan Chee Chiang, Research Computing, NUS Information Technology, on 20 January 2020

Based on current market trends and technology roadmaps from market leaders, we can expect greater convergence of HPC and AI technologies, more processor (CPU) and accelerator options, further adoption of HPC Cloud and higher demand for storage capacity in 2020.

» Acceleration of Data Pre-processing

By Kuang Hao, HPC Specialist (Research Computing), NUS Information Technology, on 20 January 2020

As the first step in machine learning’s pipeline, the importance of data pre-processing (DP) should never be neglected. For researchers and data science learners, thanks to our open source community and all the machine learning enthusiasts, there are all the clean and generalized datasets online for research and studying. DP plays such an important role, because real-life data is almost never well-organized.

» Accelerating Deep Learning & Other Data-Intensive GPU Applications

By Ku Wee Kiat, Research Computing, NUS Information Technology, on 20 January 2020

We have recently deployed a new all-flash storage appliance for NUS HPC Clusters. This new storage system provides about 100TB of high performance parallel file storage distributed amongst users of Atlas 9 and Volta GPU clusters. The storage system is purpose-built to accelerate Deep Learning applications (GPU) and other applications with mixed and demanding IO workloads. In this article, we will go through how to access the storage as well as show you results from Deep Learning benchmarks we have conducted on the new storage system, existing NAS system (hpctmp) as well as Volta nodes local SSD.