» Running Abaqus in HPC Cloud: A Personal Experience
By Tang Haibin, Research Fellow, Mechanical Engineering, on 24 September 2020
A personal account on how computational efficiency of research work is improved when running in HPC Cloud
» Hadoop on AWS: Benefits of EMR
By Kumar Sambhav, Research Computing, NUS Information Technology, on 15 May 2020
Managing Big Data on Hadoop clusters has seen a lot of paradigm shift in the recent times. From Sysadmin managed clusters at the command line level to on-prem centrally managed platforms like Cloudera, Hortonworks and MapR. All of these platforms have a primary problem of being dependent on physical hardware resources. Read on to discover how EMR addresses this shortcoming in the cloud.
» Tackling HPC issue for parallel computing in MATLAB
By Vamshidhar Gangu, Research Computing, NUS Information Technology, on 15 May 2020
Tips on how to run MATLAB Parallel Computing Toolbox jobs properly in our HPC cluster. Potential conflict between multiple concurrent jobs is addressed in this article.
» New Computational Cluster in the Cloud
By Yeo Eng Hee, Research Computing, NUS Information Technology, on 15 May 2020
Over the past few articles in the HPC Newsletter, I have been writing on Cloud resources and how computational jobs can be run in the Cloud. The Research Computing team here has been working hard to make the Cloud resources available to registered HPC users in a secure and simple way, so that our HPC workloads can be run in the Cloud as well.
» Friendly Email Alert for HPC Batch Jobs
By Wang Junhong, Research Computing, NUS Information Technology, on 15 May 2020
A customised email alerting function is developed and enabled in the HPC system to send an email alert reporting the summary of jobs completed in the last hour to an individual user. So the individual user can get almost instant updates of his/her jobs via email App on a mobile devices anytime anywhere. This overcomes the inconvenience where users need to log into the HPC system from a computer terminal to check. Other useful information of the jobs can also be added into the alert report. Read on for more details.
» Data Manipulation and More with the Command Line
By Ku Wee Kiat, Research Computing, NUS Information Technology, on 15 May 2020
Ever needed to have a directory of files renamed to a certain format? Extract lines with certain keywords from log files? Even create csv files from semi-structured logs? There is no need to bust out the custom python or R scripts or install any software when most simple tasks can be solved at a much faster speed using Bash tools.
» Anomaly Detection: A Machine Learning Use Case
By Kuang Hao, Research Computing, NUS Information Technology, on 15 May 2020
Anomaly detection is mainly a data-mining process and is widely used in behavioral analysis to determine types of anomaly occurring in a given data set. It’s applicable in domains such as fraud detection, intrusion detection, fault detection and system health monitoring in sensor networks. Since the definition of anomaly is often complicated, and depending on historical data, machine learning is optimal for this type of application.
» What is Data Engineering
By Tan Chee Chiang, Research Computing, NUS Information Technology, on 12 May 2020
We launched Data Engineering support services a while ago to support and accelerate data centric research such as in Analytics and AI. We will discuss the similarities and differences between Data Engineers and Data Scientists, and on how Data Engineers can help in both data-centric research and enterprise computing.
» Projecting 2020 HPC Trends
By Tan Chee Chiang, Research Computing, NUS Information Technology, on 20 January 2020
Based on current market trends and technology roadmaps from market leaders, we can expect greater convergence of HPC and AI technologies, more processor (CPU) and accelerator options, further adoption of HPC Cloud and higher demand for storage capacity in 2020.
» Acceleration of Data Pre-processing
By Kuang Hao, HPC Specialist (Research Computing), NUS Information Technology, on 20 January 2020
As the first step in machine learning’s pipeline, the importance of data pre-processing (DP) should never be neglected. For researchers and data science learners, thanks to our open source community and all the machine learning enthusiasts, there are all the clean and generalized datasets online for research and studying. DP plays such an important role, because real-life data is almost never well-organized.