AI-HPC RESOURCES
With the recent announcement of AI.SG national initiative, we will explore the synergy between AI and HPC, and how HPC technologies and resources can be tapped to support related research and application.
AI, Machine/Deep Learning, Big Data and HPC
AI refers to technology that could mimic the intelligence of humans. Machine Learning is an AI program that can learn by itself, similar to what humans can do. Deep Learning is a form of Machine Learning that accelerates solution for certain type of problems, specifically those related to computer vision and natural language processing.
Over the past years, we had been discussing about the emergence of Big Data related research and the use of Hadoop and HPC resources for large-scale data analytics. It is clear that the parallel computing power of HPC systems is needed to enable such large-scale computation. Moving forward, the recent rise of Machine Learning and Deep Learning in AI development apparently has something to do with Big Data and HPC. According to some experts, the increase in the volume of data collected in the Big Data era and the advancement of parallel computing power in HPC (such as the use of parallel computing capability in GPU and Xeon Phi) has helped to fuel the AI development. The availability of a large amount of data facilitates the training process in Machine/Deep Learning. The powerful HPC system enables the compute intensive Machine/Deep Learning to be completed within a reasonable timeframe.
With this relationship, it is no surprise to see Big Data and Machine/Deep Learning applications co-existing in the HPC environment. While GPU and Xeon Phi HPC systems are being used for traditional compute-intensive HPC applications (molecular dynamics, financial analysis, image processing etc.), Machine/Deep Learning and data analytics, the Hadoop systems running on HPC cluster are being used for Big Data Analytics and for some Machine Learning applications.
AI Resources and Services
At NUS IT, we have been working with our partners (e.g. National Supercomputing Centre, NSCC) and developing resources to support data intensive applications over the past years. It turns out that some of these resources can also be extended to support AI research.
- Two commonly used programming languages in HPC systems for Bioinformatics research, Java and Python, can now be used to enable Machine/Deep Learning applications. More details of our central HPC resources and services can be found at our website.
- The Hadoop data repository and analytics system developed by NUS IT will be available for use in July. The system will support Machine Learning through the Sparks libraries. Machine Learning related programming languages such as Java, Python and Scala will be supported. Check out the introductory article from my colleague for more details.
- We will be tapping on the Cloud to complement our HPC resources. We are currently working on making some of the AWS Cloud resources available to our users. One of our plans is to enable TensorFlow (software library for Machine Learning) support using GPU in the Cloud. Register as a HPC user here to receive future updates on this.
- NSCC provides a 128-node GPU cluster to support HPC applications including Machine Learning. The resources are open for use by NUS researchers. You may register to use NSCC resources as an NUS user here.