• NUS Information Technology | NUS IT Services, Solutions & Governance
  • NUS Information Technology | NUS IT Services, Solutions & Governance
  • Get Quote
  • NUS Information Technology | NUS IT Services, Solutions & Governance
    NUS Information Technology | NUS IT Services, Solutions & Governance
    • Milestones
    • Getting Started
      • HPC Portal
      • Register for HPC
      • Registration Guide
      • Introductory Guide for New HPC Users
      • How to Run Batch Job
    • Services
      • Data Processing, Storage & Management
      • Application Software
      • HPC Consulting Service
      • HPC GPU
      • Parallel Computing
      • Scientific Visualisation
  • myEmail    Staff Portal    Student Portal    A.S.K.
Home Services HPC Newsletter » A Preview of Jupyterhub on HPC

A PREVIEW OF JUPYTERHUB ON HPC

By Ku Wee Kiat – NUS IT on 27 Aug, 2018
What is Jupyter?

It is an open source web application that allows users to create and share documents that contain code, equations, visualisations and narrative text. It is useful for data exploration, cleaning and transformation, numerical simulation, statistical modelling, data visualisation, machine learning and much more.

What is Jupyterhub?

It is simply a multi-user version of Jupyter. It is designed for companies, classrooms and research labs with user management and authentication via PAM (Pluggable Authentication Modules), OAuth or other directory services like Active Directory.

The following is a brief introduction on how to access and use Jupyterhub. Exact steps might differ when Jupyterhub is finally deployed. This article will also provide a preview of Jupyterhub on HPC and something interesting users can do with it which is visualising neural networks.

Accessing Jupyterhub
Users will be able to sign in to Jupyterhub using their NUSNET credentials there is no need for a separate set of credentials.
Users will be able to sign in to Jupyterhub using their NUSNET credentials there is no need for a separate set of credentials.

After signing in, users will be presented with the home screen. Simply click “Start My Server” to proceed to the next step, which is to select your job profile.

jupyter02
Selecting a Job Profile
Example Job Profile Options
Example Job Profile Options

More job profiles might be available in the future as our resources expand. For example, GPU (Graphics Processing Unit)  job profiles for data analytics/AI/deep learning.

Select “serial” for running non-parallel code, and “parallel” with the required computing resource for parallel code.

Launching Your Job
Click after selecting your job profile to launch your Jupyter notebook session.
Click after selecting your job profile to launch your Jupyter notebook session.

In the backend, Jupyterhub will submit a PBS (Public Broadcasting Service) job script to launch a Jupyter notebook session for you. Similar to submitting PBS job scripts in the command line/terminal, your Jupyter notebook session job may have to wait in queue.

Waiting for server to start up
Waiting for server to start up
The Jupyter Notebook

When the server starts up, users will be redirected to a standard Jupyter notebook environment with their home directory displayed. From this screen, you can create new Jupyter python notebooks and more.

jupyter06
Parallel 8 queue showing 8 cpu core count
Parallel 8 queue showing 8 cpu core count
Working with Jupyter Notebooks
Plotting with Matplotlib in Jupyter Notebook (https://matplotlib.org/gallery/lines_bars_and_markers/simple_plot.html)
Plotting with Matplotlib in Jupyter Notebook (https://matplotlib.org/gallery/lines_bars_and_markers/simple_plot.html)

Besides plotting the usual plotting of graphs, Jupyter notebooks are useful for visualising neural networks.

Visualising a Simple Convolutional Neural Network (CNN)

Visualising a neural network can give us useful information about what it is learning. While designing a neural network for the task of image recognition, it is sometimes useful to be able to interpret and understand the model’s predictions.

Libraries used: Tensorflow, Keras, Numpy, matplotlib

We train a simple convolutional neural network (CNN) with the following architecture on the MNIST (Modified National Institute of Standards and Technology database) dataset. The network was trained for 10 epochs with a batch size of 256 with the Adam optimiser (Advanced Data Management).

• Input, size 28x28x1

• Convolution, 32 filters, kernel size 3×3, ReLU activation

• Convolution, 64 filters, kernel size 3×3, ReLU activation

• Max Pool, pool size 2×2

• Dropout 0.25

• Convolution, 128 filters, kernel_size 3×3, ReLU activation

• Max Pool, pool size 2×2

• Dropout 0.25

• Fully connected, 128 neurons, ReLU activation

• Dropout 0.5

• Fully connected, 10 neurons, Softmax activation

Looking at the first 36 images in MNIST (Modified National Institute of Standards and Technology database) with Jupyter
Looking at the first 36 images in MNIST (Modified National Institute of Standards and Technology database) with Jupyter
Validation accuracy of 0.994 after 10 epochs
Validation accuracy of 0.994 after 10 epochs

With the trained model, we can now visualise the weights of the layers, layer activations given an input, saliency maps and much more.

Some activations in the third convolution layer
Some activations in the third convolution layer
Saliency Map of the different inputs (Input, coloured map, grayscale map, coloured map, smoothed grayscale map)
Saliency Map of the different inputs (Input, coloured map, grayscale map, coloured map, smoothed grayscale map)

The goal of the saliency map is to identify the pixels of an image which contribute the most towards a particular class prediction. Basically, the most important pixels are those that makes the network determine that it belongs to a certain class. Saliency maps are useful for segmenting images, it can be used to localise the area of interest in the image.

As shown above, Jupyter notebooks can be used for more than just plotting graphs, it can be used for examining and understanding neural networks and many more analytical applications.

Users can look forward to more visual and interactive software for Data Analytics, Machine Learning and Deep Learning in the future.

Reference
  1. Project Jupyter: http://jupyter.org/
  2. More on Visualising Convolutional Neural Networks: Understanding CNN http://cs231n.github.io/understanding-cnn/
  3. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps https://arxiv.org/abs/1312.6034v2

Please contact the Data Engineering Technology team at DataEngineering@nus.edu.sg if you have any query on the above developments.

NUS Information Technology | NUS IT Services, Solutions & Governance > Services > HPC Newsletter > » A Preview of Jupyterhub on HPC
ALCA
National University of Singapore
  • 2 Engineering Drive 4
  • NUS Information Technology
  • Singapore 117584
  • 6516 2080
  • © National University of Singapore. All Rights Reserved.
       Legal        Branding guidelines