DATA ENGINEERING TECHNOLOGY CONSULTING AND SUPPORT
Many HPC technologies and resources can be extended to support data centric research computing such as Big Data Analytics and Deep Learning. Here are the services supported by the newly formed Data Engineering Technology team:
From HPC to Data Engineering
Data Engineers builds tools, infrastructure, frameworks and services to enable effective Big Data Analytics and running of data centric applications. Even though different programming languages, computing platforms and visualization tools are used in HPC and Big Data, they both require the presence of two common characteristics, namely Dependency on scalability and Parallel Computation, to be effective.
HPC Engineers have been working with scalable systems such as Cluster and Grid for years. They have also been enabling parallel computing using technologies such as MPI, OpenMP, Parallel File System, and more recently the Accelerator (e.g. GPU). Their expertise in scalable system development and parallel computing support will be handy but not sufficient for Big Data support. Additional knowledge and skills in data management, processing and visualisation will be essential for effective data engineering support. Some of the related works such as tools development will involve coding. The Data Engineering Technology (DET) team is equipped with both HPC and data engineering expertise.
Scope of Data Engineering Technology Consulting and Support for Research
As part of the IT strategic plan established with the Office of Deputy President (R&T), a Data Engineering Technology (DET) team was formed to provide Data Engineering support to researchers. We believe this would offer tremendous help in off-loading management overheads on systems, technology, software tools and data-related activities from the researchers who can then dedicate their efforts to research work. The functions provided by the DET team include:
- IT consultation for the exploration of Data Analytics/Deep Learning/Image Processing/Bioinformatics related solutions
- Project management for the implementation of computing system/solution
- Technical support for the data workflow development and management: Data collection, storage, preparation, processing, transfer and archiving
- Technical support for the use of the following related computing resources:
- Hadoop data repository and analytics system
- Accelerator system such as GPGPU server/cluster
- Cloud computing resources
- High-speed research network
- Parallel computing and performance optimisation support for the following software and application development:
- R
- Python
- Deep Learning Frameworks (TensorFlow, Caffe2, CNTK, Torch etc.)
- GPGPU programming, CUDA, etc.
- Spark and other data analytics related software
How to engage the Data Engineering Technology (DET) Team
Here are some scenarios that exemplifies the types of services provided by the DET team:
Scenario 1
Principal Investigator (PI) has general computing requirements for a new Big Data Analytics/AI project
Possible outcome 1: DET team helps PI optimise and port applications to central systems such as the Hadoop Data Repository and Analytics platform. DET team can also help to implement or develop tools to automate data ingestion and processing.
Possible outcome 2: DET team helps PI port application to the Cloud. AI platforms such as Tensorflow can be provisioned instantly on the Cloud.
Scenario II
PI has special requirements for a compute and data intensive project
Possible outcome 1: DET team explores solution with PI. If solution is available and can be implemented in-house, DET team will help implement the solution.
Possible outcome 2: If there is no existing solution, then the DET team will work with the PI in designing, budgeting, procuring and implementing the solution (project management support).
Scenario III
PI needs to scale up or speed up computation
Possible outcome: DET team works with PI on code optimisation and parallelisation.
Please contact the Data Engineering Technology team at gs.ude.sun@gnireenignEataD for any potential consulting engagements.