RESEARCH OBJECTIVE

With the advent of parallel programming languages like OpenCL and CUDA, heterogeneous computing platforms comprising multiple CPUs and GPUs are now being extensively used by researchers across diverse domains of science. However, the current toolsets available for both the programming languages requires the programmer to understand performance bottlenecks of the application under consideration as well as detailed knowledge regarding the architecture of the target computing platform in order to efficiently execute programs. This motivates the need for devising intelligent computing frameworks which will provide the end user with the following benefits.

  1. An Intelligent IDE that would ease parallel programming on heterogeneous platforms
  2. A Compilation Frontend that would analyze program structure and generate optimized binaries
  3. A Robust Backend that would intelligently schedule data parallel worklloads to a heterogeneous system.

Our primary research endeavour is to develop a rich heterogeneous programming framework targeting a vast array of heterogeneous platforms. The framework shall be built upon the existing OpenCL programming framework which is known for its program portability across different types of devices e.g., general purpose (CPU), data parallel (GPU), task parallel (CELL/B.E.) etc.

The components of this heterogeneous framework are illustrated by the schematic diagram below.

The intelligent programming framework depicted in the figure will provide a rich GUI integrated programming aid to the end-user for developing workloads from varied scientific domains with ease. The intelligent IDE will ensure that researchers focus more on designing algorithms and focus less in writing code for the same.

Once the algorithm has been designed using the IDE, a common program representation will be generated that can be executed on heterogeneous multicores, heterogeneous embedded systems as well as heterogeneous clusters.

The highlight of the framework would be the Compilation and Runtime System which shall mine feature information from tasks constituting a parallel programming application, use trained machine learning classifiers for the following tasks:

  1. Ascertaining which compiler optimizations should be used for the program on a target architecture
  2. Ascertaining task to device mapping functions for achieving minimum application execution time.
The proposed framework therfore provides an interactive programming environment that shall not only aid the programmer for designing high performance applications with ease, but also suggest optimization techniques relevant to the target heterogeneous platform.

Awards

Publications

Support from Industry and Government Agencies