HP3 Syllabus

OpenMP and MPI

i. Introduction to parallel computing
ii. Parallel programming with MPI: Point-to-Point Communications
iii. Parallel programming with MPI: Collective Communications and communicators
iv. Working with clusters: Workload manager and job scheduler
v. Parallel programming with OpenMP: Directives and Run-time Library Function
vi. Hybrid Programming: An application on Image Processing/Computer Vision

CUDA

i. Introduction to CUDA Program Structure - Host and Kernels
ii. Multidimensional Kernels + GPU Architecture and Warp Scheduling
iii.GPU Memory Spaces
iv. Case Study on Reduction Operation and its Optimizations, Divergence, Shared memory bank conflicts
v. Fusion of multiple CUDA kernels for interleaved execution
vi. Writing efficient CUDA libraries for Neural Network Training and Testing
vii.Bitonic Sorting