HIGH PERFORMANCE PARALLEL PROGRAMMING (CS61064)

Spring 2025



Instructor

Soumyajit Dey (GPU, CUDA),
Pralay Mitra (OpenMP/MPI paradigm),

Teaching Assistant

Siddharth Sharma (siddharth.d.sharma@kgpian.iitkgp.ac.in), Debraj Das

Class timing

THURS (15:00-17:00), FRI (14:00-16:00)

Venue

Lectures: CSE room 119
Tutorials: PC Lab, Annex building, CSE

Announcements

First meeting date: 17th July 2025 at scheduled time

Prerequisites: Proficiency in C programming

(Programming environments used in course shall be restricted to OpenMP, MPI, CUDA)

Syllabus Click here

Course Modules

Serial no Topic (GPU/CUDA) Slides
Module 0 Basics of Computer Architecture, GPU Architecture Download , Download
Module 1 Introduction to CUDA, Multi-dimensional Mapping Download , Download
Module 2 Warp Scheduling and Divergence Download
Module 3 Memory, Tiled Matrix Multiplication,Transpose Download
Module 4 Reduction Operations Download
Module 5 Fusion and Coarsening Download

Serial no Topic (OpenMP/MPI)
Module 0 OpenMP introduction, worksharing constructs
Module 1 OpenMP scheduling, handling array
Module 2 OpenMP synchronization; announcement of term project on OpenMP
Module 3 OpenMP Matrix handling
Module 4 MPI Part I
Module 5 MPI Part II

CUDA resources

Additional study materials

  • NVIDIA CUDA Programming Guide
  • Transpose : NVIDIA developer document
  • Transpose : NVIDIA developer Blog
  • CUDA Tutorials

    Tutorial No. Date Topic
    Tutorial 1 24th July Hands-on basic programming in CUDA
    Tutorial 2 Divergence in CUDA
    Tutorial 3 Reduction, fusion and coarsening in CUDA

    CUDA Assignments

    Assignment 1 (a) Link
    Assignment 1 (b) Link
    Practice questions Link

    Link to Submission System

    Test Schedule

    Test Name Solution
    Mid sem Section1: GPU/CUDA
    End sem Section1: GPU/CUDA

    Marking Scheme

    End sem: 40, Mid sem: 30, Assignments: OpenMP(5) + MPI(5) + CUDA(10)

    References

    1. “Using OpenMP” by Barbara Chapman, Gabriele Jost and Ruud van der Pas
    2. “MPI: The Complete Reference” by Marc Snir, Jack Dongarra, Janusz S. Kowalik, Steven Huss-Lederman, Steve W. Otto, David W. Walker
    3. “Parallel Programming with MPI” by Peter Pacheco
    4. "Programming Massively Parallel Processors" - David Kirk and Wen-mei Hwu
    5. CUDA Reference manual
    6. “Computer Architecture -- A Quantitative Approach” - John L. Hen- nessy and David A. Patterson
    If you are absent for two consecutive classes, turn up with documented reasons in the next class. Attendance below institute guidelines shall lead to deregistration - Yes, we are following this strictly