Philippos Mordohai
Associate Professor
Department of Computer Science
Stevens Institute of Technology

Office: Home...
Phone Number: +1 201 216 5611
E-mail: Philippos.Mordohai_at_stevens.edu

CS 677: Parallel Programming for Many-core Processors

Spring 2021



Homepage

Location
Zoom

Time
Wednesday 6:30-9:00 PM.

Office Hours
Monday 5-6 and by appointment.

Pre-requisites
One of the following courses or demonstrated experience in C/C++.
  • CS 537: Interactive Computer Graphics.
  • CS 511: Concurrent Programming.
  • CS 631: Advanced Programming in the UNIX Environment.


Syllabus

Textbook
The required textbook is the following. I will also use notes outside the textbook, mostly in the second half of the semester.
Programming Massively Parallel Processors: A Hands-on Approach
by David Kirk and Wen-mei Hwu
Morgan Kaufmann, 2016 (3rd edition)

Evaluation

Homework assignments (40%)
Homework assignments will be assigned almost every week up to Week 7 and will be due a week later.

Quizzes (20%)

Project (40%)
Each student will select a project, which has to be approved by me regarding relevance and feasibility. I will also provide suggestions for potential projects and pointers to relevant material. Students actively involved in research can select a project related to their research, but new work has to be done during the semester. Large projects can be performed by groups of two students. In Week 9, each student will briefly present a proposal of his or her project, which will have to be approved by Week 8. Longer status updates will be given three weeks later and the final presentations will be given in the last week of classes. The written reports will be due on the date of the (non-existent) final exam.

Class Schedule

Week 1: Introduction to massively parallel programming and CUDA (Kirk & Hwu Ch. 1, 2 and 3)
Lecture 1 slides (pdf)

Week 2: CUDA threads and atomics; CUDA memories (Kirk & Hwu Ch. 4 and 5)
Lecture 2 slides (pdf)

Week 3: Performance considerations (Kirk & Hwu Ch. 5)
Lecture 3 slides (pdf)

Week 4: More performance considerations; reduction trees; parallel patterns: prefix sum (Kirk & Hwu Ch. 5 and 8)
Lecture 4 slides (pdf)

Week 5: Project ideas; Convolution, constant memory and cache (Kirk & Hwu Ch. 7)

Week 6: Case study: MRI reconstruction; Timers (Kirk & Hwu Ch. 14)

Week 7: Case study: electrostatic potential calculation; Binning (Kirk & Hwu Ch. 15)

Week 8: Computational thinking (Kirk & Hwu Ch. 17)

Week 9: Pinned Memory; streams; Thurst (Kirk & Hwu Ch. 13 and notes)

Week 10: Parallel Sorting; Sparse matrix and vector operations; Summed area tables (Notes)

Week 11: Project status reports; Bitonic sorting; More libraries; OpenCL (Kirk & Hwu Ch. 14 and notes)

Week 12: More OpenCL; OpenACC; OpenMP (Notes)

Week 13: Deep learning; Latest GPU and CUDA features

Week 14: Project presentations


Resources

Textbook companion site, 3rd edition.

Textbook companion site, 2nd edition.

The most recent CUDA toolkit. The toolkit includes the NVIDIA CUDA Compiler, and other software necessary to develop CUDA applications.