Machine Learning (CS60050)
Spring semester 201718
Announcements
 Weightage: Midsemester: 20%, Assignments: 30%, Endsemester: 50%
 All topics, including those discussed in guest lectures, are included in the endsemester syllabus, unless specifically mentioned otherwise on this page.
 Guest lectures by Dr. Parth Gupta, Amazon  Dr. Gupta will take the regular classes on April 9th and 10th. He will also deliver an invited talk on "Machine Learning at Amazon" on Monday, April 9th, 5.30pm  7.30pm. Venue for the invited talk: CSE 120 (CSE department main building, ground floor)
 Assignment 2 declared. Solutions to be submitted via Moodle. Submission deadline (extended): April 15.
 Guest lecture on Discrimination and Fairness in Machine Learning by Professor Krishna P. Gummadi, Max Planck Institute for Software Systems (MPISWS): March 7, 5.30 pm  7:00 pm, CSE 119 (CSE department main building, ground floor)
 Assignment 1 declared. Solutions to be submitted via Moodle. Submission deadline (extended): March 2. No further extension will be made.
 Midsemester examination schedule announced  check central examination timetable
 All registered students must enrol in the Moodle course "MLSPG2018 (CS60050) Machine Learning" (follow link "Moodle" on CSE department site. Enrolment key: student)
 First class will be on Monday, January 8, 2018, 12:00
Instructor
Saptarshi Ghosh
Contact: (saptarshi [AT] cse.iitkgp.ernet.in)
Course Timings (3 lectures)
Monday 12:00  12:55
Tuesday 10:00  10:55, 11:00  11:55
Class venue: NC231 (Nalanda Classroom Complex)
Teaching Assistants
 Soumya Sarkar (portkey1996 [AT] gmail [DOT] com)
 Surjya Ghosh (surjya.ghosh [AT] gmail [DOT] com)
 Soumajit Pramanik (soumajit.pramanik [AT] gmail [DOT] com)
 Divyansh Gupta (divyanshgupta95 [AT] gmail [DOT] com)
 Sayan Mukhopadhyay (sayanm.kgp [AT] gmail [DOT] com)
 Lovekesh Garg (lovekeshgarg13 [AT] gmail [DOT] com)
Course evaluation
Assignments: 30%
Midsemester exam: 20%
Endsemester exam: 50%
Topics (outline)
 Introduction: Basic principles, Applications, Challenges
 Supervised learning: Linear Regression (with one variable and multiple variables), Gradient Descent, Classification (Logistic Regression, Overfitting, Regularization, Support Vector Machines), Artificial Neural Networks (Perceptrons, Multilayer networks, backpropagation), Decision Trees
 Unsupervised learning: Clustering (Kmeans, Hierarchical), Dimensionality reduction, Principal Component Analysis, Anomaly detection
 Theory of Generalization: Insample and outofsample error, VC inequality, VC analysis, Bias and Variance analysis
 Applications: Spam filtering, recommender systems, and others
 Advanced topics: Bias and fairness in Machine Learning
Text and Reference Literature
 Christopher M. Bishop. Pattern Recognition and Machine Learning (Springer)
 David Barber, Bayesian Reasoning and Machine Learning (Cambridge University Press). Online version available here.
 Tom Mitchell. Machine Learning (McGraw Hill)
 Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (John Wiley & Sons)
Other interesting stuff
 10 Things Everyone Should Know About Machine Learning  Daniel Tunkelang
 Ali Rahimi's Testoftime award presentation at NIPS 2017 (comparing Machine Learning with Alchemy)
Slides
Topic 
Slides 
References / Comments 
Introduction 
Slides 
Introduction to the course, utility of ML, applications of ML 
Linear Regression 
Slides 
Linear regression in one variable, multiple variables, gradient descent, polynomial regression 
Classification using Logistic Regression 
Slides 
Binary classification; logistic regression; multiclass classification 
Overfitting and Regularization 
Slides 
Overfitting and regularization in linear regression and logistic regression 
Feasibility of learning 
Slides 
Feasibility of learning an unknown target function; insample (training set) error and outofsample error 
Theory of Generalization 
No slides 
Training vs. testing, bounding the testing error, Breakpoints, VapnikChervonenkis inequality, VC Dimension, BiasVariance tradeoff
Resources:
Proof of VC inequality: pdf
Paper "An Overview of Statistical Learning Theory" by Vapnik: pdf
There are several resources available on the Web, e.g., MIT OpenCourseware, Lectures 35

Discrimination and Fairness in ML 
Slides 
Guest Lecture by Professor Krishna P. Gummadi, MPISWS [The material till Slide 60 (end of Part 3) is included in endsemester syllabus.] 
Neural networks 
Slides 
Perceptrons, Neural networks, backpropagation algorithm, Stochastic Gradient Descent 
Error analysis 
Slides 
Error analysis, Validation, Learning curves 
Support vector machines 
Slides 
Margin, large margin optimization, Kernel methods 
Decision trees and random forests 
Slides 
Guest Lecture by Dr. Parth Gupta, Amazon 
Representation Learning 
Slides 
Guest Lecture by Dr. Parth Gupta, Amazon [Not included in endsemester syllabus] 
Unsupervised Learning 
Slides 
Clustering  Kmeans clustering, hierarchical clustering Other unsupervised learning problems  Principal Component Analysis, Topic modeling [PCA and Topic modeling are not included in endsemester syllabus] 
