Machine Learning

Machine Learning (CS60050)

Spring semester 2017-18

Announcements

Weightage: Mid-semester: 20%, Assignments: 30%, End-semester: 50%
All topics, including those discussed in guest lectures, are included in the end-semester syllabus, unless specifically mentioned otherwise on this page.
Guest lectures by Dr. Parth Gupta, Amazon -- Dr. Gupta will take the regular classes on April 9th and 10th. He will also deliver an invited talk on "Machine Learning at Amazon" on Monday, April 9th, 5.30pm - 7.30pm. Venue for the invited talk: CSE 120 (CSE department main building, ground floor)
Assignment 2 declared. Solutions to be submitted via Moodle. Submission deadline (extended): April 15.
Guest lecture on Discrimination and Fairness in Machine Learning by Professor Krishna P. Gummadi, Max Planck Institute for Software Systems (MPI-SWS): March 7, 5.30 pm - 7:00 pm, CSE 119 (CSE department main building, ground floor)
Assignment 1 declared. Solutions to be submitted via Moodle. Submission deadline (extended): March 2. No further extension will be made.
Mid-semester examination schedule announced - check central examination time-table
All registered students must enrol in the Moodle course "ML-SPG-2018 (CS60050) Machine Learning" (follow link "Moodle" on CSE department site. Enrolment key: student)
First class will be on Monday, January 8, 2018, 12:00

Instructor

Saptarshi Ghosh

Contact: (saptarshi [AT] cse.iitkgp.ernet.in)

Course Timings (3 lectures)

Monday 12:00 - 12:55

Tuesday 10:00 - 10:55, 11:00 - 11:55

Class venue: NC231 (Nalanda Classroom Complex)

Teaching Assistants

Soumya Sarkar (portkey1996 [AT] gmail [DOT] com)
Surjya Ghosh (surjya.ghosh [AT] gmail [DOT] com)
Soumajit Pramanik (soumajit.pramanik [AT] gmail [DOT] com)
Divyansh Gupta (divyanshgupta95 [AT] gmail [DOT] com)
Sayan Mukhopadhyay (sayanm.kgp [AT] gmail [DOT] com)
Lovekesh Garg (lovekeshgarg13 [AT] gmail [DOT] com)

Course evaluation

Assignments: 30%

Mid-semester exam: 20%

End-semester exam: 50%

Topics (outline)

Introduction: Basic principles, Applications, Challenges
Supervised learning: Linear Regression (with one variable and multiple variables), Gradient Descent, Classification (Logistic Regression, Overfitting, Regularization, Support Vector Machines), Artificial Neural Networks (Perceptrons, Multilayer networks, back-propagation), Decision Trees
Unsupervised learning: Clustering (K-means, Hierarchical), Dimensionality reduction, Principal Component Analysis, Anomaly detection
Theory of Generalization: In-sample and out-of-sample error, VC inequality, VC analysis, Bias and Variance analysis
Applications: Spam filtering, recommender systems, and others
Advanced topics: Bias and fairness in Machine Learning

Text and Reference Literature

Christopher M. Bishop. Pattern Recognition and Machine Learning (Springer)
David Barber, Bayesian Reasoning and Machine Learning (Cambridge University Press). Online version available here.
Tom Mitchell. Machine Learning (McGraw Hill)
Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (John Wiley & Sons)

Other interesting stuff

Slides


Topic	Slides	References / Comments
Introduction	Slides	Introduction to the course, utility of ML, applications of ML
Linear Regression	Slides	Linear regression in one variable, multiple variables, gradient descent, polynomial regression
Classification using Logistic Regression	Slides	Binary classification; logistic regression; multi-class classification
Overfitting and Regularization	Slides	Overfitting and regularization in linear regression and logistic regression
Feasibility of learning	Slides	Feasibility of learning an unknown target function; in-sample (training set) error and out-of-sample error
Theory of Generalization	No slides	Training vs. testing, bounding the testing error, Breakpoints, Vapnik-Chervonenkis inequality, VC Dimension, Bias-Variance tradeoff Resources: Proof of VC inequality: pdf Paper "An Overview of Statistical Learning Theory" by Vapnik: pdf There are several resources available on the Web, e.g., MIT OpenCourseware, Lectures 3-5
Discrimination and Fairness in ML	Slides	Guest Lecture by Professor Krishna P. Gummadi, MPI-SWS [The material till Slide 60 (end of Part 3) is included in end-semester syllabus.]
Neural networks	Slides	Perceptrons, Neural networks, backpropagation algorithm, Stochastic Gradient Descent
Error analysis	Slides	Error analysis, Validation, Learning curves
Support vector machines	Slides	Margin, large margin optimization, Kernel methods
Decision trees and random forests	Slides	Guest Lecture by Dr. Parth Gupta, Amazon
Representation Learning	Slides	Guest Lecture by Dr. Parth Gupta, Amazon [Not included in end-semester syllabus]
Unsupervised Learning	Slides	Clustering - K-means clustering, hierarchical clustering Other unsupervised learning problems - Principal Component Analysis, Topic modeling [PCA and Topic modeling are not included in end-semester syllabus]