Machine Learning (CS60050)

Spring semester 2017-18

Announcements

  • Weightage: Mid-semester: 20%, Assignments: 30%, End-semester: 50%
  • All topics, including those discussed in guest lectures, are included in the end-semester syllabus, unless specifically mentioned otherwise on this page.
  • Guest lectures by Dr. Parth Gupta, Amazon -- Dr. Gupta will take the regular classes on April 9th and 10th. He will also deliver an invited talk on "Machine Learning at Amazon" on Monday, April 9th, 5.30pm - 7.30pm. Venue for the invited talk: CSE 120 (CSE department main building, ground floor)
  • Assignment 2 declared. Solutions to be submitted via Moodle. Submission deadline (extended): April 15.
  • Guest lecture on Discrimination and Fairness in Machine Learning by Professor Krishna P. Gummadi, Max Planck Institute for Software Systems (MPI-SWS): March 7, 5.30 pm - 7:00 pm, CSE 119 (CSE department main building, ground floor)
  • Assignment 1 declared. Solutions to be submitted via Moodle. Submission deadline (extended): March 2. No further extension will be made.
  • Mid-semester examination schedule announced - check central examination time-table
  • All registered students must enrol in the Moodle course "ML-SPG-2018 (CS60050) Machine Learning" (follow link "Moodle" on CSE department site. Enrolment key: student)
  • First class will be on Monday, January 8, 2018, 12:00


Instructor

Saptarshi Ghosh

Contact: (saptarshi [AT] cse.iitkgp.ernet.in)


Course Timings (3 lectures)

Monday 12:00 - 12:55

Tuesday 10:00 - 10:55, 11:00 - 11:55

Class venue: NC231 (Nalanda Classroom Complex)


Teaching Assistants

  1. Soumya Sarkar (portkey1996 [AT] gmail [DOT] com)
  2. Surjya Ghosh (surjya.ghosh [AT] gmail [DOT] com)
  3. Soumajit Pramanik (soumajit.pramanik [AT] gmail [DOT] com)
  4. Divyansh Gupta (divyanshgupta95 [AT] gmail [DOT] com)
  5. Sayan Mukhopadhyay (sayanm.kgp [AT] gmail [DOT] com)
  6. Lovekesh Garg (lovekeshgarg13 [AT] gmail [DOT] com)

Course evaluation

Assignments: 30%

Mid-semester exam: 20%

End-semester exam: 50%


Topics (outline)

  1. Introduction: Basic principles, Applications, Challenges
  2. Supervised learning: Linear Regression (with one variable and multiple variables), Gradient Descent, Classification (Logistic Regression, Overfitting, Regularization, Support Vector Machines), Artificial Neural Networks (Perceptrons, Multilayer networks, back-propagation), Decision Trees
  3. Unsupervised learning: Clustering (K-means, Hierarchical), Dimensionality reduction, Principal Component Analysis, Anomaly detection
  4. Theory of Generalization: In-sample and out-of-sample error, VC inequality, VC analysis, Bias and Variance analysis
  5. Applications: Spam filtering, recommender systems, and others
  6. Advanced topics: Bias and fairness in Machine Learning

Text and Reference Literature

  1. Christopher M. Bishop. Pattern Recognition and Machine Learning (Springer)
  2. David Barber, Bayesian Reasoning and Machine Learning (Cambridge University Press). Online version available here.
  3. Tom Mitchell. Machine Learning (McGraw Hill)
  4. Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (John Wiley & Sons)

Other interesting stuff

  1. 10 Things Everyone Should Know About Machine Learning - Daniel Tunkelang
  2. Ali Rahimi's Test-of-time award presentation at NIPS 2017 (comparing Machine Learning with Alchemy)


Slides

Topic Slides References / Comments
Introduction Slides Introduction to the course, utility of ML, applications of ML
Linear Regression Slides Linear regression in one variable, multiple variables, gradient descent, polynomial regression
Classification using Logistic Regression Slides Binary classification; logistic regression; multi-class classification
Overfitting and Regularization Slides Overfitting and regularization in linear regression and logistic regression
Feasibility of learning Slides Feasibility of learning an unknown target function; in-sample (training set) error and out-of-sample error
Theory of Generalization No slides Training vs. testing, bounding the testing error, Breakpoints, Vapnik-Chervonenkis inequality, VC Dimension, Bias-Variance tradeoff
Resources:
Proof of VC inequality: pdf
Paper "An Overview of Statistical Learning Theory" by Vapnik: pdf
There are several resources available on the Web, e.g., MIT OpenCourseware, Lectures 3-5
Discrimination and Fairness in ML Slides Guest Lecture by Professor Krishna P. Gummadi, MPI-SWS
[The material till Slide 60 (end of Part 3) is included in end-semester syllabus.]
Neural networks Slides Perceptrons, Neural networks, backpropagation algorithm, Stochastic Gradient Descent
Error analysis Slides Error analysis, Validation, Learning curves
Support vector machines Slides Margin, large margin optimization, Kernel methods
Decision trees and random forests Slides Guest Lecture by Dr. Parth Gupta, Amazon
Representation Learning Slides Guest Lecture by Dr. Parth Gupta, Amazon
[Not included in end-semester syllabus]
Unsupervised Learning Slides Clustering - K-means clustering, hierarchical clustering
Other unsupervised learning problems - Principal Component Analysis, Topic modeling [PCA and Topic modeling are not included in end-semester syllabus]