Machine Learning (CS60050)
Spring semester 2018-19
Announcements
- End-semester syllabus includes all topics taught in the course.
- Assignment 4 declared - deadline April 19 - see below.
- Assignment 3 declared - deadline March 31 - see below.
- Assignment 2 declared - deadline March 15 - see below.
- Assignment 1 declared - deadline Feb 15 - see below.
- Every student should create an account on Moodle submission system of CSE department. This system will be used for submission and grading of assignments. Go to this link and follow the link "Moodle" (bottom-left on page). Create a new account for yourself (unless you have an account already), giving username, password, email id. After creating an account, login to the system, and follow the link "Spring Semester (2018-19)". Choose the course "Machine Learning". Join this course as "Student"; use Student Enrolment Key: STUML.
- All registered students should join the mailing group https://groups.google.com/d/forum/machinelearning2019
Instructor
Saptarshi Ghosh
(Contact: saptarshi @ cse . iitkgp . ac . in)
Teaching Assistants
- Abhisek Dash (assignmentad @ gmail . com)
- Paheli Bhattacharya (pahelibhattacharya @ gmail . com)
- Shalmoli Ghosh (shalmolighosh94 @ gmail . com)
- Ainuddin Khan (ainuddin.india @ gmail . com)
- Harish Yadav (harishyadav394 @ gmail . com)
- Midatala Surya (surya.midatala @ gmail . com)
Course Timings (3 lectures)
Wednesday 11:00 - 11:55
Thursday 12:00 - 12:55
Friday 08:00 - 08:55
Class venue: NR421 (Nalanda complex)
Course evaluation
Assignments: 40% (There will be 4-5 assignments that will involve programming in C/C++/Java/Python)
Mid-semester exam: 20%
End-semester exam: 40%
Topics (outline)
- Introduction: Basic principles, Applications, Challenges
- Supervised learning: Linear Regression (with one variable and multiple variables), Gradient Descent, Classification -- Logistic Regression, Decision Trees, Naive Bayes, Support Vector Machines, Artificial Neural Networks (Perceptrons, Multilayer networks, back-propagation)
- Unsupervised learning: Clustering (K-means, Hierarchical), Dimensionality reduction
- Ensemble learning: Bagging, boosting
- Theory of Generalization: In-sample and out-of-sample error, Bias and Variance analysis, Overfitting, Regularization, VC inequality, VC analysis,
- Advanced topics: Bias and fairness in Machine Learning
Text and Reference Literature
- Christopher M. Bishop. Pattern Recognition and Machine Learning (Springer)
- David Barber, Bayesian Reasoning and Machine Learning (Cambridge University Press). Online version available here.
- Tom Mitchell. Machine Learning (McGraw Hill)
- Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (John Wiley & Sons)
Assignments
Assignment 1 (Linear regression):
Question
Deadline: February 15, 2019, 23:59 pm IST
Evaluation by TAs: Sl. no. 01-31: Shalmoli | 32-62: Ainuddin | 63-92: Harish | 93-123: Surya
[Serial numbers according to the list of students given below]
Assignment 2 (Decision Trees):
Question
Deadline: March 15, 2019, 23:59 pm IST
Evaluation by TAs: Sl. no. 01-31: Surya | 32-62: Shalmoli | 63-92: Ainuddin | 93-123: Harish
[Serial numbers according to the list of students given below]
Assignment 3 (Clustering):
Question
Deadline: March 31, 2019, 23:59 pm IST
Evaluation by TAs: Sl. no. 01-31: Harish | 32-62: Surya | 63-92: Shalmoli | 93-123: Ainuddin
[Serial numbers according to the list of students given below]
Assignment 4 (Neural Networks):
Question
Deadline: April 19, 2019, 23:59 pm IST
Evaluation by TAs: Sl. no. 01-31: Ainudding | 32-62: Harish | 63-92: Surya | 93-123: Shalmoli
[Serial numbers according to the list of students given below]
List of 123 students in the course: pdf
Slides
Topic |
Slides |
References / Comments |
Introduction |
Slides |
Introduction to the course, utility of ML, applications of ML |
Demo of ML tools |
Material |
Demonstration of ML tools (slides, datasets, sample scripts) |
Linear Regression |
Slides |
Linear regression in one variable and multiple variables, concept of cost function, gradient descent, polynomial regression |
Logistic Regression |
Slides |
Binary classification; logistic regression; multi-class classification |
Evaluation and Overfitting |
Slides |
Evaluation and error analysis Bias and Variance Overfitting, validation and regularization |
Decision Trees |
Slides |
Classification using Decision Trees, Hunt's algorithm, Impurity measures (Gini index, entropy), overfitting and pruning a Decision Tree |
Fairness in Machine Learning |
Slides |
Fairness and bias, and how to deal with them |
Unsupervised Learning: Clustering |
Slides |
Prototype based clustering, hierarchical clustering, graph clustering, density-based clustering |
Dimensionality Reduction |
Slides |
Supervised and unsupervised ways of dimensionality reduction, Principal Component Analysis |
Naive Bayes classifier |
Slides |
Bayesian classifiers, Naive Bayes |
Neural networks |
Slides |
Perceptrons, Multilayer Perceptrons, Neural networks, backpropagation algorithm, Stochastic Gradient Descent |
Support vector machines |
Slides |
Margin, margin optimization, Kernel methods |
Ensemble Learning |
Slides |
Bagging, Boosting |
Introduction to Theory of Generalization |
No slides |
Bounding the testing error, Breakpoints, Vapnik-Chervonenkis inequality, VC Dimension
|
Other interesting stuff
- 10 Things Everyone Should Know About Machine Learning - Daniel Tunkelang
- Ali Rahimi's Test-of-time award presentation at NIPS 2017 (comparing Machine Learning with Alchemy)
- Machine Learning resources
- Datasets for Machine Learning
|