Machine Learning (CS60050)
Instructor: Sourangshu Bhattacharya
TAs:
- Abir De
- Tejas DP
- Payal Priyadarshini
Class Schedule: MON(9:30-10:30) , WED(8:30-9:30) , THURS(9:30-10:30)
Classroom: CSE-107 NR-323 (Nalanda Classroom complex))
Website: http://cse.iitkgp.ac.in/~sourangshu/cs60050_15A.html
Announcements:
- Course "CS60050: Machine Learning" floated on Moodle: http://10.5.30.126/moodle/. Join as students receive and submit new assignments.
- First Meeting: Monday, 20th July, at 9:30 am in CSE-107.
Content:
Syllabus:
Basic Principles: Introduction, Experimental Evaluation: Over-fitting, Cross-Validation. PAC learning. Sample complexity. VC-dimension, Reinforcement Learning.
Supervised Learning: Decision Tree Learning, k-NN classification, SVMs, Ensemble learning: boosting, bagging. Artificial Neural Networks: Perceptrons, Multilayer networks and back-propagation.
Probabilistic Models: Maximum Likelihood Estimation, MAP, Bayes Classifiers, Naive Bayes. Markov Networks, Bayesian Networks, Factor Graphs, Inference in Graphical Models.
Unsupervised Learning: K-means and Hierarchical Clustering, Gaussian Mixture Models, EM algorithm, Hidden Markov Models.
Textbooks:
- Tom Mitchell. Machine Learning. McGraw Hill, 1997.
- Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer 2006.
- Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification. John Wiley & Sons, 2006.
- Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer 2009.
Prerequisites:
- Elementary Mathematics: Sets, vector spaces, dot products, distances, geometry, basic graph theory, combinatorics, etc.
- Probability theory: Random variables, Distributions, Expectations, Moments, Conditioning, Independence etc.
- Linear Algebra: Matrices, functions of matrices and vectors, derivatives of matrix formulae, etc.
- Optimization (will be discussed): Stationarity, KKT conditions, Gradient descent, etc.
Slides:
Note that slides are not a comprehensive reference for all that is taught in the class. They are designed to help you understand what is being taught. You are encouraged to take notes in the class.
Some of the material is borrowed from Christopher Bishop's slides.
- Introduction - 22, 23, 27 July - slides
- Linear regression - 28, 30 July, and 3, 5, 6, 10 August - slides
- Classification 1 (Basics, Multi-class, Least suqares, Fisher LDA) - 12 13, 17 August - slides
- Classification 2 (Testing, cross-validation, Bayes classifier, Logistic regression, IRLS) - 20, 24, 26 August - slides
- Classification 3 (SVM, Kernels, VC Theory) - 27, 31 August; 2, 3 September - slides
- Algorithms for loss minimization (SMO for SVM, Online learning, Perceptron convergence, SGD convergence) - dates - slides
- Decision trees - 12, 14 and 15 October - slides
- Clustering, Mixture of gaussian, EM Algorithm - 26, 28 and 29 October - slides
- Graphical Models - 26, 28, 29 October and 2, 5 9, 11, 12 November - slides
Assignments:
- Assignment 1: here. Due date: 23 August.
- Assignment 2: here. Due date: 27 September.
- Assignment 3: here. Due date: 8 November.
Class Tests:
- Class test 1: 19, August 7:30 - 8:30 NR 323.
- Class test 2: 9, September 7:30 - 8:25 NR 323.
- Class test 3: 4, November 7:30 - 8:25 NR 323.