cs674: machine learning

CS674: Machine Learning and Knowledge Discovery

(Semester I, 2004-2005)

Announcements:

Endsem on November 29 (Monday), 2004 (12:15-15:15, L7 (ERES), open book)
Assignment 4 (described below) due on November 20, 2004
Term project reports should be submitted by November 24, 2004

Instructor: Pabitra Mitra (Room No. 224, CSE Dept., EMail: pmitra@iitk.ac.in)

Lecture Timings:

Day	Time	Room
Tuesday	11:30-1:00	CS102
Thursday	11:30-1:00	CS102

Teaching Assistants: Amit Madaan & Jaya Kawale

Course Contents:

Machine Learning is the study of computer algorithms that improve automatically through experience. Applications range from data mining programs that discover general rules in large data sets, to information filtering systems that automatically learn users' interests.

The course is intended to cover fundamental theory and algorithms of machine learning, as well as recent research topics. The course is outlined below:

1. Introduction to learning systems.

2. Concept learning

3. Decision trees

4. Linear discriminants and support vector machines

5. Neural networks

6. Bayesian methods

7. Instance based learning

8. Inductive logic programming

9. Model selection and error estimation

10. Unsupervised learning

11. Online, active and reinforcement learning

12. Computational and statistical learning theory

Books and References:

1. Machine Learning, Tom M. Mitchell, McGraw-Hill International Edition, 1997

Lecture Slides from the book

2. Pattern Classification, Duda, Hart and Stork, Wiley Singapore Edition, 2000.

Lecture Slides

3. Neural Networks: A Comprehensive Foundation, Simon Haykin, Prentice Hall, 1998.

4. An Introduction to Computational Learning Theory, M. Kearns and U. Vazirani, MIT Press, 1994.

5. Recent literature from the journals Machine Learning, Data Mining and Knowledge Discovery, IEEE Transactions on Neural Networks, and conferences ICML, COLT etc.

Course Policies:

End semester exam: 40%
One project: 25%
Four assignments: 20%
One Mid semester: 15%

Assignments

Term Projects: (Overview), (Proposed Topics), (List of Groups)

Topic	Reference	Further Readings
1. Introduction	Chapter 1, Mitchell	Does Machine Learning Really Work?, Tom Mitchell.
2. Concept Learning	Chapter 2, Mitchell	Haym Hirsh, Nina Mishra, and Leonard Pitt (2004), Version Spaces and the Consistency Problem. Artificial Intelligence 156(2):115-138.
3. Decision Trees	Chapter 3, Mitchell	C4.5 Tutorial, http://www2.cs.uregina.ca/~hamilton/courses/ 831/notes/ml/dtrees/c4.5/tutorial.html
4. Support Vector Machines	* C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining, 2(2), 1998. B. Schölkopf. A short tutorial on kernels, 2000. Tutorial given at the NIPS'00 Kernel Workshop.
5. Neural Networks	Chapter 4, Mitchell	Neural Networks, Haykin The self-organizing map Kohonen, T.; Proceedings of the IEEE ,Vol: 78(9) Sept. 1990, Pages:1464 - 1480, [PDF] Self organization of a massive document collection Kohonen, T.; Kaski, S.; Lagus, K.; Salojarvi, J.; Honkela, J.; Paatero, V.; Saarela, A.; Neural Networks, IEEE Transactions on ,Vol 11(3) , May 2000 , Pages:574 - 585, [PDF]
6. Bayesian Learning	Chapter 6, Mitchell	Chapter 2, Duda, Hart and Stork
7. Bayesian Networks	Chapter 6, Mitchell	D. Heckerman. A Tutorial on Learning with Bayesian Networks. In Learning in Graphical Models, M. Jordan, ed.. MIT Press, Cambridge, MA, 1999.
8. Hidden Markov Models	An introduction to hidden Markov models, Rabiner, L.; Juang, B.; ASSP Magazine, IEEE ,Volume: 3, Issue: 1 , Jan 1986 , Pages:4 - 16, PDF Full-Text (6464KB)]
9. Inductive Logic Programming, Sequential Covering, FOIL, PROGOL	Chapter 10, Mitchell	Inductive Logic Programming: theory and methods - Muggleton, De Raedt - 1994
10. Clustering	Chapter 10, Duda, Hart & Stork	Data clustering: a review, A. K. Jain, M. N. Murty, P. J. Flynn, ACM Computing Surveys, Vol. 31, Issue 3, Pages: 264 - 323, 1999
11. Feature Selection	Chapter 10, Duda, Hart & Stork (Section 10.13)	A Taxonomy of Feature Selection Methods, H. Liu et al.
12. Query learning, Boosting, Ensemble classifiers	Ensemble Learning. T. G. Dietterich, In The Handbook of Brain Theory and Neural Networks, Second edition, (M.A. Arbib, Ed.), Cambridge, MA: The MIT Press, 2002. Postscript Preprint. A short introduction to boosting, Y. Freund and R.E. Schapire.Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September 1999. Solving Multiclass Learning Problems via Error-Correcting Output Codes. Dietterich, T. G., Bakiri, G. Journal of Artificial Intelligence Research 2: 263-286, 1995. Postscript file. Query by committee, H. S. Seung, M. Opper, H. Sompolinsky, Proceedings of the fifth annual workshop on Computational learning
13. Reinforcement learning	Chapter 13, Mitchell	Reinforcement Learning: A Survey Kaelbling, Littman and Moore
14. Computational Learning Theory, PAC Learning	Chapter 1, Kearns, Vazirani	A Theory of the Learnable, Leslie Valiant, Communications of the ACM, 27(11), 1984 Pdf An overview of statistical learning theory, V.N. Vapnik, IEEE Trans. Neural Networks, 10(5), 1999, PDF
15. VC-Dimension	Chapter 3, Kearns Vazirani	Learnability and the Vapnik-Chervonenkis dimension Anselm Blumer, A. Ehrenfeucht, David Haussler, Manfred K. Warmuth, JACM, 36(4), 1989
16. Online learning and mistake bound models, halving, weighted majority, perceptron, winnow	Chapter 7, Mitchell	Avrim Blum, "On-Line Algorithms in Machine Learning" (a survey). Nick Littlestone, Learning Quickly when Irrelevant Attributes Abound: A New Linear-threshold Algorithm. Machine Learning 2:285--318, 1987. Littlestone and Warmuth, The Weighted Majority Algorithm. Information and Computation 108(2):212-261, 1994. Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. How to use expert advice, Journal of the ACM, 44(3):427-485, May 1997. BTW, for fun, here is the Mindreading program of Rob Schapire and Yoav Freund compiled for sparc or linux (note: you need to hit ctrl-D to quit).
17. Weak and strong learnability	Chapter 4, Kearns, Vazirani	Yoav Freund and Rob Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55(1):119-139, 1997.
18. Learning in the presence of noise, Statistical query model	Chapter 5, Kearns, Vazirani	Dana Angluin: Queries and Concept Learning. Machine Learning 2 (4): 319-342 (1987)
19. Computational game theory	COMPUTATIONAL GAME THEORY: A TUTORIAL, Michael Kearns, NIPS 2002 Michael Kearns course on Computational Game Theory http://www.cis.upenn.edu/~mkearns/teaching/cgt