Information Retrieval (CS60092)
Instructors: Niloy Ganguly and Sourangshu Bhattacharya
TAs : Bidisha Samanta and Hussain Jagirdar
Content:
Syllabus:
We will mostly follow the standard syllabus. See here or here. Additionally, we have covered topics in information extraction and question answering.
Course Material:
The following material is for the post-midsem portion.
- Probabilistic IR and Relevance feedback : slides
- Language Models for IR : slides
- Text categorization, k-NN, Naive Bayes, etc : slides
- SVM and Learning to rank : slides. Also covered from:
- Optimizing Search Engines using Clickthrough Data. Thorsten Joachims. KDD 2002.
- A Support Vector Method for Optimizing Average Precision. Yisong Yue, Thomas Finley, Filip Radlinski, Thorsten Joachims. SIGIR 2007.
- Wei Chu and S. Sathiya Keerthi. New approaches to support vector ordinal regression, ICML 2005.
- Semantic representation of words: LSA, Word2vec, Glove, retrieval. : slides. Also covered from:
- Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. "Distributed representations of words and phrases and their compositionality." NIPS 2013.
- Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." EMNLP 2014.
- Mitra, Bhaskar, Eric Nalisnick, Nick Craswell, and Rich Caruana. "A dual embedding space model for document ranking." arXiv preprint arXiv:1602.01137 (2016).
- Information Extraction: Named entity recognition, structured learning, CRF, Named entity disambiguation. slides. Also covered from:
- Jenny Rose Finkel, Trond Grenager, and Christopher Manning. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. ACL 2005. Paper
- Dernoncourt, Franck, Ji Young Lee, Ozlem Uzuner, and Peter Szolovits. "De-identification of patient notes with recurrent neural networks." Journal of the American Medical Informatics Association 24, no. 3 (2017): 596-606. Github Link: here
- Ganea, Octavian-Eugen, and Thomas Hofmann. "Deep Joint Entity Disambiguation with Local Neural Attention." EMNLP 2017.
- Lazic, N.; Subramanya, A.; Ringgaard, M.; and Pereira, F. Plato: A selective context model for entity resolution. TACL 2015
- He, Z.; Liu, S.; Li, M.; Zhou, M.; Zhang, L.; and Wang, H. Learning entity representation for entity disambiguation. ACL 2013
- Johannes Hoffart, Yasemin Altun, and Gerhard Weikum. Discovering emerging entities with ambiguous names. WWW 2014
- Question Answering: Memory networks. slides. Also covered from:
- Kumar, Ankit, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. "Ask me anything: Dynamic memory networks for natural language processing." ICML 2016.
- Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." NIPS 2015.
- Jason Weston, Sumit Chopra, Antoine Bordes. Memory Networks. ICLR 2014
Textbooks:
- Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to information retrieval, Cambridge: Cambridge university press, 2008.