... somewhere something incredible is waiting to be known.

Information Retrieval (CS60092)

Instructors: Niloy Ganguly and Sourangshu Bhattacharya

TAs : Bidisha Samanta and Hussain Jagirdar


Content:


Syllabus:

We will mostly follow the standard syllabus. See here or here. Additionally, we have covered topics in information extraction and question answering.


Course Material:

The following material is for the post-midsem portion.

  • Probabilistic IR and Relevance feedback : slides
  • Language Models for IR : slides
  • Text categorization, k-NN, Naive Bayes, etc : slides
  • SVM and Learning to rank : slides. Also covered from:
    • Optimizing Search Engines using Clickthrough Data. Thorsten Joachims. KDD 2002.
    • A Support Vector Method for Optimizing Average Precision. Yisong Yue, Thomas Finley, Filip Radlinski, Thorsten Joachims. SIGIR 2007.
    • Wei Chu and S. Sathiya Keerthi. New approaches to support vector ordinal regression, ICML 2005.
  • Semantic representation of words: LSA, Word2vec, Glove, retrieval. : slides. Also covered from:
    • Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. "Distributed representations of words and phrases and their compositionality." NIPS 2013.
    • Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." EMNLP 2014.
    • Mitra, Bhaskar, Eric Nalisnick, Nick Craswell, and Rich Caruana. "A dual embedding space model for document ranking." arXiv preprint arXiv:1602.01137 (2016).
  • Information Extraction: Named entity recognition, structured learning, CRF, Named entity disambiguation. slides. Also covered from:
    • Jenny Rose Finkel, Trond Grenager, and Christopher Manning. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. ACL 2005. Paper
    • Dernoncourt, Franck, Ji Young Lee, Ozlem Uzuner, and Peter Szolovits. "De-identification of patient notes with recurrent neural networks." Journal of the American Medical Informatics Association 24, no. 3 (2017): 596-606. Github Link: here
    • Ganea, Octavian-Eugen, and Thomas Hofmann. "Deep Joint Entity Disambiguation with Local Neural Attention." EMNLP 2017.
    • Lazic, N.; Subramanya, A.; Ringgaard, M.; and Pereira, F. Plato: A selective context model for entity resolution. TACL 2015
    • He, Z.; Liu, S.; Li, M.; Zhou, M.; Zhang, L.; and Wang, H. Learning entity representation for entity disambiguation. ACL 2013
    • Johannes Hoffart, Yasemin Altun, and Gerhard Weikum. Discovering emerging entities with ambiguous names. WWW 2014
  • Question Answering: Memory networks. slides. Also covered from:
    • Kumar, Ankit, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. "Ask me anything: Dynamic memory networks for natural language processing." ICML 2016.
    • Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." NIPS 2015.
    • Jason Weston, Sumit Chopra, Antoine Bordes. Memory Networks. ICLR 2014


Textbooks:

  • Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to information retrieval, Cambridge: Cambridge university press, 2008.