Speech and Natural Language Processing - CS60057

Fall Semester - 2015-16


Pawan Goyal

Course Timings


Monday - 11:30 - 12:25 (NC243)

Tuesday - 9:30 - 11:25 (NC243)

Thursday - 7:30 - 8:25 (NC243) [Reserved Slot]

Office Hours: Friday - 5:30 - 7:00 PM (CSE - 308)

Teaching Assistants

Mayank Singh - mayank4490@gmail.com

Koustav Rudra - krudra5@gmail.com

Paheli Bhattacharya - pahelibhattacharya@gmail.com

Sruthi M - sruthiwarrier@gmil.com

Lecture Slides

The lecture slides of the course are uploaded on moodle (every Tuesday). Go to CSE Webpage and you can see a link for accessing moodle. Follow the instructions on the page. You will need to self-enrol for the course "Speech and Natural Language Processing (CS60057)".


October 27th: SNLP Project final presentation will be on November 15th (Sunday). Details will be mailed soon.

September 29th: The next set of Lectures by Dr. Monojit Choudhury will be on October 12-14, 5:30 - 7:00 PM. Venue will be as follows:

October 12-13th : V-1 (Vikramshila Complex)

October 14th: F-127 (Main Building)

September 29th: SNLP Project Mid-Term progress report is to be submitted by October 3rd, Midnight. The submission has to be made via Moodle.

August 6th: Next NLP lecture will be on August 17th.

August 4th: Flipkart has announced a grant of $1K for the course projects, which will be awarded to the top 3 projects after the final evaluation.

Dr. Monojit Choudhury will be giving the first set of lectures on August 19th and August 20th (Wed-Thu) from 5:30 - 7:30 PM. The venue will be F-127, just opposite to the central library.

The course will start from July 20th in Nalanda Complex - 243.

The course will also feature lectures concerning "Working with the Social Media Data" and "Opinion, Sentiments and User Behavior", to be delivered by Dr. Monojit Choudhury, Microsoft Research.

Reference Books

  1. Daniel Jurafsky and James H. Martin. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall.
  2. Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press.

Course Contents

Major Components of the Course include:
  1. Basic Text Processing: Tokenization, Stemming
  2. Language Modeling: N-grams, smoothing
  3. Morphology, Parts of Speech Tagging
  4. Syntax: PCFGs, Dependency Parsing
  5. Distributional Semantics
  6. Lexical Semantics, Word Sense Disambiguation
  7. Information Extraction: Relation extraction
  8. Text Classification

SNLP Projects

Results are out, Top 3 teams:

OCR++ An Open-Source Framework For Extracting Information From Scholarly Articles, Sidhartha Satapathy and Team

Characterization and Analysis of Non-Situational Microblogs during Disasters, Ankesh anand and Team

Studying and Analysing the Evolution of a User’s Answering Style Over a Period of Time on Quora and Developing Models to Predict it, Vatsalya Chauhan and Team