Abhik Jana Abhik Jana


About me

I am a Ph.D. student in the Department of Computer Science and Engineering, IIT Kharagpur, since Jan 2015. My supervisor is Dr. Pawan Goyal. My basic interest lies in the study of Natural Language Processing, Cognitive Computing.

 


Research | Publications | Academics | Contact |


RESEARCH ABSTRACT [Back to top]

Word sense change detection
Word Sense Induction (WSI) methods induce word senses from raw text by clustering word occurrences on the basis of the distributional hypothesis. Approaches based on context clustering either use a context vector for each word and cluster it into various groups denoting the senses or build a word co-occurrence graph and cluster the open neighborhood to obtain the word senses. Word sense discovery methods, on the other hand, attempt to discover novel senses by comparing sense clusters across two time-periods. For sense induction, one can use a distributional thesauri based network from a large dataset such as Google syntactic n-grams. After clustering the network for each target word, different clusters for the target word are considered to denote various senses. Such sense clusters can be constructed across various time-points and new senses can be discovered by comparing the two sets of clusters. On manual inspection, however, it appears that each “new sense” cluster does not always necessarily indicate a sense . Our proposal is to use network properties to enhance the existing framework to detect the word sense change more accurately. We take the words in the sense cluster of a particular target word as an ego network for that word, and measure the network properties across different time points for this ego network. We see that it helps to improve the accuracy of word sense change detection.

Predicting references for Wikipedia pages
Wikipedia is a free encyclopedia, written collaboratively by the people who use it. It consists of millions of articles in more than 270 languages. So it is a huge knowledge base, which is evolving every moment. Researchers have also been working on enriching this knowledge base automatically. Each Wikipedia page usually has several sections like introduction, history, references, external links etc. Our plan is to enrich the reference section of Wikipedia pages, so that it helps the reader to refer to specific document to get more information about the article. Our objective is to predict the reference documents and add them to the reference section of Wikipedia article. In order to do that we consider Computer Science related articles (Wikipedia pages) of a particular timestamp and try to predict computer science related papers, which can be added to the reference section in future.


PUBLICATIONS [Back to top]

Papers


ACADEMICS [Back to top]

Education

Job Experience

Teaching Assistantships


CONTACT ME [Back to top]


Date modified: Feb 15, 2015