Students ongoing- PhD


Haque Arijul

Area of Research : Speech Processing

Research objective: In the evolving field of human–computer interaction (HCI), there are a large number of modes by which humans communicate with computers. Modes like speech, text, GUI-based interaction using mouse or touchscreen devices are the most common. Among these, speech is one of the most intuitive and natural modes of communication. Embedded in speech is a very important aspect of the message one intends to convey-- emotions. For more natural HCI through speech, this emotional aspect needs to be incorporated in the machines. Computers should be able to both understand the emotion in speech conveyed by a human, as well as generate a speech in response that corresponds to both the message and the emotion identified. This requires machines to be able to recognize emotions from human speech.
My work focuses on this aspect, i.e., identifying emotions automatically from human speech. There have been many previous works in this area and this is still a hot topic of research. Many common signal processing techniques have been explored to extract features from emotional speech and many pattern recognition algorithms have also been tried out. However, the use of deep neural networks (DNN), which has the potential to virtually derive the features from raw speech itself, has not yet been adequately explored in this area. This is relatively a new paradigm and needs to be explored more. Therefore, I am trying to employ different types of DNN with various configurations to identify emotions from speech. Also, speech is not always noise-free. Therefore, the robustness of these techniques to noise will also be investigated. Apart from that, an analysis of some important emotions (like happiness, anger, sadness) will also be attempted to find out which features of speech contribute to these emotions the most.

Email : rjlhq05@gmail.com


Priya Dharshini G

Area of Research : Speech Processing

Research objective: Real time applications such as fraudulent detection and searching specific topics from a large collection of educational database such as NPTEL, MIT OCW, requires an intelligent technique called “keyword spotting”. Identification of specific keyword/phoneme/word/sub-word and its number of occurrence prediction is a highly challenged task. This research can be explored with the focus of “unsupervised framework based keyword spotting”. The major work is to obtain keywords among speech dataset in an unsupervised manner. This approach doesn’t requires any text transcription or labelled data. Rather, it uses only raw speech signals as input. The following picture clearly projects the unsupervised way of keyword spotting from ‘m’ number of speech samples.

Email : priyagdarshi@gmail.com


Madhu Keerthana Y

Area of Research : Speech Processing

Research objective: Speech pathology is a field of health science which deals with the evaluation of speech, language, and voice disorders. The voice disorders caused due to defects in the speech organs or multiple disabilities reduces the quality of life along with occupational performance, which results in considerable costs for both the patient and the society. Traditionally, diagnosis of voice pathology is carried out using invasive methods such as the direct inspection and observations of the vocal folds by using endoscopic instruments. Needless to say, these techniques are expensive, risky, time consuming, discomfort to the patients and requires costly resources. To mitigate these problems and lower the barriers, noninvasive screening methods needs to be developed to help the ENT (Ear, Nose and Throat) clinicians and speech therapists for assessment and diagnosis of vocal fold pathologies.
Hence, my research focus is on automatic detection and classification of different voice disorders using noninvasive speech based techniques, with the objective of improving the quality of patient's life by providing the right care at the right time.

Email : madhu.keerthu@gmail.com

Soumen Paul

Area of Research :Computer Vision

Research objective: My active research interests include developing computer vision models incorporating the domain knowledge in a systematic way. Currently, I work on object detection, recognition, and segmentation in images and videos from the Indian cultural heritage domain. I also use ontology to model knowledge in any respective domain. Apart from that, I have a few works on speech recognition models and domain adaptation of speech recognition models.

Email : soumenpaul165@gmail.com

Arup Kumar Dutta

Area of Research : Speech Processing

Research objective:

Email : arupdutta1990@gmail.com

Sai Sriharsha Annepu

Area of Research :Speech and Language Processing

Research objective:

Email :annepuharsha@gmail.com