• Grounded Contextual Semantics
    Sept 2018 - present
  • Multi Document Summarization of Movie Plots
    Sept 2018 - present
  • Textually Enriched Neural Module Networks for VQA
    Jan 2017 - May 2017
  • Hypercube of Neural QA models
    Jan 2018 - present
  • Query-Oriented Biomedical Text Summarization
    Jan 2017 - May 2018
  • Code-Mixed Question Answering
    Dec 2016 - present
  • Multi Task Learning to perform NER and Language Modeling
    Jan 2018 - May 2018
  • Multilingual Text Representation for Speech Synthesis
    Jan 2017 - May 2017
  • Context Maintenance in Conversational Agents (Amazon Alexa Prize Challenge)
  • Clockwork Recurrent Neural Network for Dialog State Tracking
  • Medical abstracts classification for Evidence Based Medicine
  • Refining Unit Sequences in Unit Concatenative TTS for Dravidian Languages
  • Advisors: Prof Alan Black, Prof Eric Nyberg

    The idea is to perform cross video retrieval of transcriptions and images from different YouTube videos to generate a storyboard. We are working on automating the process of creating a new story board. The high level goal is to produce storyboard-like visuals along with smoothed textual descriptions selected from large datasets of video and text.

  • Advisors: Prof Carolyn Rose, Prof Eric Nyberg

    Following the paradigm of abstraction after extraction, we have recently started working on identifying the relevant sentences explicitly first (extraction) and then using them to generate a summary. For this we are working movie reviews from which we extract all the plot related sentences and then summarize the plot of the movie from these extracted sentences.

  • Advisors: Prof. Louis-Philippe Morency

    Extended work on Neural Module Networks (NMN), that dynamically instantiate network layouts based on dependency parse of the question by including image caption and attention vectors in end-to-end training on VQA 1.0 dataset (achieved 57.1% overall accuracy). Question and caption context vectors are obtained by passing them respectively through a single LSTM layer followed by a fully connected layer. In cases of irrelevant captions, attending to caption with information need in the query after combining with the respective question and prediction from the NMNs helped localize on the answer space

  • Advisors: Prof. Eric Nyberg

    Different architectures of the neural models target specific challenges in the dataset thus hindering the generalization across the hypercube of models and datasets for Neural QA. Identifying the overlap categories of these errors to target a systematic method to ensemble the models to achieve generalization across multiple domains of QA.

  • Advisors: Prof. Eric Nyberg

    Participated in BioASQ 5B and 6B in ideal answer generation and achieved top ROUGE scores in final test batches (approximately 0.68). Developed a query oriented abstractive summarization system with an encoder-attention-decoder paradigm attending to query and relevant documents (query-focus), diversity based attention (to combat recurring sequences) and incorporating weight sharing and pointer mechanism (to handle rare terms). Worked on sentence ordering and fusion algorithms to improve coherence and readability of generated summary.

  • Advisors: Prof. Alan Black, Prof. Eric Nyberg, Dr. Manoj Chinnakotla

    Developed an end-to-end web based factoid QA system for Code-Mixed languages - WebShodh and hosted it online. Built a character level shallow net techniques like SVM to perform language identification. Due to dearth of annotated data, used lexical level resources like transliteration and translation to achieve an MRR of 0.37 and 0.32 in Hinglish (Hindi+ English) and Tenglish (Telugu+English) respectively. Curated a dataset of around 5k Code-Switched factoid questions and corresponding English answers based on code-switched articles and images.

  • Advisors: Prof. Graham Neubig, Prof. Alan Black

    Jointly learning the code-switched points along with learning the actual task improves the tasks of NER and Language Modeling in Code-Switched Text. We have collected a corpus 60k sentences with a mixing ratio of above 0.2 from online blogging sources to train our multi-layered LSTM model that comprises of a word decoder as well as a language decoder.

  • Advisor: Prof. Alan Black

    Text in navigation domain contains named entities in locations that are not in the language that the TTS database is recorded in. Performed character level SVM based Language Identification to classify native and English words in GPS navigational instructions collected from Google Maps API (between ~20k routes for 8 languages). Experimented with LSTM sequence to sequence model to transliterate Romanized spellings to native language to get closest graphemes.

  • Advisors: Prof. Alan Black, Prof. Alex Rudnicky

    Developed a skill on Amazon Alexa similar to 20 questions game to localize on a movie by probing questions about the attribute that has minimum KL divergence with uniform distribution. Performed intent detection using dependency parses and entity detection and used coreference resolution and heuristics from centering theory are used to resolve entities to have a coherent conversation.

  • Advisors: Prof. Eric Xing, Prof Matt Gormley

    Hidden layer in a CW-RNN is partitioned into separate modules that carry out update operations at different clock rates to monitor changing user goals in in restaurant domain search (Dialog State Tracking Challenge (DSTC 2) dataset). The modules that have low clock rate can retain the temporally distant long term information and output them. These states will be preserved in parallel to the high speed computations in the modules that have low clock rate which can store more recent information. Our system gave an L2 norm score of 0.38 while the baseline provided is 0.18.

  • Advisor: Prof. Ani Nenkova (UPenn)

    Evidence based medicine is a protocol for coalescing assorted clinical expertise and systematically searching through them for optimal evidence. The US National Library of Medicine (NLM) bibliographic database of is used for Automatic classification of biomedical text into labels under exclusive set of union of PICO labels and 5 NLM categories (total of 8 classes). A hierarchical classifier is built using SVM and SVR and using brown clusters to trigger the feature value for absent words. Unstructured abstracts are used as to generate brown clusters and word embeddings as features, with the idea of using them as trigger for similarity features respectively in the case of absence of the feature itself.

  • Advisor: Dr. Kishore Prahallad

    Implemeted a 3-part backoff strategy involving atomizing diphthongs, vowel epithesis and anaptyxis to combat missing syllables. Improved prosody by HMM based pause prediction (preferred in AB testing) and developed a light weight TTS android app for the same.





Hi Amma...

My mom is the best cook in the world. No, seriously you have no idea how great my mom is. I am from Hyderabad and I love even the simplest of her rasams more compared to the famous Hyderabadi biryani. She is my friend and also the epitome of patience.

That's my advisor Alan Black

Be it a Saturday or a Sunday, he is always there for you! I love how our own Dumbledore of LTI is witty, wise and sarcastic. There is a lot to learn from your dedication and calm nature. Of course I am not qualified enough to brag about your research capabilities here. Ranging from throught process of how to choose an interesting problem to doing systematic groundwork to pushing boundaries, I am making a novice apprentice effort in learning all this from you.


Just like all kids, I have grown up watching Road Runner, Tom and Jerry and Flinstones. Apart from these, the only other reality shows I have watched are dance competitions. Not having a dance teacher close to my home as a kid did not give the opportunity to be trained in dance and I know that's not an excuse. That did not stop me from learning random dance compositions from my seniors in undergrad and performing. Now and then, I still go through numerous free online lessons to learn traditional dance. The feeling you get after learning a sequence of steps and what they mean is amazing. I know I should be more systematic and sincere in learning this art form...

What happens when a meeting is cancelled...

Well, its raining and my meeting got cancelled. Though the rain had nothing to do with the cancellation, I decided to stay back home. My brain says I don't want to code today and I opened YouTube with a movie suggestion. I know my amateur drawing does not resemble her but she happens to be Deepika Padukone.

Pride and Prejudice...

Sunday evening chit chats with a friend who did not know the story of Pride and Prejudice... While giving a gist of the story, I wanted to read it again myself. Juggling between other activities, I managed to finish it after around 3 days and then I am in the mood to sketch it along with an awesome remark. Misaligning dialog of Mr Darcy with the frame in the drawing I managed to get make a small sketch.

A happy accident :)

I will let you in on a small secret. This is where I am going to admit that I started drawing a face and smudged the corners. Darkening it more and adding a theme turned out to be this. That is not it. When some of my friends saw this, they told me that they love Sachin Tendulkar. Well, who doesn't? But that's not the point. I have not tried to draw that awesome sports star. Turned out to be a happy accident. And then I went overboard and wanted to put a strong punchline and happened to ruin it from the side.

Meet my friends :)

The first time I had to stay away from my home is for my undergraduation. I should not call myself a hostelite as my home was just 12kms away and I would go away to eat my mom's food and sleep in my bed every weekend. Although I still qualify as a psuedo-hostelite since I spent 5 and a half days a week in hostel. These people are my family there. I leanred the power of adjustments, understanding each other and helping each other out. The numerous night outs with both academic stuff and of course random chattering and campus walks discussing topics ranging from the life of a tadpole to deep philosophies of existence are amazing only because of you guys. Complimenting all this is the midnight juice we get from our canteen where I have no idea of the frantic number of chocolate milkshakes that made all this even more blissful.


The term is derived from the Sanskrit name for Lord Krishna. It means 'One who is as clear as a crystal'. I am fascinated by the phonetics of the word. I have heard someone talk about this name in a temple. It also means that babies have no prejudices or hatred. Their personalities reflect from what is written on their crystal clear hearts and minds. It may be true that a soul strengthens by learning from adversities. But that need not be the case always. We are capable of learning through empathy. Placing ourselves in the shoes of another person comes naturally to us. Understanding, respecting and being welcoming of people around us strengthens us and our community that fills our clean canvas we get as babies with a beautiful painting.

One of the early drawings I made was to gift a friend. There is no occasion but I know she loves Federer. She knows a lot of technical details of the sport which I learn from discussing with her. I am one of those who like decorating their rooms with personalized stuff that reminds us of a story. I wanted her room also to have a small amateur piece of sketch of the person she adores. Hence it all began ...