Ramnath Balasubramanyan
Research Interests: statistical machine learning, network analysis, text mining, natural language processing.
Summary of Skills
Strong research background in text analysis, information extraction and machine learning, specifically using topic models and other latent-variable models.
Expertise in implementing large scale machine learning and text analysis systems.
Knowledge of several languages - C++, Pig, Python, Perl, R, Java, shell scripting etc.
Education
- PhD student, Aug '09 - present
Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
Advisor: Prof. William Cohen.
Relevant Coursework: Machine Learning, Numerical Optimization, Text Driven Forecasting, Language & Statistics - I, Language & Statistics - II, Algorithms for NLP, Advanced NLP Seminar, Active Learning Seminar, Lab in NLP, Grammar Formalisms.
GPA: 3.82 (until Mid-2012)
- M.S in Natural Language Processing (Computer Science), Aug '07 - May '09
Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
GPA: 3.83
- Bachelor of Engineering (Computer Science and Engineering), Jun '99 - Jun '03.
Vishveswaraiah Technological University, PESIT, Bangalore, India.
Score: 83.4%, First Class with Distinction
-
Other Coursework
- Text Retrieval and Web Search, Aug '06 - Dec '06
Non Degree Option program, Stanford University, Stanford, CA
- Bayesian Statistics
Conducted by Prof. David Draper (Univ. California at Santa Cruz) at Yahoo! Inc.
Work Experience
- Software Engineering Intern, Twitter Inc., San Francisco, CA, USA, Jul '12 - Sep '12
Personalization and Recommender Systems team - Worked on chatter detection and tweet categorization.
- Software Engineering Intern, Google Inc., Pittsburgh, PA, USA, Jun '09 - Aug '09
Product Search team - Worked on automatically matching product offers to product catalog entries.
- Senior Research Engineer, Yahoo! Inc, Santa Clara, CA, USA, Jun '05 - Jun '07
Content Analysis Group headed by Byron Dom; Natural Language Processing group headed by Hadar Shemtov
- Built a large scale text categorization application to categorize product offers for Yahoo!
Shopping. Product categorization enabled key functionalities such as guided navigation
and merchant cost per click pricing
- Developed an information extraction system to extract attributes from merchant product
offers using Hidden Markov Models and Conditional Random Fields for Yahoo! Shopping.
Attribute extraction allows users to narrow searches for products based on attribute
values such as brand, size etc.
- Worked on enhancing the scope and increasing the accuracy of the automatic question categorization system for Yahoo! Answers.
- Software Engineer, Yahoo! Inc, Bangalore, India, Jun '03 - May '05
- Core member of a team that developed a document categorization platform.
- Won Yahoo! Ratna (Best Employee) award for the year 2004
Research Projects
- Stochastic block models: Regularization and applications in social networks and politics, Aug '11 - now
Studying regularization techniques to make mixed membership stochastic block models more flexible in modeling real world networks.
Open sourced code at https://github.com/rbalasub/jigsaw
- A computational study of political decision making under Prof. Cohen and Prof. David P. Redlawsk (Rutgers University), Jun '10 - now
Developed techniques to construct corpora-specific sentiment word lists and used it to model reader responses to political blogs using a supervised LDA approach.
- Never Ending Lanugage Learner project at CMU under Prof. Cohen, Jun '10 - now
- Querendipity project at CMU under Prof. Cohen, Sep '09 - now
Developing stochastic block models and topic models to model text and entity relations jointly. This model was applied to a corpora of publications about Yeast and a protein-protein interaction network of yeast proteins.
- News Sagas project at CMU under Prof. Cohen (in collaboration with Microsoft Live Labs), Sep '08 to May '09
Developed topic models to cluster atomic news stories into "sagas" using parametric clustering models that model time-sensitive text and nonparametric Bayesian models based on hierarchical Dirichlet processes.
- RADAR project at CMU under Prof. Cohen(funded by DARPA), Aug '07 to Aug '08
- Message - Task Linking using lazy graph walks with random restarts on a specialized graph constructed for the task.
- Developed a Mozilla Thunderbird extension for email recipient recommendation and leak detection.
Patents
- Attribute extraction using limited training data, D. Pavlov, R. Balasubramanyan - United States Patent 7689527
- Assigning into one set of categories information that has been assigned to other sets of categories, B. Dom, H. Han, R. Balasubramanyan, D. Pavlov - United States Patent 7885859
- Automatic product categorization, B. Dom, A. Goyal, R. Balasubramanyan, D. Pavlov, B. Suresh - United States Patent 7870039
Publications
- Regularization of Latent Variable Models to Obtain Sparsity
Ramnath Balasubramanyan and William W. Cohen.
SDM 2013, SIAM International Conference on Data Mining, Austin.
- Characterizing User-Subgroups in Flickr Group : A Block LDA Based Approach
Sumit Negi, Ramnath Balasubramanyan and Santanu Chaudhury.
ICPR 2012, International Conference on Pattern Recognition, Tsukuba Science City, Japan.
- Entropic Regularization of Mixed-membership Network Models using Pseudo-observations
Ramnath Balasubramanyan and William W. Cohen.
MLG 2012, Workshop on Mining and Learning with Graphs at ICML, Edinburgh.
- Evaluating Joint Modeling of Yeast Biology Literature and Protein-Protein Interaction Networks
Ramnath Balasubramanyan , Kathryn Rivard, William W. Cohen, Jelena Jakovljevic and John Woolford.
BioNLP 2012, Workshop at NAACL 2012, Montreal.
- Modeling Polarizing Topics: When Do Different Political Communities Respond Differently to the Same News?
Ramnath Balasubramanyan, William W. Cohen, Doug Pierce and David Redlawsk.
ICWSM 2012, International AAAI Conference on Weblogs and Social Media, Dublin, Ireland.
- What pushes their buttons? Predicting comment polarity from the content of political blog posts
Ramnath Balasubramanyan, William W. Cohen, Doug Pierce and David Redlawsk.
Workshop on Language in Social Media (LSM 2011) at ACL (Annual Meeting of the Association for Computational Linguistics: Human Language Technologies), 2011, Portland, OR, USA.
- Combining stochastic block models and topic models
Ramnath Balasubramanyan and William W. Cohen.
SDM 2011, SIAM Conference on Data Mining, Phoenix, AZ, USA.
- Node Clustering in Graphs: An Empirical Study
Ramnath Balasubramanyan, Frank Lin and William W. Cohen.
NIPS 2010, Workshop on Networks Across Disciplines in Theory and Applications, Vancouver, BC, Canada.
- Block-LDA: Jointly modeling entity-annotated text and entity-entity links.
Ramnath Balasubramanyan and William W. Cohen
ICML 2010, International Conference on Machine Learning, Workshop on Topic Modeling, Haifa, Israel.
- From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series.
Brendan O'Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith
ICWSM 2010, International AAAI Conference on Weblogs and Social Media, Washington DC, USA.
- From Episodes to Sagas: Understanding the News by Identifying Temporally Related Story Sequences.
Ramnath Balasubramanyan, Frank Lin, William W. Cohen, Matthew Hurst, Noah A. Smith
ICWSM 2009, International AAAI Conference on Weblogs and Social Media, San Jose, USA.
- Information Leaks and Suggestions: A Case Study using Mozilla Thunderbird.
Vitor Carvalho, William W. Cohen and Ramnath Balasubramanyan
CEAS 2009, Conference on Email and Anti-Spam, Mountain View, USA.
- CutOnce - Recipient Recommendation and Leak Detection in Action.
Ramnath Balasubramanyan, Vitor Carvalho and William W. Cohen
AAAI 2008, Conference on Artificial Intelligence, Workshop on Enhanced Messaging, Chicago, USA.
- Activity - centered Search in Email.
Einat Minkov, Ramnath Balasubramanyan and William W. Cohen
CEAS 2008, Conference on Email and Anti - Spam, Mountain View, USA.
- Document Preprocessing for Naìˆve Bayes Classification and Clustering with Mixture of Multinomials.
Dmitry Pavlov, Ramnath Balasubramanyan, Byron Dom, Shyam Kapur, Jignashu Parikh
KDD 2004, ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Seattle, USA.
Teaching
- Teaching Assistant for Machine Learning(10-601), Carnegie Mellon University - Spring 2010.
- Teaching Assistant for Language & Statistics - I(11-761), Carnegie Mellon University - Spring 2011.
Other
- Webmaster for ICWSM (International AAAI Conference on Weblogs and Social Media) 2009, 2010.
References available upon request.