I'm a graduate student at Carnegie Mellon University's Language Technologies Institute. Broadly speaking, my research interests lie in the fields of natural language processing, machine learning and artificial intelligence. I'm currently working on multilingual named entity recognition and co-reference resolution, under the supervision of Dr. Robert Frederking. I also collaborate closely with Dr. Anatole Gershman and Bo Lin. Before coming to Carnegie Mellon, I was a visiting research scholar at the University of Pennslyvania's Linguistics Data Consortium. My research there, under the guidance of Dr. Lyle Ungar and Dr. Mark Liberman, was on simultaneous word-sense disambiguation and part-of-speech tagging for Arabic text. In what seems like a previous life, I graduated with a B.Tech and an M.Tech from the Indian Institute of Technology, Kharagpur in 2008, with Dr. Arijit Bishnu as my advisor.



  1. Bo Lin, Kevin Dela Rosa, Rushin Shah, Nitin Agarwal. 2011. LADS: Rapid Development of a Learning-To-Rank Based Related Entity Finding System using Open Advancement. Proceedings of the ACM SIGIR 1st International Workshop on Entity Oriented Search (SIGIR - EOS 2011). PDF

  2. Kevin Dela Rosa, Rushin Shah, Bo Lin, Anatole Gershman, Robert Frederking. 2011. Topical Clustering of Tweets. Proceedings of the ACM SIGIR 3rd Workshop on Social Web Search and Mining (SIGIR - SWSM 2011). PDF

  3. Rushin Shah, Bo Lin, Kevin Dela Rosa, Anatole Gershman, Robert Frederking. 2011. Improving Cross-Document Co-Reference with Semi-Supervised Information Extraction Models
    Proceedings of the Symposium on Machine Learning in Speech and Language Processing (MLSLP 2011)PDF


  1. Rushin Shah, Paramveer S. Dhillon, Mark Liberman, Dean Foster, Mohamed Maamouri and Lyle Ungar. 2010. A New Approach to Lexical Disambiguation of Arabic Text. Proceedings of EMNLP. PDF

  2. Bo Lin, Rushin Shah, Robert Frederking, Anatole Gershman. 2010. CONE: Metrics for Automatic Evaluation of Named Entity Co-reference Resolution. Proceedings of ACL Workshop on Named Entities. PDF

  3. Rushin Shah, Bo Lin, Anatole Gershman, Robert Frederking. 2010. SYNERGY: A Named Entity Recognition System for Resource-scarce Languages such as Swahili using Online Machine Translation. Proceedings of LREC Workshop on African Language Technology. PDF

  4. Bo Lin, Rushin Shah, Robert Frederking, Anatole Gershman. 2010. ENCORE: Experiments with a Synthetic Entity Co-reference Resolution Tool. Proceedings of LREC Workshop on Resources and Evaluation for Entity Resolution and Entity Management. PDF


  1. Arijit Ghosh, Rushin Shah, Arijit Bishnu, Bhargab Bhattacharya. 2009. Algorithms for Biological Cell Sorting with a Lab-on-a-chip. World Congress on Nature and Biologically Inspired Computing (NaBIC ‘09), Coimbatore, India. PDF

  2. Rushin Shah. 2009. The LDC Standard Arabic Morphological Tagger. Talk, LDC Institute, Philadelphia, USA. PDF

2008 and Prior

  1. Rushin Shah. 2008. Pushing Points, and Curves on Grid: Geometric Characterization and Algorithms. M.Tech Thesis, Kharagpur, India. PDF

  2. Rushin Shah. 2007. A Study on Uniform Insertion of Points. B.Tech Thesis, Kharagpur, India. PDF

  3. Rushin Shah. 2006. A Machine Learning Based System to Predict Species Labels of Gene Mentions in Medical Abstracts. Talk, University of Pennsylvania, Philadelphia. PDF



Rushin Shah
6412, Gates-Hillman Center
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15206

Email: rnshah AT cs DOT cmu DOT edu