I am an assistant professor at CMU in the Machine Learning and the Computer Science departments. I work in the areas of statistical learning theory and game theory, with a focus on learning from people.
Google Scholar page
nihars [at] cs.cmu.edu
Office: GHC 8211
My research interests lie in the areas of statistics, machine learning, information theory and game theory, with a focus on human-centered systems. I am presenty particularly excited about developing principled approaches towards improving peer review. Peer review faces a number of critical challenges related to biases, noise, incentives, subjectivity and others. Addressing these challenges is particularly urgent due to the exploding number of submissions in many fields, which is necessitating an increased automation in the peer review process. My research aims to address these important challenges at scale, in a principled and pragmatic manner.
- Bias: Reviewers are often miscalibrated. As an example suppose that ratings must be made in the range [0,1]. Then it might be the case that the some reviewer is lenient and always provides a score of at least 0.6 whereas some other reviewer is stringent and hardly every provides a score more than 0.4. Or it might be the case that one reviewer is moderate whereas the other is extreme -- the first reviewer's 0.2 is equivalent to the second reviewer's 0.1 whereas the first reviewer's 0.3 is equivalent to the second reviewer's 0.9. Then if these biases are a priori unknown to you, how would one calibrate the reviewers (from say, just one review obtained per reviewer)? Indeed, popular estimation algorithms (and more generally, all determistic algorithms) fail to accomplish this task. We design a novel randomized estimator that can handle arbitrary and even adversarial miscalibrations. (link)
- Variance: Reviews are often noisy due to reasons such as imperfect matches between papers and reviewers. The assignment algorithms popularly employed today can be unfair to certain papers by assigning poorer-fit reviewers to interdisciplinary or novel papers in favor of better reviewers to papers pursuing standard topics. We present an algorithm to assign reviewers to papers which guarantees fairness of assignment to all papers. Simultaneously, the algorithm guarantees statistical accuracy of the review procedure. (link)
- Strategic behavior: The presence of various conflicts of interest in peer review can incentivize strategic reviews to influence the final ranking of one's own papers. We present a framework to ensure peer review systems that are insulated from strategic manipulations. Specifically, we design assignment and aggregate algorithms which ensure that no reviewer can influence the ranking of any papers with which they have a conflict of interest. We present positive results in terms of an algorithm and an analysis on ICLR 2017 data, as well as negative results which demonstrate the challenges in this problem. (link)
- Subjectivity: It is common to see a handful of reviewers reject a highly novel paper, because they view, say, extensive experiments as far more important than novelty, whereas the community as a whole would have embraced the paper. More generally, the fact that any paper is reviewed by only a handful of reviewers leads to a high influence of subjectivity in terms of the preferences of these few reviewers. We develop a novel method to mitigate this subjectivity -- to first learn the community's preferences from the individual reviews and then judge every paper with the same yardstick. We prove that surprisingly, this is the only method which meets three simple and natural requirements. (link coming soon)
- Empirical analysis: We analyze the peer-review data from NIPS 2016. We make several surprising observations from this data, outline some key takeaways for use in future conferences, and highlight a number of challenging and useful open problems. (link)
PAST RESEARCH:
Crowdsourcing (2013-17). Awarded the David J. Sakrison Memorial Prize at UC Berkeley.
Distributed storage (2009-13). Awarded the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011 & 12.
- Your 2 is My 1, Your 3 is My 9: Handling Arbitrary Miscalibrations in RatingsJingyan Wang and Nihar B. Shah
- PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review
Ivan Stelmakh and Nihar B. Shah and Aarti Singh
- On Strategyproof Conference Review
Yichong Xu, Han Zhao, Xiaofei Shi and Nihar B. Shah
- Choosing How to Choose Papers
Ritesh Noothigattu, Nihar B. Shah and Ariel Procaccia
- Design and Analysis of the NIPS 2016 Review Process
Nihar B. Shah, Behzad Tabibian, Krikamol Muandet, Isabelle Guyon and Ulrike von Luxburg
PAST RESEARCH ON CROWDSOURCING
- Low Permutation-rank Matrices: Structural Properties and Noisy Completion
Nihar B. Shah, Sivaraman Balakrishnan and Martin J. Wainwright
ISIT 2018
- A Permutation-based Model for Crowd Labeling: Optimal Estimation and Robustness
Nihar B. Shah, Sivaraman Balakrishnan and Martin J. Wainwright
Code for the WAN and the OBI-WAN estimators
Dataset
- Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don’t Help
Reinhard Heckel, Nihar B. Shah, Kannan Ramchandran and Martin J. Wainwright
Code
- Feeling the Bern: Adaptive Estimators for Bernoulli Probabilities of Pairwise Comparisons
Nihar B. Shah, Sivaraman Balakrishnan and Martin J. Wainwright
Shorter version at ISIT 2016.
Code for the CRL estimator
- Simple, Robust and Optimal Ranking from Pairwise Comparisons
Nihar B. Shah and Martin J. Wainwright
Journal of Machine Learning Research (to appear).
Dataset
- Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues
Nihar B. Shah, Sivaraman Balakrishnan, Adityanand Guntuboyina and Martin J. Wainwright
IEEE Transactions on Information Theory 2017 (Shorter version at ICML 2016).
- No Oops, You Won't Do It Again: Mechanisms for Self-correction in Crowdsourcing
Nihar B. Shah and Dengyong Zhou
ICML 2016.
- Approval Voting and Incentives for Crowdsourcing
- Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence
Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, and Martin J. Wainwright
The Journal of Machine Learning Research, 2016.
Dataset for cardinal vs. ordinal Dataset for pairwise comparison topologies
- Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing
Nihar B. Shah and Dengyong Zhou
Journal of Machine Learning Research 2016 (shorter version at NIPS 2015).
Dataset
- Parametric Prection from Parametric Agents
Yuan Luo, Nihar B. Shah, Jianwei Huang, Jean Walrand
Operations Research, 2017.
- Truth Serums for Massively Crowdsourced Evaluation Tasks
Vijay Kamble, Nihar Shah, David Marn, Abhay Parekh, Kannan Ramachandran
SCUGC 2015: The 5th Workshop on Social Computing and User-Generated Content.
- On the Impossibility of Convex Inference in Human Computation
Nihar B. Shah and Dengyong Zhou
AAAI, Austin, Jan. 2015.
- A Case for Ordinal Peer-evaluation in MOOCs
Nihar B. Shah, Joseph Bradley, Abhay Parekh, Martin J. Wainwright, Kannan Ramchandran
Neural Information Processing Systems (NIPS): Workshop on Data Driven Education, Lake Tahoe, Dec. 2013.
- Regularized Minimax Conditional Entropy for Crowdsourcing
Dengyong Zhou, Qiang Liu, John Platt, Christopher Meek, and Nihar B. Shah
Dec. 2014.
PAST RESEARCH ON DISTRIBUTED STORAGE
(* indicates equal contribution)
- The MDS Queue: Analysing Latency Performance of Codes
Nihar B. Shah, Kangwook Lee and Kannan Ramchandran
IEEE Transactions on Information Theory, 2017.
- A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes
K. V. Rashmi, Nihar B. Shah and Kannan Ramchandran
IEEE Transactions on Information Theory, 2017.
Slides from conference (ISIT) presentation
- When Do Redundant Requests Reduce Latency ?
Nihar B. Shah, Kangwook Lee and Kannan Ramchandran
IEEE Transactions on Communication, Feb. 2016.
Slides
- Distributed Storage Codes with Repair-by-Transfer and Non-achievability of Interior Points on the Storage-Bandwidth Tradeoff
Nihar B. Shah*, K. V. Rashmi*, P. Vijay Kumar and Kannan Ramchandran
IEEE Transactions on Information Theory, March 2012.
- Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction
K. V. Rashmi*, Nihar B. Shah* and P. Vijay Kumar
IEEE Transactions on Information Theory, August 2011.
IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011 & 2012.
- Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions
Nihar B. Shah*, K. V. Rashmi*, P. Vijay Kumar and Kannan Ramchandran
IEEE Transactions on Information Theory, April 2012.
- On Minimizing Data-read and Download for Storage-Node Recovery
Nihar B. Shah
IEEE Communications Letters, 2013.
Second place in the first ACM University Student Research Competition, 2013.- Having Your Cake and Eating It Too: Jointly Optimal Codes for I/O, Storage and Network-bandwidth In Distributed Storage Systems
KV Rashmi, Preetum Nakkiran, Jingyan Wang, Nihar B. Shah, and Kannan Ramchandran
USENIX FAST, Santa Clara, Feb. 2015.
Picked as the best paper of USENIX FAST 2015 by StorageMojo.
- Fundamental Limits on Communication for Oblivious Updates in Storage Networks
Preetum Nakkiran, Nihar B. Shah, K. V. Rashmi
IEEE GLOBECOM 2014, Dec. 2014.
- A "Hitchhiker's" Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers
K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran
ACM SIGCOMM, Aug 2014.
- One Extra Bit of Download Ensures Perfectly Private Information Retrieval
Nihar B. Shah, K. V. Rashmi and Kannan Ramchandran
ISIT 2014.
- A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster
K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran
USENIX HotStorage, San Jose, Jun. 2013.
- Secret Sharing Across a Network with Low Communication Cost: Distributed Algorithm and Bounds
Nihar B. Shah, K. V. Rashmi and Kannan Ramchandran
IEEE International Symposium on Information Theory (ISIT), Istanbul, Jul. 2013.
Slides Poster
- Regenerating Codes for Errors and Erasures in Distributed Storage
K. V. Rashmi*, Nihar B. Shah*, Kannan Ramchandran, and P. Vijay Kumar
IEEE International Symposium on Information Theory (ISIT), Cambridge, Jul. 2012.
Slides
- Information-theoretically Secure Regenerating Codes for Distributed Storage
Nihar B. Shah*, K. V. Rashmi*, and P. Vijay Kumar
Globecom 2011.
- Enabling Node Repair in Any Erasure Code for Distributed Storage
K. V. Rashmi*, Nihar B. Shah* and P. Vijay Kumar
IEEE International Symposium on Information Theory (ISIT), St. Petersburg, Jul. 2011.- A Flexible Class of Regenerating Codes for Distributed Storage
Nihar B. Shah*, K. V. Rashmi*, and P. Vijay Kumar
IEEE International Symposium on Information Theory (ISIT), Austin, Jun. 2010.
- Explicit and Optimal Exact-Regenerating Codes for the Minimum-Bandwidth Point in Distributed Storage
K. V. Rashmi*, Nihar B. Shah*, P. Vijay Kumar, and Kannan Ramchandran
IEEE International Symposium on Information Theory (ISIT), Austin, Jun. 2010.
- Explicit Codes Minimizing Repair Bandwidth for Distributed Storage    (the complete version on Arxiv)
Nihar B. Shah*, K. V. Rashmi*, P. Vijay Kumar and Kannan Ramchandran
IEEE Information Theory Workshop (ITW), Cairo, Jan. 2010.
- Explicit Construction of Optimal Exact Regenerating Codes for Distributed Storage
K. V. Rashmi*, Nihar B. Shah*, P. Vijay Kumar and Kannan Ramchandran
Allerton Conference on Control, Computing and Communication, Urbana-Champaign, Sep. 2009.
- Regenerating Codes for Distributed Storage Networks (invited)
Nihar B. Shah*, K. V. Rashmi*, P. Vijay Kumar, and Kannan Ramchandran
International Workshop on the Arithmetic of Finite Fields (WAIFI), Istanbul, Jun. 2010.
- Network Coding
K. V. Rashmi*, Nihar B. Shah* and P. Vijay Kumar.
Resonance, vol. 15, no. 7, pp. 604-621., Jul. 2010.
(Resonance is a journal of science education published by the Indian Academy of Sciences)- Distributed Storage System for Optimal Storage Space and Network Bandwidth Utilization and A Method Thereof
K. V. Rashmi*, Nihar B. Shah* and P. Vijay Kumar
US Patent, Nov 2011.
GROUP PHD STUDENTS
Jingyan Wang, Robotics Institute, CMU
Ivan Stelmakh, Machine Learning Department, CMU (advised jointly with Aarti Singh)
CURRICULUM VITAE EDUCATION- UC Berkeley
PhD in Electrical Engineering and Computer Sciences
Advisors: Prof. Martin J. Wainwright and Prof. Kannan Ramchandran
Other members of thesis committee: Prof. Christos Papadimitriou and Prof. Tom Griffiths
- Indian Institute of Science (IISc), Bangalore
M.E. in Telecommunication
Thesis: Minimizing Repair Bandwidth in Distributed Storage Systems
Advisor: Prof. P. Vijay Kumar
- National Institute of Technology Karnataka, Surathkal
B. Tech. in Electronics and Communication
PUBLICATIONS
-
Please visit the publications page.
HONORS
- David J. Sakrison Memorial Prize for a "truly outstanding piece of research" at EECS, UC Berkeley, May 2017
- Outstanding GSI (graduate student instructor) award at UC Berkeley, 2015-16
- Microsoft Research PhD Fellowship, 2014-2016.
- IEEE Data Storage Best Paper and Best Student Paper awards for years 2011 & 2012
- Second place in the first ACM University Student Research Competition, 2013.
- Berkeley Fellowship, 2011-13 (the most prestigious fellowship for incoming graduate students at UC Berkeley).
- Excellence Award for the academic year 2011-2012 at UC Berkeley.
- Prof. SVC Aiya Medal for the best master-of-engineering student in the ECE department at IISc, 2010.
TECHNICAL SERVICES
- Program committee member, HCOMP 2018.
- Co-chair, the NIPS 2014 workshop on Crowdsourcing and Machine-learning and the ICML 2014 workshop on Crowdsourcing and Human-computation.
- Reviewer, Journal of Machine Learning Research, Journal of Artificial Intelligence Research, Proceedings of the IEEE, IEEE Transactions on Information Theory, IEEE Transactions on Signal Processing, IEEE Transactions on Communications, IEEE Transactions on Parallel and Distributed Systems, IEEE Communications Letters, IEEE Journal on Selected Areas in Communications, ACM Transactions on Intelligent Systems, NIPS, ISIT, DISC, Netcod, Infocom, Globecom, ITW.
WORK EXPERIENCE
- Intern, Microsoft Research Redmond, May 2013 to August 2013 and May 2014 to August 2014
- Crowdsourcing algorithms.
- Project Associate, IISc-Infosys collaborative project, Bangalore, July 2010 to June 2011
- Algorithms for robust and efficient media content distribution networks.
- Member of Technical Staff, Adobe Systems, Bangalore, July 2007 to July 2008.
- Worked on Adobe Captivate, an automated e-learning authoring tool.
- PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review