|Office:||CBA 6.484, 2110 Speedway Stop B6500, Austin, TX 78712|
deepay <at> utexas <dot> edu
deepay <at> cs <dot> cmu <dot> edu
I work on a broad range of problems in Machine Learning and Data Mining, particularly focusing on:
How can we infer user profiles given only partially completed profiles of a few users in a large social network? Instead of a blanket assumption of homophily, we show that a more nuanced notion of partial homophily achieves much greater accuracy, and inference under this model can scale to the billion-node Facebook network.
How can social networks be used to learn users' responses to recommended items? While "friends are similar" is a worthy guiding principle, we show that online learning with the Gaussian Random Field model typically used in such settings runs into severe problems in real-world networks where degrees can be high. We investigate the reasons behind this phenomenon, analyzing the model in a Bayesian setting, and propose new models that are provably better, while also admitting scalable inference.
Why should a simple heuristic such as counting the number of common neighbors work so well in link prediction tasks? Why is the Adamic-Adar measure better? We formulate this as a question of measuring distances in a latent space, and prove how under fairly general conditions, such common heuristics are correct and optimal (and interestingly, why they sometimes deviate from optimality and yet do not suffer in practice!)