Speaker: Andrew McCallum Associate Professor University of Massachusetts Amherst Computer Science Department Title: Latent Variable Models of Social Networks and Text Abstract: Generative topic models, such as Latent Dirichlet Allocation and its progeny, are increasingly popular tools for summarization and knowledge discovery in text and other discrete data. This talk will present several generative topic models that combine unstructured text with structured data, such as links, relations, time-stamps, and word n-grams. I will demonstrate these methods' capabilities in enabling role and group discovery in social network data, and enabling new bibliometric impact measures mined from the "citation social network" in over 1 million research papers gathered by our new web portal, Rexa.info. Finally, I will briefly introduce very recent work in Multi-Conditional Mixtures---alternative topic models that have some similarities to conditional random fields. Joint work with colleagues at UMass: Xuerui Wang, Natasha Mohanty, Andres Corada, Chris Pal, Wei Li, David Mimno and Gideon Mann. Bio: Andrew McCallum is an Associate Professor at University of Massachusetts, Amherst. He was previously Vice President of Research and Development at WhizBang Labs, a company that used machine learning for information extraction from the Web. In the late 1990's he was a Research Scientist and Coordinator at Justsystem Pittsburgh Research Center, where he lead the creation of CORA, an early research paper search engine that used machine learning for spidering, extraction, classification and citation analysis. After receiving his PhD from the University of Rochester in 1995, he was a post-doctoral fellow at Carnegie Mellon University. He is an action editor for the Journal of Machine Learning Research. For the past ten years, McCallum has been active in research on statistical machine learning applied to text, especially information extraction, document classification, clustering, finite state models, semi-supervised learning, and social network analysis.