Andrew Carlson

Update: The New York Times has published an article on the Never-Ending Language Learner (NELL) project.

Update: I have graduated and am now working at Google Pittsburgh.

From 2005 through 2010, I was a Ph.D. student at Carnegie Mellon University in the Machine Learning Department, within the School of Computer Science. I was advised by Tom Mitchell. My research focused on mining the Web for facts and information. I was particularly interested in building large-scale, minimally-supervised information extraction systems that can extract large repositories of facts from web text. This work contributed to building the Never-Ending Language Learner (NELL) system, part of the Read the Web research project.

I defended a thesis entitled "Coupled Semi-Supervised Learning" in April 2010, and will graduate in May 2010. The latest published results from this work are in the paper "Toward an Architecture for Never-Ending Language Learning," which was presented at AAAI 2010 in July 2010. This work reports on the first several months of running the NELL system. Also, see the paper "Coupled Semi-Supervised Learning for Information Extraction," which was presented at WSDM 2010 in February.

I also contributed to Professor Mitchell's work on machine learning and fMRI brain imaging data. The results of that work can be found in the Science paper "Predicting Human Brain Activity Associated with the Meanings of Nouns". This work was also featured on "60 Minutes" [Video and Transcript].

I am thankful for support from Yahoo! for my research in 2007-2009 through the PhD Student Fellowship Program. I was an intern with Yahoo!'s Data Mining and Research group, working with Scott Gaffney and Flavian Vasile, in the summer of 2008. Yahoo! was also extremely helpful by providing CMU with access to its M45 computing cluster. The cluster has enabled web-scale research outside of corporations. Without it, it would have been impossible to pursue this line of research. I wrote a blog post with Justin Betteridge on Yahoo!'s Hadoop blog describing one way in which we used M45 and Hadoop in our research.

I spent the summer of 2007 on an internship at Google in Pittsburgh working with Charles Schafer. A publication from that work was presented at ECML/PKDD 2008.

In the summer of 2006, I coordinated a reading group on semi-supervised natural language learning research.

Previously, I attended the University of Illinois at Urbana-Champaign, where I received a B.S. in Computer Science. I researched machine learning applied to natural language under Professor Dan Roth, as part of the Cognitive Computation Group. I also released and maintained the SNoW software package.

Press Coverage


Contact Information

Email:   acarlson AT