| Office: Gates Hillman Center (GHC) 8010 |
| Phone: 412-268-2627 |
| jkbradle (yes, without the y) at cs dot cmu dot edu |
I just finished the Ph.D. program in the Machine Learning Department at Carnegie Mellon University. My advisor was Carlos Guestrin, of the Select Lab. I am now doing a short-term postdoc with Carlos at the University of Washington.
I am interested in large-scale machine learning. My current focus is on decomposing learning problems into smaller, simpler subproblems. Such decompositions can permit trade-offs between sample complexity, computational complexity, and potential for parallelization, and we can often optimize these trade-offs in model- or data-specific ways. My approach combines theory and application, focusing on methods which have strong theoretical guarantees and are competitive in practice.
My thesis is on tractable learning methods for large-scale Conditional Random Fields (CRFs) (Lafferty et al., 2001). CRFs are Probabilistic Graphical Models of conditional distributions P(Y|X), where Y and X are sets of random variables. My thesis has three parts: CRF parameter learning, CRF structure learning, and parallel learning for CRFs.
I am researching tractable methods for learning parameters of CRFs with arbitrary structures. We use decomposable learning methods (composite likelihood) which do not require intractable inference during learning, but which also come with strong theoretical guarantees for finite sample sizes.
I am also researching learning tractable (low-treewidth) structures for CRFs. Up to now, little work has been done on CRF structure learning, but our techniques permit efficient learning of tree structures. We do very well at recovering ground-truth models from synthetic data and on an fMRI application.
My research on CRF parameter and structure learning uses methods which break large problems down into smaller regression problems. I am researching parallel methods for sparse regression which take advantage of sparsity to improve parallel performance. Our ICML paper below uses statistical properties of data to permit parallel optimization for multicore computing, and my current research is examining methods for distributed computing which communicate sparse information.
Before coming to grad school, I was an undergraduate at Princeton University, where I received a B.S.E. in Computer Science. At Princeton, my main research was with Robert E. Schapire. We researched boosting in the filtering framework, where the learner does not use a fixed training set but rather has access to an example oracle which can produce an unlimited number of examples from the target distribution. This setting is useful for modeling learning with datasets too large to fit into a computer, learning in memory-limited situations, or learning from an online source of examples (e.g. from a web crawler).
Joseph K. Bradley.
Learning Large-Scale Conditional Random Fields.
Ph.D. Thesis, Machine Learning Department, Carnegie Mellon University, 2013.
Thesis (PDF)
Defense Slides (PPT)
Joseph K. Bradley and Carlos Guestrin.
Sample Complexity of Composite Likelihood.
In
the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), 2012.
Paper (PDF)
Poster (PPT)
Talk from CMU Machine Learning Lunch talk (Vimeo)
Slides from CMU Machine Learning Lunch talk (PPT)
Joseph K. Bradley, Aapo Kyrola, Danny Bickson, and Carlos Guestrin.
Parallel Coordinate Descent for L1-Regularized Loss Minimization.
In
the 28th International Conference on Machine Learning (ICML), 2011.
Paper (PDF)
Talk slides (PPT)
Project page (with code, data, and supplementary material)
TechTalks.tv video of ICML talk
Joseph K. Bradley and Carlos Guestrin.
Learning Tree Conditional Random Fields.
In
the 27th International Conference on Machine Learning (ICML), 2010.
Paper (PDF)
Talk slides (PPT)
Code available for download (Note: This code is part of a larger lab codebase which we are preparing to release. The new release features many improvements but will not be completely compatible with this previous release.)
Joseph K. Bradley and Robert E. Schapire.
FilterBoost: Regression and Classification on Large Datasets.
In
Advances in Neural Information Processing Systems 20 (NIPS), 2008.
Paper (PDF) with Appendix
Slides (PPT) from oral at NIPS
I do competitive Latin, Standard and Smooth ballroom dancing.
It's awesome. You should do it too.
(Check out CMU's Ballroom Dance Club!)
From our Christmas 2011 piece:
I like traveling.