# Learning Tree Structures for Conditional Random Fields (CRFs)

This page provides code from my project on learning tree structures for Conditional Random Fields (CRFs). This code is a somewhat improved version of the code used for this paper:
Joseph K. Bradley and Carlos Guestrin. Learning Tree Conditional Random Fields. International Conference on Machine Learning (ICML), 2010.
bibtex/abstract PDF

Note: This code is part of a larger SELECT Lab codebase which is available but still being improved for actual release. See my SILL Project Page for info on that codebase.

Page contents:

## Project Overview

Main goals:
• Learn structured models of conditional distributions P(Y|X).
• Learn tractable structures (trees).
• Make use of local inputs, i.e., local dependencies between subsets of Y variables and subsets of X variables.
• Use scalable methods.
Approach:
• Do variable selection beforehand; i.e., for each Yi, select a small set of variables in X which Yi directly depends on.
• For all pairs (Yi, Yj), compute an edge weight.
• Choose a max spanning tree.
• Using this tree structure for the CRF, do parameter learning.

## CRF Learning Code

### Code Overview

Main parts of code:
• Factors: table factors, gaussian factors, conditional versions for CRFs
• Models: decomposable (junction tree), Bayes nets, CRFs, synthetic models
• Inference: exact for tractable models, sampling and BP for intractable
• Datasets: datasets, synthetic data
• Parameter learning: learning for factors, learning for models via gradient methods
• Structure learning: Chow-Liu for generative models, my MST-based methods for CRFs
• Discriminative learning: regression, decision trees/stumps, boosting
• The code is C++, with a few Matlab scripts to process/plot results.
• We use CMake to build our code.
• Our main dependencies are IT++ (matrix/vector library) and Boost (C++ libraries).
• The code is pretty well-documented, with Doxygen-generated HTML documentation.

### Installation and Getting Started

The code has detailed instructions on how to install the necessary dependencies and build our code. Once you download the code, look at these files in the home directory:
introduction, installation, getting started
licensing information
AUTHORS.txt
list of contributors
TREECRFS.txt
info on duplicating my experiments
doc/html/index.html
Doxygen-generated documentation

### Licensing

This code is mostly released under the GNU General Public License (GPL). However, a few files are released under the GNU Lesser General Public License (LGPL). See the LICENSE.txt file for more details. The code is a subset of the SELECT Lab's larger codebase. We are planning to release the entire codebase under a more permissive license before long.

### If You Have Questions

If you have questions, get weird results, etc., please feel free to contact me; my email is listed at the top of my homepage. If you find bugs, definitely contact me! :)