Language Technologies Institute
Student Research Symposium 2006

A New Approximate Inference Algorithm for the Correlated Topic Model

Amr Ahmed

Recently, topic models have been extensively used to manage large collection of unstructured data (like documents) by providing a low dimensional representation that captures the latent semantic of the collection. This low dimensional representation can then be used for tasks like classifications and clustering or merely as a tool to structurally browse the rather unstructured collection.

Topic models view documents as a mixture of latent topics where the mixture proportion is a document-specific multinomial distribution whose parameters are drawn from a specific prior. Earlier approaches (like Latent Dirichlet Allocation LDA [1]) used the Dirichlet distribution as a conjugate prior. While the Dirichlet can capture variations in each topic's intensity independently, it can't capture the intuition that some topics are highly correlated and can rise up in intensity together. To accommodate that, Blei and Lafferty introduced the Correlated Topic Model (CTM) by using the Logistic Normal distribution as a prior instead of the Dirichlet. Unfortunately, the Logistic Normal is not conjugate to the multinomial and hence approximate posterior inference becomes harder than in the LDA case. To deal with this non-conjugacy, the authors in [2] use numerical techniques to solve the variational inference fixed point equations. In this work we take another approach by using a truncated Taylor approximation that makes the variational inference fixed point equations amenable to analytical closed form solution [3, 4]. We compared the two algorithms in terms of inference and learning over simulated data and the results were promising. We are currently comparing them over real data sets using the proceedings of the Neural Information Processing Systems (NIPS) conference from 1988-2003.

[1] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993-1022, January 2003.

[2] D. Blei and J. Lafferty. Correlated topic models. In Advances in Neural Information Processing Systems 18 , 2006.

[3] E.P. Xing, On Topic Evolution. CMU-CALD Technical Report 05-115

[4] A. Ahmed, E.P. Xing, A New Approximate Inference Algorithm for the Correlated Topic Model. In preparation