|
General
Purpose
Submissions
Program
Past Years
|
A New Approximate Inference Algorithm for the Correlated Topic Model
Amr Ahmed
Recently, topic models have been extensively used to manage large collection of
unstructured data (like documents) by providing a low dimensional representation
that captures the latent semantic of the collection. This low dimensional
representation can then be used for tasks like classifications and clustering or
merely as a tool to structurally browse the rather unstructured collection.
Topic models view documents as a mixture of latent topics where the mixture
proportion is a document-specific multinomial distribution whose parameters are
drawn from a specific prior. Earlier approaches (like Latent Dirichlet
Allocation LDA [1]) used the Dirichlet distribution as a conjugate prior. While
the Dirichlet can capture variations in each topic's intensity independently, it
can't capture the intuition that some topics are highly correlated and can rise
up in intensity together. To accommodate that, Blei and Lafferty introduced the
Correlated Topic Model (CTM) by using the Logistic Normal distribution as a
prior instead of the Dirichlet. Unfortunately, the Logistic Normal is not
conjugate to the multinomial and hence approximate posterior inference becomes
harder than in the LDA case. To deal with this non-conjugacy, the authors in [2]
use numerical techniques to solve the variational inference fixed point
equations. In this work we take another approach by using a truncated Taylor
approximation that makes the variational inference fixed point equations
amenable to analytical closed form solution [3, 4]. We compared the two
algorithms in terms of inference and learning over simulated data and the
results were promising. We are currently comparing them over real data sets
using the proceedings of the Neural Information Processing Systems (NIPS)
conference from 1988-2003.
[1] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of
Machine Learning Research, 3:993-1022, January 2003.
[2] D. Blei and J. Lafferty. Correlated topic models. In Advances in Neural
Information Processing Systems 18 , 2006.
[3] E.P. Xing, On Topic Evolution. CMU-CALD Technical Report 05-115
[4] A. Ahmed, E.P. Xing, A New Approximate Inference Algorithm for the
Correlated Topic Model. In preparation
|