Wei Chen ![]()
I am a graduate student at Language Technologies Institute, Carnegie Mellon University. My research advisor is Scott Fahlman. I am currently working on "Mental States Representation and Reasoning" powered by Scone, which is our Knowledge-Base system. When I'm bored, I play with statistical natural language processing, especially in Machine Translation. However, I believe the real "translation" has to be done with sufficient understanding about the "meaning".
My old CV is available here.
Education
M.
S. E. in Computer Science ,
B. S. in Computer Science, Peking University, China, 2001-2005
Publications
Wei Chen and Scott E. Fahlman, "Modelling Mental States and Their Interactions". AAAI 2008 Fall Symposium on Biologically Inspired Cognitive Architectures. Arlington, VA.
Wei Chen, "Dimensions of Subjectivity in Natural Language" (Short Paper). In Proceedings of ACL-HLT'08. Columbus Ohio.
Technical Reports
"Discriminative Word Alignment with Syntactic Features"
Machine Translation Lab Report. LTI, Carnegie Mellon University, May 2008.
"Building Lanugage Model on Continuous Space Using Gaussian Mixture Models ̄
Technical Report submitted to Center of Language and Speech Processing, Johns Hopkins University, July 2007.
Abstract:
This work focuses on exploiting word distribution on a continuous space. We aim at finding out how continuous space performs in language modeling, when compared with its discrete counterpart. Also, we propose a new method for predicting infrequent words by borrowing information from some frequent words, which are close to them in our word vector space. We use bigram counts to initialize our word vectors. Some matrix factorization processes, such as SVD and NMF, have been used to reduce the size of the vectors. Then we model the word vector distribution by a Gaussian Mixture Model (GMM), which has been trained incrementally using EM algorithm. The performance of various methods in our model training, such as MAP adaptation and Tied-Mixture System, has also been studied in detail. In this research, our NMF code is able to work on large sparse matrices of size 100000*100000.
Survey Papers
Vamshi Ambati and Wei Chen, Cross Lingual Syntax Projection for Resource-Poor Languages. 2007.
Downloads
Non-negiave Matrix Factorization for Large Sparce Matrices
Professional Activities
Reviewer for AAAI FSS-08 BICA