Wei Chen    

I am a graduate student at Language Technologies Institute, Carnegie Mellon University. My research advisor is Jack Mostow. I am working on the design and the implementation of the self-questioning instruction in our intelligent tutoring system. I used to work on "Mental States Representation and Reasoning" with Scott Fahlman. When I'm bored, I play with statistical natural language processing, especially in Machine Translation. However, I believe the real "translation" has to be done with sufficient understanding about the "meaning" of language. 

My CV is available here.

 

Education

M. S. E. in Computer Science, Johns Hopkins University, USA, 2005-2007

B. S. in Computer Science, Peking University, China, 2001-2005

 

 

Publications

 

Wei Chen, Gregory Aist and Jack Mostow. ^Generating Questions Automatically from Informational Text ̄. In Proceedings of The 2nd Workshop on Question Generation. Brighton, UK. (8 pages)

 

Jack Mostow and Wei Chen, "Generating Instruction Automatically for the Reading Strategy of Self-Questioning". In Proceeding of AIED2009. Brighton, UK. (8 pages)

 

Wei Chen, "Understanding Mental States in Natural Language". In Proceedings of IWCS-8. Tilburg, The Netherlands. (12 pages)   

 

Wei Chen and Scott E. Fahlman, "Modelling Mental States and Their Interactions". AAAI 2008 Fall Symposium on Biologically Inspired Cognitive Architectures. Arlington, VA. (6 pages)   

 

Wei Chen, "Dimensions of Subjectivity in Natural Language" (Short Paper). In Proceedings of ACL-HLT'08. Columbus Ohio. (4 pages)

 

 

Publications and Abstracts

 

 

Technical Reports

 

"Discriminative Word Alignment with Syntactic Features" (9 pages)

Machine Translation Lab Report. LTI, Carnegie Mellon University, May 2008. 

 

"Building Lanugage Model on Continuous Space Using Gaussian Mixture Models ̄ (66 pages)

Technical Report submitted to Center of Language and Speech Processing, Johns Hopkins University, July 2007.

Abstract:

This work focuses on exploiting word distribution on a continuous space. We aim at finding out how continuous space performs in language modeling, when compared with its discrete counterpart. Also, we propose a new method for predicting infrequent words by borrowing information from some frequent words, which are close to them in our word vector space. We use bigram counts to initialize our word vectors. Some matrix factorization processes, such as SVD and NMF, have been used to reduce the size of the vectors. Then we model the word vector distribution by a Gaussian Mixture Model (GMM), which has been trained incrementally using EM algorithm. The performance of various methods in our model training, such as MAP adaptation and Tied-Mixture System, has also been studied in detail. In this research, our NMF code is able to work on large sparse matrices of size 100000*100000.

 

 

Survey Papers

 

Vamshi Ambati and Wei Chen, Cross Lingual Syntax Projection for Resource-Poor Languages. 2007. (20 pages)

 

 

Downloads


Non-negiave Matrix Factorization for Large Sparce Matrices

 

 

 

Professional Activities

 

Reviewer, AAAI FSS-08 BICA


Program Committee Member, AAAI FSS-09 BICA

 


More about Me