The corpus is described in

TriS: A Statistical Sentence Simplifier with Log-linear Models and Margin-based Discriminative Training
Nguyen Bach, Qin Gao, Stephan Vogel and Alex Waibel
In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP 2011), November 2011, Chiang Mai, Thailand.

Please cite the above paper in papers that make use of this corpus.

The .ref files are factual-based sentence simplification reference.
The *.in files are original sentences which also include their POS tagged, syntactic trees, and dependency trees.


Nguyen Bach (nbach@cs.cmu.edu)