MEGRASP(A syntactic parser for CHILDES trascripts)
Kenji SagaeInstitute for Creative Technologies, University of Southern California
MEGRASP is a dependency parser for identification of grammatical relations in child language transcripts in the CHILDES Database.
MEGRASP requires input files in CHAT format (PDF link), with part-of-speech tags produced by POST (or manually assigned). If this does not sound familiar, consult the main CHILDES web site.
Download
Warning: the part-of-speech tags used in the CHILDES database have changed since the parser was released! This means that MEGRASP will NOT WORK properly until I update the parser models. If you have the latest version of CLAN and the English lexicon for MOR, the parser will produce garbage.
MEGRASP v0.7 (released June 15, 2007) · Cygwin · Linux · Mac OSX ·
Updated source code coming soon (if you want source code now,
you can get the code for my CoNLL-style dependency
parser).
After downloading and unzipping the archive for your platform, please look at the README file (README.txt in the Windows distribution) for instructions on running the parser.
Please feel free to contact me at sagae+megrasp@cs.cmu.edu with questions, comments, requests and bug reports.
For more information about MEGRASP, see the following paper (please cite it in work based on MEGRASP output).
Sagae, K., Davis, E., Lavie, A., MacWhinney, B. and Wintner, S. 2007. High-accuracy annotation and parsing of CHILDES transcripts. Proceedings of the ACL-2007 Workshop on Cognitive Aspects of Computational Language Acquisition. Prague, Czech Republic.
A (slightly out-of-date) description of the grammatical relations used by MEGRASP is available here.
|