Global Voices Malagasy-English Parallel Corpus

This page provides a link to a corpus of parallel news articles in Malagasy and English from the Global Voices project. This corpus was collected and aligned at the sentence level by Victor Chahuneau.

Download - release 12.06

Full corpus




More data will be released periodically (~100 articles are published every month on Global Voices)

Sentence aligner used: Gargantua (Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora, F. Braune & A. Fraser, COLING 2010)


The original content was published under a Creative Commons Attribution-Only license.


This work was supported by the Army Research Office (grant number W911NF-10-1-0533).