GiRaF: Robust, Computational Identification of Influenza Reassortments via Graph Mining

Niranjan Nagarajan1, Carl Kingsford2
1Genome Institute of Singapore, 2University of Maryland, College Park


GiRaF is a computational tool for identification of reassortments in influenza viruses from sequence databases of isolates. Reassortments in influenza - a process where strains exchange genetic segments - have been implicated in 2 out of 3 pandemics of the 20th century as well as the 2009 H1N1 outbreak. GiRaF robustly identifies reassortments in a fully automated fashion while accounting for uncertainties in the inferred phylogenies. GiRaF relies on a fast consensus-search algorithm to confidently identify incompatible gene segment phylogenies that serve as signatures for reassortments. In experiments with synthetic datasets, GiRaF demonstrates high precision and sensitivity as well as robustness to complex reassortment histories. On human, avian and swine influenza datasets, GiRaF correctly identifies known reassortments as well novel events and can automatically catalog reassortment architectures based on all pairwise comparisons between gene segments.

If you use GiRaF, please cite:


Source code and executables for GiRaF are freely available via the links below. Unless you have a reason to do otherwise, use the latest version. The README.txt file contained in the distribution contains additional instructions about how to run GiRaF.

Older Versions: The results obtained using version 1.0 may differ slightly from those obtained using version 0.9. If you want them to be as similar as possible use the command-line option --version-0.9-compat. Even then, version 1.0 and 0.9 can still differ a little bit.

GiRaF is a command-line program that reads NEXUS files (.nex) that contain phylogenetic trees. GiRaF requires some way to build these collections of phylogenetic trees. One such package that has been tested with GiRaF is MrBayes.


Input sequences and corresponding results for the real and synthetic influenza datasets studied using GiRaF can be found here: FTP directory. The result files in this directory were generated with version 0.9 of GiRaF.


For questions and comments write to niranjan at


This work was supported by the National Science Foundation [EF-0849899 and IIS-0812111] and the National Institutes of Health [1R21AI085376].

Last modified: November 7, 2012.