It is now easy to obtain genomic sequence data at massive scales, whether produced by an individual in their own lab, or the consortium projects that are now sequencing hundreds of thousands of individual genomes. However, analysis of these data is still difficult for typical researchers, due to challenges including the need to move large amounts of data, the need for substantial compute infrastructure, the need to provide security and privacy, and lack of specialized computational training. Here I will discuss our efforts to address these challenges through two projects: Galaxy and the AnVIL.
James Taylor is the Ralph S. O’Connor Associate Professor of Biology and associate professor of computer science at Johns Hopkins University. Until 2014, he was an associate professor in the departments of biology and mathematics and computer science at Emory University. He is one of the original developers of the Galaxy platform for data analysis, and his group continues to work on extending the Galaxy platform. His group also works on understanding genomic and epigenomic regulation of gene transcription through integrated analysis of functional genomic data. James received a Ph.D. in computer science from Penn State University, where he was involved in several vertebrate genome projects and the ENCODE project.