Emerging Linked-Read technologies (aka Read-Cloud or barcoded short-reads) have revived interest in short-read technology as a viable way to understand large-scale structure in genomes and metagenomes. Linked-Read technologies, such as the 10x Chromium system, use a microfluidic system and a specialized set of 3′ barcodes (aka UIDs) to tag short DNA reads sourced from the same long fragment of DNA; subsequently, the tagged reads are sequenced on standard short read platforms. This approach results in interesting compromises. Each long fragment of DNA is only sparsely covered by reads, no information about the ordering of reads from the same fragment is preserved, and 3′ barcodes match reads from roughly 2-20 long fragments of DNA. However, compared to long read technologies the cost per base to sequence is far lower, far less input DNA is required, and their base error rate is that of Illumina short-reads. In this talk, we discuss novel algorithms and some of the advantages of Linked-Reads over standard short read sequencing technologies with applications to whole genome re-sequencing and metagenomics.
Iman Hajirasouliha is Assistant Professor of Computational Genomics at the Institute for Computational Biomedicine at Weill Cornell Medicine of Cornell University and a member of the Englander Institute for Precision Medicine and the Meyer Cancer Center, New York, USA. He completed a Postdoctoral Scholarship at the Computer Science Department, Stanford University, and a Simons Research Fellowship at the University of California, Berkeley. His research focuses on computational genomics and metagenomics, computational digital pathology, large-scale sequence analysis, and characterizing somatic variations and intra-tumor heterogeneity in cancer.
Faculty Host: Hosein Mohimani