Computational Method Speeds Hunt for New Antibiotics

New Algorithm Reduces Search Time From Hundreds of Years to Mere Hours

Printer-friendly version

Assistant Professor of Computational Biology Hosein Mohimani is part of a team of researchers that developed an algorithm to rapidly search massive databases for novel variants of known antibiotics — a potential boon in fighting antibiotic resistance.

A team of American and Russian computer scientists has developed an algorithm that can rapidly search massive databases to discover novel variants of known antibiotics — a potential boon in fighting antibiotic resistance.

In just a few hours, the algorithm, called VarQuest, identified 10 times more variants of peptidic natural products, or PNPs, than all previous PNP discovery efforts combined, the researchers report in the latest issue of the journal Nature Microbiology. Previously, such a search might have taken hundreds of years of computation, said Hosein Mohimani, assistant professor in Carnegie Mellon University's Computational Biology Department.

"Our results show that the antibiotics produced by microbes are much more diverse than had been assumed," Mohimani said. VarQuest found more than a thousand variants of known antibiotics, he noted, providing a big picture perspective that microbiologists could not obtain while studying one antibiotic at a time.

Mohimani and Pavel A. Pevzner, professor of computer science at the University of California, San Diego, designed and directed the effort, which included colleagues at St. Petersburg State University in Russia.

PNPs have an unparalleled track record in pharmacology. Many antimicrobial and anticancer agents are PNPs, including the so-called "antibiotics of last resort," vancomycin and daptomycin. As concerns mount regarding antibiotic drug resistance, finding more effective variants of known antibiotics is a means for preserving the clinical efficacy of antibiotic drugs in general.

The search for these novel variants received a boost in recent years with the advent of high-throughput methods that enable environmental samples to be processed in batches, rather than one at a time. Researchers also recently launched the Global Natural Products Social (GNPS) molecular network, a database of mass spectra of natural products collected by researchers worldwide. Already, the GNPS based at UC San Diego contains more than a billion mass spectra.

The GNPS represents a gold mine for drug discovery, Mohimani said. The VarQuest algorithm, which employs a smarter way of indexing the database to enhance searches, should help GNPS meet its promise, he added.

"Natural product discovery is turning into a Big Data territory, and the field has to prepare for this transformation in terms of collecting, storing and making sense of Big Data," Mohimani said. "VarQuest is the first step toward digesting the Big Data already collected by the community."

In addition to Pevzner and Mohimani, the research team includes Alexey Gurevich, Alla Mikheenko, Alexander Shlemov, Anton Korobeynikov of St. Petersburg State. The U.S. National Institutes of Health and the Russian Science Foundation supported this research.

Byron Spice | 412-268-9068 | bspice@cs.cmu.edu