Cell shape represents the balance of physical and chemical forces within a cell. To explore how cells' shape influences behaviour, we used high content image analysis to measure cell morphology together with protein localisation in millions of single cells and statistical modelling to look for relationships between shape features and nuclear translocation of oncogenic transcription factors (TFs). First, we profiled a panel of morphologically diverse breast tumor and non-tumor cell lines to identify shape sensitivities of the key cancer and inflammatory NF-kappaB, then tested model predictions using chemical, genetic, and physical perturbations. Second, we used RNAi screening and multiparametric linear regression models to search for direct regulators of the mechanosensitive TFs YAP/TAZ by normalising for cell shape. Finally, we are currently exploring the relationships between cell morphology and the dynamics of TF localisation in living cells.

The Highly Iterative Palindrome-1 (HIP1) is a highly abundant octamer palindrome motif (5’-GCGATCGC-3’) found in a wide range of cyanobacterial genomes from various habitats. In the most extreme genome, HIP1 frequency is as high as one occurrence per 350 nucleotides. This is rather astonishing considering that at this frequency, on average, every gene will be associated with more than one HIP1 motif.  This high level abundance is particularly intriguing, considering the important roles other repetitive motifs play in the regulation, maintenance, and evolution of prokaryotic genomes. However, although first identified in the early 1990s, HIP1’s functional and molecular roles remain a mystery.

Here I present a comparative genomics investigation of the forces that maintain HIP1 abundance in 40 cyanobacterial genomes. My genome-scale survey of HIP1 enrichment, taking into account the background tri-nucleotide frequency in the genome, shows that HIP1 frequencies are up to 300 times higher than expected.  Further analysis reveals that in alignments of divergent genomes, HIP1 motifs are more conserved than other octamer palindromes with the same GC content, used as a control. This conservation is not a byproduct of codon usage, since codons in HIP1 motifs are more conserved than the same codons found outside HIP1 motifs. HIP1 is also conserved on a broader scale.  I predicted orthologs using the Notung software platform and compared enrichment of HIP1 motifs with control motifs across orthologous gene pairs.  The similarity of HIP1enrichment in orthologs is significantly higher than the control. Taken together, my results provide the first evidence for the mechanism driving HIP1 prevalence.  The observed conservation is consistent with selection acting to maintain HIP1 prevalence and rejects the hypothesis that HIP1 abundance is due to a neutral process, such as DNA repair. The evidence of selection thus suggests a functional role for HIP1. My analysis of the genome-wide spatial distribution of HIP1 suggests that the motif lacks periodicity, voting against a role in supercoiling. The spatial distribution of HIP1 motifs in mRNA transcript data from Synechococcus sp. PCC 7942 reveals a significant 3’ bias, which is suggestive of regulatory functions such as transcription termination and inhibition of exonucleolytic degradation.  I conclude by discussing my findings in the context of cyanobacterial evolution and propose testable hypotheses for future work.

Thesis Committee:
Dannie Durand (Advisor)
N. Luisa Hiller (Biological Sciences/CMU)
Jeffrey Lawrence (Biological Sciences/University of Pittsburgh)
Daniel Baker (School of Biology/St. Andrew's University, Scotland)

Biological networks, social networks, and the dynamic processes over them, such as diffusion, can be better understood by simultaneously analyzing both the network data and the diffusion data. However, data about diffusion, the network, and node attributes are all limited and often wrong. Overcoming this limited/uncertain data bottleneck is an important challenge in better estimating the network structure, better finding the correlations hidden in the network, and better tracking the diffusion dynamics over the network.

We focus on four different problems regarding the analysis of networks, and diffusion dynamics over them, with limited information. We first improve protein annotation prediction performance by metric labeling and associated semi-metric embedding of the annotations that integrate the similarities between annotations to protein network data. Second, we propose methods to reconstruct the unknown network from available diffusion data accurately at both micro and macro scales over both biological and social domains. Then, we formulate the diffusion history reconstruction problem to estimate the diffusion histories from incomplete snapshots of the diffusion process, and apply our methods to different diffusion types with accurate performance. Lastly, we propose novel methods to deconvolve the biological 3C interaction matrix that is an ensemble over a cell population under several assumptions about their structures. All these problems are computational, and we validate the effectiveness of our methods with both computational experiments and with theoretical bounds.

The phylum Apicomplexa is composed of parasites that have an enormous impact on human and animal health.  This includes the causative agents of malaria, cryptosporidiosis, and toxoplasmosis.  In my lab we use the human parasite Toxoplasma gondii to understand the evolution of virulence and host range in eukaryotic pathogens.  While most parasites have highly restricted and specialized life cycles involving only a limited number of host species, Toxoplasma gondii boasts an intermediate host range that includes both birds and all mammals studied to date. In order to understand the genetic basis for these distinct phenotypes, we are using comparative genomics to identify loci that distinguish T. gondii from its nearest relatives.  While gene content across our query species is remarkably well conserved overall, we have found that tandemly expanded loci are by far the most distinguishing genomic feature.  We are now investigating the role of select Toxoplasma-specific expanded loci in parasite biology, and have determined that one of these loci is responsible for a species-specific cellular phenotype in T. gondii, specifically the manipulation of host mitochondria.  We are now using genetic tools to determine how this phenotype uniquely evolved in Toxoplasma gondii.


Subscribe to CBD