Identification of proteins associated with specialized (secretory) organelles in apicomplexan parasites
Targeting of eukaryotic proteins to subcellular organelles typically involves specific sequence motifs. Apicomplexan parasites, including the human pathogens Plasmodium (malaria) and Toxoplasma, harbor a variety of distinctive organelles of interest as therapeutic targets, including the 'apical complex' (micronemes, rhoptries) required for host cell attachment and invasion, and the 'digestive vacuole' (DV) used by Plasmodium to digest hemoglobin. Unfortunately, the signals involved in targeting proteins to these organelles remain poorly defined; identification has been hampered by the multiplicity of subcellular trafficking pathways, the small number of experimentally validated organellar proteins, and the inaccuracy of gene models (particularly in the first exon that often encodes amino-terminal targeting sequences). This dissertation integrates computational and experimental approaches to facilitate the recognition of specialized organellar proteins in the apicomplexa. Computational analysis of experimental truncations suggested a novel DV targeting motif near the amino-terminus of the cytoplasmic domain, which has been validated by site-directed mutagenesis. Mining the parasite genome yields ∼100 candidate DV proteins, two of which were confirmed experimentally. Microneme proteins lack conserved targeting motifs (beyond the presence of a generic signal peptide) but often harbor adhesive domains. Mining the genomes of twelve parasite species identifies >600 candidate m croneme proteins based on these domain patterns, and seven of eight proteins tested localize to the apical complex. Analysis of available interactome datasets also identifies ∼1,500 potential human partner proteins, including most known receptors. Because identification of organellar targeting signals is frequently compromised by inaccurate genome annotation, a gene model extender was developed based on mapping protein level features to genomic sequence. This strategy improves gene prediction accuracy for P. falciparum, T. gondii and Homo sapiens, particularly with respect to the specificity of first exon recognition. Further studies indicate that the concordance of signal peptide presence/absence in orthologous protein groups can also be used to improve gene models. In aggregate, this research has helped to identify new proteins associated with the specialized organelles of apicomplexan parasites, and confirms the potential of exploiting protein level features to develop more sophisticated models for gene finding.