Genetic and environmental components of human leukocyte gene expression variation in Morocco
In humans, study of how environment and genome interact to shape phenotypic variation is particularly relevant to understanding the origins of complex diseases and the increase in their prevalence coinciding with major shifts from traditional to urbanized lifestyles. Gene expression is the first step in a complex and multi-step process towards the production of higher-level phenotypes. While genetic analysis of gene expression variation in humans using cell lines and clinical samples is subject to extensive research, little is known about the contributions of environment and geography to transcriptional variation.
To estimate these contributions, I examined gene expression in peripheral blood leukocyte samples from 46 desert nomadic, mountain agrarian and coastal urban Moroccan Amazigh individuals. Strikingly, as much as one third of the leukocyte transcriptome was found to be differentially expressed among lifestyles. Genome-wide polymorphism analysis indicates that genetic differentiation in the total sample is limited and is unlikely to explain the expression divergence. Methylation profiling of 1,505 CpG sites suggests limited contribution of methylation to the observed differences in gene expression.
To estimate the contributions of genetic and environmental factors jointly, I generated gene expression profiles and whole genome genotypic data from 208 and 203 individuals, respectively, from leukocyte samples of two groups of urban dwellers and two groups of rural villagers representing Arab and Amazigh ethnicities in southern Morocco. Again, the analysis revealed strong effects of environmental geography but also suggested that the interplay between non-genetic environmental factors and genes is a major modulator of the transcriptome. Both studies confirm that genetic factors are neither the sole, nor even the major, source of variation affecting the leukocyte transcriptome. The amplitude and functional characteristics of the observed differences in both studies suggest a significant impact on immune response and disease susceptibility.
The analysis was extended by performing a gene expression genome-wide association where each of 516,972 genotypes was tested against each of 22,300 expressed transcripts while accounting for gender, location, genetic ethnicity, relatedness, and interaction effects. This analysis revealed 1,744 genome-wide significant associations involving 380 cis-eSNP and 16 trans-eSNP all robust to population structure and environmental modulation. No evidence for genotype-by-environment interactions modulating transcript abundance was detected for the genome-wide significant associations suggesting environmental geography and genotypes act in a largely additive manner. Further, I confirmed numerous previously reported regulatory signals and validated several previous trait and disease GWAS findings as being modulated by variation in gene expression levels. These results further our understanding of gene expression variation in humans and emphasize the biomedical importance of expression divergence within and among human populations.