Genetics of Pigmentation Diversity Aug 29, 2019 18:15:37 GMT
Post by Admin on Aug 29, 2019 18:15:37 GMT
Genomic context of GWAS loci
The SNPs with strongest association with skin color in Africans were on chromosome 15 at or near the Solute Carrier Family 24 Member 5 (SLC24A5)gene (Figs. 1). A functional non-synonymous mutation within SLC24A5 (rsl426654) (14) was significantly associated with skin color (F-test, p-value = 5.5 × 10−62) and was identified as potentially causal by CAVIAR (Table 1). The rsl426654 (A) allele is at high frequency in European, Pakistani, and Indian populations (Fig. 1) and is a target of selection in Europeans, Central Asians and North Indians (15-17). In Africa this variant is common (28-50% frequency) in populations from Ethiopia and Tanzania with high Afroasiatic ancestry (18, 19) and is at moderate frequency (5-11%) in San and Bantu populations from Botswana with low levels of East African ancestry and recent European admixture (20, 21) (Fig. 1 and figs. S2 and S4). We observe a signature consistent with positive selection at SLC24A5 in Europeans on the basis of extreme values of Tajima’s D statistic (fig. S5).
Based on coalescent analysis with sequence data from the Simons Genomic Diversity Project (SGDP) (13), the time to most recent common ancestor (TMRCA) of most Eurasian lineages containing the rsl426654 (A) allele is 29 kya (95% credible interval (CI) 28-31 kya), consistent with prior studies (6, 16) (Fig. 4). Haplotype analysis indicates that the rsl426654 (A) variant in Africans is on the same extended haplotype background as Europeans (Fig. 5 and fig. S6), likely reflecting gene flow from western Eurasia over at least the past 3-9 ky (22). The rsl426654 (A) variant is at high frequency (28%) in Tanzanian populations, suggesting a lower bound (~5 kya) for introduction of this allele into East Africa, the time of earliest migration from Ethiopia into Tanzania (23). Further, the frequency of the rsl426654 (A) variant in eastern and southern Africans exceeds the inferred proportion of non-African ancestry (figs. S2 and S4). Estimates of genetic differentiation (FST) at the rsl426654 SNP between the West African Yoruba (YRI) and Ethiopian Amhara populations is 0.76, among the top 0.01% of values on chromosome 15 (table S4). These results are consistent with selection for the rsl426654 (A) allele in African populations following introduction, although complex models of demographic history cannot be ruled out.
Coalescent Trees and TMRCA dating
A lysosomal transporter protein associated with skin pigmentation
The region with the second strongest genetic association with skin pigmentation contains the Major Facilitator Superfamily Domain Containing 12 (MFSD12) gene on chromosome 19 (Figs. 1 and and33 and tables S2 and S3). MFSD12 is homologous to other genes containing MFS domains, conserved throughout vertebrates, which function as transmembrane solute transporters (24). MFSD12 mRNA levels are low in de-pigmented skin of vitiligo patients (25), likely due to autoimmune related destruction of melanocytes.
The MFSD12 locus is in a region with extensive recombination, enabling us to fine-map eight potentially causal SNPs (Table 1 and table S3) that cluster in two regions: one within MFSD12 and the other ~7,600–9,000 bp upstream of MFSD12 (Fig. 3). Many SNPs are in predicted regulatory regions active in melanocytes and/or keratinocytes (Table 1 and Fig. 3) and show enhancer activity in luciferase expression assays in a WM88 melanoma cell line (Table 1, table S5, and fig. S7). Within MFSD12, the two SNPs that CAVIAR identifies as having highest probability of being causal are rs56203814 (F-test, p-value = 3.6 × 10−18), a synonymous variant within exon 9, and rsl0424065 (F-test, p-value = 5.1 × 10−20), located within intron 8. They are 130 bp apart, in strong LD, and impact gene expression in luciferase expression assays (1.5 – 2.7 × higher expression than the minimal promoter; fig. S7). The SNPs upstream of MFSD12 with highest probability of being causal are rsll2332856 (F-test, p-value = 3.8 × 10−16) and rs6510760 (F-test, p-value = 6.5 × 10−15). They are 346 bp apart, in strong LD, and impact gene expression in luciferase expression assays (4.0 - 19.7 × higher expression than the minimal promoter; fig. S7).
The derived rs56203814 and rsl0424065 (T) alleles associated with dark pigmentation are present only in African populations (or those of recent African descent) and are most common in East African populations with Nilo-Saharan ancestry (Fig. 1 and fig. S4). Coalescent analysis of the SGDP dataset indicates that the rsl0424065 (T) allele predates the 300 kya origin of modern humans (26) (estimated TMRCA of 612 kya, 95% CI 515-736 kya) (Fig. 4).
At rs6510760 and rsll2332856, the ancestral (G) and (T) alleles, respectively, associated with light pigmentation, are nearly fixed in Europeans and East Asians and are common in San as well as Ethiopian and Tanzanian populations with Afroasiatic ancestry (Fig. 1 and fig. S4). The derived rs6510760 (A) and rs112332856 (C) alleles (associated with dark pigmentation) are common in all sub-Saharan Africans except the San, as well as in South Asian and Australo-Melanesian populations (Fig. 1 and fig. S4). Haplotype analysis places the rs6510760 (A) allele (and linked rsll2332856 (C) allele) in Australo-Melanesians on similar haplotype backgrounds relative to central and eastern Africans (Fig. 5 and fig. S6), suggesting they are identical by descent from an ancestral African population. Coalescent analysis of the SGDP dataset indicates that the TMRCA for the derived rs6510760 (A) allele is 996 kya (95% CI 0.82-1.2 mya; Fig. 4).