|
Post by Admin on Oct 31, 2018 18:11:32 GMT
Although pigmentation varies globally, it has been more thoroughly studied and is therefore better understood in European populations. This has led to a research gap, especially in East-Asian populations. The OCA2 gene, which is thought to be responsible for maintaining pH levels within melanosomes,1 has been shown to be under positive selection in both European and East-Asian populations.2,3 However, the variants and haplotypes favored by selection are different in each population.2,4,5,6 For example, a variant located within the HERC2 gene is known to affect the expression of the nearby OCA2 gene, and it is strongly associated with blue eyes in European populations.7,8,9 The HERC2 rs12913832 allele associated with blue eyes has a high frequency in Europe but is not present in East-Asian populations.7,8,9 In addition, two non-synonymous polymorphisms, rs1800414 and rs74653330, have been associated with pigmentation in East Asians5,10,11 and are not found at high frequencies in any population outside of East Asia.12 It has been suggested that the phenotype of lighter skin is a result of convergent evolution in Europe and East Asia.2,6,13 Available population data indicate that the rs1800414 and rs74653330 polymorphisms show a distinct geographical distribution. The highest frequencies of the derived rs1800414 G allele are found in Japan, China and Korea, whereas the derived rs74653330 A allele has the highest frequencies in northern East Asia, including Mongolia.12,14 In this report, we provide further data on the global distribution of rs1800414 and rs74653330, with a primary focus on the allelic frequencies observed in East Asia. Briefly, the two polymorphisms were genotyped in the Human Genome Diversity Project–Centre d’Étude du Polymorphisme Humain (HGDP–CEPH) samples (http://www.cephb.fr/en/hgdp_panel.php) by LCG Genomics (Beverly, MA, USA) by using KASP genotyping technology. The HGDP–CEPH panel includes samples for more than 1,000 individuals from 52 populations around the world. Supplementary Table 1 shows the allelic frequencies of both markers in the HGDP–CEPH panel. In agreement with previous data, both polymorphisms are primarily restricted to East-Asian populations. The derived rs1800414 G allele has a broad distribution in East Asia, with the highest frequencies observed in the Japanese population (79%) and several populations from China (Dai, Miaozu, Han, Hezhen, Tujia and Xibo, with frequencies between 65 and 50%). In contrast, the distribution of the derived rs74653330 A allele is more restricted, with the highest frequencies found in Altaic speaking populations from northern East Asia and Mongolia, such as the Yakut from Siberia (36%), the Daur (33%), the Oroqen (28%), the Hezhen (22%) and the Mongola (20%). Figure 1 shows a map of East Asia with the frequencies of both polymorphisms. The derived rs1800414 G and rs74653330 A alleles are not present in any of the samples from Africa, the Middle East or Oceania. In the Americas, the rs1800414 G allele is also absent, and one Maya individual is heterozygous for rs74653330. Both derived alleles are present at very low frequencies in Central–South Asia (rs1800414 G: 4.4%; rs74653330 A: 2.1%) and Europe (rs1800414 G: 0.3%; rs74653330 A: 1%). Within Central–South Asia, the derived alleles are primarily present in the Hazara (Pakistan) and Uygur (China). Within Europe, the derived alleles are observed only in Russia. The presence of the two derived alleles in some of the populations from Central–South Asia and Europe seems to be the consequence of gene flow from East-Asian groups.
|
|
|
Post by Admin on Nov 1, 2018 18:17:41 GMT
Figure 1 Distribution of allele frequencies for SNPs rs1800414 (blue) and rs74653330 (orange) in East-Asian populations: (1) Dia; (2) Daur; (3) Han; (4) Hezhen; (5) Japanese; (6) Lahu; (7) Miaozu; (8) Mongola; (9) Naxi; (10) Oroqen; (11) She; (12) Tu; (13) Tujia; (14) Uyghur; (15) Xibo; (16) Yakut; (17) Yizu; and (18) Cambodia. It is interesting to note that the frequency distribution of the rs74653330 A allele reflects the present genetic structure at a genome-wide level in East Asia. We used the program PLINK15 to perform principal component analysis (PCA) of the East-Asian CEPH–HGDP populations by using genome-wide data (Affymetrix Axiom Human Origins Array) available in the HGDP–CEPH website (http://www.cephb.fr/en/hgdp_panel.php). We pruned SNPs based on linkage disequilibrium (LD) and removed five known areas of long-range LD. Figure 2 shows a visualization of the first two axes of the PCA using the program PAST (http://folk.uio.no/ohammer/past/). There is a clear geographic pattern with the northern populations (Yakut, Oroqen, Mongola, Daur and Hezhen) present on the left side of the plot. As described above, it is precisely in these populations in which the highest frequencies of the derived rs74653330 A allele are observed. Figure 2 PCA (axes 1 and 2) showing population structure of East-Asian populations from the CEPH–HGDP panel. We explored the haplotype structure of the OCA2 region in East Asia in detail. To do this, we merged the genotype data of the two markers of interest with the Affymetrix Human Origin data set for chromosome 15 plus the Illumina (San Diego, CA, USA) 650K data set for chromosome 15. The OCA2 gene was extracted from this data set by selecting markers from chromosome 15, position 25–26.5 Mb. On the basis of the north–south geographical gradient observed in the PCA output as well as the geographic distribution of the two polymorphisms, the haplotype analysis of East Asia was carried out separately in northern East Asia and the rest of East Asia. Populations that were included in the northern grouping included the Yakut from Siberia and the Oroqen, Mongola, Daur and Hezhen from northern China. The haplotype analyses were performed with the program Haploview.16
|
|
|
Post by Admin on Nov 2, 2018 18:36:30 GMT
Figure 3 shows the haplotype structure surrounding the rs1800414 and rs74653330 polymorphisms. The two non-synonymous polymorphisms are located in the same LD block, but they are always found in different haplotypes. The haplotype analysis suggests that the haplotypes carrying the derived alleles for each polymorphism arose independently from the same ancestral haplotype. Using the markers rs7170451–rs1800414–rs728405–rs728404–rs4778214–rs1448488–rs12903382–rs74653330–rs12910433–rs3794609–rs730502 to define the haplotype block (the relevant non-synonymous polymorphisms are labeled in bold), our results indicate that, from the ancestral haplotype ‘AAGAGCAGGTT’, a non-synonymous mutation at rs1800414 originated the haplotype ‘AGGAGCAGGTT’, and another non-synonymous mutation independently originated the haplotype ‘AAGAGCAAGTT’. Both derived haplotypes then increased in frequency in different regions of East Asia. The haplotype ‘AGGAGCAGGTT’ is now the most common haplotype in a broad region of East Asia, whereas the haplotype ‘AAGAGCAAGTT’ has become the most prevalent in northern East Asia. Several lines of evidence indicate that this increase in frequency may have been the result of positive selection favoring light skin in high-latitude regions. Both derived alleles are non-synonymous variants predicted to have a functional effect,11 and both have been associated with lighter skin pigmentation in East-Asian populations.5,10,11 In addition, several studies have identified signatures of positive selection in the OCA2 region in genome-wide scans in East-Asian populations.2,3 The geographic distribution of the variants strongly suggests that these two mutations arose after the separation of European and East-Asian populations. This is supported by a recent study that dated the derived G allele of the OCA2 rs1800414 polymorphism to ~10,000 years ago.17 Figure 3 Haplotype block structure and pattern of LD of the OCA2 region including markers rs1800414 (marker 424) and rs74653330 (marker 432). (a) Northern East Asia; (b) in the rest of Asia. To our knowledge, there has been no attempt to date the polymorphism rs74653330. We used the dense, genome-wide SNP data available for the HGDP–CEPH panel to estimate the ages of the derived alleles at rs1800414 and rs74653330 in East-Asian populations. We used a method18 that relies on the decay of haplotype sharing of the ancestral genomic segment on which the derived mutations occurred. Before the analysis, we removed individuals with pi-hat values exceeding 0.05 to minimize potential problems with cryptic relatedness. To account for the possibility that members of individual populations may have a most recent common ancestor (MRCA) that is more recent than the MRCA of the entire East-Asian sample, we calculated these age estimates assuming a correlated genealogy.18 Under these conditions, and assuming a generation time of 29 years,19 we estimated the age of the derived allele at rs74653330 to be 6,835 years (95% confidence interval (CI): 1,070–12,798). The estimated age of the derived allele at rs1800414 is quite similar at 6,397 years (95% CI: 1,183–11,446 years). This is slightly younger than a previous estimate of the age of the derived allele at rs1800414 using a different method (10,660 years; 95% CI of 8,070–15,780),17 although the CIs of our estimate overlap Chen’s point estimate. The discrepancy in age may be explained by differences in the two methods as well as in differences among the East-Asian populations and the data sets used in each study. Recent ancient DNA studies, which have characterized dense genomic data in Eurasian individuals spanning a broad archaeological period (e.g., from hunter gatherers to individuals living in the Bronze Age), have provided important information about the temporal distribution of genetic markers associated with pigmentation variation in Europe and have strengthened the case for selection operating in pigmentation-related genes in this region.20,21,22 Similar studies in East Asia have the potential to clarify the major events that have shaped the interesting distribution of the two non-synonymous variants of the OCA2 gene in this vast area. In this respect, it will be important to consider not only potential selective effects but also the major population movements that have taken place in this region during the past 15,000 years. Human Genome Variation volume 2, Article number: 15058 (2015)
|
|
|
Post by Admin on Nov 6, 2018 18:10:27 GMT
Fig. 1. Probabilities of obtaining FST equal to or greater than that observed (0.00551) between 60 Eneolithic (ca. 6,500–5,000 y ago) and Bronze Age (ca. 5,000–4,000 y ago) samples from the Pontic–Caspian steppe, and a combined sample of 246 homologous modern sequences from the same geographic region, across a range of assumed ancestral population size combinations. Two phases of exponential growth were modeled, the first after the initial colonization of Europe 45,000 y ago, of assumed effective female population size NUP (y axis), and ending when farming began in the region considered 7,000 y ago, when the assumed effective female population size was NN (x axis), and the second leading up to the present, when the assumed effective female population size is 5,444,812. The initial colonizers of Europe were sampled from a constant ancestral African population of 5,000 effective females. Gray shaded areas indicate P values >0.05. Ancient DNA was retrieved from 63 out of 150 Eneolithic (ca. 6,500–5,000 y ago) and Bronze Age (ca. 5,000–4,000 y ago) samples from the Pontic–Caspian steppe, mainly from modern-day Ukraine. We used multiplex-PCR enrichment and next-generation sequencing to genotype the three pigmentation-associated SNPs (rs12913832, rs16891982, and rs1042602) and mtDNA hypervariable region 1 (HVR1) sequences plus 32 mtDNA coding region SNPs and a 9-bp-indel from these individuals (Tables S1 and S2). Consensus HVR1 sequences were successfully assembled from 60 individuals. Pigmentation gene data were obtained from 48 samples. We also genotyped the three pigmentation-associated SNPs in a sample of 60 modern Ukrainians (28) and observed an increase in frequency of all derived alleles between the ancient and modern samples from the same geographic region (Table 1 and Fig. S1). This implies that the pigmentation of the prehistoric population is likely to have differed from that of modern humans living in the same area. Modern frequencies of the derived alleles within all of Europe and outside of Europe are provided for comparison (Table 1). Inferring natural selection based on temporal differences in allele frequency requires the assumption of population continuity. To this end we compared the 60 mtDNA HVR1 sequences obtained from our ancient sample to 246 homologous modern sequences (29⇓–31) from the same geographic region and found low genetic differentiation (FST = 0.00551; P = 0.0663) (32). Coalescent simulations based on the mtDNA data, accommodating uncertainty in the ancient sample age, failed to reject population continuity under a wide range of assumed ancestral population size combinations (Fig. 1). Fig. 2. Two-tailed empirical P values for obtaining the observed (A) SLC45A2 G allele, (B) TYR A allele, and (C) HERC2 G allele frequency increase. P values were obtained by forward simulation of drift and natural selection across a range of assumed ancestral population sizes and selection coefficients, assuming exponential growth to a modern Ne of 4,845,710. The SLC45A2 rs16891982 G allele and the TYR rs1042602 A allele were assumed to be codominant. The HERC2 rs12913832 G allele was assumed to be recessive (values less than 0.01 are shaded gray). Conversely, continuity between early central European farmers and modern Europeans has been rejected in a previous study (33). However, the Eneolithic and Bronze Age sequences presented here are ∼500–2,000 y younger than the early Neolithic and belong to lineages identified both in early farmers and late hunter–gatherers from central Europe (33). A plausible explanation for this is that the prehistoric populations sampled in this study are a product of admixture between in situ hunter–gatherers and immigrant early farmers during the centuries after the arrival of farming, and that this admixture was a major process shaping modern patterns of mtDNA variation (34) and possibly also the variability observed in European hair, eye, and skin color. To test whether the observed increases in the three light pigmentation-associated alleles can be explained by genetic drift alone or whether natural selection needs to be invoked, we performed forward computer simulations of drift plus selection, accommodating uncertainty in ancient and modern allele frequency, population size, and ancient sample age. We assumed codominance for both SLC45A2 rs16891982 G and TYR rs1042602 A alleles (22, 35) and that the derived HERC2 rs12913832 G allele is recessive (36). Using these simulations, neutrality (S = 0) was rejected under all assumed ancestral effective population sizes—ranging from 103 to 105 at the time of the ancient sample (SLC45A2 P < 1 × 10−5, TYR P < 2 × 10−5, and HERC2 P < 1 × 10−5). The values of selection acting on the SLC45A2 rs16891982 G allele, the TYR rs1042602 A allele, and the HERC2 rs12913832 G allele that best explained the observed derived allele frequency changes were 0.030, 0.026, and 0.036, respectively (Fig. 2). Whereas there is strong evidence that the derived HERC2 rs12913832 G allele is recessive (36), it is less clear whether the SLC45A2 and TYR derived alleles are codominant, recessive, or dominant (22, 35). Under the assumption that both SLC45A2 and TYR derived alleles are recessive, the selection values that best explain the observed changes in frequency are 0.022 and 0.104, respectively, and under the assumption that they are dominant the selection values are 0.088 and 0.016, respectively; again, neutrality was rejected for all three alleles (P < 4 × 10−5) under all ancestral population sizes modeled and all assumptions of dominance/codominance/recessivity (Fig. S2).
|
|
|
Post by Admin on Nov 7, 2018 18:00:01 GMT
Discussion Our analysis indicates that positive selection on pigmentation variants associated with depigmented hair, skin, and eyes was still ongoing after the time period represented by our archaeological population, 6,500–4,000 y ago. This finding suggests that either the selection pressures that initiated the selective sweep during the Late Pleistocene or early Holocene were still operative or that a new selective environment had arisen in which depigmentation was favored for a different reason. The high selection coefficients estimated for pigmentation genes HERC2, SLC45A2, and TYR are best understood in the context of estimates obtained for other recently selected loci. Using spatially explicit simulation and approximate Bayesian computation, selection on the LCT -13,910*T allele—which is strongly associated with lactase persistence in Europeans and southern Asians—was inferred to fall in the range 0.0259–0.0795 and to have begun around 7,500 y ago in the region between the Balkans and central Europe (37). However, another simulation-based study incorporating latitudinal effects on selection resulted in a lower estimate of S (0.008–0.018) (38). The selective advantage of the G6PD A− and Med deficiency alleles conferring resistance to malaria have been estimated at 0.019–0.048 and 0.014–0.049, respectively, in regions where malaria is endemic (39). These alleles are estimated to have arisen ∼6,357 y ago (G6PD A−) and 3,330 y ago (G6PD Med) (39). Thus, the estimates of S for the three pigmentation genes examined in this study are comparable to those for the most strongly selected loci in the human genome. Although these estimated selection coefficients are high, they are comparable to previous estimates for genes in the pigmentation complex. The selective sweeps favoring the SLC45A2 derived allele, as well as the derived alleles of SNPs in SLC24A5 and TYRP1, which are also implicated in the lightening of skin pigmentation, are estimated to have begun between 11,000 and 19,000 y ago, after the separation of the ancestors of modern Europeans and East Asians (the ages of the selective sweeps affecting HERC2 and TYR have not yet been estimated) (14, 40). Beleza et al. (14) recently estimated the coefficient of selection at the SLC45A2 locus to be 0.05 under a dominant model of inheritance and 0.04 under an additive model. Selection favoring the derived alleles of SNPs in SLC24A5 and TYRP1 was found to be similarly strong. Estimating selection coefficients using the ancient DNA-based simulation approach presented here offers considerable advantages over traditional methods based on allele age and frequency estimates (1): Selection coefficients are estimated over a defined period; selection acting on standing variation can be accommodated; and our approach is insensitive to the frequently unaccounted for uncertainties associated with allele age estimation using molecular or recombination clocks. This latter advantage is likely to result in considerable improvements in precision. However, our approach does require the assumption of population continuity and will not provide direct estimates of when a selective sweep began. Although the strength of the selection coefficients in a certain time window can be estimated with improved precision using our ancient DNA-based simulation approach, the actual nature of the selection pressure remains unknown. However, temporal and geographical information from the prehistoric skeletal population under study can help in formulating reasonable hypotheses. Geographic variation in many functional skin pigmentation gene polymorphisms (13), and lighter skin pigmentation more generally, correlate strongly with distance from the equator in long-established populations, suggesting that selective pressure also occurred along a latitudinal gradient. The samples in our study were from between 42°N and 54°N, a latitudinal belt in which yearly average UVR is insufficient for vitamin D3 photosynthesis in highly melanized skin (4, 41). Constraints on the ability to photosynthesize vitamin D3 imposed by low incident UVR intensity may have provided significant selective pressure favoring lighter pigmentation populations in high-latitude regions such as the northern Pontic steppe belt. The need to admit UVB radiation to catalyze the synthesis of vitamin D3, together with the decreased danger of folate photolysis at higher latitudes, may account for the observed skin depigmentation from prehistoric to modern times in this region (5). Dietary change during the Neolithization process may have reinforced selection pressure favoring depigmented skin. The individuals analyzed in this study lived ∼500–2,000 y after the arrival of farming in the region north of the Black Sea (42, 43). In many parts of Europe, the Mesolithic–Neolithic transition is associated with a switch from a vitamin D-rich aquatic or game-based hunter–gatherer diet (44) to a vitamin D-poor agriculturalist diet. In low-UV regimes such as the one prevailing in our study region, it is difficult to meet vitamin D requirements without the consumption of significant quantities of oily fish or animal liver (45, 46). The vitamin D recommended dietary allowance of 800–1,000 IU for adults requires the daily consumption of the equivalent of 100 g of wild salmon (the dietary input with the greatest measured vitamin D concentration). Isotopic evidence suggests that the populations sampled in our study continued to access aquatic resources, primarily river fish, in the Neolithic, Eneolithic, and Bronze Age, although there was considerable heterogeneity in fish consumption within the study region (47⇓⇓–50). However, any diminution in fish consumption may have been sufficient to generate additional selective pressure favoring depigmentation at this low-incident-UVR latitude. Although ecological and environmental factors may be sufficient to explain the observed change in European skin pigmentation, these explanations are unlikely to hold for eye and hair color. The geographic distribution of iris and hair pigmentation variation does not conform as well to a latitudinal cline model, with much of the observed phenotypic variation restricted to Europe and closely related neighboring populations (51, 52). The blue iris phenotype characteristic of the HERC2 rs12913832 G allele, for example, is almost completely restricted to western Eurasia and some adjacent regions, its descendant populations, and populations containing European admixture (51, 52). It is possible that depigmented irises or the various human hair color morphs in Europeans are by-products of selection on skin pigmentation. There is evidence for gene–gene interaction within the polygenic system governing complex pigmentation traits; interactions between HERC2, OCA2, and MC1R, in particular, have been found to have a statistically significant effect on hair, iris, and skin color (36). There is also evidence for epistatic interactions between components of the melanin synthesis pathway in other mammalian model systems, including interactions between the products of ASIP, MC1R, and TYR (53). Additionally, many pigmentation genes, including TYR, HERC2, and SLC45A2 have pleiotropic effects on skin, hair, and eye color (11, 36). Given that intraspecific pigmentation variability in other taxa, particularly avians, has been attributed to signaling and other factors associated with mate choice (54) it is possible that depigmented irises and the various hair colors observed in Europeans arose through sexual selection (7). Frequency-dependent sexual selection in favor of rare variants has been observed in vertebrates (55, 56), and such selection favoring rare pigmentation morphs could have driven alleles associated with lighter hair and eye colors to higher frequency. Once lighter hair and eye pigmentation phenotypes reached appreciable frequencies in European populations, these novel traits may have continued to be preferred as indicators of group membership, facilitating assortative mating. Assortative mating based on coloration is common in vertebrates (57), and skin pigmentation has been observed as a criterion for endogamy in modern human populations (58, 59). In addition, there is some evidence that lighter iris colors, because of their recessive mode of inheritance, may be preferred by males in assortative mating regimes to improve paternity confidence (60). Consistent with positive assortative mating, an exact test of Hardy–Weinberg equilibrium reveals an excess of HERC2 rs12913832 homozygotes in both the modern (P = 0.0543) and ancient (P = 0.0084) East European samples genotyped here (Table S3), despite the relatively small sample sizes. The observed excess of HERC2 rs12913832 homozygotes in the ancient sample might be explained by population stratification in a temporally heterogeneous population sample. Although we do not observe any chronological or spatial patterning of the pigmentation markers in our prehistoric sample, we cannot exclude population stratification in the absence of additional neutral SNPs. However, we note that neither the TYR nor the SLC45A2 SNPs investigated here, nor three additional SNPs investigated in the same ancient and modern samples, showed any significant observable excess of homozygotes (Table S3), suggesting that the excess of HERC2 rs12913832 homozygotes is less likely to be due to population stratification. In sum, a combination of selective pressures associated with living in northern latitudes, the adoption of an agriculturalist diet, and assortative mating may sufficiently explain the observed change from a darker phenotype during the Eneolithic/Early Bronze age to a generally lighter one in modern Eastern Europeans, although other selective factors cannot be discounted. The selection coefficients inferred directly from serially sampled data at these pigmentation loci range from 2 to 10% and are among the strongest signals of recent selection in humans. PNAS April 1, 2014. 111 (13) 4832-4837
|
|