|
Post by Admin on Dec 17, 2018 18:03:54 GMT
Xinjiang, China has been a contact zone of the peoples from Central Asia and East Asia. In particular, the presence of a Tocharian (an extinct Indo-European language)-speaking population during the first millennium, the discovery of mummies with European features dating from 3,000–4,000 YBP (Years Before Present), and the existence of West Eurasian mitochondrial-DNA lineages clearly indicate the influence of populations of European descent in this region, and the signature of admixture between East Asians and those of European descent is also evident.1–8 A full analysis of genetic structure of the admixed populations in this region would shed light on the understanding of human migratory history and the admixture of East Asians and those of European descent. Because many human populations settled at Central Asia, which has been a complex assembly of peoples, cultures, and habitats,9 The Uyghur population in Xinjiang demonstrates an array of mixed anthropological features of Europeans and Asians.10 We are interested in both its admixture history and its potential for gene mapping. Admixture of populations often leads to an extended linkage disequilibrium (LD), which could greatly facilitate the mapping of human disease genes.11–14 Gene mapping by admixture linkage disequilibrium (MALD) has been shown to be of special value theoretically11,15–22 and empirically.1,23–36 However, typical admixture populations used for MALD often involve those formed by recent admixture between groups originating on different continents as a result of European maritime expansion during the past few hundred years. These include populations formed by two-way and three-way admixture between Europeans, West Africans, and Native Americans in the Americas, as well as populations formed by two-way admixture of Europeans with indigenous populations in Australia, the Pacific Islands, and Polar Regions.12 Because the admixture events happened a few hundred years ago, parental populations and the admixture histories of the aforementioned populations are relatively clear; it is easy to obtain the panels of markers informative for ancestry.28,34,35,37 Although Uyghur is a population presenting a typical admixture of Eastern and Western anthropological traits, its potential utility in MALD has been largely ignored because of its uncharacterized and suspected to be longer history of admixture as compared with other populations. It is more difficult to identify ancestry-informative markers (AIMs), and more such markers are required when admixture occurred beyond the time range which was considered ideal.12 Concerning the Uyghur population, there are many questions that remain unanswered: 1) What are the ancestral origins of Uyghur? 2) Was Uyghur formed by two-way or three-way admixture? 3) How much ancestry did each parental population contribute to Uyghur respectively? 4) How long ago did the admixture occur? 5) What is the LD pattern and magnitude in Uyghur? 6) Is Uyghur amiable to MALD? In this study, we try to answer these questions by dissecting the genetic structure of an Uyghur population sample at population, individual, and chromosome level by using a panel of high-density SNP markers on chromosome 21. Figure 1 Distribution of Marker Information for 83 AIMs Select Ancestry-Informative Markers SNPs that have large allele-frequency differences between CHB and CEU were selected as ancestry-informative markers (AIMs). One threshold proposed for declaring a SNP to be highly informative for ancestry inference is δ = 0.5,46 which corresponds to FST ∈ [0.250, 0.333] and. In ∈ [0.131, 0.216].48 Although FST, f, and In are all closely related to δ in the case of biallelic markers in two source populations, unlike δ they capture the dependence of information content on the position of allele frequencies in the unit interval. In this study, we selected AIMs according to FST value, with FST ≥ 0.35 (the upper bound of FST corresponding to δ > 0.5), from 20,177 SNPs and obtained 602 AIMs. However, we noted that in many regions, adjacent markers had the same or similar FST values and formed ‘blocks’ of FST. One example of this is shown in Figure S3. We examined these “FST block” regions and found that they were actually haplotype blocks in both CHB and CEU and contained very few haplotypes; therefore, markers within these “FST blocks” would provide redundant information if they were all included. Furthermore, for STRUCTURE analysis, the program was not designed to model the LD that occurs between nearby markers (so called “background LD”) within populations (i.e., the model is best suited for data on markers that are linked, but not so tightly linked).49 Therefore, we picked “tag AIMs” by controlling the between-marker distance and removing those redundant AIMs to avoid strong LD within CHB and CEU. At the same time, there were some greatly spanned chromosome regions (> 1 cM) without AIMs covered. We saturated these regions by selecting some AIMs with FST less than 0.35 but larger than 0.3. Finally, 83 of the original 602 AIMs were selected and used for further analysis. Distribution of marker information (FST, f, δ, and In) of 83 AIMs is shown in Figure 1. The average BMD of these 83 AIMs was 0.82 cM (398 kb), and median BMD was 0.57 cM (280 kb), with mean δ = 0.52, mean In = 0.20, mean f = 0.33, and mean FST = 0.47. The final length of the chromosomal region that was covered by AIMs was 67.37 cM (32.7 Mb).
|
|
|
Post by Admin on Dec 18, 2018 18:16:48 GMT
Figure 2 Principal-Coordinate-Analysis Representation of the Allele-Sharing Distance Principal-Coordinate Analysis of Individuals Principal-coordinate analysis (PCO) provides a useful means of revealing relationships among individuals. Figure 2 is a two-dimensional plot displaying the first two PCO axes for all individuals, with allele-sharing distance (ASD) used for all pairwise combinations of individuals. Individuals from one population cluster tightly, to the exclusion of individuals from other populations. The first two axes together explain 25.8 % of the total variation, and each of the remaining axes explains less than 1.5 % of the total variation. The first PCO axis shows a separation of the African and non-African populations and explains 17.57 % of the total variation; the second PCO axis explains 8.27 % of the total variation and shows a separation of the European and East Asian populations, with UIG individuals lying between them. This is also an expected result of UIG as an admixed population. Figure 3 Probability Estimations for the Number of Clusters, with Ten Randomly Selected Data Sets STRUCTURE Analysis and Estimation of Admixture Proportion of Individuals Evidence of Two-Way Admixture Given the large number of markers in our dataset, genetic analyses can be performed at the level of individual, making no presumption of group membership. We applied a model-based clustering algorithm, implemented by the computer program STRUCTURE, to infer the genetic ancestry of individuals. Our approach is solely based on genotype, without incorporation of any information on sampling location or population affiliation of each individual. For each data set of which markers were randomly selected by controlling BMD > 200 kb, we ran STRUCTURE from K = 2 to K = 6. Ten repeats were done for each K and each data set. According to the distribution of Ln(Pr), as shown in Figure 3, the most probable and appropriate number of clusters should be three in our dataset. Table 4 Population Admixture-Proportion Percentages Estimated from Random Markers
Cluster1 Cluster2 Cluster3 JPT 0.6 ± 0.2 1.2 ± 0.3 98.1 ± 0.4 CHB 0.7 ± 0.2 1.4 ± 0.4 97.9 ± 0.5 UIG 2.1 ± 0.8 56.2 ± 3.8 41.7 ± 3.9 CEU 0.8 ± 0.2 97.8 ± 0.3 1.4 ± 0.2 YRI 97.9 ± 0.6 1.1 ± 0.3 1.0 ± 0.3
Cluster1 corresponds to African ancestry. Cluster2 corresponds to European ancestry. Cluster3 corresponds to East Asian ancestry. The three clusters correspond to African, European, and Asian populations. The middle plots in Figure 4 are the average results of ten data sets at K = 3 (there were some variations of the estimations of admixture proportion of UIG, as shown in Figures S5 and S6; each cluster depicted by one color corresponds to an ethnic group). The results showed that individuals from the same population often shared membership coefficients in the inferred cluster, with UIG individuals displaying strong admixture of both European and Asian clusters. The admixture proportion of populations are shown in Table 4. The UIG population has average of 56.2% of admixture from European ancestry and 41.7% of admixture from East Asian ancestry, and the other populations were dominated by single East Asian, European, or African cluster. Notably, the distribution of admixture proportions among UIG individuals is relatively even, with 48.7% the lowest admixture from European ancestry and the highest 62.2%. The standard deviation is only 3.8%, which is much smaller than the estimation for the African-American (AfA) population,58 suggesting a much longer history of admixture events for the Uyghur population compared with the AfA population. Figure 4 Phylogenetic Analysis and Structure Analysis of UIG and Four HapMap Population Samples
|
|
|
Post by Admin on Dec 19, 2018 18:00:13 GMT
Admixture Proportion and Time of Admixture The STRUCTURE results from random markers showed that UIG was an admixed population with contributions from both European and East Asian ancestries. We thus selected AIMs according to allele frequency of CHB and CEU for further estimation of the admixture proportion of UIG. Table 5 shows the admixture proportions estimated from 83 AIMs. The UIG population has 60% of admixture from European ancestry and 40% of admixture from East Asian ancestry. Individual admixture proportion was estimated for each UIG individual, and Figure 5 shows the distribution of admixture proportions of UIG individuals. The proportion of East Asian ancestry in UIG individuals ranges from 15.7% to 59.7%, and the proportion of European ancestry in UIG individuals ranges from 40.3% to 84.3%. We ran a linkage model for 83 AIMs and obtained the estimation of recombination parameter r (breakpoints per cM). The posterior distribution of r is shown in Figure 6. On average, there were 1.26 breakpoints per cM, with a 90% confidence interval of [1.07, 1.46]. Under the assumption of a hybrid-isolation (HI) model, the admixture event of UIG was estimated to have taken place about 126 [107∼146] generations or 2520 [2140∼2920] years ago, assuming 20 years per generation. Figure 5 Summary Plot of Individual Admixture Proportions The results of individual admixture proportions estimated from 83 AIMs. Each individual is represented by a single vertical line broken into two colored segments, with lengths proportional to each of the two inferred clusters. Red indicates East Asian ancestry proportion, and blue indicates European ancestry proportion. The predefined population IDs (CHB, UIG, and CEU) are presented on the abscissa. The ordinate indicates the proportion unit. Figure 6 Posterior Distribution of the Recombination Parameter r, per Centimorgan Table 5 Population Admixture-Proportion Percentages Estimated from 83 AIMs
Cluster1 Cluster2 CHB 3.1 ± 0.1 96.9 ± 0.1 UIG 60.0 ± 0.1 40.0 ± 0.1 CEU 97.1 ± 0.1 2.9 ± 0.1 Cluster1 corresponds to European ancestry.
Cluster2 corresponds to East Asian ancestry.
Note: estimations from ten repeat runs are of very little difference. Inferred Ancestral Origins of Chromosomal Segments in UIG Using selected AIMs, we further inferred the ancestral origins of chromosomal segments in 40 UIG individuals. We selected a panel of 83 AIMs encompassing an overall area of 63.37 cM on chromosome 21 for estimation of the ancestry of alleles. The STRUCTURE program49 was run under the linkage model with the option of correlated allele frequency. The estimated haplotypes from the 40 UIG individuals were examined together with the phased data from the 60 CEU and 45 CHB subjects under a two-population model (K = 2). Figure 7 Inferred Ancestral Origins of a 67.37 cM Segment of Chromosome 21 in Three Populations With the assumption that East Asian and European populations were the only two parental populations, STRUCTURE provided the probability of an allele being derived from either the East Asian cluster or the European cluster. The natural logarithms of the probability ratio (LnPR) that an allele was derived from the East Asian cluster over the European cluster were estimated, and the results are depicted in Figure 7. The results provide information on the ancestry of the chromosome segments for each individual (see Supplemental Data for details). As expected, the UIG haplotypes showed contributions from both parental populations (Figure 7). The contribution from European ancestry was greater than that from East Asian ancestry in UIG. The mean contributions of ancestry were 60% (minimum 40.3% and maximum 84.3%) from European ancestry and 40% (minimum 15.7% and maximum 59.7%) from East Asian ancestry. Some segments existed for which ancestry was uncertain (shown in gray in Figure 7), because it is difficult to precisely define the length of the segments in UIG derived from each population sample. Notably, most ambiguous segments were distributed in the region with few or even no AIMs (AIM “deserts”). The cumulative frequencies of segment sizes that were derived from East Asian ancestry and from European ancestry are shown in Figure S7. The first quartile of segment size with East Asian ancestry was 0.55 cM, the second quartile was 1.68 cM, and the third quartile was 3.24 cM. For chromosomal segments with European ancestry, the first quartile of segment size was 0.83 cM, the second quartile was 3.14 cM, and the third quartile was 5.09 cM. The average sizes of chromosomal segments that were derived from East Asian ancestry and European ancestry were 2.43 cM and 4.07 cM, respectively.
|
|
|
Post by Admin on Dec 20, 2018 18:05:34 GMT
Figure 8 Comparison of the Proportion of Marker Pairs of Different r2 Levels in UIG and its Parental Populations (A) LD was calculated from markers with MAF ≥ 0.05 in each population. (B) LD was calculated from markers with MAF ≥ 0.15 in each population. Overall LD in Uyghur and its Parental Populations The extent of LD was examined across Chromosome 21 in UIG, Han Chinese (CHB), and European (CEU) samples for markers with minor-allele frequency (MAF) ≥ 0.05 (Figure 8a; Tables S2 and S3) and for markers with MAF ≥ 0.15 (Figure 8b; Tables S4 and S5). Proportions of marker pairs with LD at different levels of r2 (<0.1, ≥0.1, ≥0.2, ≥1/3, ≥0.5, ≥0.8) were plotted against between-marker distance (denoted as BMD hereafter). Interestingly, the admixed population UIG did not show stronger LD than did CHB and CEU. In fact, for both groups of SNPs, UIG showed weaker LD at each level of r2 ≥ 0.2. For example, when r2 ≥ 0.8 for common alleles and BMD ≤ 300kb, the proportion of marker pairs in UIG was only 68% of CHB and 75% of CEU. Furthermore, the extent of LD in marker pairs of UIG is very similar to that of CHB and CEU; i.e., LD levels of r2 ≥ 0.2 extend no more than 300 kb in all three populations, and strong LD levels (r2 ≥ 0.8) extend less than 100 kb. Magnitude and Extension of LD in UIG with AIMs Previous studies reported that extended LD in admixed populations such as AfA was concealed by unselected markers and that increased LD in AfA was correlated with increasing allele-frequency differences between the markers of Europeans and Africans.59,60 We showed in a recent study that LD in an admixed population correlates with allele-frequency difference between parental populations,58 which can be measured by FST. We selected 602 SNPs with large FST (mean FST = 0.48, mean δ = 0.52) between CHB and CEU as AIMs and compared the magnitude and extension of LD in all three populations by using these AIMs. We calculated r2 for each of the marker pairs (a total of 180,901 pairs) by using the haplotypes inferred by the PHASE program. To investigate the extension of LD, we compared the distributions of r2 in 180,901 marker pairs in three populations. The LD in UIG extends a little further than do those in CHB and CEU, especially when 0.2 > r2 ≥ 0.1 (Figure 9; Tables S6 and S7). For example, in UIG, LD extends to 2,000 kb at a level of r2 ≥ 0.1 (corresponding to Kruglyak's useful LD54). In contrast, LD at a level of r2 ≥ 0.1 extends to no more than 300 kb in both CHB and CEU samples. However, the proportion of marker pairs with higher LD, as high as r2 ≥ 0.8 in UIG, is even smaller than that of CEU (Figure 9 and Table S7). In fact, at a level of r2 ≥ 0.8 and within 200 kb, the proportion of marker pairs in UIG only slightly exceeds that of CHB (1.12-fold) and is even smaller than that of CEU (0.69-fold). At the other r2 levels, the proportion of marker pairs in UIG is, on average, larger than that of CHB and CEU. For example, at a level of r2 ≥ 0.5, the proportion of marker pairs in UIG is 2.18-fold of that in CHB and 1.04-fold of that in CEU; at a level of r2 ≥ 1/3 (the Ardlie's useful LD,56), the proportion of marker pairs in UIG is 2.63-fold of that in CHB and 2.09-fold of that in CEU; at a level of r2 ≥ 0.2, the proportion of marker pairs in UIG is 1.33-fold of that in CHB and 3.26-fold of that in CEU; and at a level of r2 ≥ 0.1, the proportion of marker pairs in UIG is 2.65-fold of that in CHB and 5.38-fold of that in CEU. Therefore, when AIMs were used, elevated LDs in UIG were mostly observed in the range of 0.1 ≤ r2 < 0.8 and at BMD < 2,000 kb. Figure 9 Comparison of the Proportion of Marker Pairs of Different r2 Levels in UIG and its Parental Populations for 602 AIMs At population level, STRUCTURE analysis showed that UIG was a typical two-way admixed population with ancestries contributed from both East Asian and European origins. In this study, we included only population samples from three continental groups; i.e., African, European, and East Asian population samples. We did not test the other possible sources of UIG's ancestral origin, such as South-Asian and Southeast Asian ancestries. However, because STRUCTURE posterior probabilities strongly supported two-way admixture, it is unlikely that there was a third parental population of UIG with genetic components that are substantially different from those of European or East Asian ancestry. The admixture proportion was estimated as 60% from European ancestry and 40% from East Asian ancestry; thus, European ancestry contributed slightly more to Uyghur genomes than did East Asian ancestry. This result is consistent with pairwise FST between populations estimated from entire markers: average FST between UIG and CEU was 0.028, which is much smaller than FST between UIG and CHB (0.037). The Uyghur samples used in this study were collected in Hetian, which is located in Southern Xinjiang, where the Uyghur population was thought to be less affected by the recent migration of Han Chinese than are Uyghur populations in Northern Xinjiang. Therefore, our estimation in this study is expected to be different from that of some previous studies on UIG samples collected in northern Xinjiang,2,8 where more interaction occurred between Han Chinese and Uyghur; for example, the estimation of European admixture proportions in some previous studies on UIG samples collected in northern Xinjiang was 30%. In addition, previous studies investigated only very few loci or even just a single locus.1,2,7,8 However, this discrepancy in admixture estimation should not significantly alter the mapping strategy. At the individual level, the proportion of East Asian ancestry in UIG individuals ranges from 15.7% to 59.7%, and the proportion of European ancestry in UIG individuals ranges from 40.3% to 84.3%. The distribution of admixture proportions among UIG individuals is relatively even, and the variation is much smaller than the estimation of variation in the AfA population.58 It is unlikely that such results were due to sampling of closely related individuals, because the IBD values within UIG samples were the lowest in all populations (CHB, JPT, CEU, YRI) (Table 3). Furthermore, the ancestry variation among individuals could even be overestimated, given that the result was based on the data of one single chromosome. This result suggests a much longer history of admixture events for the Uyghur population compared with the AfA population, because recombination over many generations has interwoven chromosome segments derived from both ancestries and drift of ancestries among individuals has become very small. At the chromosomal level, we inferred the ancestral origins of UIG chromosome segments: the average size of chromosome segments that were derived from East Asian and European populations were 2.4 cM and 4.1 cM, respectively. The estimated recombination rate was about 1.07–1.46 breakpoints per cM. Under the assumption of a hybrid-isolation (HI) model, the admixture event of UIG was estimated to have taken place about 107–146 generations, or 2140–2920 years ago assuming 20 years per generation. The word “Uyghur” (alternatively Uygur, Uigur, and Uighur) originates from the Old Turkish word “Uyγur.” On the basis of its Old Turkish phonetics, the word “Uyγur” was rendered differently in Chinese during different periods of China's history. The most ancient translation of the word “Uyγur” in Chinese was “Yuanhe,” which appears in Weishu (History of the Wei Dynasty), which was compiled during the period of Northern Qi (550–577 AD). The ancestors of the Uyghur (Gaoche) can be traced to the Chidi and Dingling in the third century B.C. (See Sima Qian, ‘Shiji’ Vol. 110: Xiongnu). Therefore, the estimated admixture time could be concordant with the historical record. However, this result could be underestimated due to the assumption of a hybrid-isolation (HI) model. In this model, we assumed that Uyghur was formed by a single event of admixture during a short period of time, which might not be true of the real history of the Uyghur. Considering the geographical location where the Uyghur settled, continuous gene flow from populations of European and Asian descent was very likely. Because the time estimation in this study was based on the information of recombination or linkage disequilibrium (LD), which decays with time of generations, LD could have been maintained to some extent, and recombination information could have been diluted if there had been continuous gene flow; thus, the time of admixture could be underestimated. In addition, the time of admixture could be underestimated because the distribution of the length of chromosome segments might be biased toward large segments due to the large spacing between markers and the uncertainty in the ancestry estimation of some alleles. Furthermore, switch errors are almost inevitable when haplotypes are inferred from genotype data of unrelated individuals. For the inferred haplotype data of 83 AIMs, we estimated the switch error rate as 1 per 22 SNPs in CEU, 1 per 25 SNPs in CHB, and 1 per 19 SNPs in UIG. In other words, on average there would be four potential phasing errors in CEU, three potential phasing errors in CHB, and four potential phasing errors in UIG. However, we think the switch errors have limited influence on the downstream analysis, i.e., the estimation of ancestral origin of chromosomal segments, because of the following reasons: (1) the recombination rate (breakpoints per cM) estimated from phased data is consistent with that estimated from unphased data; (2) considering the recombination rate, the frequency of breakpoints is much higher than that of switch errors—for example, the breakpoint rate in UIG is estimated at an average of 1.26 per 1 cM, or 85 breaks per 67.4 cM, whereas the switch error in UIG is only about 4 per 67.4 cM; (3) for many AIMs, we observed several UIG individuals with both alleles derived from the same ancestry; i.e., the phase information is not so important for those markers and individuals. Am J Hum Genet. 2008 Apr 11; 82(4): 883–894.
|
|
|
Post by Admin on Feb 21, 2019 1:34:38 GMT
The Tarim Basin in the Xinjiang region of China is situated on the Silk Road, the collection of ancient trade routes that for several millennia linked China to the Mediterranean (Fig. 1). The present-day inhabitants of the Tarim Basin are highly diverse both culturally and biologically as a result of extensive movements of peoples and cultural exchanges between east and west Eurasia [1, 2, 3]. Archaeological and anthropological investigations have helped to formulate two main theories to account for the origin of the populations in the Tarim Basin [4, 5, 6, 7, 8, 9, 10, 11, 12]. The first, so-called “steppe hypothesis”, maintains that the Tarim region experienced at least two population influxes from the Russo-Kazakh steppe. The earliest settlers may have been nomadic herders of the Afanasievo culture (ca. 3300–2000 B.C.), a primarily pastoralist culture derived from the Yamna culture of the Pontic-Caspian region and distributed in the Eastern Kazakhstan, Altai, and Minusinsk regions of the steppe north of the Tarim Basin (Fig. 1) [9, 12, 13, 14, 15]. This view is based on the numerous similarities between the material culture, burial rituals and skeletal traits of the Afanasievo culture and the earliest Bronze Age sites in the Tarim Basin, such as Gumugou (ca. 3800 BP), one of the oldest sites with human burials in Xinjiang [8, 9, 11, 12, 16]. These first settlers were followed by people of the Late Bronze Age Andronovo cultural complex (ca. 2100–900 B.C.), another pastoralist culture derived from the Yamna culture, primarily distributed in the Pamirs, the Ferghana Valley, Kazakhstan, and the Minusinsk/Altai region (Fig. 1) [8, 9, 11, 12, 15, 16]. This is signaled by the introduction of new material culture, clothing styles and burial customs around 1200 B.C. The second model, known as the “Bactrian oasis hypothesis”, also postulates a two-step settlement of the Tarim Basin in the Bronze Age, but maintains that the first settlers were farmers of the Bactria–Margiana Archaeological Complex (or BMAC, also known as the Oxus civilization) (ca. 2200–1500 B.C.) west of Xinjiang in Uzbekistan (north Bactria), Afghanistan (south Bactria), and Turkmenistan [17], followed later by the Andronovo people from the northwest (Fig. 1) [5, 7]. This model emphasises the environmental similarities between the Xinjiang and Central Asian desert basins, and suggests that certain features, including the irrigation systems, wheat remains, woolen textiles, bones of sheep and goats, and traces of the medicinal plant Ephedra found in Xinjiang could be evidence of links with the Oxus civilization [5, 7, 16]. These contrasting models can be tested using DNA recovered from archaeological bones. Previous genetic evidence on the origin of the earliest settlers was based on the analysis of mtDNA from burials at the Gumugou cemetery in the eastern edge of the Tarim Basin. In that study, researchers sequenced the first mtDNA hypervariable region (HVRI), but the results were inconclusive [18]. The discovery of another Bronze Age site of a similar age to Gumugou, with many well-preserved mummies, including individuals with European facial features, provided a unique opportunity to obtain genetic evidence about the first settlers of the Tarim Basin [19, 20, 21]. Fig. 1 Map of Eurasia showing the location of the Xiaohe cemetery, the Tarim Basin, the ancient Silk Road routes and the areas occupied by cultures associated with the settlement of the Tarim Basin. This figure is drawn according to literatures We describe here the analysis of mtDNA from human remains recovered from the Xiaohe tomb complex, an important Bronze Age site in the eastern edge of the Tarim Basin (40°20′11″N, 88°40′20.3″E) (Fig. 1). Discovered originally in 1934 by the Swedish archaeologist Folke Bergman, it was subsequently lost, but rediscovered in 2000 by a team from the Xinjiang Archaeological Institute using global positioning equipment. The cemetery was excavated between 2002 and 2005, and consisted of five strata with radiocarbon dates ranging from 4000 to 3500 years before present (14C yBP) [19, 22]. The site has many notable features, including numerous large phallus and vulva posts made of poplar, striking wooden human figures and masks, well-preserved boat coffins, leather hides, wheat and millet grains, and many artifacts (Fig. 2). Importantly, it contains the oldest and best-preserved mummies so far discovered in the Tarim Basin, possible those of the earliest people to settle the region. Genetic analysis of these mummies can provide data to elucidate the affinities of the earliest inhabitants, and help understand later patterns of human migration in the Eurasian continent. Fig. 2 a Fourth layer of the Xiaohe cemetery showing a large number of large phallus and vulva posts; b A well-preserved boat coffin; c Female with European features; d Double-layered coffin excavated from the Xiaohe cemetery The necropolis consisted of five layers of burials spanning half a millennium, offering the opportunity to determine the extent of interactions between the people of Xiaohe and other populations after the original settlement of the Tarim Basin. Did the people remain comparatively isolated or did they intermarry with newcomers? In an earlier study, we analysed DNA recovered from the deepest and oldest layer of burials of the Xiaohe site, the fifth layer, corresponding to the earliest inhabitants. Our results revealed that the first settlers carried both European and central Siberian maternal lineages. These findings agreed with the archaeological evidence for a close connection to the Afanasievo culture of the steppe north of the Tarim Basin, in other words with the “steppe hypothesis” [23]. We describe here the analysis of the maternal lineages of individuals recovered from the remaining four burial layers, and discuss the results in the context of the contrasting views on the settlement and migration patterns of the Tarim Basin.
|
|