|
Post by Admin on Mar 28, 2020 22:51:16 GMT
The Genomic Formation of Human Populations in East Asia Chuan-Chao Wang1,2,3,4,*, Hui-Yuan Yeh5,*, Alexander N Popov6,*, Hu-Qin 3 Zhang7,*, Hirofumi Matsumura8, Kendra Sirak2,9, Olivia Cheronet10, Alexey 4 Kovalev11, Nadin Rohland2, Alexander M. Kim2,12, Rebecca Bernardos2 88 The deep population history of East Asia remains poorly understood due to a 89 lack of ancient DNA data and sparse sampling of present-day people. We report 90 genome-wide data from 191 individuals from Mongolia, northern China, 91 Taiwan, the Amur River Basin and Japan dating to 6000 BCE - 1000 CE, many 92 from contexts never previously analyzed with ancient DNA. We also report 383 93 present-day individuals from 46 groups mostly from the Tibetan Plateau and 94 southern China. We document how 6000-3600 BCE people of Mongolia and the 95 Amur River Basin were from populations that expanded over Northeast Asia, 96 likely dispersing the ancestors of Mongolic and Tungusic languages. In a time 97 transect of 89 Mongolians, we reveal how Yamnaya steppe pastoralist spread 98 from the west by 3300-2900 BCE in association with the Afanasievo culture, 99 although we also document a boy buried in an Afanasievo barrow with ancestry 100 entirely from local Mongolian hunter-gatherers, representing a unique case of 101 someone of entirely non-Yamnaya ancestry interred in this way. The second 102 spread of Yamnaya-derived ancestry came via groups that harbored about a 103 third of their ancestry from European farmers, which nearly completely 104 displaced unmixed Yamnaya-related lineages in Mongolia in the second 105 millennium BCE, but did not replace Afanasievo lineages in western China 106 where Afanasievo ancestry persisted, plausibly acting as the source of the early 107 splitting Tocharian branch of Indo-European languages. Analyzing 20 Yellow 108 River Basin farmers dating to ~3000 BCE, we document a population that was a 109 plausible vector for the spread of Sino-Tibetan languages both to the Tibetan 110 Plateau and to the central plain where they mixed with southern agriculturalists 111 to form the ancestors of Han Chinese. We show that the individuals in a time 112 transect of 52 ancient Taiwan individuals spanning at least 1400 BCE to 600 CE 113 were consistent with being nearly direct descendants of Yangtze Valley first 114 farmers who likely spread Austronesian, Tai-Kadai and Austroasiatic languages 115 across Southeast and South Asia and mixing with the people they encountered, 116 contributing to a four-fold reduction of genetic differentiation during the 117 emergence of complex societies. We finally report data from Jomon hunter 118 gatherers from Japan who harbored one of the earliest splitting branches of East 119 Eurasian variation, and show an affinity among Jomon, Amur River Basin, 120 ancient Taiwan, and Austronesian-speakers, as expected for ancestry if they all 121 had contributions from a Late Pleistocene coastal route migration to East Asia.
Figure 1: Geographical locations of newly reported ancient individuals. 123 Main text 124 East Asia, one of the oldest centers of animal and plant domestication, today harbors 125 more than a fifth of the world’s human population, with present-day groups speaking 126 languages representing eleven major families: Sino-Tibetan, Tai-Kadai, Austronesian, 127 Austroasiatic, Hmong-Mien, Indo-European, Altaic (Mongolic, Turkic, and 128 Tungusic), Koreanic, Japonic, Yukgahiric, and Chukotko-Kanchatkan1. The past 129 10,000 years have been a period of profound economic and cultural change in East 130 Asia, but our current understanding of the genetic diversity, major mixture events, and 131 population movements and turnovers during the transition from foraging to 132 agriculture remains poor due to minimal sampling of the diversity of present-day 133 people on the Tibetan Plateau and southern China2. A particular limitation has been a 134 deficiency in ancient DNA data, which has been a powerful tool for discerning the 135 deep history of populations in Western and Central Eurasia3-8.
137 We genotyped 383 present-day individuals from 46 populations indigenous to China 138 (n=337) and Nepal (n=46) using the Affymetrix Human Origins array (Table S1 and 139 Supplementary Information section 1). We also report genome-wide data from 191 140 ancient East Asians, many from cultural contexts for which there is no published 141 ancient DNA data. From Mongolia we report 89 individuals from 52 sites dating 142 between ~6000 BCE to ~1000 CE. From China we report 20 individuals from the 143 ~3000 BCE Neolithic site of Wuzhuangguoliang. From Japan we report 7 Jomon 144 hunter-gatherers from 3500-1500 BCE. From the Russian Far East we report 23 145 individuals: 18 from the Neolithic Boisman-2 cemetery at ~5000 BCE, 1 from the 146 Iron Age Yankovsky culture at ~1000 BCE, 3 from the Medieval Heishui Mohe and 147 Bohai Mohe culture at ~1000 CE; and 1 historic period hunter-gatherer from Sakhalin 148 Island. From archaeological sites in Eastern Taiwan—the Bilhun site at Hanben on the 149 main island and the Gongguan site on Green Island—we report 52 individuals from 150 the Late Neolithic through the Iron Age spanning at least 1400 BCE - 600 CE.
152 For all but the Chinese samples we enriched the ancient DNA for a targeted set of 153 about 1.2 million single nucleotide polymorphisms (SNPs)4,9 , while for the 154 Wuzhuangguoliang samples from China we used exome capture (18 individuals) or 155 shotgun sequencing (2 individuals) (Figure 1, Supplementary Data files 1 and 2 and 156 Supplementary Information section 1). We performed quality control to test for 157 contamination by other human sequences, assessed by the rate of cytosine to thymine 158 substitution in the terminal nucleotide and polymorphism in mitochondrial DNA 159 as well as X chromosome sequences in males, and restricted analysis to 160 (Online Table 1). We detected close kinship 161 between individuals at the same site, including a Boisman nuclear family with 2 162 parents and 4 children (Table S2). We merged the new data with previously reported 163 data: 4 Jomon individuals, 8 Amur River Basin Neolithic individuals from the Devil’s 164 Gate site, 72 individuals from the Neolithic to the Iron Age in Southeast Asia, and 8 165 from Nepal7,12-20. We assembled 123 radiocarbon dates using bone from the 166 individuals, of which 94 are newly reported (Online Table 3), and clustered 167 individuals based on time period and cultural associations, then further by genetic 168 cluster which in the Mongolian samples we designated by number (our group names 169 thus have the format “<Country>_<Time Period>_<Genetic Cluster>_<Cultural 170 Association If Any>”) (Supplementary Note, Table S1 and Online Table 1). We 171 merged the data with previously reported data (Online Table 4).
|
|
|
Post by Admin on Mar 29, 2020 5:22:28 GMT
173 We carried out Principal Component Analysis (PCA) using smartpca21, projecting the 174 ancient samples onto axes computed using present-day people. The analysis shows 175 that population structure in East Asia is correlated with geographic and linguistic 176 categories, albeit with important exceptions. Groups in Northwest China, Nepal, and 177 Siberia deviate towards West Eurasians in the PCA (Supplementary Information 178 section 2, Figure 2), reflecting multiple episodes of West Eurasian-related admixture 179 that we estimate occurred 5 to 70 generations ago based on the decay of linkage disequilibrium22 180 (Table S3 and Table S4). East Asians with minimal proportions of 181 West Eurasian-related ancestry fall along a gradient with three clusters at their poles. 182 The “Amur Basin Cluster” correlates geographically with ancient and present-day 183 populations living in the Amur River Basin, and linguistically with present-day 184 indigenous people speaking Tungusic languages and the Nivkh. The “Tibetan Plateau 185 Cluster” is most strongly represented in ancient Chokhopani, Mebrak, and Samzdong individuals 186 from Nepal15 and in present-day people speaking Tibetan-Burman 187 languages and living on the Tibetan Plateau. The “Southeast Asian Cluster” is 188 maximized in ancient Taiwan groups and present-day people in Southeast Asia and 189 southern parts of China speaking Austroasiatic, Tai-Kadai and Austronesian 190 languages (Figure S1, Figure S2). Han are intermediate among these clusters, with 191 northern Han projecting close to the Neolithic Wuzhuangguoliang individuals from 192 northern China (Figure 2). We observe two genetic clusters within Mongolia: one falls 193 closer to ancient individuals from the Amur Basin Cluster (‘East’ based on their
Figure 2: Principal Component Analysis (PCA). (A) Projection of ancient samples onto PCA dimensions 1 and 2 defined by East Asians, Europeans, Siberians and Native Americans. (B) Projection onto groups with the little West Eurasian mixture. 194 geography), and the second clusters toward ancient individuals of the Afanasievo 195 culture ( ‘West’), while a few individuals take intermediate positions between the two 196 (Supplementary Information section 2). 197 198 The three most ancient individuals of the Mongolia ‘East’ cluster are from the 199 Kherlen River region of eastern Mongolia (Tamsag-Bulag culture) and date to 6000- 200 4300 BCE (this places them in the Early Neolithic period, which in Northeast Asia is 201 defined by the use of pottery and not by agriculture23). These individuals are 202 genetically similar to previously reported Neolithic individuals from the cis-Baikal 203 region and have minimal evidence of West Eurasian-related admixture as shown in 204 PCA (Figure 2), f4-statistics and qpAdm (Table S5, Online Table 5, labeled as 205 Mongolia_East_N). The other seven Neolithic hunter-gatherers from northern 206 Mongolia (labeled as Mongolia_North_N) can be modeled as having 5.4% ±1.1% 207 ancestry from a source related to previously reported West Siberian Hunter-gatherers(WSHG)8 208 (Online Table 5), consistent with the PCA where they are part of an east 209 west Neolithic admixture cline in Eurasia with increasing proximity to West Eurasians 210 in groups further west. Because of this ancestry complexity, we use the 211 Mongolia_East_N individuals without significant evidence of West Eurasian-related 212 admixture as reference points for modeling the East Asian-related ancestry in later 213 groups (Online Table 5). The two oldest individuals from the Mongolia ‘West’ cluster 214 have very different ancestry: they are from the Shatar Chuluu kurgan site associated 215 with the Afanasievo culture, with one directly dated to 3316-2918 calBCE (we quote 216 a 95% confidence interval here and in what follows whenever we mention a direct 217 date), and are indistinguishable in ancestry from previously published ancient 218 Afanasievo individuals from the Altai region of present-day Russia, who in turn are 219 similar to previously reported Yamnaya culture individuals supporting findings that 220 eastward Yamnaya migration had a major impact on people of the Afansievo culture5,8 221 . All the later Mongolian individuals in our time transect were mixtures of 222 Mongolian Neolithic groups and more western steppe-related sources, as reflected by 223 statistics of the form f3 (X, Y; Later Mongolian Groups), which resulted in 224 significantly negative Z scores (Z<−3) when Mongolia_East_N was used as X, and 225 when Yamnaya-related Steppe populations, AfontovaGora3, WSHG, or European 226 Middle/Late Neolithic or Bronze Age populations were used as Y (Table S6).
|
|
|
Post by Admin on Mar 29, 2020 21:05:59 GMT
228 To quantify the admixture history of the later Mongolians, we again used qpAdm. A 229 large number of groups could be modeled as simple two-way admixtures of 230 Mongolia_East_N as one source (in proportions of 65-100%) and WSHG as the other 231 source (in proportions of 0-35%), with negligible contribution from Yamnaya-related 232 sources as confirmed by including Russia_Afanasievo and Russia_Sintashta groups in 233 the outgroup set (Figure 3). The groups that fit this model were not only the two 234 Neolithic groups (0-5% WSHG), but also the Early Bronze Age people from the 235 Afanasievo Kurgak govi site (15%), the Ulgii group (28%), the main grouping of 236 individuals from the Middle Bronze Age Munkhkhairkhan culture (33%), Late Bronze 237 Age burials of the Ulaanzuukh type (6%), a combined group from the Center-West 238 region (27%), the Mongun Taiga type from Khukh tolgoi (35%), and people of the 239 Iron Age Slab Grave culture (9%). A striking finding in light of previous 240 archaeological and genetic data is that the male child from Kurgak govi (individual 241 I13957, skeletal code AT_629) has no evidence of Yamnaya-related ancestry despite 242 his association with Afanasievo material culture (for example, he was buried in a 243 barrow in the form of circular platform edged by vertical stone slabs, in stretched 244 position on the back on the bottom of deep rectangular pit and with a typical 245 Afanasievo egg-shaped vessel (Supplementary Note); his late Afanasievo chronology 246 is confirmed by a direct radiocarbon date of 2858-2505 BCE24). This is the first 247 known case of an individual buried with Afanasievo cultural traditions who is not 248 overwhelmingly Yamnaya-related, and he also shows genetic continuity with an 249 individual buried at the same site Kurgak govi 2 in a square barrow (individual I6361, 250 skeletal code AT_635, direct radiocarbon date 2618-2487 BCE). We label this second 251 individuals as having an Ulgii cultural association, although a different archaeological 252 assessment associates this individual to the Afanasievo or Chemurchek cultures25, so 253 it is possible that this provides a second example of Afanasievo material culture being 254 adopted by individuals without any Yamnaya ancestry. The legacy of the Yamnaya 255 era spread into Mongolia continued in two individuals from the Chemurchek culture 256 whose ancestry can be only modeled by using Afanasievo as one of the sources 257 (49.0%±2.6%, Online Table 5). This model fits even when ancient European farmers 258 are included in the outgroups, showning that if the long-distance transfer of West 259 European megalithic cultural traditions to people of the Chemurchek culture that has 260 been suggested in the archaeological literature occurred,26 it must have been through 261 spread of ideas rather than through movement of people.
Figure 3: qpAdm modeling of ancestry change over time in Mongolia. We use Mongolia_East_N, Afanasievo, WSHG, and Sintashta_MLBA as sources, and for each combined archaeological and genetic grouping identify maximally parsimonious models (fewest numbers of sources) that fit with P>0.05 (Online Table 5). We plot results for groupings that give a unique parsimonious model, and include at least one individual with data that “PASS” at high quality and with a confident chronological assignment (Online Table 1). The bars show proportions of each ancestry source, and we also include time spans for the individuals in the cluster. Groupings that include more eastern individuals (longitude >102.7 degrees) are indicated in green and typically have very little Yamnaya-related admixture even at late dates. 263 Beginning in the Middle Bronze Age in Mongolia, there is no compelling evidence 264 for a persistence of the Yamnaya-derivd lineages originally spread into the region 265 with Afanasievo. Instead in the Late Bronze Age and Iron Age and afterward we have 266 data from multiple Mongolian groups whose Yamnaya-related ancestry can only be 267 modeled as deriving not from the initial Afanasievo migration but instead from a later 268 eastward spread into Mongolia related to people of the Middle to Late Bronze Age 269 Sintashta and Andronovo horizons who were themselves a mixture of ~2/3 Yamnaya-related 270 and 1/3 European farmer-related ancestry5,7,8. The Sintashta-related ancestry is 271 detected in proportions of 5% to 57% in individuals from the 272 Mongolia_LBA_6_Khovsgol (a culturally mixed group from the literature14), 273 Mongolia_LBA_3_MongunTaiga, Mongolia_LBA_5_CenterWest, 274 Mongolia_EIA_4_Sagly, Mongolia_EIA_6_Pazyryk, and Mongolia_Mongol groups, 275 with the most substantial proportions of Sintashta-related ancestry always coming 276 from western Mongolia (Figure 3, Online Table 5). For all these groups, the qpAdm 277 ancestry models pass when Afanasievo is included in the outgroups while models 278 with Afanasievo treated as the source with Sintashta more distantly related outgroups 279 are all rejected (Figure 3, Online Table 5). Starting from the Early Iron Age, we 280 finally detect evidence of gene flow in Mongolia from groups related to Han Chinese. 281 Specifically, when Han are included in the outgroups, our models of mixtures in 282 different proportions of Mongolia_East_N, Russia_Afanasievo, Russia_Sintashta, and 283 WSHG continue to work for all Bronze Age and Neolithic groups, but fail for an 284 Early Iron Age individual from Tsengel sum (Mongolia_EIA_5), and for Xiongnu and 285 Mongols. When we include Han Chinese as a possible source, we estimate ancestry 286 proportions of 20-40% in Xiongnu and Mongols (Online Table 5).
288 While the Afanasievo-derived lineages are consistent with having largely disappeared 289 in Mongolia by the Late Bronze Age when our data showed that later groups with 290 Steppe pastoralist ancestry made an impact, we confirm and strengthen previous 291 ancient DNA analysis suggesting that the legacy of this expansion persisted in 292 western China into the Iron Age Shirenzigou culture (410-190 BCE)27. The only 293 parsimonious model for this group that fits according to our criteria is a 3-way 294 mixture of groups related to Mongolia_N_East, Russia_Afanasievo, WSHG. The only 295 other remotely plausible model (although not formally a good fit) also requires 296 Russia_Afanasievo as a source (Figure 3, Online Table 5). The findings of the original 297 study that reported evidence that the Afanasievo spread was the source of Steppe 298 ancestry in the Iron Age Shirenzigou have been questioned with the proposal of 299 alternative models that use ancient Kazakh Steppe Herders from the site of Botai, 300 Wusun, Saka and ancient Tibetans from the site of Mebrak15 in present-day Nepal as 301 major sources for Steppe and East Asian-related ancestry28. However, when we fit 302 these models with Russia_Afanasievo and Mongolian_East_N added to the outgroups, 303 the proposed models are rejected (P-values between 10-7 and 10-2), except in a model 304 involving a single low coverage Saka individual from Kazakhstan as a source 305 (P=0.17, likely reflecting the limited power to reject models with this low coverage). 306 Repeating the modeling using other ancient Nepalese with very similar genetic 307 ancestry to that in Mebrak results in uniformly poor fits (Online Table 5). Thus, 308 ancestry typical of the Afanasievo culture and Mongolian Neolithic contributed to the 309 Shirenzigou individuals, supporting the theory that the Tocharian languages of the 310 Tarim Basin—from the second-oldest-known branch of the Indo-European language 311 family—spread eastward through the migration of Yamnaya steppe pastoralists to the 312 Altai Mountains and Mongolia in the guise of the Afansievo culture, from where they 313 spread further to Xinjiang5,7,8,27,29,30. These results are significant for theories of Indo 314 European language diversification, as they increase the evidence in favor of the 315 hypothesis the branch time of the second-oldest branch in the Indo-European language tree occurred at the end of the fourth millennium BCE27,29,3
|
|
|
Post by Admin on Mar 30, 2020 4:50:58 GMT
318 The individuals from the ~5000 BCE Neolithic Boisman culture and the ~1000 BCE 319 Iron Age Yankovsky culture together with the previously published ~6000 BCE data 320 from Devil’s Gate cave19 are genetically very similar, documenting a continuous 321 presence of this ancestry profile in the Amur River Basin stretching back at least to 322 eight thousand years ago (Figure 2 and Figure S2). The genetic continuity is also 323 evident in the prevailing Y chromosomal haplogroup C2b-F1396 and mitochondrial 324 haplogroups D4 and C5 of the Boisman individuals, which are predominant lineages 325 in present-day Tungusic, Mongolic, and some Turkic-speakers. The Neolithic 326 Boisman individuals shared an affinity with Jomon as suggested by their intermediate 327 positions between Mongolia_East_N and Jomon in the PCA and confirmed by the 328 significantly positive statistic f4 (Mongolia_East_N, Boisman; Mbuti, Jomon). 329 Statistics such as f4 (Native American, Mbuti; Test East Asian, 330 Boisman/Mongolia_East_N) show that Native Americans share more alleles with 331 Boisman and Mongolia_East_N than they do with the great majority of other East 332 Asians in our dataset (Table S5). It is unlikely that these statistics are explained by 333 back-flow from Native Americans since Boisman and other East Asians share alleles 334 at an equal rate with the ~24,000-year-old Ancient North Eurasian MA1 who was 335 from a population that contributed about 1/3 of all Native American ancestry31. A 336 plausible explanation for this observation is that the Boisman/Mongolia Neolithic 337 ancestry was linked (deeply) to the source of the East Asian-related ancestry in Native 338 31. We can also model published data from Neolithic and Early Bronze Age 339 individuals around Lake Baikal7 as sharing substantial ancestry (77-94%) with 340 the lineage represented by Mongolia_East_N, revealing that this type of ancestry was 341 once spread over a wide region spanning across Lake Baikal, eastern Mongolia, and 342 the Amur River Basin (Table S7). Some present-day populations around the Amur 343 River Basin harbor large fractions of ancestry consistent with deriving from more 344 southern East Asian populations related to Han Chinese (but not necessarily Han 345 themselves) in proportions of 13-50%. We can show that this admixture occurred at 346 least by the Early Medieval period because one Heishui_Mohe individual (I3358, 347 directly dated to 1050-1220 CE) is estimated to have harbored more than 50% 348 ancestry from Han or related groups (Table S8).
350 The Tibetan Plateau, with an average elevation of more than 4,000 meters, is one of 351 the most extreme environments in which humans live. Archaeological evidence 352 suggests two main phases for modern human peopling of the Tibetan Plateau. The 353 first can be traced back to at least ~160,000 years ago probably by Denisovans32 and 354 then to 40,000-30,000 years ago as reflected in abundant blade tool assemblages 355 However, it is only in the last ~3,600 years that there is evidence for continuous 356 permanent occupation of this region with the advent of agriculture34. We grouped 17 357 present-day populations from the highlands into three categories based on genetic 358 clustering patterns (Figure S3): “Core Tibetans” who are closely related to the ancient 359 Nepal individuals such as Chokopani with a minimal amount of admixture with 360 groups related to West Eurasians and lowland East Asians in the last dozens of 361 generations, “northern Tibetans” who are admixed between lineages related to Core 362 Tibetans and West Eurasians, and “Tibeto-Yi Corridor” populations (the eastern edge 363 of the Tibetan Plateau connecting the highlands to the lowlands) that includes not just 364 Tibetan speakers but also Qiang and Lolo-Burmese speakers who we estimate using 365 qpAdm4,35 have 30-70% Southeast Asian Cluster-related ancestry (Table S9). We 366 computed f3 (Mbuti; Core Tibetan, non-Tibetan East Asian) to search for non-Tibetans 367 that share the most genetic drift with Tibetans. Neolithic Wuzhuangguoliang, Han and 368 Qiang appear at the top of the list (Table S10), suggesting that Tibetans harbor 369 ancestry from a population closely related to Wuzhuangguoliang that also contributed 370 more to Qiang and Han than to other present-day East Asian groups. We estimate that 371 the mixture occurred 60-80 generations ago (2240-1680 years ago assuming 28 years 372 per generation36 under a model of a single pulse of admixture (Table S11). This 373 represents an average date and so only provides a lower bound on when these two 374 populations began to mix; the start of their period of admixture could plausibly be as 375 old as the ~3,600-year-old date for the spread of agriculture onto the Tibetan plateau. 376 These findings are therefore consistent with archaeological evidence that expansions 377 of farmers from the Upper and Middle Yellow River Basin influenced populations of 378 the Tibetan Plateau from the Neolithic to the Bronze Age as they spread across the 379 China Central plain37,38, and with Y chromosome evidence that the shared common 380 haplogroup Oα-F5 between Han and Tibetans coalesced to a common ancestry less 381 than 5,800 years ago39.
|
|
|
Post by Admin on Mar 30, 2020 23:21:30 GMT
383 In the south, we find that the ancient Taiwan Hanben and Gongguan culture 384 individuals dating from at least a span of 1400 BCE - 600 CE are genetically most 385 similar to present-day Austronesian speakers and ancient Lapita individuals from 386 Vanuatu as shown in outgroup f3-statistics and significantly positive f4-statistics 387 (Taiwan_Hanben/Gongguan, Mbuti; Ami/Atayal/Lapita, other Asians) (Table S8). 388 The similarity to Austronesian-speakers is also evident in the Iron Age dominant 389 paternal Y chromosome lineage O3a2c2-N6 and maternal mtDNA lineages E1a, 390 B4a1a, F3b1, and F4b, which are widespread lineages among Austronesianspeakers40,41 391 . We compared the present-day Austronesian-speaking Ami and Atayal of 392 Taiwan with diverse Asian populations using statistics like f4 (Taiwan Iron 393 Age/Austronesian, Mbuti; Asian1, Asian2). Ancient Taiwan groups and Austronesian 394 China and in Hainan Island42 speakers share significantly more alleles with Tai-Kadai speakers 395 in southern mainland than they do with other East Asians (Table S8), 396 consistent with the hypothesis that ancient populations related to present-day Tai 397 Kadai speakers are the source for the spread of agriculture to Taiwan island around 398 5000 years ago43. The Jomon share alleles at an elevated rate with ancient Taiwan 399 individuals and Ami/Atayal as measured by statistics of the form f4 (Jomon, Mbuti; 400 Ancient Taiwan/Austronesian-speaker, other Asians) compared with other East Asian 401 groups, with the exception of groups in the Amur Basin Cluster (Table S8)44.
Figure 4: qpAdm modeling of Han Chinese cline. We used the ancient Wuzhuangguoliang as a proxy for Yellow River Farmers and Taiwan_Hanben as a proxy for Yangtze River Farmers related ancestry. 403 The Han Chinese are the world’s largest ethnic group. It has been hypothesized based 404 on the archaeologically documented spread of material culture and farming 405 technology, as well as the linguistic evidence of links among Sino-Tibetan languages, 406 that one of the ancestral populations of the Han might have consisted of early farmers 407 along the Upper and Middle Yellow River in northern China, some of whose 408 descendants also may have spread to the Tibetan Plateau and contributed to presentday 409 Tibeto-Burmans. Archaeological and historical evidence document how during 410 the past two millennia, the Han expanded south into regions inhabited by previously 411 established agriculturalists46. Analysis of genome-wide variation among present-day 412 populations has revealed that the Han Chinese are characterized by a “North-South” cline47,48 413 , which is confirmed by our analysis. The Neolithic Wuzhuangguoliang, 414 present-day Tibetans, and Amur River Basin populations, share significantly more 415 alleles with Han Chinese compared with the Southeast Asian Cluster, while the 416 Southeast Asian Cluster groups share significantly more alleles with the majority of 417 Han Chinese groups when compared with the Neolithic Wuzhuangguoliang (Table 418 S12, Table S13). These findings suggest that Han Chinese may be admixed in variable 419 proportions between groups related to Neolithic Wuzhuangguoliang and people 420 related to those of the Southeast Asian Cluster. To determine the minimum number of 421 source populations needed to explain the ancestry of the Han, we used qpWave4,49 to 422 study the matrix of all possible statistics of the form f4 (Han1, Han2; O1, O2), where 423 “O1” and “O2” are outgroups that are unlikely to have been affected by recent gene 424 flow from Han Chinese. This analysis confirms that two source populations are 425 consistent with all of the ancestry in most Han Chinese groups (with the exception of 426 some West Eurasian-related admixture that affects some northern Han Chinese in 427 proportions of 2-4% among the groups we sampled; Table S14 and Table S15). 428 Specifically, we can model almost all present-day Han Chinese as mixtures of two 429 ancestral populations, in a variety of proportions, with 77-93% related to Neolithic 430 Wuzhuangguoliang from the Yellow River basin, and the remainder from a 431 population related to ancient Taiwan that we hypothesize was closely related to the 432 rice farmers of the Yangtze River Basin. This is also consistent with our inference that 433 the Yangtze River farmer related ancestry contributed nearly all the ancestry of 434 Austronesian speakers and Tai-Kadai speakers and about 2/3 of some Austroasiatic 435 speakers17,20 (Figure 4). A caveat is that there is a modest level of modern 436 contamination in the Wuzhuangguoliang we use as a source population for this 437 analysis (Online Table 1), but this would not bias admixture estimates by more than 438 the contamination estimate of 3-4%. The average dates of West Eurasian-related 439 admixture in northern Han Chinese populations Han_NChina and Han_Shanxi are 32- 440 45 generations ago, suggesting that mixture was continuing at the time of the Tang 441 Dynasty (618-907 CE) and Song Dynasty (960-1279 BCE) during which time there 442 are historical records of integration of Han Chinese amd western ethnic groups, but 443 this date is an average so the mixture between groups could have begun earlier.
|
|