Mal'ta Boy (ANE): Origins of Native Americans

new

Admin
Administrator

Posts: 75,896

Mal'ta Boy (ANE): Origins of Native Americans Feb 8, 2020 21:44:00 GMT

Quote

Post by Admin on Feb 8, 2020 21:44:00 GMT

The settlement of the Americas occurred at least 15,000 years ago through Beringia, a land bridge between Asia and America that existed during the ice ages1,2,3,4,5. Most analyses of Native American genetic diversity have examined single loci, particularly mitochondrial DNA or the Y chromosome, and some interpretations of these data model the settlement of America as a single migratory wave from Asia6,7,8. We assembled native population samples from Canada to the southern tip of South America, genotyped them on single nucleotide polymorphism (SNP) microarrays, and merged our data with six other data sets. The combined data set consists of 364,470 SNPs genotyped in 52 Native American populations (493 samples; Fig. 1a and Supplementary Table 1), 17 Siberian populations (245 samples; Supplementary Fig. 1 and Supplementary Table 2) and 57 other populations (1,613 samples) (Supplementary Notes).

Figure 1: Geographic, linguistic and genetic overview of 52 Native American populations.

A complication in studying Native American genetic history is admixture with European and African immigrants since 1492. Cluster analysis16 shows that many of the samples we examined have some non-native admixture (an average of 8.5%; Fig. 1b and Supplementary Tables 1 and 3). This admixture is a challenge for learning about the historical relationships among the populations, and to address this complication we used three independent approaches. First, we restricted analyses to 163 Native Americans from 34 populations without evidence of admixture (Supplementary Notes). Second, we subtracted the expected contribution of European and African ancestry to the statistics we used to learn about population relationships (Supplementary Notes). Third, we inferred the probability of non-native ancestry at each genomic segment and ‘masked’ segments with more than a negligible probability of this ancestry (Fig. 1b, Supplementary Notes and Supplementary Fig. 2). Our inferences from these three approaches are concordant (Supplementary Figs 3 and 4).

We built a tree (Fig. 1c) using Fst distances between pairs of populations, which broadly agrees with geography and linguistic categories17 (trees based on masked and unmasked data were similar; Supplementary Fig. 3). An early split separates Asians from Native Americans and extreme northeastern Siberians (Chukchi, Naukan, Koryak), which is consistent with studies that have identified pan-American variants shared with some northeastern Siberians6,7,10,18. Eskimo–Aleut speakers and far-northeastern Siberians form a cluster that is separated from other Native American populations by a long internal branch. Within America the tree shows a series of splits in an approximate north–south sequence beginning with the Arctic, followed by northern North America, northern/central and southern Mexico and lower Central America/Colombia, and ending in three South American clusters (the Andes, the Chaco region and eastern South America). This pattern of splits is consistent with a north–south population expansion, an inference that is also supported by the negative correlation between heterozygosity and distance from the Bering Strait (r = −0.48, P = 0.007). This correlation increases if we use ‘least cost distances’ that consider the coasts as facilitators of migration19,20,21, and persists if we exclude four Native North American populations with ancestry from later streams of Asian gene flow (Supplementary Notes and Supplementary Fig. 5).

Trees provide a simplified model of history that does not accommodate the possibility of gene flow after population separation. Circumstantial evidence that some Native American populations may not fit a simple tree comes from cluster analysis, which infers Siberian-related ancestry in some northern North Americans (Fig. 1b), and from single-locus studies that have identified genetic variants shared between Eurasia and North America that are absent from South America11,22,23. The advent of genome-wide data sets has allowed the development of a formal four-population test for whether sets of four populations are consistent with a tree. This test is robust to the ascertainment bias affecting SNP arrays24. For each of the 52 Native American populations in turn, we tested the hypothesis that they conform to the tree: ((test population, southern Native American), (outgroup1, outgroup2)) for 45 pairs of ten Asian outgroups. We used a Hotelling T-test to evaluate whether all four-population test f4 statistics of this form are consistent with the expectation of zero (Supplementary Notes). The test is not significant for 47 populations, which is consistent with their stemming from the same, presumably first, wave of American settlement; we call this ancestry ‘First American’ (Table 1). In contrast, four populations from northern North America show highly significant evidence of ancestry from additional streams of gene flow from Asia, subsequent to the initial peopling of America, which we confirm through the Hotelling T-test and a complementary test (Supplementary Notes): East Greenland Inuit (P
Examination of the values of the f4 statistics allows us to infer the minimum number of gene flow events from Asia into America consistent with the data. Each stream of gene flow is expected to produce a distinct vector of f4 statistics, constituting a ‘signature’ of how the ancestral migrating population relates to present-day Asian populations. By finding the minimum number of vectors whose linear combinations are necessary to produce the vector observed in each population, we infer that a minimum of three gene flow events from Asia are necessary to explain the data from all Native American populations jointly, including the Saqqaq Palaeo-Eskimo (Supplementary Notes). These three episodes correspond to First American ancestry (distributed throughout the Americas) and to two additional streams of gene flow detected in a subset of northern North Americans (East Greenland Inuit, West Greenland Inuit, Aleutian Islanders, Chipewyan and Saqqaq). Table 1 shows that f4 statistics in the Inuit and Aleutian islanders are consistent with deriving the non-First-American portions of their ancestry from the same later stream of Asian gene flow, providing support for deep shared ancestry between these linguistically linked groups12,26. The Na-Dene-speaking Chipewyan have a different pattern of f4 statistics from Eskimo–Aleut speakers, implying that they descend at least in part from a separate stream of Asian gene flow (P
To develop an explicit model for the settlement of the Americas, we used the admixture graph (AG) framework24. AGs are generalizations of trees that accommodate the possibility of a limited number of unidirectional gene flow events. They are powerful tools for learning about history because they make predictions about the values of f-statistics (such as f4) that can be used to test the fit of a proposed model24 (Supplementary Notes). Figure 2 presents an AG relating selected Native American and Old World populations that is a good fit to the data in the sense that none of the f-statistics predicted by the model are more than three standard errors from what is observed. This supports the hypothesis of three deep lineages in Native Americans: the Asian lineage leading to First Americans is the most deeply diverged, whereas the Asian lineages leading to Eskimo–Aleut speakers and the Na-Dene-speaking Chipewyan are more closely related and descend from a putative Siberian ancestral population more closely related to Han (Fig. 2). We also arrive at the finding that Eskimo–Aleut populations and the Chipewyan derive large proportions of their genomes from First American ancestors: an estimated 57% for Eskimo–Aleut speakers, and 90% in the Chipewyan, probably reflecting major admixture events of the two later streams of Asian migration with the First Americans that they encountered after they arrived (Supplementary Notes). The high proportion of First American ancestry explains why Eskimo–Aleut and Chipewyan populations cluster with First Americans in trees like that in Fig. 1c despite having some of their ancestry from later streams of Asian migration, and explains the observation of some genetic variants that are shared by all Native Americans but are absent elsewhere6,7,10,18. We also infer back-migration of populations related to the Eskimo–Aleut from America into far-northeastern Siberia (we obtain an excellent fit to the data when we model the Naukan and coastal Chukchi as mixtures of groups related to the Greenland Inuit and Asians (Fig. 2 and Supplementary Notes)). This explains previous findings of pan-American alleles also in far-northeastern Siberia6,7,10,18.

Figure 2: Distinct streams of gene flow from Asia into America.

We next used AGs to develop a model for the history of populations who derive all their ancestry from the First American migration, with no ancestry from subsequent streams of Asian gene flow. Figure 3 presents an AG we built for 16 selected Native American populations and two outgroups, which is a good fit to the data in that the largest |Z|-score for a difference between the observed and predicted f-statistics is 3.2 from among the 11,781 statistics we tested (Supplementary Notes) (The AG of Fig. 3 used masked data; however, a consistent set of relationships is inferred for unadmixed samples (Supplementary Fig. 4).) This model provides a greatly improved statistical fit to the data compared with the tree of Fig. 1c and leads to several novel inferences. First, a relatively large fraction of South American populations fit the AG without a need for admixture events, which we speculate reflects a history of limited gene flow among these populations since their initial divergence. In contrast, only a small fraction of Meso-American populations fit into the AG, which could reflect either a higher rate of migration among neighbouring groups or our denser sampling in Meso-America allowing us to detect more subtle gene flow events. Second, some Meso-American populations have experienced very little genetic drift since divergence from the common ancestral population with South Americans (adding up the genetic drifts along the relevant edges of Fig. 3, we infer Fst = 0.014 between the Zapotec and a hypothetical population ancestral to all of Central and South America), suggesting that effective population sizes in Meso-America have been relatively large since settlement of the region. Third, the model infers three admixture events consistent with geographic locations and linguistic affiliations (Supplementary Notes). The Inga have both Amazonian and Andean ancestry, which is consistent with their speaking a Quechuan language but living in the eastern Andean slopes of Colombia and thus interacting with groups in the neighbouring Amazonian lowlands. The Guarani stem from two distinct strands of ancestry within eastern South America. The most striking admixture event is in the Costa Rican Cabecar (Fig. 3) and other Chibchan-speaking populations (Supplementary Notes) from the Isthmo-Colombian area. One of the lineages that we detect in these populations occurs definitively within the radiation of South American populations, and so the presence of these populations in lower Central America suggests that there was reverse gene flow across the Panama isthmus after the initial settlement of South America. There has been controversy about whether Chibchan speakers of lower Central America represent direct descendants of the first settlers in the region or more recent migration across the isthmus, and our results support the view that more recent migration has contributed most of these populations’ ancestry27.

Figure 3: A model fitting populations of entirely First American ancestry.

This is the most comprehensive survey of genetic diversity in Native Americans so far. Our analyses show that the great majority of Native American populations—from Canada to the southern tip of Chile—derive their ancestry from a homogeneous ‘First American’ ancestral population, presumably the one that crossed the Bering Strait more than 15,000 years ago6,7,8. We also document at least two additional streams of Asian gene flow into America, allowing us to reject the view that all present-day Native Americans stem from a single migration wave6,7,8, and supporting the more complex scenarios proposed by some other studies9,10,11,12,13,14,15. In particular, the three distinct Asian lineages we detect—‘First American’, ‘Eskimo–Aleut’ and a separate one in the Na-Dene-speaking Chipewyan—are consistent with a three-wave model proposed9 mostly on the basis of dental morphology and a controversial interpretation of the linguistic data. However, our analyses also document extensive admixture between First Americans and the subsequent streams of Asian migrants, which was not predicted by that model, such that Eskimo–Aleut speakers and the Chipewyan derive more than half their ancestry from First Americans. Further insights into Native American history will benefit from the application of analyses similar to those performed here to whole-genome sequences and to data from the many admixed populations in the Americas that do not self-identify as native28,29,30.

Nature volume 488, pages 370–374 (16 August 2012)

Admin
Administrator

Posts: 75,896

Mal'ta Boy (ANE): Origins of Native Americans Oct 11, 2021 21:58:35 GMT

Quote

Post by Admin on Oct 11, 2021 21:58:35 GMT

Paleolithic to Bronze Age Siberians Reveal Connections with First Americans and across Eurasia

Summary
Modern humans have inhabited the Lake Baikal region since the Upper Paleolithic, though the precise history of its peoples over this long time span is still largely unknown. Here, we report genome-wide data from 19 Upper Paleolithic to Early Bronze Age individuals from this Siberian region. An Upper Paleolithic genome shows a direct link with the First Americans by sharing the admixed ancestry that gave rise to all non-Arctic Native Americans. We also demonstrate the formation of Early Neolithic and Bronze Age Baikal populations as the result of prolonged admixture throughout the eighth to sixth millennium BP. Moreover, we detect genetic interactions with western Eurasian steppe populations and reconstruct Yersinia pestis genomes from two Early Bronze Age individuals without western Eurasian ancestry. Overall, our study demonstrates the most deeply divergent connection between Upper Paleolithic Siberians and the First Americans and reveals human and pathogen mobility across Eurasia during the Bronze Age.

Graphical Abstract

Introduction
The Lake Baikal region in Siberia has been inhabited by modern humans since the Upper Paleolithic and has a rich archaeological record (Katzenberg and Weber, 1999, Weber, 1995). In the past 5 years, ancient genomic studies have revealed multiple genetic turnovers and admixture events in this region. The 24,000-year-old individual (MA1) from the Mal’ta site represents an ancestry referred to as “Ancient North Eurasian (ANE),” which was widespread across Siberia during the Paleolithic (Fu et al., 2016, Raghavan et al., 2014a, Sikora et al., 2019) and that contributed to the genetic profile of a vast number of present-day Eurasian populations as well as Native Americans (Haak et al., 2015, Lazaridis et al., 2014, Lazaridis et al., 2016, Raghavan et al., 2015). ANE ancestry was suggested to have been largely replaced in the Lake Baikal region during the Early Neolithic by a gene pool related to present-day northeast Asians, with a limited resurgence of ANE ancestry by the Early Bronze Age (Damgaard et al., 2018a).

Siberia has also been proposed as a source for multiple waves of dispersals into the Americas, the first of which was shown to be driven by a founding population estimated to have formed around 25,000–20,000 years before the present (BP) (Raghavan et al., 2015). The so-called Ancient Beringian ancestry represented by a 11,500-year-old Alaskan individual (USR1) was shown to be part of this founding population, estimated to have split from other Native Americans around 23,000 BP (Moreno-Mayar et al., 2018). In addition, the recently published 9,800-year-old Kolyma genome from northeastern Siberia was suggested to represent the closest relative to Native American populations outside of the Americas (Sikora et al., 2019). Moreover, the Paleo-Eskimo ancestry represented by a 4,000-year-old Saqqaq individual from Greenland was also estimated to have split from northeastern Siberian groups and migrated to Arctic America around 6,000–5,000 BP (Flegontov et al., 2019, Raghavan et al., 2014b, Rasmussen et al., 2010). Although these waves of migration are generally linked to ancient Siberian populations, their origins in the context of the Siberian genetic history remain poorly understood. Further studies of the Siberian population history using ancient genomes are, therefore, critical for the better understanding of the formation of Native American populations.

Furthermore, the Neolithic to Bronze Age transition in Eurasia was marked by complex cultural and genetic changes facilitated by extensive population movements, though their impact in the Lake Baikal region is still unclear. Looking to the west, the Early Bronze Age groups from the Pontic-Caspian steppe associated with the Yamnaya complex spread both east and west along with their distinct genetic profile often referred to as “Steppe ancestry” (Allentoft et al., 2015, Haak et al., 2015). The eastward expansion of this group is considered to be associated with the Early Bronze Age Afanasievo culture. However, the later Middle Bronze Age Okunevo-related population from the central steppe as well as the Late Bronze Age Khövsgöl-related population from the eastern steppe harbor only a limited proportion of Steppe ancestry (Jeong et al., 2018, Jeong et al., 2019). Therefore, the effect of steppe migrations in eastern Eurasia, particularly the interactions of Bronze Age Baikal hunter-gatherers with the contemporaneous and geographically proximal Afanasievo population, is still largely unexplored.

In this study, we report 19 newly sequenced ancient hunter-gatherers from the Lake Baikal and its surrounding regions, spanning from the Upper Paleolithic to the Early Bronze Age. Their analyses alongside published data reveal the most deeply divergent ancestry that link Upper Paleolithic Siberians and the First Peoples of the Americas, and more clearly delineate the complex transition between Early Neolithic and Early Bronze Age populations in the Lake Baikal region. We also provide both human and pathogen genomic evidence demonstrating the influence of western Eurasian steppe populations in this region during the Early Bronze Age and discuss the genetic contribution of Lake Baikal hunter-gatherers to Siberian populations through time.

Admin
Administrator

Posts: 75,896

Mal'ta Boy (ANE): Origins of Native Americans Oct 12, 2021 4:45:36 GMT

Quote

Post by Admin on Oct 12, 2021 4:45:36 GMT

Results
Ancient DNA Sequencing
We generated genome-wide genotype data from 19 ancient humans, including one Upper Paleolithic individual (dated to 14,050–13,770 BP, see Orlova, 1995, Pavlenok et al., 2019), four Early Neolithic individuals (7,320–6,500 BP), and 14 Late Neolithic to Early Bronze Age (LNBA) individuals (4,830–3,570 BP) from a total of 10 archaeological sites (Figure 1; Table S1). The radiocarbon date offsets caused by the local freshwater reservoir were estimated using the carbon and nitrogen isotopic values as described in previous studies on the same region (see STAR Methods; Schulting et al., 2014, Schulting et al., 2015). We built single- and double-stranded DNA libraries from teeth or petrous portions of the temporal bone for the studied individuals, and shotgun sequencing revealed high levels of DNA preservation with endogenous DNA contents ranging from 0.12% to 50.54% (Table S1). Subsequently, libraries were enriched for human DNA by SNP-capture targeting a set of 1.24 million variable sites (Fu et al., 2015) and sequenced to mean coverage ranging from 0.04X to 2.07X. Pseudo-haploid genotypes were called on the targeted SNPs by randomly sampling a single allele at each position, with 34k to 886k SNPs covered by our samples. Additionally, we performed deep shotgun sequencing on eight individuals with high endogenous DNA levels (12%–51%), to achieve genomic coverage that ranged from 0.1X to 1.9X and refined their diploid genotypes by genotype likelihood-based imputation, resulting in 386k to 518k SNPs overlapping with the Human Origins dataset (Table S1).

Figure 1. Geographic Location, Time Period, and Genetic Profile of Studied Individuals

(A) Location of the 19 newly reported and published ancient individuals relevant in this study. Newly reported individuals are shown in outlined squares or circles.

(B) Ages of newly reported individuals from each site, with the x axis showing the median calibrated radiocarbon dates after correction for freshwater reservoir effect.

(C) PCA of Eurasian and Native American populations. The modern individuals are shown in light gray circles, with the name of several representative populations marking the positions of West Eurasians (Sardinian), East Asians (Ami and Uyghur), Siberians (Chukchi, Eskimo Naukan, Koryak, Nganasan, Selkup), and Native Americans (Chipewyan, Mixe, Karitiana). Ancient individuals are shown in colored symbols. The individual KPT005, which showed a significant shift toward west Eurasian populations, and individuals GLZ001&GLZ002, which also represented outlier genetic profiles, are marked out with an arrow.

(D) Population clustering pattern of studied individuals together with representative modern and ancient populations, when K = 16. Most Early Neolithic to Bronze Age Lake Baikal region individuals are modeled as admixture of northeast Asian (dark red), ANE (dark blue), and Nganasan component (purple). The BZK002 individual has similar genetic profile as Okunevo, with a significantly larger ANE proportion compared with Baikal individuals, while the KPT005 individual shows a large component associated to WHG (light blue).

We determined genetic sex by comparing the coverage on the sex chromosomes with the autosomal chromosomes, which revealed four females and 15 males. All individuals revealed low modern human DNA contamination at the mitochondrial level as well as through an estimation of X chromosomal heterozygosity on male individuals, except for KAG001 that showed 9.6% nuclear contamination (Table S1). No kin relationship was found among these individuals. We finally intersected our genotypes with SNPs on the Affymetrix Human Origins array (Lazaridis et al., 2014) and combined with published genotype data from 3,014 present-day worldwide individuals and 453 ancient individuals for population genomic analysis (Table S1).

Admin
Administrator

Posts: 75,896

Mal'ta Boy (ANE): Origins of Native Americans Oct 12, 2021 19:53:08 GMT

Quote

Post by Admin on Oct 12, 2021 19:53:08 GMT

Population Structure
We first performed principal-component analysis (PCA) to understand the genetic background of the studied individuals, against modern Eurasian and Native American populations, and projected selected ancient individuals onto the PCs calculated with modern ones (Figure 1C). Most of the Lake Baikal individuals occupied the space on a “ANE-NEA” cline running between “Northeast Asian” (NEA) ancestry represented by Neolithic hunter-gathers from the Devil’s Gate in the Russian Far East (Sikora et al., 2019, Siska et al., 2017), and the ANE ancestry represented by Upper Paleolithic Siberian individuals MA1, AfontovaGora 2 (AG2), and AfontovaGora 3 (AG3) (Fu et al., 2016, Raghavan et al., 2014a), which was first described by Damgaard et al. (2018a). Our newly sequenced Upper Paleolithic genome from the Ust-Kyakhta-3 site (UKY) just south to the Lake Baikal is placed close to the Mesolithic northeastern Siberian Kolyma individual (Sikora et al., 2019) and is shifted toward Native American populations compared to the rest of the ancient Baikal individuals along PC2. All four Early Neolithic individuals cluster with published Early Neolithic groups from the same region (Shamanka_EN, Lokomotiv_EN, UstBelaya_EN) (Damgaard et al., 2018a, Flegontov et al., 2019) designated as the “Baikal_EN” population. The LNBA individuals were divided into four groups. The major “Baikal_LNBA” group included 10 individuals and clustered with published Late Neolithic to Bronze Age Baikal populations (Shamanka_EBA, Kurma_EBA, UstIda_EBA, UstIda_LN, UstBelaya_BA). These individuals were positioned in PCA closer to ANE-related individuals compared with the Early Neolithic individuals from the same region, as well as closer to the Paleo-Eskimo Saqqaq individual (Rasmussen et al., 2010). Another two individuals (GLZ001 and GLZ002) from the Glazkovskoe predmestie site, unlike the third individual from the same archaeological site (GLZ003), seemed shifted from the main cluster and showed closer genetic affinity to the Devil’s Gate and Early Neolithic Baikal individuals. One of the six individuals from the Kachug site (KPT005) was substantially displaced from the Baikal_LNBA group toward western Eurasians along PC1, not along the ANE-NEA cline but toward later Bronze Age populations, suggesting a potential introgression of the Steppe-related ancestry. Finally, an Early Bronze Age individual (BZK002) from the Bazaikha site in the Yenisei River region further to the west of the Lake Baikal was significantly displaced toward ANE-related individuals and located close to published Bronze Age individuals associated to the Okunevo culture (Damgaard et al., 2018a).

Population clustering with ADMIXTURE based on worldwide populations also showed a similar clustering pattern. When selecting a K value of 16 (see STAR Methods), the published and newly sequenced individuals belonging to main Early Neolithic to Bronze Age Baikal groups all showed genetic profiles composed of a mixture of three major components that were mostly enriched in ANE-related individuals, northeast Asians, and central Siberians represented by the Uralic-speaking Nganasan population (Figure 1D). The ANE and central Siberian ancestries were both of higher proportion in most LNBA Baikal individuals than in the Early Neolithic ones, while GLZ001 and GLZ002 showed higher NEA ancestry, similar to the Early Neolithic population. The BZK002 individual presented a profile similar to the published Okunevo group (Damgaard et al., 2018b), with a much larger ANE component compared to other Lake Baikal individuals. The KPT005 individual also displayed a substantial contribution derived from European “Western Hunter-Gatherer” (WHG) ancestry, likely acquired through gene flow from the west.

We estimated the runs of homozygosity (ROH) of selected individuals together with published Baikal individuals (Table S1) and did not identify an inbreeding signal in any individual. The Kolyma individual showed significantly more ROH compared with other individuals, suggesting a smaller population size in Mesolithic northeastern Siberia (Figure S1). The sharing of identity-by-descent (IBD) segments between individuals suggested a close relationship between UKY and Kolyma, supporting our analyses based on genome-wide SNP data, and also revealed that Early Neolithic and LNBA Baikal individuals shared genetic affinity with each other as well as with the older UKY and Kolyma genomes (Figure S1).

Figure S1. Population Size and Relatedness of Lake Baikal Populations Revealed by ROH and IBD Segments, Related to Figure 1 and Table S1

This figure summarizes the accumulative ROH length detected in each individual (row 1), shared IBD segment length of individuals within population (row 2), and shared IBD segment length of UKY, Baikal_EN and Baikal_LNBA individuals with other population (row 3-5), respectively. The long segments (> 8Mb) and short segments (< 2Mb) are also summarized separately.

Admin
Administrator

Posts: 75,896

Mal'ta Boy (ANE): Origins of Native Americans Oct 12, 2021 22:19:01 GMT

Quote

Post by Admin on Oct 12, 2021 22:19:01 GMT

Upper Paleolithic Baikal Ancestry Links with Non-Arctic Native Americans
In the population structure analysis, we found the Upper Paleolithic UKY individual to be closely related with the northeastern Siberian Kolyma individual. This is further validated by outgroup f3 statistics (Figure 2A) where, similarly to Kolyma, UKY showed close genetic affinity with Native American and Beringian populations (Figure 2A). F4 statistics in the form of f4(Mbuti, X; Kolyma, UKY) revealed that Kolyma is more closely related to populations from northeastern Siberia and North America compared with UKY (Figure S2). We further applied f4 statistics to explore the relationship of UKY and Kolyma with Native Americans and USR1 that was described as an outgroup to all non-Arctic Native Americans (Moreno-Mayar et al., 2018). Both UKY and Kolyma were symmetrically related with non-Arctic Native Americans and USR1, while USR1 shared significantly more genetic affinity with Native American populations compared to UKY and Kolyma (Figure S2; Table S2).

Figure 2. Genetic Affinity between Upper Paleolithic UKY, Kolyma, and Native Americans

(A) Genetic affinity between UKY and worldwide population assessed by f3(Mbuti;X,UKY). The sampling location of UKY is shown with a green triangle. The 10 test populations with highest f3 values are shown in diamonds and other populations in circles.

(B) Graphic model of the relationship among UKY, Kolyma, and Native American populations. We first find the best fitted model with only UKY or only Kolyma as described in Figure S2 and then add Kolyma on the selected model with UKY and choose the best model based on maximum f-statistics Z scores and final scores reported for each model. The lineages related with Native American population are colored orange, and the northeast Asian-related lineages are colored red.

See also Figure S2 and Table S2.

Figure S2. Relationship between UKY, Kolyma, and Modern-Day Populations Based on f4 Statistics and qpGraph Modeling, Related to Figure 2

(A) This figure shows the different genetic affinities between UKY, Kolyma with worldwide population, assessed by f4(Mbuti, X; Kolyma, UKY). The test populations with significant f4 values (|Z| > 3) are shown in diamonds and other populations in circles.

(B) This figure shows the graphic modeling of UKY (left) and Kolyma (right) on the skeleton graph including Mbuti, AG3, Onge, Devil’s Gate, USR1, ASO and ESN described in STAR Methods. The best fitted model for each individual is selected based on the maximum f-statistics Z scores and final scores reported for each model.

We also investigated their genetic composition using qpAdm modeling (Haak et al., 2015) and found that both UKY and Kolyma possessed a similar level of ANE contribution, around 30%, when modeled as two-way mixture of Devil’s Gate (representing NEA ancestry) and AG3 (representing ANE ancestry) (Table S3). Noticeably, this model did not fit well for both UKY (p = 1.45E-03) and Kolyma (p = 3.98E-08), as the Native American Karitiana population showed extra affinity with the tested individuals compared to the fitted model (Table S3). This observation suggests that UKY and Kolyma shared a certain degree of genetic drift with Native American populations that occurred after the ancestors of Native Americans diverged from ANE and NEA ancestries.

We further explored the relationships among UKY, Kolyma, and ancient Native American groups using the graphic-based qpGraph modeling (Patterson et al., 2012, Reich et al., 2009). We found that both UKY and Kolyma could be modeled as mixture between a northeast Asian lineage and a sister group of the Native American clade represented by USR1, Ancient Southwestern Ontario (ASO) individuals from Canada, and Early San Nicolas (ESN) individuals from the California Channel Islands in the USA (Scheib et al., 2018; Figure S2). When UKY and Kolyma were included in the same graph, they were consistently modeled as descendants from two independent admixture events, with ancestral lineages of both deriving from the Native American-related and the northeast Asian-related clades (Figure 2B). These findings confirm the close affinity of UKY and Kolyma to Native Americans but also highlight that both lineages contributing to UKY were ancestral to the groups contributing to Kolyma. In addition, aside from the first wave migrating into the Americas through Beringia, the admixture modeling suggests that the source of the Native American ancestry was more broadly spread across Siberia during the Upper Paleolithic, as UKY was found to be ∼4,000 years older and over 3,000 km further to the southwest from Kolyma. In fact, our admixture graph indicates that this basal Native American group experienced multiple genetic contacts with northeast Asian populations giving rise to distinct ancient Siberian populations.