Genetic History of the Near East

new

Admin
Administrator

Posts: 72,874

Genetic History of the Near East May 31, 2020 7:52:22 GMT

Quote

Post by Admin on May 31, 2020 7:52:22 GMT

The Genomic History of the Bronze Age Southern Levant

Summary
We report genome-wide DNA data for 73 individuals from five archaeological sites across the Bronze and Iron Ages Southern Levant. These individuals, who share the “Canaanite” material culture, can be modeled as descending from two sources: (1) earlier local Neolithic populations and (2) populations related to the Chalcolithic Zagros or the Bronze Age Caucasus. The non-local contribution increased over time, as evinced by three outliers who can be modeled as descendants of recent migrants. We show evidence that different “Canaanite” groups genetically resemble each other more than other populations. We find that Levant-related modern populations typically have substantial ancestry coming from populations related to the Chalcolithic Zagros and the Bronze Age Southern Levant. These groups also harbor ancestry from sources we cannot fully model with the available data, highlighting the critical role of post-Bronze-Age migrations into the region over the past 3,000 years.

Graphical Abstract

Introduction
The Bronze Age (ca. 3500–1150 BCE) was a formative period in the Southern Levant, a region that includes present-day Israel, Jordan, Lebanon, the Palestinian Authority, and southwest Syria. This era, which ended in a large-scale civilization collapse across this region (Cline, 2014), shaped later periods both demographically and culturally. The following Iron Age (ca. 1150–586 BCE) saw the rise of territorial kingdoms such as biblical Israel, Judah, Ammon, Moab, and Aram-Damascus, as well as the Phoenician city-states. In much of the Late Bronze Age, the region was ruled by imperial Egypt, although in later phases of the Iron Age it was controlled by the Mesopotamian-centered empires of Assyria and Babylonia. Archaeological and historical research has documented major changes during the Bronze and Iron Ages, such as the cultural influence of the northern (Caucasian) populations related to the Kura-Araxes tradition during the Early Bronze Age (Greenberg and Goren, 2009) and effects from the “Sea Peoples” (such as Philistines) from the west in the beginning of the Iron Age (Yasur-Landau, 2010).

The inhabitants of the Southern Levant in the Bronze Age are commonly described as “Canaanites,” that is, residents of the Land of Canaan. The term appears in several 2nd millennium BCE sources (e.g., Amarna, Alalakh, and Ugarit tablets) and in biblical texts dating from the 8th–7th centuries BCE and later (Bienkowski, 1999, Lemche, 1991, Na’aman, 1994a). In the latter, the Canaanites are referred to as the pre-Israelite inhabitants of the land (Na’aman, 1994a). Canaan of the 2nd millennium BCE was organized in a system of city-states (Goren et al., 2004), where elites ruled from urban hubs over rural (and in some places pastoral) countryside. The material culture of these city-states was relatively uniform (Mazar, 1992), but whether this uniformity extends to their genetic ancestry is unknown. Although genetic ancestry and material culture are unlikely to ever match perfectly, past ancient DNA analyses show that they might sometimes be strongly associated. In other cases, a direct correspondence between genetics and culture cannot be established. We discuss several examples in the Discussion.

Previous ancient DNA studies published genome-scale data for thirteen individuals from four Bronze Age sites in the Southern Levant: three individuals from ‘Ain Ghazal in present-day Jordan, dated to ∼2300 BCE (Intermediate Bronze Age) (Lazaridis et al., 2016); five from Sidon in present-day Lebanon, dated to ∼1750 BCE (Middle Bronze Age) (Haber et al., 2017); two from Tel Shadud in present-day Israel, dated to ∼1250 BCE (Late Bronze Age) (van den Brink et al., 2017); and three from Ashkelon in present-day Israel, dated to ∼1650–1200 BCE (Middle and Late Bronze Age) (Feldman et al., 2019). The ancestry of these individuals could be modeled as a mixture of earlier local groups and groups related to the Chalcolithic people of the Zagros Mountains, located in present-day Iran and designated in previous studies as Iran_ChL (Haber et al., 2017, Lazaridis et al., 2016). The Bronze Age Sidon group could be modeled as a major (93% ± 2%) ancestral source for present-day groups in the region (Haber et al., 2017). A study of Chalcolithic individuals from Peqi’in cave in the Galilee (present-day Israel) showed that the ancestry of this earlier group included an additional component related to earlier Anatolian farmers, which was excluded as a substantial source for later Bronze Age groups from the Southern Levant, with the exception of the coastal groups from Sidon and Ashkelon (Feldman et al., 2019, Harney et al., 2018). These observations point to a degree of population turnover in the Chalcolithic-Bronze Age transition, consistent with archaeological evidence for a disruption between local Chalcolithic and Early Bronze cultures (de Miroschedji, 2014).

Here, we set out to address three issues. First, we sought to determine the extent of genetic homogeneity among the sites associated with Canaanite material culture. Second, we analyzed the data to gain insights into the timing, extent, and origin of gene flow that brought Zagros- and Caucasus-related ancestry to the Bronze Age Southern Levant. Third, we assessed the extent to which additional gene flow events have affected the region since that time.

To address these questions, we generated genome-wide ancient DNA data for 71 Bronze Age and 2 Iron Age individuals, spanning roughly 1,500 years, from the Intermediate Bronze Age to the Early Iron Age. Combined with previously published data on the Bronze and Iron Ages in the Southern Levant, we assembled a dataset of 93 individuals from 9 sites across present-day Israel, Jordan, and Lebanon, all demonstrating Canaanite material culture. We show that the sampled individuals from the different sites are usually genetically similar, albeit with subtle but in some cases significant differences, especially in residents of the coastal regions of Sidon and Ashkelon. Almost all individuals can be modeled as a mixture of local earlier Neolithic populations and populations from the northeastern part of the Near East. However, the mixture proportions change over time, revealing the demographic dynamics of the Southern Levant during the Bronze Age. Finally, we show that the genomes of present-day groups geographically and historically linked to the Bronze Age Levant, including the great majority of present-day Jewish groups and Levantine Arabic-speaking groups, are consistent with having 50% or more of their ancestry from people related to groups who lived in the Bronze Age Levant and the Chalcolithic Zagros. These present-day groups also show ancestries that cannot be modeled by the available ancient DNA data, highlighting the importance of additional major genetic effects on the region since the Bronze Age.

Admin
Administrator

Posts: 72,874

Genetic History of the Near East May 31, 2020 18:59:39 GMT

Quote

Post by Admin on May 31, 2020 18:59:39 GMT

Results
Dataset
We extracted DNA from the bones of 73 individuals from 5 archaeological sites in the Southern Levant (Table S1; STAR Methods; Figure 1A):
Thirty-five individuals from Tel Megiddo (northern Israel), most of whom date to the Middle-to-Late Bronze Age, except for one dating to the Intermediate Bronze Age and one dating to the Early Iron Age

Twenty-one individuals from the Baq‛ah in central Jordan (northeast of Amman), mostly from the Late Bronze Age

Thirteen individuals from Yehud (central Israel), dating to the Intermediate Bronze Age

Three individuals from Tel Hazor (northern Israel) dating to the Middle-to-Late Bronze Age

One individual from Tel Abel Beth Maacah (northern Israel), dating to the Iron Age

Figure 1. Bronze and Iron Age Individuals Analyzed in This Study

(A) Location of archaeological sites. Shown in blue are sites with individuals first reported in this paper. In green are sites with individuals reported in previous studies.

(B) PCA plot, showing present-day Eurasian individuals in gray (taken from Lazaridis et al., 2014) and ancient individuals in color. Only individuals with at least 30,000 autosomal SNPs were plotted. All Bronze and Iron Age individuals cluster (blue and green marks), except for the three denoted as “outliers” and for some IA1 individuals.

(C) ADMIXTURE plot with , showing Bronze and Iron Age individuals, as well as other selected populations. Only individuals with at least 30,000 autosomal SNPs were plotted. The seven families are marked by F1–F7. “A” stands for “Abel.”

See also Figure S1 and Table S1.

For all analyzed samples but one, DNA was extracted from petrous bones. The DNA was converted to double-indexed half Uracil-DNA glycosylase (UDG)-treated libraries that we enriched for about 1.2 million single nucleotide polymorphism (SNPs) before sequencing (see STAR Methods). The median number of autosomal SNPs covered was 288,863 (range 4,883–945,269). In addition to genetic data, we measured values of strontium isotopes for 12 individuals (and for 8 additional individuals that did not produce DNA) (STAR Methods; Methods S1A), and generated accelerator mass spectrometry radiocarbon dates for 20 individuals (Table S1). We combined our newly generated data with published data for 13 Bronze Age Southern Levant individuals from ‘Ain-Ghazal, Sidon, Tel Shadud, and Ashkelon (van den Brink et al., 2017, Feldman et al., 2019, Haber et al., 2017, Lazaridis et al., 2016), and 7 Iron Age Southern Levant Individuals from Ashkelon (Feldman et al., 2019).

We projected the autosomal genetic data onto the plane spanned by the first two principal components of 777 present-day West Eurasian individuals genotyped for roughly 600,000 SNPs on the Affymetrix Human Origins SNP array (Lazaridis et al., 2014). We restricted the plot to 68 individuals represented by at least 30,000 autosomal SNPs (Figure 1B), a coverage threshold where the ability to infer ancestry was robust to sampling noise (Methods S1B). All Bronze and Iron Age Levant individuals (blue and green shapes) form a tight cluster, except for three outliers from Megiddo, and previously identified outliers from the Ashkelon population known as Iron Age I (IA1) (Feldman et al., 2019). We also ran ADMIXTURE on a set of 1,663 present-day and ancient individuals (see STAR Methods; Figure S1). The ADMIXTURE results are qualitatively consistent with the principal component analysis (PCA), suggesting that all individuals but the outliers from Megiddo and the Ashkelon IA1 population have similar ancestry (Figure 1C).

Figure S1. Cross-Validation Errors in ADMIXTURE as a Function of K, Related to Figure 1C

(A) Using 1,663 individuals. (Blue) ADMIXTURE was run on 357,334 SNPs according to the ancient samples protocol. (Orange) ADMIXTURE was run on 50,165 SNPs according to the present-day samples protocol.

(B) Using 3,515 individuals. Only the present-day samples protocol was used.

We used the method described in (Olalde et al., 2019) to identify 17 individuals as being first-, second-, or third-degree relatives of other individuals in the dataset. They fall within seven families: five in Tel Megiddo and two in the Baq‛ah. In most families, we used only the member with the highest SNP coverage in subsequent analyses (Table S1). Two of the three Megiddo outliers are a brother and a sister (Family 4, I2189 and I2200), leaving in the final dataset two individuals marked as outliers. After removing low-coverage individuals and closely related family members, 62 individuals were left for further analysis (Table S1).

Admin
Administrator

Posts: 72,874

Genetic History of the Near East Jun 1, 2020 6:21:06 GMT

Quote

Post by Admin on Jun 1, 2020 6:21:06 GMT

High Degree of Genetic Affinities between Multiple Sites
We divided the 26 high-coverage individuals from Tel Megiddo into the following groups, on the basis of geographic location, archaeological period, and genetic clustering in PCA (Table S1): Intermediate Bronze Age (Megiddo_IBA, a single individual), Middle-to-Late Bronze Age (Megiddo_MLBA, 22 individuals), Iron Age (Megiddo_IA, a single individual), as well as the two outliers, Megiddo_I2200 and Megiddo_I10100, which were each treated as a separate group. We compared these groups and the other populations in our dataset to previously published data from other sites in the broader region and from earlier periods, including the Early Bronze Age Caucasus (Armenia_EBA), the Middle-to-Late Bronze Age Caucasus (Armenia_MLBA), the Chalcolithic Zagros Mountains (Iran_ChL), the Chalcolithic Caucasus (Armenia_ChL), the Neolithic of the Southern Levant (Levant_N), the Neolithic of the Zagros Mountains (Iran_N), and the Neolithic of Anatolia (Anatolia_N) (Lazaridis et al., 2016).

To test for variation in ancestry proportions among the Levant Bronze and Iron Age groups, we used qpWave. qpWave tests whether each possible pair of groups (Testi, Testj) is consistent with descending from a common ancestral population—that is, consistent with being a clade—since separation from the ancestors of a set of outgroup populations. qpWave works by computing symmetry test statistics of the form f4(Testi, Testj; Outgroupk, Outgroupl), which have an expected value of zero if (Testi, Testj) form a clade with respect to the outgroups. qpWave then generates a single p value corrected for the empirically measured correlation among the statistics (Reich et al., 2012). Using a distantly related set of outgroups, we found that with the exception of the outliers from Megiddo, Ashkelon IA1, and Sidon, all Bronze and Iron Age Levant groups are consistent with being pairwise clades with respect to the outgroups (Figure 2).

Figure 2. p values of qpWave for Each Pair of Populations

Values greater than 0.05 are shaded in light green, and values lower than 0.001 are shaded in light red. The rectangle shows all Levant populations excluding Sidon, the outliers from Megiddo and Ashkelon IA1. Computations were based on the o9a outgroup set (o9 + Anatolia_N).

We discuss each of qpWave’s findings of significant population substructure in turn. The Megiddo outliers not only fail to form a clade with the other populations, but also with each other. Ashkelon IA1 has previously been reported to harbor European ancestry, and so our finding that it is genetically differentiated from contemporary groups is unsurprising (Feldman et al., 2019). The significant differentiation of the Sidon individuals in qpWave—despite the fact that they roughly cluster with the other Southern Levant Bronze Age groups in PCA and ADMIXTURE—is notable, especially because we find that they are consistent with forming a clade with the two groups from coastal Ashkelon that do not have European-related admixture (the Bronze Age and later Iron Age groups ASH_LBA and ASH_IA2). Speculatively, this observation could be related to the fact that both Sidon and Ashkelon were port towns with connections to other Mediterranean coastal groups outside the Southern Levant, which could have introduced ancestry components that are absent from inland Levantine Bronze Age groups, although it is difficult to test this hypothesis in the absence of high resolution ancient DNA sampling from the eastern Mediterranean rim. The genetic distinctiveness of the Sidon individuals is also compatible with previous findings that Chalcolithic Levantine individuals from Peqi’in Cave are consistent with contributing some ancestry to the Sidon individuals, but not to the ‘Ain Ghazal ones (Harney et al., 2018). We considered the possibility that the significantly different genetic patterns we detect in the Sidon individuals could reflect their different experimental treatment compared with that of the other individuals in this study (shotgun sequencing of non-UDG-treated libraries compared with enrichment of UDG-treated libraries). To test this, we repeated the analyses by using only transversion SNPs, which are less prone to characteristic ancient DNA errors, but found no indication of systematic bias (Wang et al., 2015). However, we did find evidence of substructure within the Sidon individuals, and some but not all were consistent with forming a clade with inland Southern Levant populations, a finding that could reflect substantial cosmopolitan nature of this coastal site (Methods S1C, see Discussion).

To reveal subtler population structure, we repeated the qpWave analysis adding outgroups that are genetically closer to the test groups, such as Armenia_MLBA and Natufian (Figure 3). With this more powerful set of outgroups, Baq‛ah and Megiddo_IBA also provide evidence of not being pairwise clades with the remaining groups. Thus, beyond the broad observation of genetic affinities between sites, we also observe subtle ancestry heterogeneity across the region during the Bronze Age (see Discussion).

Figure 3. p values of qpWave for Each Pair of Populations

Values greater than 0.05 are shaded in light green, and values lower than 0.001 are shaded in light red. The rectangle shows all Levant populations excluding Sidon, the outliers from Megiddo and Ashkelon IA1. Computations were based on the extended outgroup set (o9 + Anatolia_N + Armenia_MLBA + Caucasus Hunter Gatherers [CHG] + Natufians).

Gene Flow into the Southern Levant During the Bronze Age
Two previous studies of Bronze Age individuals from ‘Ain Ghazal and Sidon modeled them as derived from a mixture of earlier local groups (Levant_N) and groups related to peoples of the Chalcolithic Zagros mountains (Iran_ChL) (Haber et al., 2017, Lazaridis et al., 2016). These groups were estimated to harbor around 56%3% and 48%4% Neolithic Levant-related ancestry for ‘Ain Ghazal (Lazaridis et al., 2016) and Sidon (Haber et al., 2017), respectively. We used qpAdm to estimate that Bronze and Iron Age Ashkelon (ASH_LBA and ASH_IA2) carry 54%5% and 42%5% Neolithic Levant-related ancestry, respectively. Next, we used qpAdm to test the same model for the data reported here and found that most Middle-to-Late Bronze Age groups fit the model, with point estimates of 48%–57% Levant_N ancestry. These ancestry proportions are statistically indistinguishable (Bonferroni-corrected z test), which corroborates the fact that they are consistent with forming pairwise clades in qpWave (Table S2; Methods S1D). The only group that failed to fit this model was Baq‛ah (p = 0.0003), even when using a wide range of outgroup populations (Table S2). This might be a result of ancestry heterogeneity across the Baq‛ah individuals (see below).

To obtain insight into the Zagros-related ancestry component, we focused on two questions: what is the likely origin of this ancestry component and what is its likely timing? Although people of the Chalcolithic Zagros are so far the best proxy population for this ancestry component, there is no archaeological evidence for cultural spread directly from the Zagros into the Southern Levant during the Bronze Age. In contrast, there is archaeological support for connections between Bronze Age Southern Levant groups and the Caucasus (Greenberg and Goren, 2009), a term we use to represent both present-day Caucasus, as well as neighboring regions such as eastern Anatolia (see Discussion). With regard to the timing of these events, archaeology points to cultural affinities between the Kura-Araxes (Caucasus) and Khirbet Kerak (Southern Levant) archaeological cultures in the first half of the 3rd millennium BCE (Greenberg and Goren, 2009), and textual evidence documents a number of non-Semitic, Hurrian (from the northeast of the ancient Near East) personal names in the 2nd millennium BCE, for example in the Amarna archive of the 14th century BCE (Na’aman, 1994b). We therefore reasoned that the Chalcolithic Zagros component might have arrived into the Southern Levant through the Caucasus (and even more proximately the northeastern areas of the ancient Near East, although we have no ancient DNA sampling from this region). This movement might not have been limited to a short pulse, and instead could have involved multiple waves throughout the Bronze Age.

Admin
Administrator

Posts: 72,874

Genetic History of the Near East Jun 1, 2020 20:45:49 GMT

Quote

Post by Admin on Jun 1, 2020 20:45:49 GMT

To test whether the origin of the gene flow was from the Caucasus, rather than directly from the Zagros region, we ran qpAdm, replacing Iran_ChL with Early Bronze Age Caucasus (Armenia_EBA). We found that the Caucasus model received similar support to that of the Zagros model (Table S2; Methods S1E). Next, we modeled Armenia_EBA as a mixture of an earlier Caucasus population (Chalcolithic Armenia, Armenia_ChL) and Iran_ChL and found that indeed Armenia_EBA is compatible with this model (Table S2). Altogether, we conclude that our data are also compatible with a model in which Zagros-related ancestry in the Southern Levant arrived through the Caucasus, either directly or via intermediates.

To study the timing of the admixture of Zagros-related ancestry in the Southern Levant, we leveraged the large time span of individuals in our dataset, extending across roughly 1,500 years, from the Intermediate Bronze Age to the Early Iron Age. Using qpAdm-based ancestry estimates for each of the individuals, we found that almost all are compatible with being an admixture of groups related to the Neolithic Levant and Chalcolithic Zagros. One exception to this is an individual in Megiddo_MLBA that is weakly compatible with the model. Another exception is three individuals in the Baq’ah (Table S2), which suggests that the difficulty in modeling individuals from this site as a mixture of Neolithic Levant and Chalcolithic Zagros might reflect ancestry heterogeneity (Figure 3). These results do not change qualitatively when we used a larger set of outgroup populations (Table S2). We observed that the oldest individuals in our collection, from the Intermediate Bronze Age, already carried significant Zagros-related ancestry, suggesting that gene flow into the region started before ca. 2400 BCE. This is consistent with the hypothesis that people of Kura-Araxes archaeological complex of the 3rd millennium BCE might have affected the Southern Levant not only culturally, but also through some degree of movement of people. Our data also imply an increase in the proportion of Zagros-related ancestry after the Intermediate Bronze Age, as reflected in a significantly positive slope in a linear regression of the Chalcolithic-Zagros-related ancestry over the calendar year (, Jackknife), amounting to an increase of ∼14% per thousand years (Figures 4 and S2A). However, we caution that the number of individuals and their time span are insufficient to determine whether the increase in the Zagros-related ancestry happened continuously during the Middle and Late Bronze Ages, or whether there were multiple distinct migration events.

Figure 4. Temporal Changes in the Genetic Makeup of Individuals in the Bronze and Iron Age Levant

Fraction of Chalcolithic-Iran-related component in each individual as computed by qpAdm, modeling each individual as a mixture of Neolithic Levant and Chalcolithic Iran and using the o9a outgroup set (o9 + Anatolia_N). Vertical error bars denote one standard error in each direction. Horizontal error bars denote estimated time ranges. Dashed line describes the linear regression. Only individuals whose time range does not exceed 250 years are plotted and used in the regression. Note that the two well-dated Ash_LBA individuals happen to harbor the highest Iran_ChL component.

See also Figures S2 and S3 and Tables S2 and S3.

Figure S2. Fraction of Chalcolithic Iran-Related Component in Each Individual as Computed by qpAdm, Related to Figure 4

Modeling each individual as a mixture of Neolithic Levant and Chalcolithic Iran, and using either (A) the o9aensw outgroup set (o9 + Anatolia_N + EHG + Natufian + Switzerland_HG + WHG) or (B) the o9a outgroup set (o9 + Anatolia_N). Vertical error bars denote one standard error. Horizontal error bars denote estimated time ranges. Dashed line describes the linear regression. Only individuals whose time range does not exceed 250 years are plotted and used in the regression.

The two outliers from Megiddo (three including the sibling pair) provide additional evidence for the timing and origin of gene flow into the region. The three were found in close proximity to each other at Level K-10, which is radiocarbon dated to 1581–1545 BCE (domestic occupation) and 1578–1421 BCE (burials; both ± 1 s) (Martin et al., 2020, Toffolo et al., 2014), whereas the bone of one of the three (I10100) was directly dated (1688–1535 BCE, ± 2Σ). The reason these individuals are distinct from the rest is that their Caucasus- or Zagros-related genetic component is much higher, reflecting ongoing gene flow into the region from the northeast (Table S2; Figure S2B). The Neolithic Levant component is 22%–27% in I2200, and 9%–26% in I10100. These individuals are unlikely to be first generation migrants, as strontium isotope analysis on the two outlier siblings (I2189 and I2200) (Methods S1A) suggests that they were raised locally. This implies that the Megiddo outliers might be descendants of people who arrived in recent generations. Direct support for this hypothesis comes from the fact that in sensitive qpAdm modeling (including closely related sets of outgroups), the only working northeast source population for these two individuals is the contemporaneous Armenia_MLBA, whereas the earlier Iran_ChL and Armenia_EBA do not fit (Table S2). The addition of Iran_ChL to the set of outgroups does not change this result or cause model failure. Finally, no other Levantine group shows a similar admixture pattern (Table S2). This shows that some level of gene flow into the Levant took place during the later phases of the Bronze Age and suggests that the source of this gene flow was the Caucasus.

Altogether, our analyses show that gene flow into the Levant from people related to those in the Caucasus or Zagros was already occurring by the Intermediate Bronze Age, and that it lingered, episodically or continuously, at least in inland sites, during the Middle-to-Late Bronze Age.

Admin
Administrator

Posts: 72,874

Genetic History of the Near East Jun 2, 2020 19:14:01 GMT

Quote

Post by Admin on Jun 2, 2020 19:14:01 GMT

Further Change in Levantine Populations Since the Bronze Age
To develop a sense of population changes in the Levant since the Bronze Age, we attempted to model groups that have a tradition of descent from ancient people in the region (Jews) as well as Levantine Arabic-speakers as mixtures of various ancient source populations. qpAdm assumes no admixture between groups related to the outgroups and the source populations, but almost all present-day Levantine and Mediterranean populations have significant sub-Saharan-African-related admixture that the ancient groups did not. This eliminates many key outgroups for qpAdm and reduces the utility of the method in this context. In particular, we were not able to apply qpAdm to get a single working model for the majority of present-day West Eurasian populations. As an alternative, we developed a methodology we call LINADMIX, which relies on the output of ADMIXTURE (Alexander et al., 2009) and uses constrained least-squares to estimate the contribution of given source populations to a target population (see STAR Methods). As a complementary approach, we developed a tool we call pseudo-haplotype ChromoPainter (PHCP), which is an adaptation of the haplotype-based method ChromoPainter (Lawson et al., 2012) to ancient genomes (see STAR Methods; Methods S1F). We first established that these methods provide meaningful estimates of ancestry in the context of this study by using them to re-compute the ancestry proportions that we were able to model with qpAdm. Both LINADMIX and PHCP (Table S3; Figure S3; Methods S1F) produce qualitatively similar estimates as qpAdm (Table S2). To further establish the methods, we performed simulations that were designed to test the methods’ abilities to infer ancestry proportions in present-day populations in a setup similar to the current study (Methods S1H). For this, we generated present-day populations as a mixture of two closely related ancient populations with and without a third, more distant, population. Both methods estimated the ancestry proportion of the distant source population with errors of up to 4% and the proportions of the closely related source populations with errors of up to 10%. Thus, although ADMIXTURE, the basis of LINADMIX, is known to have certain pitfalls as a tool for quantifying ancestry proportions (Lawson et al., 2018), in the case of individuals with ancestry sources similar to those we have analyzed here, our results suggest that both LINADMIX and PHCP are highly informative.

Figure S3. LINADMIX and PHCP Results on Ancient Populations, Related to Figure 4 and to STAR Methods, LINADMIX, PHCP

(A) Comparison between qpAdm and LINADMIX. Bars show the fraction of Neolithic Levant in different populations. Error bars show one standard deviation.

(B) LINADMIX of individual samples. Bars show the fraction of Neolithic Levant in different individuals, when the other source population is either Iran_ChL or Armenia_EBA.

(C) Fraction of Levant_N in different ancient populations (when the other source population is Iran_ChL) as computed by LINADMIX and PHCP.

For the LINADMIX analysis of present-day populations, we used a background dataset of 1,663 present-day and ancient individuals from 239 populations genotyped by using SNP arrays and focused our analysis on 14 Jewish and Levantine present-day populations, along with modern English, Tuscan, and Moroccan populations that were used as controls. We used LINADMIX to model each of the 17 present-day populations as an admixture of four sources: (1) Megiddo_MLBA (the largest group) as a representative of the Middle-to-Late Bronze Age component; (2) Iran_ChL as a representative of the Zagros and the Caucasus; (3) Present-day Somalis as representatives of an Eastern African source (in the absence of genetic data on ancient populations from the region); and (4) Europe_LNBA as a representative of ancient Europeans from the Late Neolithic and Bronze Age (Methods S1I; Table S4; Figure S4). We also applied PHCP to these 17 present-day populations (Methods S1G; Table S4; Figure S4). Comparison of PHCP and LINADMIX shows that they agree well with respect to the Somali and Europe_LNBA component, and therefore also for the combined contribution of Iran_ChL and Megiddo_MLBA (Methods S1G; Figure S4). However, they deviate regarding the respective contributions of Iran_ChL and Megiddo_MLBA (Figure S4), likely because of the fact that the Megiddo_MLBA and Iran_ChL are already very similar populations (Table S3). To only consider results that are robust and shared by LINADMIX and PHCP, we have combined Megiddo_MLBA and Iran_ChL to a single source population representing the Middle East for our main results (Figure 5). We further verified these conclusions, as well as the robustness of the estimations, by using a different representative for the Bronze Age Levantine groups as a source (Tables S4 and S5; Methods S1J) and using perturbations to the ADMIXTURE parameters (Table S4; Methods S1K). Combined, these results suggest that modern populations related to the Levant are consistent with having a substantial ancestry component from the Bronze Age Southern Levant and the Chalcolithic Zagros. Nonetheless, other potential ancestry sources are possible, and more ancient samples might enable a refined picture (Table S4).

Figure S4. LINADMIX and PHCP models on modern populations, Related to Figure 5

(A) The contribution of each of the source populations to the examined present-day populations, using LINADMIX and PHCP.

(B) LINADMIX and PHCP relative contribution of each of the source populations to the present-day target population listed on the x axis.

Figure 5. Estimated Fractions Contributed by Different Ancient Populations to Present-Day Groups

Seventeen present-day populations were modeled as an admixture of groups related to four source populations. The upper graphic shows the norm of residuals of the models, and the lower graphic shows the relative contribution of each of the source populations to the present-day target population listed on the x axis.

(A) LINADMIX.

(B) PHCP.

See also Figure S4 and Tables S4 and S5.

The results show that since the Bronze Age, an additional East-African-related component was added to the region (on average ∼10.6%, excluding Ethiopian Jews who harbor ∼80% East African component), as well as a European-related component (on average ∼8.7%, excluding Ashkenazi Jews who harbor a ∼41% European-related component). The East-African-related component is highest in Ethiopian Jews and North Africans (Moroccans and Egyptians). It exists in all Arabic-speaking populations (apart from the Druze). The European-related component is highest in the European control populations (English and Tuscan), as well as in Ashkenazi and Moroccan Jews, both having a history in Europe (Atzmon et al., 2010, Carmi et al., 2014, Schroeter, 2008). This component is present, although in smaller amount, in all other populations except for Bedouin B and Ethiopian Jews. As expected, the English and Tuscan populations have a very low Middle-Eastern-related component. Whereas LINADMIX and PHCP have high uncertainty in estimating the relative contributions of Megiddo_MLBA and Iran_ChL, the results and simulations nevertheless suggest that additional Zagros-related ancestry has penetrated the region since the Bronze Age (Methods S1I). Except for the populations with the highest Zagros-related component, PHCP estimates lower magnitudes of this component (Figure S4A), and therefore detection by PHCP of a Zagros-related ancestry is likely an indication for the presence of this component. Indeed, examining the results of LINADMIX and PHCP on all four source populations (Figure S4), we observe a relatively large Zagros-related component in many Arabic-speaking groups, suggesting that gene flow from populations related to those of the Zagros and Caucasus (although not necessarily from these specific regions) continued even after the Iron Age (Methods S1I).

Altogether, the patterns of the present-day populations reflect demographic processes that occurred after the Bronze Age and are plausibly related to processes known from the historical literature (Methods S1I). These include an Eastern-African-related component that is present in Arabic-speaking groups but is lower in non-Ethiopian Jewish groups, as well as Zagros-related contribution to Levantine populations, which is highest in the northernmost population examined, suggesting a contribution of populations related to the Zagros even after the Bronze and Iron Ages.