|
Post by Admin on Apr 26, 2022 21:24:54 GMT
Admixture in Roma and South Asian origin of the proto-Roma population Admixture events that have shaped the genetic composition of the Roma population were inferred with GLOBETROTTER. For all European Roma clusters, “one-date” type of admixture event (single admixture date between two sources) was detected involving two sources: a West Eurasian-like major source and South Asian-like minor source, around 1270–1580 (S3 Table, Fig 2, Table 1). This interval of admixture dates overlaps with the period when the first historical records report the presence of Roma groups in each European country, although these records represent the lower limits for the actual first Roma settlements. In general, Roma from the surroundings of the Balkan Peninsula and Central Europe (RomaMix-1, RomaMix-2, RomaMix-3, RomaUkr) have earlier admixture dates (Table 1), which supports the dispersion into Europe via the Balkans [15]. Fig 2. West Eurasian and South Asian ancestry of the European Roma (Dataset1) from GLOBETROTTER results. Pie charts on the map show the geographic location of the donor populations. Grey diamonds display those samples that do not contribute to the Roma ancestry. For each Roma cluster, the major source (West Eurasian like) and minor source (South Asian like) are shown: the proportion (in percentage) of each source and a horizontal bar with the proportions of the donor populations in each source. Only donor groups that contribute a minimum of 3% to the Roma genomes are shown. 1000G population labels are used in the map for ITU (Indian Telugu from the UK), STU (Sri Lankan Tamil from the UK), BEB (Bengali from Bangladesh), PJL (Punjabi from Lahore, Pakistan). Regarding the South Asian-like source, it contributes around 35% to the admixture and its most representative cluster is Punjabi-1, from Northwestern India, (Fig 2, S3 Table). Although Punjabis have a linguistically uniform identity [23], they are genetically heterogeneous. In fact, Punjabi samples do not cluster together, instead they are spread along PC2 (S1 Fig), as well as in the fineSTRUCTURE dendrogram (S3 Fig), with three different Punjabi clusters with increasing levels of ANI component (S5 Fig, S4 and S5A Tables). Thus, most of the South Asian ancestry of the Roma is mainly shared with the group of individuals from Punjab with less West Eurasian component (Punjabi-1, S3 Table). The rest of South Asian surrogates identified in the minor source correspond to southeastern Dravidian-speaking populations (E-India, Irula clusters) (Fig 2, S3 Table), which also exhibit low levels of West Eurasian ancestry (S5 Fig, S5A Table). Altogether, these findings suggest that the most likely proxy for the South Asian origin of the proto-Roma, is the ancestral source here described as a mixture of present-day South Asian groups with a low West Eurasian signature.
|
|
|
Post by Admin on Apr 27, 2022 6:32:28 GMT
Recent West Eurasian admixture The West Eurasian-like source contributes around 65% to the admixture event. This component captures the recent West Eurasian admixture between the proto-Roma and West Eurasians during their diaspora from India to Europe, in other words, it does not include the AWE component present in South Asian populations (S1 Note, S6 Fig) estimated to be around 15% (S5B Table). This recent West Eurasian ancestry is lower in the Roma groups from the Balkan Peninsula and Central Europe (RomaMix-1 and RomaMix-2), around 60%, and it increases up to 80% (RomaIberia-2) as the distance from the Balkans increases (Fig 2, S3 Table).
The main contribution of this major source is from southeastern European clusters (Balkan-1 and Balkan-2), with this area being the historically reported gateway of the Roma groups into Europe [1]. The component from Middle East and Caucasian clusters was found to be moderate in the Roma groups. Besides these two components, additional distinct European ancestries are detected in the Northwestern Roma groups from the Baltic (Estonia-Lithuania) and Iberia (Spain-Portugal). Specifically, while RomaBalt cluster shows a northeastern European component (NE-Europe1 cluster), RomaIberia-1 and RomaIberia-2 contain a southwestern European component (SW-Europe1 and SW-Europe2) each. This result indicates that, in the Roma groups that migrated to Northern and Southwestern Europe, the Balkan component left a footprint still clearly detectable today, though having been highly reconfigured by admixture in the Baltic region and the Iberian Peninsula, respectively (Fig 2, S3 Table).
Regarding the Iberian Roma, the samples constitute two highly differentiated clusters (RomaIberia-1 and RomaIberia-2) not found elsewhere, which suggests a deep genetic substructure within the Roma settled in Iberia (Figs 1 and 2, S3 Table).
Sex-biased gene flow As mentioned above, the European Roma ancestry contains two main sources: the West Eurasian (European and MiddleEast-Caucasus) and the South Asian components. However, these ancestry proportions differ significantly when comparing the X chromosome to the autosomes: the South Asian ancestry is significantly higher in the X chromosome while the MiddleEast-Caucasus proportion is significantly higher in the autosomes (S6 Table, S7 Fig). These results point to a sex-biased admixture during the Roma diaspora, likely characterized by a higher influx of non-Roma males than females from the Middle East and Caucasus. The proportions of European ancestry contained in the autosomes and the X chromosome are similar, although RomaBalt, RomaIberia-1, RomaIberia-2 and RomaMix-4 show higher levels of European ancestry in the autosomes. These findings can also indicate different sex-biased gene flow processes in the European Roma groups, which might be the result of different social patterns among groups. Future studies with mtDNA and Y- chromosome data could add further insights into these results, as well as sex-specific fertility inheritance processes in the Roma population [24].
Roma demographic patterns To investigate the effective population size (Ne) dynamics, we have estimated the Ne of each Roma group and the ancestry-specific Ne. On one hand, all Roma groups show a long uninterrupted Ne decrease followed by an increase of Ne (without reaching the levels of the NorthItaly cluster, which we used as a European reference) (S8 Fig). The change of the Ne trend is slightly correlated with the start of the admixture in each Roma group (S9 Fig), which might point to the gradual settlement of the Roma population in Europe. On the other hand, we inferred Ne through time for the three ancestral Roma source populations (European, MiddleEast-Caucasus and SouthAsian), focusing on their Ne before the admixture: 34 generations ago, as the more ancient lowest confidence interval (CI) inferred from GLOBETROTTER is found in RomaMix-2 at 1164 CE (S7A Table). The European Neg = 34 is 2.12 to 2.64 times higher than the South Asian Neg = 34, which is 1.27–1.43 times higher than the MiddleEast-Caucasus Neg = 34 (S7B Table). In contrast, the fold-change between the European and South Asian ancestry proportions is lower than 2 in all Roma groups (except RomaIberia-2 and RomaMix-4) and between South Asian and MiddleEast-Caucasus ancestry proportions is higher than 1.5 fold in all Roma groups (S7C Table). These differences between the ancestry proportions and the ancestry-specific Ne could be explained by the fact that a small South Asian proto-Roma group of founders had a continuous gene flow with different non-related groups from the MiddleEast and Caucasus and different non-Roma European populations, during their West Eurasian diaspora (see S4 Note).
Runs of homozygosity (ROH) were computed to assess the levels of inbreeding and the degree of genetic isolation in the Roma groups. In general, the mean ROH length of the Roma groups is significantly higher than the mean of the non-Roma reference Balkan-2 and Punjabi-1 clusters. For all ROH length categories, Roma groups present similar values than those of Kalash (S10 Fig, S8A Table), which is known to be a highly inbred population [25], possibly due to genetic isolation, although their isolation degree is in debate [26,27]. The average ROH lengths of the Roma maintain high values after a first significantly decrease between the first and the second categories (1–2 and 2–3 Mb, respectively) (S8B Table), which suggest that the inbreeding signals of Roma are the result of a continuous, although decreasing, level of isolation, from historical to recent times. Furthermore, the Roma groups with more West Eurasian ancestry (IberianRoma-2 and RomaMix-4) are the clusters with the lowest mean ROH length values across all categories (S10 Fig). Thus, these results additionally evidence a degree of heterogeneity within Roma from the Iberian Peninsula that need to be further investigated.
|
|
|
Post by Admin on Apr 27, 2022 17:09:55 GMT
Iberian Roma genetic characterization Iberian Roma substructure. To further explore the genetic structure of the Iberian Roma population, we included 34 newly genotyped Roma samples from the Iberian Peninsula (Dataset2, S9 Table, see Material and Methods). These samples fall between European and South Asian populations in the PCA (S11 Fig), the ADMIXTURE analysis (S12 Fig), and the fineSTRUCTURE dendrogram (S13 Fig), in agreement with the above results using European Roma individuals (Dataset1). Iberian Roma samples were classified in four different genetic clusters (S14 Fig). Although the four Iberian Roma groups are only partially clustered by geography, different patterns are discerned: IberianRoma-1 and IberiaRoma-2 contain samples from the northeastern region of the Iberian Peninsula, IberianRoma-3 is restricted to the south, and IberianRoma-4 is mainly formed by samples from the northwestern region (S14A and S14B Fig). As shown in the dendrogram (S14B Fig), IberianRoma-4 is the most significantly differentiated group (p < 0.001) (S15 Fig, S10 Table). Analogously to Dataset1 dendrogram (S3 Fig), the non-Roma reference samples were classified in 83 clusters, which can be summarized in four large super-groups (MiddleEast-Africa, Europe, MiddleEast-Caucasus, and Central-SouthAsia) (S13 Fig). Recent West Eurasian admixture in Iberian Roma groups. Admixture events in the Iberian Roma clusters were inferred with GLOBETROTTER. As shown above for the general Roma groups (Dataset1), one admixture event between a West Eurasian-like major source and a South Asian-like minor source was detected in each of the four Iberian Roma groups (S11 Table). The date intervals (95% CI) of the inferred admixture event for each Iberian Roma cluster are: 1210–1557 (IberianRoma-1), 1241–1536 (IberianRoma-2), 1279–1583 (IberianRoma-3), and 1532–1730 (IberianRoma-4), having the latter the most recent dates (S16 Fig). Regarding the minor source, the most contributing clusters are Punjabi-1, E-India, NE-India and Irula (S11 Table), as observed in Dataset1, which fits the hypothesis that the Roma origin can be placed in a group of South Asian individuals with low West Eurasian ancestry. The West Eurasian-like source mainly consists of Balkan and Southwestern European clusters (SW-Europe2, SW-Europe3, and Basque) and, in less proportion, Middle Eastern and Caucasian populations (Egypt-Bedouin, W-Caucasus2, and Georgia) (Fig 3, S14C Fig, S11 Table), which reinforces the evidence of the three main focus of migration of the Iberian Roma: their way out from Northwestern India, the entrance into Europe from the Balkan Peninsula, and the arrival into the Iberian Peninsula. Although the surrogate populations involved in the admixture event of the four Iberian Roma groups are similar, some distinctness can be appreciated. IberianRoma-4, as mentioned above, is the most differentiated group and GLOBETROTTER results suggest that it is due to the different source and proportion of European ancestry: first, the contribution of Southwestern European clusters is higher than in the rest of the Iberian Roma clusters; and second, other European clusters (NorthItaly, E-Europe2, and NW-Europe2) are also identified, but they are absent in the rest of Iberian Roma groups (Fig 3, S14C Fig, S11 Table). The inferred IberianRoma-4 admixture event is the only one that contains Balkan and Middle East surrogates in the minor source, possibly as a result of its high non-Roma European ancestry (S11 Table). Moreover, IberianRoma-3 exhibits some degree of Northwest African admixture (~1%), probably due to its southern location in the Iberian Peninsula (S14C Fig, S11 Table), where historically the North African gene flow into the general Iberian population was more relevant [28,29]. Besides, IberianRoma-3 is also the group with more NE-Europe2 (~2%) (S14C Fig, S11 Table). IberianRoma-2 contains exclusively Roma samples from the Basque country and, accordingly, it shows the highest non-Roma Basque ancestry. Altogether, these results confirm the presence of genetic substructure and differential admixture within the Iberian Roma population, revealing four distinct patterns of spatial distribution (Fig 3), and, furthermore, reject a putative North African origin of the Iberian Roma groups [30]. Fig 3. West Eurasian ancestry of the Iberian Roma (Dataset2). Kriging model of the spatial distribution of the major source donor proportion inferred with GLOBETROTTER, reflecting the West Eurasian ancestry proportions in each Roma group (A-D). Demographic patterns in Iberian Roma. Overall, Iberian Roma show a significantly higher mean ROH length than the non-Roma reference European populations (Basque and SW-Europe2) and the Punjabi-1 cluster. At larger ROH length categories, Iberian Roma present higher values than Kalash (S17 Fig, S12A Table). In addition, some specific trends can be recognized in the Iberian Roma groups. Namely, the progressive decline of ROH length in IberianRoma-4 is significantly different from the rest of Roma groups and it mirrors the SW-Europe2 one, being their differences not significant (S12A Table). On the other hand, IberianRoma-2 exhibits a sudden decrease of ROH length at 4–5 ROH category, although differences are not significant probably due to their low sample size (S12B Table); while IberianRoma-1 and IberianRoma-3 show high levels of inbreeding (significant p-values only between the 1-2Mb category and the rest of ROH length categories), suggesting different degrees of relatedness in the Iberian Roma groups. The Ne estimations through time in each Iberian Roma group are lower than the ones from SW-Europe2, and a constant Roma Ne reduction is detected from around 750 to 1600 (S18 Fig). This Ne reduction trend is reversed after the admixture event inferred by GLOBETROTTER. These results agree with the ones obtained for Dataset1, which contains all European Roma groups.
|
|
|
Post by Admin on Apr 27, 2022 18:40:30 GMT
Discussion The demographic history of the Roma population is characterized by a series of bottlenecks and admixture events that have occurred since the proto-Roma left India, after their arrival to the Balkans and spread throughout Europe, and in the case of Iberian Roma, after their settlement in the Iberian Peninsula. The study of their genetic profile in a worldwide context places them between South Asians and Europeans, which confirms previous findings of admixture [10,15,16]. A fine-scale approach has allowed us to distinguish the recent West Eurasian component, which is the result of the admixture with non-Roma West Eurasian populations. Our estimates of this recent West Eurasian component, around 65%, are lower than the previously reported 80% [16], as it only includes the “post-exodus from India” admixture and not the “pre-exodus from India” AWE component (around 15% based on the f4 ratio estimates). This recent West Eurasian component was acquired between 1270–1580. Although GLOBETROTTER infers this admixture as a single pulse event (“one-date”), it would require large datasets to distinguish continuous from single pulse admixture [31].
Regarding the origin of the proto-Roma population, Northwestern India has been previously proposed as the putative source of their South Asian ancestry [4,5]. Although it is a geographically well-defined area, their populations are socially, linguistically, and genetically heterogeneous, with high levels of stratification and substructure: their lands comprise from tribe clans to upper-caste groups, and from Dravidian to Indo-European speaking groups [32]. Our analyses show that they are dispersed along the PC with different admixture proportions (S1–S3 and S5 Figs). Within the boundaries of Northwestern India, the Punjab region has been further placed as the ancestral homeland of the proto-Roma, through different approaches: identity by descend (IBD) sharing analyses [16], Approximate Bayesian Computation models [15], and mitochondrial M lineages [10] and tau haplotype [33] comparisons between Roma and South Asians. However, the linguistic identity that characterizes the Punjabi population is independent of their historical origin and social designation [23]. Punjab is a strategic region that has suffered repeated invasions from different sources [32], explaining why nowadays encompasses heterogeneous population with differential admixture and ancestral components. We have shown that the Punjabi samples are genetically heterogeneous, which mainly differ in the proportion of West Eurasian ancestry, further confirming previous results [7]. Our results add in the indication that the original genetic composition of the proto-Roma seems nearest to that of the Punjabi cluster from the less West Eurasian admixed group. Assuming that the individuals from this Punjabi cluster were already in Punjab when the rest of Punjabi clusters admixed with West Eurasians, socio-historical factors might have determined their differential admixture. In other words, this Punjabi cluster might derive from Punjabis who belonged to a lower caste group, since in agreement with previous studies, Indian lower caste groups are characterized by less West Eurasian admixture [6,7]. In addition, we have reported that Dravidian-speaking populations with high ASI ancestry (i.e. E-India and Irula clusters) are also involved in the South Asian source of the Roma individuals. These two sources of South Asian ancestry could solve the contradiction regarding the identification of uniparental Roma lineages with a Northwestern Indian origin [11] and the high Y-STR haplotype sharing among Roma and South Indian populations [34], as these findings could be explained by two overlapping scenarios. The first one, first mentioned by Turner [4], consists in considering a previous migration of nomadic groups into Northwestern India from Central India around 250 BCE and, after several centuries in Punjab with few external admixture, a single group of proto-Roma individuals left India. The second scenario refers to the fact that the genomes of present-day North Indians have more West Eurasian ancestry due to subsequent gene flow from West Eurasians after the proto-Roma left India [20], which explains the combination of populations with low West Eurasian ancestry identified in the South Asian Roma component. These two scenarios fit the idea that the Roma people descend from a single initial founder population [15].
After the exodus from India and during the diaspora through West Eurasia, the Roma population admixed with multiple non-Roma European, Middle Eastern and Caucasian groups. First, the European Roma ancestors arrived to Armenia through Persia [1]. Our results agree with a moderate Middle East and Caucasus gene flow during a rapid migration across this territory [15], specifically, we detect a higher rate of male gene flow, which could be related to the incorporation of Persian nomadic groups with the Roma [1]. Then, historical records suggest that, in Armenia, they followed the same route as the displaced Armenians towards Anatolia, due to the Mongol and Seljuq invasions (a Persian Muslim dynasty), from where they were pushed to the west until their entrance into Europe through the Thrace region in the Balkan Peninsula [35]. They settled in the Balkans for almost 200 years [35], where the Greek impact on the Romani language was much more extensive than the Persian [1]. Accordingly, we have identified the Balkan admixture footprint in the European Roma genomes with an ancestry gradient correlated with the distance to the Balkans: from 45% in Bulgarian, Greek, and Serbian Roma; to 25% in Lithuanian, Estonian, and Iberian Roma, which is further evidence that the dispersion into Europe took place via the Balkans [15]. After subsequent migrations and dispersions across Europe, Roma groups reached Northeastern Europe (e.g. Lithuania and Estonia) and Southwest Europe (e.g. Iberian Peninsula), at the beginning of the 16th and 15th centuries, respectively [1]. Particularly in these groups, we have identified the Baltic and Iberian components besides the common Balkan component.
In relation to the demographic dynamics, we have shown that the Ne reduction of the Roma groups ceased after the start of the admixture event, which points to the settlement of Roma in Europe and the beginning of more intense assimilation politics during the seventeenth century [1]. The Ne estimates (as discussed in S3A Note) might reflect Ne changes in the Roma groups due to a population expansion or the non-Roma West Eurasian admixture. In addition, the levels of inbreeding in the Roma population are higher than in non-Roma Europeans and similar to those of South Asian groups, which could be the result of endogamy practices and/or multiple founder events.
In the Iberian Peninsula, Roma groups were well-accepted at their arrival, but at the end of the fifteenth century, with the unification of Castile and Aragon crowns, the nomad Roma groups were forced to become sedentary and suffered continuous persecutions [1]. As we remark, the present-day Iberian Roma exhibit high levels of non-Roma European ancestry, with an admixture event estimated around 1250–1600. Although GLOBETROTTER did not infer two independent admixture events as might be expected in the Iberian Roma, two different European footprints are identified: the Balkan and the non-Roma Iberian components. The detection of a single signal of admixture could be explained by a rapid expansion from the Balkans to the Iberian Peninsula, with a short time gap between the two events, or due to continuous gene flow between non-Roma Europeans and Roma groups during their migration within Europe. In fact, if the time ranges between two events are close, the ability of GLOBETROTTER to distinguish between two admixture pulses from a single pulse decreases [31].
Besides between-country heterogeneity, the present study further identifies within-country Roma substructure in the Iberian Peninsula, partially correlated with geography: two clusters are restricted to the northwestern and central part of the peninsula (IberianRoma-1 and IberianRoma-2), another cluster mainly represents Roma samples from the south (IberianRoma-3) and the last one contains all the northeastern individuals (IberianRoma-4). These groups differ both in ancestry proportions and inbreeding levels, which can be the result of different demographic patterns, as the different laws concerning the Roma people in the Iberian Peninsula were neither homogeneous nor permanent [1]. As mentioned above, IberianRoma-4 is the most differentiated cluster. It exhibits more non-Roma Iberian ancestry, the inferred date of the admixture event is the most recent one (1532–1730), and it presents the lowest inbreeding levels. Altogether this can be explained by the extensive admixture with the non-Roma Iberian population. In fact, historical records confirm that both nomadic and sedentary Roma groups in the Principality of Catalonia were highly linked and interrelated with the non-Roma society [36]. In addition, their European ancestral source contains groups from North Italy and Northwestern Europe that are absent in the rest of Iberian Roma samples, which might point to either a posterior arrival to the Iberian Peninsula after admixing with these European populations or due to the constant movement of Roma groups between Southeastern France and Northeastern Spain [36]. The Iberian group representing the most southern location, IberianRoma-3, has a genetic particularity: it has around 1% of Northwest African ancestry, which probably corresponds to the North African admixture found in the southern and western parts of the Iberian Peninsula, during the Arab expansion (711–1248) [28,29]. The fact that the North African component is only found in IberianRoma-3 samples, who also show Balkan ancestry, contributes to reject the hypothesis of a Roma migration route to Iberia from North Africa [30]. IberianRoma-1 has more non-Roma Iberian component than IberianRoma-2, although these two clusters contain samples from the same region. These results highlight that, even within Roma groups who live in the same geographic region, distinct social dynamics (ie. itinerant vs sedentary lifestyles) caused the application of different laws that might have shaped their current genetic landscape. On the contrary, some geographical patterns have probably been diluted due to the continuous movement and admixture among Roma groups, especially after 1749 with the general imprisonment of Spanish Romani, who were captured and relocated, although the effects of this event were not uniform throughout the Roma community, enabling the identification of present-day geographical patterns within Iberia Roma [37].
The present study attempts to characterize the European Roma and describe their South Asian and West Eurasian components using fine-scale methods. On the one hand, we have targeted the putative South Asian ancestry of the Roma in a specific group of Punjabi and Southeastern Indian individuals, representing a small group of proto-Roma founders with low levels of the West Eurasian ancestry. Besides, our results show that the recent West Eurasian component (around 65% of the Roma genomes) was acquired between 1270–1580, during the Roma diaspora. Specifically, we have detected and characterized the Balkan genetic footprint in all European Roma groups and the Baltic and Iberian components in the Northern and Western Roma groups, respectively, likely due to a continuous non-Roma gene flow during their dispersal through Europe. On the other hand, we have found genetic substructure within the Iberian Roma, with different groups and different levels of non-Roma admixture, as a result of the complex historical events occurred in the Peninsula. Further studies are needed to fully understand the genetic substructure of the Roma population as well as to provide new insights into the migration routes undertaken by the European Roma shaping their current genetic landscape. The use of migration group data (Balkan, Romungro and Vlax group assignation) would add an additional layer of information in both genome-wide and complete uniparental markers analyses, as it has been suggested that Roma genetic diversity might be primarily structured by migration route [11,12].
|
|
|
Post by Admin on Apr 28, 2022 18:04:42 GMT
Refining the South Asian Origin of the Romani people Bela I. Melegh, Zsolt Banfai, Kinga Hadzsiev, Attila Miseta & Bela Melegh BMC Genetics volume 18, Article number: 82 (2017)
Abstract Background Recent genetic studies based on genome-wide Single Nucleotide Polymorphism (SNP) data further investigated the history of Roma and suggested that the source of South Asian ancestry in Roma originates most likely from the Northwest region of India.
Methods In this study, based also on genome-wide SNP data, we attempted to refine these findings using significantly larger number of European Roma samples, an extended dataset of Indian groups and involving Pakistani groups into the analyses. Our Roma data contained 179 Roma samples. Our extended Indian data consisted of 51 distinct Indian ethnic groups, which provided us a higher resolution of the population living on the Indian subcontinent. We used in this study principal component analysis and other ancestry estimating methods for the study of population relationships, several formal tests of admixture and an improved algorithm for investigating shared IBD segments in order to investigate the main sources of Roma ancestry.
Results According to our analyses, Roma showed significant IBD sharing of 0.132 Mb with Northwest Indian ethnic groups. The most significant IBD sharings included ethnic groups of Punjab, Rajasthan and Gujarat states. However, we found also significant IBD sharing of 0.087 Mb with ethnic groups living in Pakistan, such as Balochi, Brahui, Burusho, Kalash, Makrani, Pashtun and Sindhi.
Conclusion Our results show that Northwest India could play an important role in the South Asian ancestry of Roma, however, the origin of Romani people might include the area of Pakistan as well.
|
|