|
Post by Admin on Dec 16, 2021 21:46:08 GMT
Contrasting Paternal and Maternal Genetic Histories of Thai and Lao Populations Wibhu Kutanan, Jatupol Kampuansai, Metawee Srikummool, Andrea Brunelli, Silvia Ghirotto, Leonardo Arias, Enrico Macholdt, Alexander Hübner, Roland Schröder, Mark Stoneking Molecular Biology and Evolution, Volume 36, Issue 7, July 2019, Pages 1490–1506, doi.org/10.1093/molbev/msz083Published: 12 April 2019 Abstract The human demographic history of Mainland Southeast Asia (MSEA) has not been well studied; in particular, there have been very few sequence-based studies of variation in the male-specific portions of the Y chromosome (MSY). Here, we report new MSY sequences of ∼2.3 mB from 914 males and combine these with previous data for a total of 928 MSY sequences belonging to 59 populations from Thailand and Laos who speak languages belonging to three major Mainland Southeast Asia families: Austroasiatic, Tai-Kadai, and Sino-Tibetan. Among the 92 MSY haplogroups, two main MSY lineages (O1b1a1a* [O-M95*] and O2a* [O-M324*]) contribute substantially to the paternal genetic makeup of Thailand and Laos. We also analyze complete mitochondrial DNA genome sequences published previously from the same groups and find contrasting pattern of male and female genetic variation and demographic expansions, especially for the hill tribes, Mon, and some major Thai groups. In particular, we detect an effect of postmarital residence pattern on genetic diversity in patrilocal versus matrilocal groups. Additionally, both male and female demographic expansions were observed during the early Mesolithic (∼10 ka), with two later major male-specific expansions during the Neolithic period (∼4–5 ka) and the Bronze/Iron Age (∼2.0–2.5 ka). These two later expansions are characteristic of the modern Austroasiatic and Tai-Kadai groups, respectively, consistent with recent ancient DNA studies. We simulate MSY data based on three demographic models (continuous migration, demic diffusion, and cultural diffusion) of major Thai groups and find different results from mitochondrial DNA simulations, supporting contrasting male and female genetic histories. Introduction Thailand and Laos occupy a key location in the center of Mainland Southeast Asia (MSEA; fig. 1), which is undoubtedly one of the factors facilitating the extensive ethnolinguistic diversity, as there are 68 recognized groups in Thailand and 82 groups in Laos, belonging to five language families (Simons and Fennig 2018). The prehistoric peopling of the area of present-day Thailand and Laos has been documented by several archaeological studies (Shoocongdej 2006; Demeter et al. 2012; Higham 2014,, 2017) and investigated further by recent ancient DNA studies (Lipson et al. 2018; McColl et al. 2018). The earliest presence of modern humans in SEA is dated to ∼50 ka (Higham 2013; Bae et al. 2017), followed by Paleolithic migration to East Asia ∼30 ka, inferred from genetic data (Yan et al. 2014; Hallast et al. 2015). There was also an expansion of Neolithic farmers and Bronze Age migrations from southern China to MSEA, which contributed to the present-day gene pool of modern MSEA people, for example, Thais and Laotians (Higham 2014,, 2017; Lipson et al. 2018; McColl et al. 2018). Additional migrations during the historical period from neighboring countries (Penth 2000; Schliesinger 2000) have further enhanced ethnolinguistic diversity. FIG. 1.
|
|
|
Post by Admin on Dec 17, 2021 2:49:00 GMT
The census size for Thailand was ∼68.41 million in 2017 and for Laos was ∼6.76 million in 2016 (Simons and Fennig 2018). There are five linguistic families distributed in these two countries. Although the Tai-Kadai (TK) language is widely spread in southern China and MSEA, it is concentrated in present-day Thailand and Laos as it is a major language spoken by Thais (90.5%) and Laotians (67.7%). Austroasiatic (AA) speakers are next most frequent, accounting for 4.0% in Thailand and 24.4% in Laos. In addition, this area is also inhabited by historical migrants who speak Sino-Tibetan (ST), Hmong-Mien (HM), and Austronesian languages (frequencies of 3.2%, 0.3%, and 2%, respectively, in Thailand; 3.1%, 4.8%, and 0% in Laos) (Simons and Fennig 2018).
It is generally thought that AA languages were brought to the Thai/Lao region by Neolithic farmers from southern China, whereas TK languages were brought by a later, Bronze Age migration, also from southern China (Bellwood 2018). The Neolithic expansion was ∼2–3 ka before the expansion of TK languages; thus, the AA people were thought to be present before the TK expansion. The TK migration during the Bronze Age could have occurred via either demic diffusion (an expansion of TK people that brought both their genes and their language) or cultural diffusion (a language spread with minor movement of people). A genetic study on the origin of TK people supports a southern Chinese origin (Sun et al. 2013), whereas our previous studies of mitochondrial DNA (mtDNA) genome sequences support demic diffusion as the best explanation for the origin of the present-day Thai/Lao TK groups, although there is a strong signal of admixture between TK and AA groups in central Thailand (Kutanan et al. 2017; Kutanan, Kampuansai, Brunelli, et al. 2018). Although there is extensive ethnolinguistic diversity in the region, Thai/Lao populations can be generally categorized based on geography as either hill tribes or lowlanders. Nine ethnic groups, consisting of ∼700,000 people, are officially identified as hill tribes in Thailand: the AA-speaking Lawa, Htin, and Khmu; the HM-speaking Hmong and IuMien; and the ST-speaking Karen, Lahu, Akha, and Lisu. The Akha, Lisu, Hmong, IuMien, Lawa, and Khmu are strongly patrilocal (i.e., the wife moves to the residence of her husband after marriage), whereas the Lahu, Karen, and Htin are strongly matrilocal. The lowlanders are neither strongly patrilocal nor matrilocal (Schliesinger 2000,, 2001; Penth and Forbes 2004).
Previous studies have reported an influence of postmarital residence pattern on genetic variation in northern Thai hill tribes, with lower within-population genetic diversity coupled with greater genetic heterogeneity among populations for patrilocal groups than for matrilocal groups for the male-specific portions of the Y chromosome (MSY), whereas the opposite pattern is observed for mtDNA (Oota et al. 2001; Besaggio et al. 2007). However, these previous studies compared genetic variation between partial mtDNA sequences (hypervariable regions of the control region) and Y chromosomal short tandem repeats (Y-STRs); it would be informative to investigate more complete genetic data from these groups.
The MSY are paternally inherited and exhibit lineages specific to populations/geographic regions, making the MSY an informative tool for reconstructing paternal genetic history and demographic change (Yan et al. 2014; Barbieri et al. 2016). However, to date, there have been few MSY studies of MSEA and almost all of them employed Y-STRs (Cai et al. 2011; Kutanan et al. 2011; Brunelli et al. 2017) and also defined haplogroups by genotyping assays, which are thus biased in terms of the haplogroups detected, and cannot uncover new sublineages. Analyzing partial sequences of the MSY and complete mtDNA genome sequences provides more insight into genetic history, especially sex-biased practices that can influence genetic variation, as well as the role of geography and language (Arias et al. 2018; Bajic et al. 2018; Kutanan, Kampuansai, Changmai, et al. 2018).
We have previously carried out comprehensive studies of the maternal genetic history of the Thai/Lao region, based on 1,823 complete mtDNA genome sequences (Kutanan et al. 2017; Kutanan, Kampuansai, Brunelli, et al. 2018; Kutanan, Kampuansai, Changmai, et al. 2018). In order to investigate the paternal genetic variation and demographic history, here, we investigate ∼2.3 mB of MSY sequence in a subset of the above individuals, comprising 928 sequences from 59 populations. We compare and contrast the MSY and mtDNA results, with a focus on the patrilocal versus matrilocal hill tribes, the AA-speaking versus TK-speaking groups, and the various geographic regions (northern Thailand, central Thailand, and northeastern Thailand and Laos). We also use demographic modeling to address the role of demic versus cultural diffusion versus admixture in the origins of the major TK groups in each Thai/Lao region and contrast the results based on the MSY to previous results based on mtDNA. Our MSY sequencing results provide new insights into the paternal genetic history of MSEA and indicated contrasting paternal and maternal histories in this region.
|
|
|
Post by Admin on Dec 17, 2021 19:30:25 GMT
Results We generated 914 sequences of ∼2.3 mB of the MSY, which combined with 14 published sequences brings the total to 928 MSY sequences belonging to 59 populations from Thailand and Laos (fig. 1 and supplementary table 1, Supplementary Material online). There are 816 haplotypes defined by 8,160 polymorphic sites, with mean coverages ranging from 4× to 109× (overall average coverage = 23×). Among the 928 MSY sequences, there are 92 specific haplogroups, belonging mostly to two main MSY lineages (O1b* and O2a*), that contribute substantially to the paternal genetic makeup of Thailand and Laos. There are several subclades of O1b*; the most frequent (50.54%) is O1b1a1a* or O-M95*, which occurs in almost half of the AA groups with a very high frequency (>70%), that is, KH1-KH2, KA, BU, BL, SU, TN1-TN3, MA, and LW3 (fig. 1 and supplementary table 2, Supplementary Material online). The correspondence analysis (based on haplogroup frequencies) also supports the divergence of these AA-speaking groups in agreement with the other results mentioned later, with many O1b* sublineages, for example, O1b1a1a1b1a (O-B426) and O1b1a1a1a1a (O-F2758) (supplementary fig. 1, Supplementary Material online). O2a* or O-M324* is the second most frequent haplogroup (25.86%) and has a relatively high frequency (>40%) in some AA and TK groups, and all ST-speaking Karen. Additional minor non-SEA-specific haplogroups were also observed, for example, haplogroup N*, found in the Lawa groups, and haplogroups R*, H*, and J*, which support associations between India and the Mon, and genetic connections between Mon and TK groups (fig. 1 and supplementary fig. 1, Supplementary Material online). Further details on haplogroup distribution are provided in supplementary table 2 and text, Supplementary Material online. Genetic Diversity and Structure Generally, the AA populations show lower genetic diversity values than the TK and ST groups for the MSY, in agreement with the mtDNA results (fig. 2A–C) (Mann–Whitney U tests between AA and TK for MSY: h: Z = 3.37, P < 0.01; mean number of pairwise difference [MPD]: Z = 2.40, P < 0.05; haplogroup diversity: Z = 3.74, P < 0.01 and for mtDNA: h: Z = 4.33, P < 0.01; MPD: Z = 1.47, P > 0.05; haplogroup diversity: Z = 4.37, P < 0.01). After the Maniq (MN), who have no MSY variation, and the Mlabri (MA), who have no mtDNA variation, the Htin (TN1), Lawa (LW3), and Bru (BU) show very low diversity values of MSY, whereas the Htin (TN1–TN3), Khmer (KH2), and Seak (SK) show low mtDNA diversity (fig. 2A–C). In contrast to the other AA groups, the Mon (MO1–MO7) show higher levels of both MSY and mtDNA diversity than other AA groups (Mann–Whitney U tests between AA and Mon for MSY: h: Z = −3.33, P < 0.01; MPD: Z = −3.30, P < 0.01; haplogroup diversity: Z = −3.75, P < 0.01 and for mtDNA: h: Z = −1.94, P > 0.05; MPD: Z = −2.03, P < 0.05; haplogroup diversity: Z = −2.79, P < 0.01). LW3 showed very low MSY haplogroup diversity (fig. 2B) and MPD values (fig. 2C), and a significantly low Tajima’s D value (fig. 2D), suggesting recent paternal expansion in this group, but the converse trend (rather high diversity) for mtDNA. Interestingly, a significantly negative Tajima’s D value was observed more frequently in the TK than the AA groups for both the MSY and mtDNA (MSY, P < 0.05: 10/31 for TK vs. 6/24 for AA; mtDNA, P < 0.05: 20/31 for TK vs. 5/24 for AA) (fig. 2D), suggesting a stronger signal of recent population expansion in TK groups; no significant Tajima’s D values were observed in any of the ST-speaking Karen groups. The Nyahkur (BO), who speak a Mon language, show the highest MPD value for the MSY (fig. 2C), which might indicate paternal gene flow with other populations; this is supported by the BO having the highest number of shared MSY haplotypes (three haplotypes) with other populations (fig. 3A). MO3 and MO4 have shared MSY haplotypes with the TK-speaking groups (CT2, CT6, and YU1), reflecting their genetic connection. In the mtDNA, apart from the AA-speaking Palaung (PL), the Mon (MO2, MO3, and MO7) also share haplotypes with the central Thai (CT3 and CT6) and Shan (SH) (fig. 3A). FIG. 2. Genetic diversity values of MSY and mtDNA in the studied populations, excluding the Maniq (MN) and Mlabri (MA): haplotype diversity (A), haplogroup diversity (B), MPD (C), and Tajima’s D values (D). More information and all genetic diversity values are provided in supplementary table 1, Supplementary Material online.
|
|
|
Post by Admin on Dec 17, 2021 21:36:52 GMT
FIG. 3. Relative shared haplotypes (A) and heat plot of Φst (B) between studied populations for the MSY and for mtDNA. The Analysis of Molecular Variance (AMOVA) indicates that the variation among populations (within group) accounts for 11.12% of the total MSY genetic variance (table 1). There is greater genetic heterogeneity within the AA group (20.01%, P < 0.01 and 18.49%, P < 0.01 without MN, the hunter–gatherer group from southern Thailand) than among the TK (4.48%, P < 0.01) and ST-speaking Karen groups (2.29%, P > 0.01). For the AA group with more than one population sampled, the greatest within-group variation by far was among the three Lawa populations (34.43%, P < 0.01), whereas the seven Mon populations showed very low (albeit still significant) within-group variation (3.92%, P < 0.01) (supplementary fig. 2, Supplementary Material online). Very low within-group variation was also observed for the central Thai groups from central Thailand (1.47% P > 0.01), Khon Mueang groups from northern Thailand (−1.83%, P > 0.01), and Lao Isan groups from northeastern Thailand (1.84%, P > 0.01), indicating overall genetic homogeneity among these major TK-speaking groups. In agreement with the MSY, larger mtDNA variation is observed in the AA groups (14.03%, P < 0.01) than the ST (6.51%, P < 0.01) and TK groups (4.33%, P < 0.01), but interestingly the largest within-group variation is not among the Lawa (7.78%, P < 0.01) but rather among the Htin populations (25.71%, P < 0.01). In contrast to the MSY, each of the TK groups with more than one population sampled showed significant within-group differences for mtDNA, especially the Khon Mueang (4.20%, P < 0.01) (supplementary fig. 2, Supplementary Material online). In sum, we observed different patterns of MSY versus mtDNA for the different language groups. The among-population variation within linguistic groups is larger for the MSY (20.01%, P < 0.01) than for mtDNA (14.03%, P < 0.01) for AA groups, but about the same for TK groups (4.48%, P < 0.01 for MSY and 4.33%, P < 0.01 for mtDNA), and the ST groups have larger among-population variation for mtDNA (6.51%, P < 0.01) than for the MSY (2.29%, P < 0.01) (table 1 and supplementary fig. 2, Supplementary Material online). Thus, there are different patterns of MSY versus mtDNA differentiation for these three language families. Table 1.AMOVA Results. Groups Number of Groups Number of Populations Percent Variation Within Populations Within Groups Among Groups MSY mtDNA MSY mtDNA MSY mtDNA Total 1 59 (58) 88.88 (89.46) 91.51 11.12* (10.54*) 8.55* Language 3 59 (58) 88.21* (98.05*) 91.20* 10.16* (1.96*) 8.18* 1.63* (−0.01) 0.62* Austroasiatic 1 24 (23) 79.99 (81.51) 85.97 20.01* (18.49*) 14.03* Mon 1 7 96.08 93.10 3.92* 6.90* Htin 1 3 88.47 74.29 11.53* 25.71* Lawa 1 3 65.57 92.22 34.43* 7.78* Sino-Tibetan (Karen) 1 4 97.71 93.49 2.29 6.51* Tai-Kadai 1 31 95.52 95.67 4.48* 4.33* Central Thai 1 7 98.53 98.36 1.47 1.64* Khon Mueang 1 4 101.83 95.80 −1.83 4.20* Lao Isan 1 4 98.16 97.69 1.84 2.31* Geography 6 (5) 59 (58) 88.27* (98.07*) 91.40* 9.35* (2.02*) 8.40* 2.38* (−0.09) 0.20* Northern 1 26 85.51 88.84 14.49* 11.16* Northeastern 1 16 96 91.29 8.00* 8.71* Central 1 11 94.61 95.86 5.39* 4.14* Western 1 3 93.97 99.11 6.03* 0.89 NOTE.—The numbers in parentheses show the percent variation of MSY by excluding the Maniq (MN) and asterisks indicate significant level (P < 0.01).
|
|
|
Post by Admin on Dec 17, 2021 22:36:38 GMT
Although there is more variation among groups defined by geographic location (2.38%, P < 0.01) than by language family (1.63%, P < 0.01) (table 1), there is much more MSY variation among populations within the same group than among groups defined either by geographic or by linguistic criteria. Moreover, when the divergent MN population of hunter–gatherers from southern Thailand is removed from the analysis, then the among-group component is no longer significant for either geographic location or language family (−0.09%, P > 0.01 for geography; −0.01%, P > 0.01 for language), and the total variation among populations within group reduces to 10.54%. Thus, neither geography nor language family is a good predictor of the MSY genetic structure of Thai/Lao populations, indicating that these two factors are not important in the broad view (table 1). There are significant correlations between matrices of MSY genetic and geographic distance, estimated by Mantel tests, for all three types of geographic distances, that is, great circle distance (r = 0.3381, P < 0.01), resistance distance (r = 0.5418, P < 0.01) and least-cost path distance (r = 0.3912, P < 0.01). However, the correlations are no longer significant when the MN group is removed from the analysis: great circle distance (r = 0.0125, P > 0.05), resistance distance (r = −0.0446, P > 0.05) and least-cost path distance (r = 0.0139, P > 0.05). In contrary, no significance was detected (P > 0.05) between matrices of mtDNA genetic distance and geographic distances with and without MN (great circle distance: r = 0.0776 and r = −0.0323), resistance distance (r = 0.1433 and r = −0.1105), and least-cost path distance (r = 0.0997 and r = −0.0253). To identify and describe population clustering based on multivariate analysis, discriminant analysis of principal components (DAPC) was carried out. This analysis attempts to maximize among-groups genetic differentiation and minimize within-group genetic variation; the results showed considerable overlap among groups defined by either language family or geographic location in both MSY and mtDNA (supplementary fig. 3, Supplementary Material online). In addition, the groupings by population and ethnicity of MSY data revealed the largest discrimination to be among some AA-speaking groups, that is, all Lawa groups (LW1–LW3), Htin (TN1), and Blang (BL), whereas all Htin groups (TN1, TN2, and TN3), Mlabri (MA), TK-speaking Seak (SK), and ST-speaking Karen (KSK1, KSK2, and KPW) are differentiated from the others for mtDNA, emphasizing contrasting genetic pattern between MSY and mtDNA for Htin, Mlabri, Lawa, Blang, Seak, and Karen. In sum, all results indicate lower genetic diversity of the AA groups than the TK and ST groups, except the Mon and Nyahkur, who exhibit high genetic diversity. The AA groups also show greater genetic heterogeneity than the TK and ST groups. Postmarital Residence and Genetic Diversity We studied five highlander groups: four hill tribes (Karen, Htin, Lawa, and Khmu) and the Palaung, another minority group in the mountainous area of northern Thailand but not officially recognized as a hill tribe. The Khmu (KA), Lawa (LW1, LW2, and LW3), and Palaung (PL) groups practice patrilocality, whereas the Htin (TN1, TN2, and TN3) are matrilocal, as are the ST-speaking Karen (KSK1, KSK2, KPA, and KPW). If residence pattern is influencing genetic variation, then lower within-population genetic diversity coupled with greater genetic heterogeneity among populations is expected for patrilocal groups than for matrilocal groups for the MSY, whereas the opposite pattern is expected for mtDNA (Oota et al. 2001). The MSY h and MPD values are higher for matrilocal groups, but not significantly (Mann–Whitney U tests: h: Z = 1.4616, P > 0.05; MPD: Z = 0.9744, P > 0.05); however, haplogroup diversity is significantly higher for the matrilocal groups (Mann–Whitney U tests: Z = 2.1112, P < 0.05) (supplementary fig. 4, Supplementary Material online). For mtDNA, genetic diversity values are higher for patrilocal than for matrilocal groups, but the differences are not statistically significant (Mann–Whitney U tests: h: Z = −0.9744, P > 0.05; MPD: Z = −0.8120, P > 0.05; haplogroup diversity: Z = −1.864, P > 0.05) (supplementary fig. 4, Supplementary Material online). Notably, TN1 and LW3 exhibit very low within-population diversity for the MSY, for example, MPD = 20.07 and 23.07, compared with the average MPD (121.11), whereas TN1 and TN2 (20.69 and 26.14) show lower MPD than average (35.09) for mtDNA (supplementary table 1, Supplementary Material online). For genetic differences between-populations, the patrilocal Khmu, Lawa, and Palaung have significantly higher genetic differentiation for the MSY than for mtDNA (average Φst = 0.3109 for MSY and 0.0774 for mtDNA) (Mann–Whitney U tests: Z = 3.5907, P < 0.01), whereas the matrilocal groups (Htin and Karen) also show higher average Φst for MSY (0.1859) than for mtDNA (0.1553), but these are not significantly different (Mann–Whitney U tests: Z = 0.3270, P > 0.05). Contrasting genetic differences for the MSY versus mtDNA of Lawa, Htin, and Karen are clearly seen in the multidimensional scaling (MDS) and DAPC plots (fig. 4A and B and supplementary fig. 3, Supplementary Material online). Much stronger contrasting between-group variation is seen in the AMOVA results (Lawa: 34.43% for MSY and 7.78% for mtDNA; Htin: 11.53% for MSY and 25.71% for mtDNA; Karen: 2.29% for MSY and 6.51% for mtDNA) (table 1 and supplementary fig. 2, Supplementary Material online). FIG. 4. The two-dimensional MDS plot and five-dimensional MDS heat plot based on the Φst distance matrix for 57 populations (after removal of Maniq and Mlabri) of MSY (A and C) and mtDNA (B and D). However, in general, the AA-speaking groups, whether identified as hill tribes or as other minorities, are patrilocal groups. The AMOVA result indicates that the variation among AA populations is higher in MSY (20.01%) than mtDNA (14.03%), in accordance with expectations if residence pattern is influencing genetic variation. Conversely, the TK populations, where neither patrilocal nor matrilocal residence is preferred, exhibit similar among-population variances for the MSY (4.48%) and mtDNA (4.33%) (table 1 and supplementary fig. 2, Supplementary Material online). Overall, there does seem to be some impact of postmarital residence on the patterns of genetic diversity.
|
|