Genetics of Latin Americans

new

Admin
Administrator

Posts: 73,555

Genetics of Latin Americans Nov 18, 2023 22:03:38 GMT

Quote

Post by Admin on Nov 18, 2023 22:03:38 GMT

Material and methods
Population sample
134 unrelated individuals from Rio de Janeiro (Southeastern Brazil) with different pigmentation phenotypes of skin, hair and eyes (Table 1) were selected for MC1R genotyping. The population of study was selected non-randomly and the classification followed a qualitative analysis based on predetermined parameters of skin, hair and eye color. Collecting the biological material and the DNA extraction were described in a previous study [25]. The project was approved by the Ethics in Research Committee of Clementino Fraga Filho Hospital/UFRJ (CEP—MEMO—n.° 536/10). MC1R variants data from genetically unrelated individuals from parental populations such as Africans (GWD, LWK, MSL and YRI), Asians (CDX, CHB, CHS and JPT) and Europeans (FIN, GBR, IBS and TSI) were selected from the 1000 Genomes Project Consortium Phase 3 (for population keys see S1 Table) [26].

Table 1
Phenotypic frequency of admixed population from Rio de Janeiro.
Parameter Individuals / n = 134
Hair, n (%)
Red 14(0.10)
Blond 25(0.19)
Light brown 16(0.12)
Dark brown 33(0.25)
Black 46(0.34)
Skin, n (%)
Light 75(0.56)
Intermediate 35(0.26)
Dark 24(0.18)
Eye, n (%)
Blue 21(0.16)
Green/Hazel 30(0.22)
Dark 83(0.62)
Gender, n (%)
Female 63(0.47)
Male 71(0.53)
Age, (years)
Mean (SD) 30.87(12.24)

DNA genotyping and sequences analyses
Sanger sequencing was performed in all samples to encompass the human MC1R coding region. Amplification of the regions was performed with the GeneAmp High Fidelity enzyme of Applied Biosystems™ (30ng of DNA, 1x buffer, 1.5 mm MgCl2, 10% DMSO, 0.2mm dNTP, 1.5U of enzyme, 0.3μM of each primer—forward: 5’-GGCAGCACCATGAACTAAGCAG-3’ and reverse: 5’-CAGGGTCACACAGGAACCAGAC-3’ final volume of 50μl—cycling: 94°C 2min; 30x (94°C 20s, 63°C 20s, 72°C 1min) 72°C 7min. The amplified product was purified by PCR Cleanup kit (AxygenTM) and sequenced using Big Dye Terminator (Thermo Fisher, CA). The products were run in Applied Biosystems™ ABI 3130xl, using six primers designed by our group (5’-GAAGAACTGTGGGGACCTGGA-3’, 5’-CAGGAAGCAGAGGCTGGACAG-3’, 5’-ATGTACTGCTTCATCTGCTGC-3’, 5’-CAGGATGGTGAGGGTGACAGC-3’, 5’-TCCTGGCTATGCTGGTGCTCA-3’, 5’-ACACAATATCACCACCTCCCTCT-3’) which covered the intronless MC1R coding region and part of 3’UTR at least twice. Sequences from both strands were aligned with the SNP-contained GenBank consensus sequence format from Homo sapiens chromosome 16, reference assembly, complete sequence (GenBank version NC_000016.8 GI:51511732), using the blast tool from Geneious Pro 4.7 software (Biomatters). The region of HVI of mitochondrial DNA was sequenced based on the protocol described by another group [10] and the classification of haplogroups was achieved through the command line version of Haplogrep v.2.2.6 [27]. The correspondent geographic region for haplogroups was determined through the MITOMAP database [28]. For the parental populations analysis, all the polymorphisms from MC1R gene (89985667–89986632bp) were extracted from The 1000 Genomes database (ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/) using the VCFtools v.0.1.15 [29] and the variant annotations were obtained by SnpEff v.4.5 package [30].

Statistical analysis
The allele frequencies and the Hardy-Weinberg equilibrium (HWE) were calculated using Adegenet v. 2.1.3 and Pegas v.0.13 packages [31, 32], respectively. The phylogenetic tree was obtained by applying the neighbor joining method (NJ function) using ape v.5.3 package [33]. To evaluate the clusters of skin color individuals from RJ and the 1000 Genomes populations, the nonmetric multidimensional scaling (NMDS) and its respective stress plot were conducted with the vegan v.2.5.6 package based on two dimensions (K), kulczynski distance and 25 as a maximum number of random starts. The phylogenetic tree and all the multidimensional scaling plots were based on the Fst genetic distance that were performed through Nei’s pairwise Fst calculation using the pairwise.neifst function and the respective confidence interval, boot.ppfst () function, both from hierfstat v.0.5.7 package [34]. All the analyzes above were performed using the R-studio v.1.3.1056. The R packages ggplot and ggtree were used to generate the graphs and trees, respectively.

MC1R mutational analysis
Mutational analyses were performed using two criteria: i. Based on the protein sequence; ii. Based on the protein three-dimensional (3D) structure. For the sequence-based predictions, the following disease-association predictors were used: PolyPhen [35], PON-P2 [36] and Meta-SNP [37], based on the Uniprot ID Q01726. For the structure-based predictions, the following predictors were used: DUET [38] and DynaMut [39]. DUET includes a consensus score and the individual scores for mCSM and SDM predictors.

MC1R model construction
The MC1R sequence was retrieved from the UniProt database (ID: Q01726) [40]. Template search with BLAST and HHBlits against the Protein Data Bank (PDB) [41] were performed using the SWISS-MODEL server [42]. The best template was the crystal structure of the human melanocortin receptor 4 (MC4R) (PDB ID: 6w25) with sequence identity of 50.18% and alignment coverage of 0.87. Since this crystal did not cover a small region (residues 29–38) of the target protein containing a mutation site on position 35, we selected an additional template, the muscarinic acetylcholine receptor M2 (PDB ID: 5zk8), with 20.21% sequence identity but a slightly higher coverage value of 0.89. Multiple template modelling of the wild-type (WT) MC1R was performed using MODELLER [43] version 9.23, using standard parameters. The best model was selected based on DOPE score [44]. Modelling of the five selected mutants was also performed through MODELLER, using as template the generated WT model.

Admin
Administrator

Posts: 73,555

Genetics of Latin Americans Nov 19, 2023 21:29:51 GMT

Quote

Post by Admin on Nov 19, 2023 21:29:51 GMT

Results
MC1R is highly polymorphic in an admixed population
Assessing the genetic diversity of the intronless MC1R gene, we observed that 75% of Rio de Janeiro (RJ) sample population presented at least one polymorphism. These 31 variants encompasses 14 synonymous mutations, 15 non-synonymous, one indel variation and one SNP at 3’UTR. From those, 28 were already described polymorphisms and three were novel synonymous variants in heterozygous state (Leu11Leu, Tyr143Tyr and Ala181Ala; Table 2). Interestingly, the novel polymorphism found in humans (c.429 C>T, p.143 Tyr>Tyr) was also observed in two different breeds of sheep [49, 50]. It is probable that the emergence of this variation shared between humans and sheep occurred as independent events.


Table 2
Minor allele frequency (MAF) of MC1R polymorphisms from RJ population.
SNP ID	Nt / Aa change	MAF (n)*	Skin	Hair	Eyes
Light	Intermed	Dark	Red	Light	L. brown	D. brown	Black	Blue	Green/Hazel	Dark
Novel SNP	c.31C>T / Leu11Leu	0.004 (1)	-	-	0.021	-	-	-	-	0.011	-	-	0.006
rs779504604	c.104G>A / Cys35Tyr	0.007 (2)	0.013	-	-	0.071	-	-	-	-	-	0.017	0.006
rs1805005	c.178G>T / Val60Leu	0.104 (28)	0.133	0.100	0.021	0.107	0.100	0.250	0.091	0.065	0.190	0.150	0.066
rs34474212	c.247T>C / Ser83Pro	0.011 (3)	0.020	-	-	0.071	0.020	-	-	-	-	0.017	0.012
rs1805006	c.252C>A / Asp84Glu	0.015 (4)	0.027	-	-	0.036	0.060	-	-	-	-	0.067	-
rs2228479	c.274G>A / Val92Met	0.019 (5)	0.020	-	0.042	-	-	0.031	0.030	0.022	0.024	-	0.024
rs780284801	c.288C>T / Ala96Ala	0.004 (1)	-	0.014	-	-	-	-	0.015	-	-	-	0.006
rs140650544	c.309C>T / Ala103Ala	0.004 (1)	-	0.014	-	-	-	-	0.015	-	-	-	0.006
rs3212364	c.318G>A / Leu106Leu	0.007 (2)	-	-	0.042	-	-	-	0.015	0.011	-	-	0.012
rs201429598	c.399C>T / Cys133Cys	0.004 (1)	0.007	-	-	-	-	0.031	-	-	-	-	0.006
rs11547464	c.425G>A / Arg142His	0.011 (3)	0.013	0.014	-	0.036	-	-	0.030	-	-	0.033	0.006
Novel SNP	c.429C>T / Tyr143Tyr	0.004 (1)	0.007	-	-	-	-	-	0.015	-	-	-	0.006
rs1805007	c.451C>T / Arg151Cys	0.041 (11)	0.073	-	-	0.214	0.060	-	0.030	-	0.024	0.050	0.042
rs201827012	c.453.C>G / Arg151Arg	0.004 (1)	0.007	-	-	-	-	0.031	-	-	-	-	0.006
rs1110400	c.464T>C / Ile155Thr	0.007 (2)	0.013	-	-	-	0.040	-	-	-	-	0.017	0.006
rs3212365	c.466G>C / Val156Leu	0.004 (1)	-	-	0.021	-	-	-	-	0.011	-	-	0.006
rs1805008	c.478C>T / Arg160Trp	0.015 (4)	0.027	-	-	0.036	0.040	-	0.015	-	0.024	-	0.018
rs885479	c.488G>A / Arg163Gln	0.056 (15)	0.047	0.086	0.042	-	0.040	0.125	0.045	0.065	0.071	0.050	0.054
rs34612847	c.504C>T / Ile168Ile	0.004 (1)	-	0.014	-	-	-	-	0.015	-	-	-	0.006
rs555179612	c.537_538insC / Frameshift	0.004 (1)	0.007	-	-	0.036	-	-	-	-	-	-	0.006
Novel SNP	c.543C>T /Ala181Ala	0.007 (2)	0.013	-	-	-	0.040	-	-	-	-	0.033	-
rs370040645	c.546C>T / Tyr182Tyr	0.004 (1)	0.007	-	-	0.036	-	-	-	-	-	0.017	-
rs3212366	c.586T>C / Phe196Leu	0.015 (4)	-	0.029	0.042	-	-	-	-	0.043	-	-	0.024
rs146544450	c.699G>A / Gln233Gln	0.004 (1)	-	-	0.021	-	-	-	-	0.011	-	-	0.006
rs200215218	c.766C>T / Pro256Ser	0.004 (1)	0.007	-	-	0.036	-	-	-	-	-	0.017	-
rs375813196	c.819C>T / Cys273Cys	0.004 (1)	-	-	0.021	-	-	-	-	0.011	-	-	0.006
rs1805009	c.880G>C / Asp294His	0.030 (8)	0.053	-	-	0.286	-	-	-	-	0.024	0.067	0.018
rs3212367	c.900C>T / Phe300Phe	0.019 (5)	-	0.043	0.042	-	-	-	0.015	0.043	-	-	0.030
rs375127718	c.923C>T / Thr308Met	0.004 (1)	0.007	-	-	-	-	0.031	-	-	-	0.017	-
rs2228478	c.942A>G / Thr314Thr	0.164 (44)	0.080	0.143	0.458	-	0.080	0.063	0.167	0.293	0.095	0.083	0.211
rs3212368	c.*12G>A / 3´UTR	0.045 (12)	0.007	0.029	0.188	-	0.020	-	0.015	0.109	-	0.017	0.066
The table refers to the minor allele frequency (MAF) of each polymorphism in a total population and in the skin, hair and eye color subgroups. N: Number of alleles in a population; hyphen: No minor allele was observed.

The analysis of the MC1R variants showed that all the polymorphisms are in HWE with the exception of Val60Leu (rs1805005). The Val60Leu, Arg163Gln, Thr314Thr and rs3212368 from 3’UTR were the SNVs with the highest frequencies in RJ population with minor allele frequency (MAF) equal to or above 5% (Table 2). Despite the purifying selection, nonsynonymous variants were already detected in African populations. In our study, four individuals with high levels of melanin carry the variant Phe196Leu (rs3212366), which is located on the fifth transmembrane domain, previously identified in the African sub-Saharan population [51]. In addition, the variant Val156Leu (rs3212365) which is so far detected only in the African population according to the 1000 Genome Project (S2 Table), was identified in one black individual in our sample.

Regarding the so-called strong alleles “R” for RHC that have high penetrance for red hair and fair skin, we detected the Asp84Glu (rs1805006), Arg142His (rs11547464), Arg151Cys (rs1805007), Arg160Trp (rs1805008) and Asp294His (rs1805009). In the same group, we found an insertion of a cytosine at nucleotide position 537 (rs555179612) of the MC1R ORF, encoding a truncated protein 237 amino acids in length in the transmembrane 4 domain region. In our sample population, minor allele frequency (MAF) of two variants previously predicted as damaging mutations, Ser83Pro (rs34474212) and Pro256Ser (rs200215218), also the loss-of-function SNP Cys35Tyr (rs779504604), were present in 1.1%, 0.4% and 0.7% respectively (Table 2). Cys35Tyr was detected in two red-haired individuals in heterozygous condition. One person also carried a second variant, Ser83Ser/Pro, previously associated with changes in the structure and function of the receptor [14], and the other individual exhibited Cys35Tyr as the unique variant detected in the entire sequence of the MC1R coding region.

Admin
Administrator

Posts: 73,555

Genetics of Latin Americans Nov 22, 2023 19:42:18 GMT

Quote

Post by Admin on Nov 22, 2023 19:42:18 GMT

Intermediate skin color group from RJ is genetically closed to European population based on MC1R variation
To understand the relationship among color phenotypes based on the overall MC1R variation in miscegenated individuals, we assessed the distribution pattern of the different phenotype within our population from low to high melanin content for skin and hair color phenotypes using neighbor-joining phylogenetic trees with Fst-pairwise genetic distances (Fig 1A and 1B and S3 and S4 Tables).

Fig 1
Color phenotype profile based on MC1R variation in an admixed population.
Phylogenetic tree based on neighbor-joining method using the Fst distance of different groups of (A) skin, (B) hair and (C) eye colors.

The red-haired group was positioned relatively distant from other phenotypes with a large Fst distance as well as the lighter-colored eyes from the dark ones (Fig 1C and S4 Table). An interesting finding was observed relating to the skin color variation shown in Fig 1A. The intermediate skin color group is closer to the light skin color group than the dark one, indicating a genetic distance similarity between both groups.

Considering the variation of the Brazilian skin color is due to the miscegenation of black Africans, medium-tone Native American and light skin Europeans, it is not known whether the similar genetic distance between the light and intermediate skin phenotypes is a result of a high contribution of European background in the intermediate group or whether the light skin color is mixed enough that it differs from white Europeans. To comprehend this issue, we assessed the distribution of the skin color phenotype samples compared to African, Asian and European populations from the 1000 Genomes Project (S2 Table) by performing a nonmetric multidimensional scaling (NMDS) analysis based on the pairwise Fst genetic distance (S3 Table). Running with k = 2 dimensions (R2: 0.998, stress: 0.0206), we observed three distinct clusters corresponding to the populations of Africa (GWD, LWK, MSL and YRI), Asia (CDX, CHB, CHS and JPT) and Europe (FIN, GBR, IBS and TSI) (Fig 2A).

Admin
Administrator

Posts: 73,555

Genetics of Latin Americans Nov 24, 2023 19:27:12 GMT

Quote

Post by Admin on Nov 24, 2023 19:27:12 GMT

Fig 2
Distribution of skin color phenotype from RJ compared to parental populations from the 1000 Genomes project.
(A) The nonmetric Multidimensional Scaling (NMDS) analysis was performed based on Fst analysis of the light, intermediate and dark skin colors among 12 populations in Africa (GWD, LWK, MSL, YRI), Asia (CDX, CHB, CHS, JPT) and Europe (FIN, GBR, IBS, TSI) obtained from the 1000 Genomes Consortium. (B) Phylogenetic tree based on the Fst genetic distance. Africa (GWD-Gambian in Western Divisions; LWK-Luhya in Webuye, Kenya; MSL-Mende in Sierra Leone; YRI-Yoruba in Ibadan, Nigeria); Asia (CDX-Chinese Dai in Xishuangbanna, China; CHB-Han Chinese in Beijing, China; CHS-Southern Han Chinese); JPT-Japanese in Tokyo, Japan); Europe (GBR-British in England and Scotland; FIN-Finnish in Finland; IBS-Iberian Population in Spain; TSI-Toscani in Italia).

Moreover, the elements on dimension 2 of the plot were able to distinguish the melanin content in the skin, in which the pigmentation levels decreased as the values increased. This conception in the NMDS data is strengthened by the distribution of Asian cluster. This distribution is corroborated by previous reports that showed the southern Asians with lower skin reflectance compared to the northern ones from Beijing and Japanese [52]. Related to our population, the admixed light skin individuals gathered with Europeans (Fig 2A and 2B), in particular Mediterraneans, differently from the dark skin individuals who are grouped with Africans, closer to Kenyans and Nigerians. However, the intermediate skin phenotype was close to the light skin, since the Fst distance is small between both groups compared to dark skin color (Figs (Figs11 and and22 and S3 Table), suggesting that coding region sequence of MC1R would discriminate, more properly, dark from non-dark skin color in an admixed population from RJ.

To support the above data, we analyzed the matrilineal genetic ancestry of the 134 individuals. We observed that the mitochondrial DNA haplogroups exhibit a distinct distribution among skin color phenotype, wherein the light and dark skin groups showed a dominant proportion of European and African ancestry, respectively (Fig 3 and S5 Table).

Admin
Administrator

Posts: 73,555

Genetics of Latin Americans Nov 27, 2023 4:13:26 GMT

Quote

Post by Admin on Nov 27, 2023 4:13:26 GMT

Fig 3
Distribution of matrilineal lineages among individuals of skin color variation in the population of RJ.
Haplogroups based on the HVS-I region of Mitochondrial DNA were classified using Haplogrep software. The numbers correspond to the frequency of African, Asian, Native American and European haplogroups within color phenotypes: Dark, intermediate and light. All represents the frequency distribution of matrilineal lineages of all skin color variation in our sample.

Interestingly, the intermediate phenotype displayed majority contribution of African mtDNA (53%) and similar proportion between Native American and European, 26.5% and 20.6%, respectively, not supporting the fact that intermediate skin color clusters near to the light color phenotype through matrilineal genetic ancestry analysis.

Nonsynonymous polymorphisms are predicted to impact negatively on MC1R function
We evaluated the amino acid changes that potentially interfere with the function of the MC1R protein receptor. Since many nonsynonymous polymorphisms found in our sample were extensively studied regarding their effects on melanin production, we decided to focus on five nonsynonymous mutations: Cys35Tyr, Ile155Thr, Pro256Ser (identified in blond/red hair individuals), Vall156Leu and Phe196Leu (in black people) to understand their functional roles on MC1R. First, we estimated their possible phenotypic effects using seven disease-association predictors based on the amino acid sequence of the protein (Table 3, left). Only Val156Leu was predicted having neutral effects (5 from 7 predictors), while the remaining were assumed to cause alterations on the MC1R protein that may be linked to a suspected malfunction of the protein.

Table 3
Functional and structural impacts of the selected MC1R mutations.
Mutation Functional Predition Structural Prediction expressed as ΔΔG (kcal/mol)
PolyPhen-2a PON-P2a PANTHERb PhD-SNPb SIFTb SNAPb Meta-SNPc mCSM SDM DUETd DynaMut
C35Y Damaging Pathogenic Disease Disease Neutral Disease Disease Destabilizing Destabilizing Destabilizing Stabilizing
(rs779504604)
0.998 0.889 0.747 0.876 0.20 0.54 0.693 -0.719 -0.18 -0.651 0.726
I155T Damaging Unknown Disease Disease Disease Disease Disease Destabilizing Destabilizing Destabilizing Destabilizing
(rs1110400)
0.986 0.781 0.581 0.709 0.03 0.725 0.69 -1289 -1.27 -1178 -0.254
V156L Possibly damaging Unknown Neutral Disease Neutral Disease Neutral Destabilizing Destabilizing Destabilizing Stabilizing
(rs3212365)
0.567 0.519 0.201 0.563 0.23 0.505 0.465 -0.491 -0.99 -0.4987 0.486
F196L Damaging Pathogenic Disease Disease Disease Disease Disease Stabilizing Destabilizing Stabilizing Stabilizing
(rs3212366)
0.997 0.848 0.862 0.837 0.03 0.745 0.734 0.401 -1.64 0.342 0.557
P256S Damaging Pathogenic Disease Disease Disease Disease Disease Destabilizing Stabilizing Destabilizing Destabilizing
(rs200215218)
1 0.895 0.948 0.907 0 0.765 0.852 -2054 0.53 -1571 -0.569