|
Post by Admin on May 6, 2021 5:19:38 GMT
Skin pigmentation is highly variable within Africa To identify genes affecting skin pigmentation in Africa, we used a DSM II ColorMeter to quantify light reflectance from the inner arm as a proxy for melanin levels in 2101 ethnically and genetically diverse Africans living in Ethiopia, Tanzania, and Botswana (table S1 and figs. S1 and S2) (8). Skin pigmentation levels vary extensively among Africans, with darkest pigmentation observed in Nilo-Saharan–speaking pastoralist populations in eastern Africa and lightest pigmentation observed in San hunter-gatherer populations from southern Africa (Fig. 2 and table S1). Fig. 2 Melanin distributions. Histograms of melanin index computed from under-arm measurements with a DSM II ColorMeter for all individuals in each population as described in (67). Skin tones were visualized by displaying the scaled mean red, green, and blue values from the ColorMeter for individuals binned by melanin index. A locus associated with light skin color in Europeans is common in East Africa We genotyped 1570 African individuals with quantified pigmentation levels using the Illumina Infinium Omni5 Genotyping array. After quality control, we retained ~4.2 million biallelic single-nucleotide polymorphisms (SNPs) for analysis. A genome-wide association study (GWAS) analysis with linear mixed models, controlling for age, sex, and genetic relatedness (9), identified four regions with multiple significant associations (P < 5 × 10−8) (Fig. 1, fig. S3, and tables S2 and S3). We then performed fine-mapping using local imputation of high-coverage sequencing data from a subset of 135 individuals and data from the Thousand Genomes Project (TGP) (Fig. 3 and table S3) (10). We ranked potential causal variants within each locus using CAVIAR, a fine-mapping method that accounts for linkage disequilibrium (LD) and effect sizes (Table 1) (11). We characterized global patterns of variation at these loci using whole-genome sequences from West African, Eurasian, and Australo-Melanesian populations (10, 12, 13). Fig. 3 Genomic context of GWAS loci. Plot of −log10(P value) versus genomic position for variants near the four regions with most strongly associated SNPs from GWAS, including annotations for genes, MITF ChIP-seq (chromatin immunoprecipitation sequencing) data for melanocytes (45), a CTCF ChIP-seq track for NHEK keratinocytes, and H3K27ac, DNase I hypersensitive sites (DHS), and chromHMM tracks for melanocytes and keratinocytes from the Roadmap Epigenomics data set (30). Genome-wide significant variants are highlighted in red. Circles, squares, and triangles denote noncoding, synonymous, and nonsynonymous variants, respectively. (A) SLC24A5 locus. (B) MFSD12 locus. (C) DDB1/TMEM138 locus. (D) OCA2/HERC2 locus. Table 1 Annotations of candidate causal SNPs from GWAS. Top candidate causal variants for the four regions identified based on analysis with CAVIAR (11). For each variant, the genomic position (Location), RSID, and Ancestral>Derived alleles are shown, with the allele associated with dark pigmentation in bold. Beta and standard error [Beta(SE)] and the P values from the GWAS (F test, linear mixed model) are given. For functional genomic data, nearest genes are given and variants overlapping DHS sites for melanocytes (E059) (DHS melanocytes) and/or other cell types (DHS other) available from Roadmap Epigenomics are indicated with X (30, 89). Variants intersecting enhancer regions tested by luciferase assays were labeled with Y (significant enhancer activity) or N (no enhancer activity) (fig. S7). Chromatin interactions with nearby genes measured in MCF-7 or K562 cell lines as identified by ChIA-PET are listed with gene names (Chromatin interactions) (43, 44). SNPs that are in strong LD (r2 > 0.7 in East Africans) are numerically labeled in the column titled LD block.
Location RSID Ancestral>Derived Beta(SE) P DHS melanocytes DHS other Luciferase activity LD block Nearest gene Chromatin interactions 15:48485926 rs2413887 T>C 7.70(0.44) 4.9 × 10−62 1 CTXN2 MYEF2 15:48426484 rs1426654 G>A 7.69(0.44) 5.5 × 10−62 X X N 1 SLC24A5 15:48392165 rs1834640 G>A .56(0.44) 3.2 × 10−61 X 1 SLC24A5 15:48400199 rs2675345 G>A 7.62(0.44) 6.7 × 10−61 1 SLC24A5 15:48460188 rs8028919 G>A −4.95(0.41) 5.0 × 10−32 1 MYEF2 19:3545022 rs10424065 C>T 4.48(0.48) 5.1 × 10−20 X X Y 2 MFSD12 CACTIN, MFSD12 19:3544892 rs56203814 C>T −4.38(0.50) 3.6 × 10−18 X Y 2 MFSD12 CACTIN, MFSD12 19:3566631 rs111317445 C>T 3.51(0.42) 1.7 × 10−16 N 3 HMG20B MFSD12 19:3547955 rs10414812 C>T 4.38(0.53) 3.8 × 10−16 X Y 2 MFSD12 CACTIN, FZR1, MFSD12 19:3565599 rs112332856 T>C 3.52(0.43) 3.8 × 10−16 X X Y 3 HMG20B MFSD12 19:3565253 rs6510760 G>A 3.54(0.45) 6.5 × 10−15 X X Y 3 MFSD12 MFSD12 19:3545150 rs73527942 T>G −3.58(0.47) 4.8 × 10−14 X X Y 2 MFSD12 CACTIN, MFSD12 19:3547685 rs142317543 C>T 6.99(0.92) 5.0 × 10−14 X X Y 2 MFSD12 CACTIN, FZR1, MFSD12 19:3566513 rs7254463 C>T 2.90(0.50) 9.0 × 10−9 X N 3 HMG20B MFSD12 19:3565357 rs7246261 C>T 2.71(0.47) 1.1 × 10−8 X X 3 HMG20B MFSD12 19:3565909 rs6510761 T>C 2.79(0.50) 2.2 × 10−8 X 3 HMG20B MFSD12 11:61137147 rs7948623 A>T −2.94(0.44) 2.2 × 10−11 X X Y 4 TMEM138 TKFC, DDB1, TMEM138 11:61148456 rs397709980 G/GA −2.90(0.43) 2.4 × 10−11 X N 4 TMEM216 11:61152630 rs4453253 C>T −2.85(0.43) 5.4 × 10−11 X X N 4 TMEM216 CYB561A3, TKFC, DDB1, TMEM138, TMEM216 11:61153401 rs4939520 C>T −2.79(0.43) 1.4 × 10−10 X N 4 TMEM216 CYB561A3, TKFC, DDB1, TMEM138, TMEM216 11:61142943 rs4939519 C>T −2.47(0.39) 2.8 × 10−10 X 4 TMEM138 TMEM138 11:61106525 rs2512809 C>T −2.93(0.47) 7.4 × 10−10 N 5 TKFC TKFC, DDB1, TMEM138 11:61046876 rs11230658 T>C 3.01(0.49) 9.4 × 10−10 5 VWCE 11:61084180 rs12289370 G>A 2.99(0.49) 1.3 × 10−9 X 5 DDB1 TKFC, DDB1 11:61144652 rs1377457 C>A −3.01(0.49) 1.5 × 10−9 X 5 TMEM138 CYB561A3, TKFC, DDB1, TMEM138 11:61088140 rs7934735 G>T 2.98(0.49) 1.5 × 10−9 5 DDB1 11:61141476 rs7394502 G>A −2.41(0.40) 1.6 × 10−9 5 TMEM138 TMEM138 11:61141164 rs10897155 C>T −2.41(0.40) 1.6 × 10−9 Y 5 TMEM138 TMEM138 11:61139869 rs11230678 G>A −2.41(0.40) 1.7 × 10−9 5 TMEM138 TMEM138 11:61115821 rs148172827 C/CATCAA −2.95(0.49) 1.8 × 10−9 X X Y 5 TKFC CYB561A3, TKFC, DDB1, TMEM138 11:61144707 rs1377458 C>T −2.40(0.40) 2.1 × 10−9 X 5 TMEM138 CYB561A3, TKFC, DDB1, TMEM138 11:61076372 rs11230664 C>T 2.95(0.49) 2.1 × 10−9 X N 5 DDB1 DDB1 11:61122878 rs7951574 G>A 2.92(0.49) 2.8 × 10−9 5 CYB561A3 11:61054892 rs1108769 A>C 2.82(0.47) 3.0 × 10−9 X 5 VWCE TKFC, DDB1 11:61141259 rs57265008 T>C 2.34(0.39) 3.7 × 10−9 Y 4 TMEM138 TMEM138 11:61222635 rs3017597 G>A −2.77(0.47) 5.4 × 10−9 5 SDHAF2 11:61075524 rs12275843 T>C 2.64(0.45) 5.5 × 10−9 5 DDB1 DDB1 11:61043773 rs73490303 G>C 2.67(0.46) 7.2 × 10−9 5 VWCE VWCE 11:61018855 rs653173 A>G 2.68(0.46) 8.2 × 10−9 X 5 PGA5 11:61063156 rs10897150 G>T 2.79(0.48) 8.8 × 10−9 X 5 VWCE TKFC, DDB1, VWCE 11:61108974 rs2260655 G>A 2.63(0.46) 9.0 × 10−9 5 TKFC CYB561A3, TKFC, DDB1, TMEM138 11:61152028 rs12791961 C>A 2.90(0.50) 9.7 × 10−9 X 5 TMEM216 CYB561A3, TKFC, DDB1, TMEM138, TMEM216 11:61055014 . GACTA/G 2.61(0.45) 1.1 × 10−8 X 5 VWCE TKFC, DDB1 11:61080557 rs7120594 T>C 2.58(0.45) 1.2 × 10−8 5 DDB1 11:61044470 rs9704187 G>C 2.58(0.45) 1.3 × 10−8 5 VWCE VWCE 11:61106892 rs2513329 G>C −2.73(0.48) 1.6 × 10−8 N 5 TKFC CYB561A3, TKFC, DDB1, TMEM138, TMEM216 11:61033525 rs2001746 T>A 2.56(0.45) 1.7 × 10−8 5 VWCE 11:61112802 rs2305465 C>T 2.62(0.46) 1.8 × 10−8 X X 5 TKFC CYB561A3, TKFC, DDB1, TMEM138 11:61037389 . ATT/A 2.51(0.45) 3.5 × 10−8 5 VWCE 15:28514281 rs4932620 C>T −2.85(0.48) 3.2 × 10−9 6 HERC2 15:28532639 rs1667393 C>T −2.82(0.48) 6.3 × 10−9 X N 6 HERC2 15:28535675 rs1635167 C>T −2.88(0.50) 8.9 × 10−9 6 HERC2 15:28545148 rs2905952 A>G −3.16(0.55) 9.0 × 10−9 6 HERC2 15:28396894 rs12915877 T>G −2.76(0.48) 1.1 × 10−8 X 6 HERC2 15:28487069 rs4932618 G>A −2.69(0.47) 1.6 × 10−8 6 HERC2 15:28235773 rs1800404 C>T 2.54(0.45) 1.6 × 10−8 X 7 OCA2 15:28238158 rs1868333 G>A −2.53(0.45) 2.2 × 10−8 7 OCA2 15:28419497 . TA/T −3.73(0.67) 2.6 × 10−8 6 HERC2 15:28238895 rs735066 A>G −2.50(0.45) 3.5 × 10−8 X 7 OCA2
|
|
|
Post by Admin on May 6, 2021 22:48:01 GMT
The SNPs with strongest association with skin color in Africans were on chromosome 15 at or near the solute carrier family 24 member 5 (SLC24A5) gene (Figs. 1 and 3 and tables S2 and S3). A functional nonsynonymous mutation within SLC24A5 (rs1426654) (14) was significantly associated with skin color (F test, P = 5.5 × 10−62) and was identified as potentially causal by CAVIAR (Table 1). The rs1426654 (A) allele is at high frequency in European, Pakistani, and Indian populations (Fig. 1) and is a target of selection in Europeans, Central Asians, and North Indians (15–18). In Africa, this variant is common (28 to 50% frequency) in populations from Ethiopia and Tanzania with high Afro-Asiatic ancestry (19, 20) and is at moderate frequency (5 to 11%) in San and Bantu-speaking populations from Botswana with low levels of East African ancestry and recent European admixture (Fig. 1 and figs. S2 and S4) (21, 22). We observe a signature consistent with positive selection at SLC24A5 in Europeans based on extreme values of Tajima’s D statistic (fig. S5). On the basis of coalescent analysis with sequence data from the Simons Genomic Diversity Project (SGDP) (13), the time to most recent common ancestor (TMRCA) of most Eurasian lineages containing the rs1426654 (A) allele is 29 thousand years ago (ka) [95% critical interval (CI), 28 to 31 ka], consistent with previous studies (15, 17) (Fig. 4). Haplotype analysis indicates that the rs1426654 (A) variant in Africans is on the same extended haplotype background as Europeans (Fig. 5 and fig. S6), likely reflecting gene flow from western Eurasia over at least the past 3 to 9 ky (23). The rs1426654 (A) variant is at high frequency (28%) in Tanzanian populations, suggesting a lower bound (~5 ka) for introduction of this allele into East Africa, the time of earliest migration from Ethiopia into Tanzania (24). Furthermore, the frequency of the rs1426654 (A) variant in eastern and southern Africans exceeds the inferred proportion of non-African ancestry (figs. S2 and S4). Estimates of genetic differentiation (FST) at the rs1426654 SNP between the West African Yoruba (YRI) and Ethiopian Amhara populations is 0.76, among the top 0.01% of values on chromosome 15 (table S4). These results are consistent with selection for the rs1426654 (A) allele in African populations following introduction, although complex models of demographic history cannot be ruled out. Fig. 4 Coalescent trees and TMRCA dating. Inferred genealogies for regions flanking candidate causal loci. Each leaf corresponds to a single sampled chromosome from 1 of 278 individuals in the Simons Genome Diversity Project (13). Leaf nodes are colored by the population of origin of the individual, and sequences carrying the light allele are indicated with an open circle, located next to the leaf node. Node heights and 95% CI are presented for a subset of internal nodes. Gene genealogies are shown for regions flanking (A) SLC24A5, rs1426654 (15:48426484); (B) MFSD12, rs10424065 (19:3545022); (C) MFSD12, rs6510760 (19:3565253); (D) TMEM138, rs7948623 (11:61137147); (E) DDB1, rs11230664 (11:61076372); (F) OCA2, rs1800404 (15:28235773); (G) HERC2, rs4932620 (15:28514281); and (H) HERC2, rs6497271 (15:28365431). Fig. 5 Haplotype networks at SLC24A5, MFSD12, DDB1/TMEM138, and OCA2/HERC2. Median-joining haplotype networks of regions containing candidate causal variants. Connections between circles indicate genetic relatedness, whereas size is relative to the frequency of haplotypes. Ancestry proportions are displayed as pie charts. Yellow and red subfigures indicate which haplotypes contain the allele associated with dark pigmentation (red) or light pigmentation (yellow). (A) Region (75 kb) flanking the causal variant at SLC24A5. (B and C) Regions (3 kb) flanking rs10424065 in MFSD12 and rs6510760 upstream of MFSD12. (D) Region (195 kb) flanking DDB1 extending from PGA5 to SDHAF2. (E to G) Regions 1, 3, and 2 (50 kb) at OCA2 and HERC2 (ordered based on highest to lowest probability of being causal from CAVIAR analysis).
|
|
|
Post by Admin on May 7, 2021 3:03:08 GMT
A lysosomal transporter protein associated with skin pigmentation The region with the second strongest genetic association with skin pigmentation contains the major facilitator superfamily domain containing 12 (MFSD12) gene on chromosome 19 (Figs. 1 and 3 and tables S2 and S3). MFSD12 is homologous to other genes containing MFS domains, conserved throughout vertebrates, which function as transmembrane solute transporters (25). MFSD12 mRNA levels are low in depigmented skin of vitiligo patients (26), likely due to autoimmune-related destruction of melanocytes.
The MFSD12 locus is in a region with extensive recombination, enabling us to fine-map eight potentially causal SNPs (Table 1 and table S3) that cluster in two regions: one within MFSD12 and the other ~7600 to 9000 base pairs (bp) upstream of MFSD12 (Fig. 3). Many SNPs are in predicted regulatory regions active in melanocytes and/or keratinocytes (Table 1 and Fig. 3) and show enhancer activity in luciferase expression assays in a WM88 melanoma cell line (Table 1, table S5, and fig. S7). Within MFSD12, the two SNPs that CAVIAR identifies as having the highest probability of being causal are rs56203814 (F test, P = 3.6 × 10−18), a synonymous variant within exon 9, and rs10424065 (F test, P = 5.1 × 10−20), located within intron 8. They are 130 bp apart, are in strong LD, and affect gene expression in luciferase expression assays (1.5 to 2.7× higher expression than the minimal promoter; fig. S7). The SNPs upstream of MFSD12 with highest probability of being causal are rs112332856 (F test, P = 3.8 × 10−16) and rs6510760 (F test, P = 6.5 × 10−15). They are 346 bp apart, are in strong LD, and affect gene expression in luciferase expression assays (4.0 to 19.7× higher expression than the minimal promoter; fig. S7).
The derived rs56203814 (T) and rs10424065 (T) alleles associated with dark pigmentation are present only in African populations (or those of recent African descent) and are most common in East African populations with Nilo-Saharan ancestry (Fig. 1 and fig. S4). Coalescent analysis of the SGDP data set indicates that the rs10424065 (T) allele predates the 300-ka origin of modern humans (estimated TMRCA of 612 ka; 95% CI, 515 to 736 ka) (Fig. 4) (27).
At rs6510760 and rs112332856, the ancestral (G) and (T) alleles, respectively, associated with light pigmentation, are nearly fixed in Europeans and East Asians and are common in San as well as Ethiopian and Tanzanian populations with Afro-Asiatic ancestry (Fig. 1 and fig. S4). The derived rs6510760 (A) and rs112332856 (C) alleles (associated with dark pigmentation) are common in all sub-Saharan Africans except the San, as well as in South Asian and Australo-Melanesian populations (Fig. 1 and fig. S4). Haplotype analysis places the rs6510760 (A) allele [and linked rs112332856 (C) allele] in Australo-Melanesians on similar haplotype backgrounds relative to central and eastern Africans (Fig. 5 and fig. S6), suggesting that they are identical by descent from an ancestral African population. Coalescent analysis of the SGDP data set indicates that the TMRCA for the derived rs6510760 (A) allele is 996 ka [95% CI, 0.82 to 1.2 million years ago (Ma); Fig. 4].
We do not detect evidence for positive selection at MFSD12 using Tajima’s D and iHS statistics [figs. S5 and S8; as expected if selection were ancient (28)]. However, levels of genetic differentiation are elevated when comparing East African Nilo-Saharan and western European (CEU) populations (for example, FST = 0.85 for rs112332856, top 0.05% on chromosome 19), consistent with differential selection at this locus (table S4) (29).
MFSD12 is within a cluster of 10 genes with high expression levels in primary human melanocytes relative to primary human keratinocytes (30), with MFSD12 as the most differentially expressed (90×; table S6). The genomic region (chr19:3541782-3581062) encompassing MFSD12 and neighboring gene HMG20B (a transcription factor common in melanocytes) has numerous deoxyribonuclease (DNase) I hypersensitive sites (DHS) and is enriched for H3K27ac enhancer marks in melanocytes (top 0.1% genome-wide; Fig. 3), suggesting that this region may regulate expression of genes critical to melanocyte function (31).
Analyses of gene expression using RNA sequencing (RNA-seq) data from 106 primary melanocyte cultures (table S7) indicate that African ancestry is significantly correlated with decreased MFSD12 gene expression [Pearson correlation coefficient (PCC), P = 5.0 × 10−2; fig. S9]. We observed significant associations between genotypes at rs6510760 and rs112332856 with expression of HMG20B [Bonferroni-adjusted P (Padj) < 4.9 × 10−3] and MFSD12 (Padj < 3.4 × 10−2) (fig. S9). In each case, the alleles associated with dark pigmentation correlate with decreased gene expression. Allele-specific expression (ASE) analysis indicates that individuals heterozygous for either rs6510760 or rs112332856 show increased allelic imbalance, relative to homozygotes, for MFSD12 (Mann-Whitney-Wilcoxon test, P = 4.9 × 10−3 and 1.3 × 10−2, respectively), consistent with regulation of gene expression in cis. A haplotype containing the rs6510760 (A)/rs112332856 (C) variants associated with dark pigmentation showed 4.9 times lower expression in luciferase assays than the haplotype containing rs6510760 (G)/rs112332856 (T) variants associated with light pigmentation (Kruskal-Wallis rank-sum test, P = 7.7 × 10−7; fig. S7 and table S5). We did not have power to detect an association between expression of MFSD12 and rs56203814 or rs10424065 due to low frequency (~2%) of the alleles associated with dark pigmentation in the primary melanocyte cultures.
|
|
|
Post by Admin on May 7, 2021 5:50:04 GMT
MFSD12 suppresses eumelanin biogenesis in melanocytes from lysosomes We silenced expression of the mouse ortholog of MFSD12 (Mfsd12) using short hairpin RNAs (shRNAs) in immortalized melan-Ink4a mouse melanocytes derived from C57BL/6J-Ink4a−/− mice (32), which almost exclusively make eumelanin (Fig. 6). Reduction of Mfsd12 mRNA by ~80% with two distinct lentivirally encoded shRNAs (Fig. 6A) caused a 30 to 50% increase in melanin content compared to control cells (Fig. 6B), and a higher percentage of melanosomes per total cell area in most cells compared to cells transduced with nontarget shRNA (Fig. 6, C and D). A fraction of MFSD12-depleted cells harbored large clumps of melanin in autophagosome-like structures (fig. S10). These data suggest that MFSD12 suppresses eumelanin content in melanocytes and may offset autophagy. Fig. 6 MFSD12 suppresses eumelanin production but localizes to lysosomes. Immortalized melan-Ink4a melanocytes expressing nontarget (sh NT) shRNA or either of two shRNA plasmid clones (#1 and 2) targeting Mfsd12 were analyzed for (A) Mfsd12 mRNA content by quantitative reverse transcription polymerase chain reaction (qRT-PCR), (B) melanin content by spectrophotometry, or (C) percentage of cell area containing melanin by bright-field microscopy. (D) Quantification. Data in (A) to (C) represent means ± SEM, normalized to sh NT samples, from three separate experiments. In (C), n (sh NT) = 97 cells, n (shMfsd12 #1) = 68 cells, and n (shMfsd12 #2) = 71 cells. Scale bar, 10 μm (C). (E to G) Melan-Ink4a melanocytes transiently expressing MFSD12-HA (E) or not transfected (F and G) were fixed, immunolabeled for HA (E) and for LAMP2 to mark lysosomes (E and G) or for TYRP1 to mark melanosomes (F), and analyzed by immunofluorescence and bright-field microscopy. Bright-field (melanin) images show pigmented melanosomes (pseudocolored red in the merged images). Insets, 4× magnification of boxed regions. Arrows, MFSD12-containing structures that overlap LAMP2 (E) or TYRP1-containing structures that overlap melanosomes (F); arrowheads, structures that do not overlap (G). Scale bars, 10 μm. (H) Quantification of overlap for structures labeled by MFSD12, TYRP1, LAMP2, and pigment. Data represent means ± SEM from three independent experiments; n = 17 cells (MFSD12 overlap with LAMP2 and melanin), 33 cells (TYRP1 overlap with melanin), or 23 cells (LAMP2 and melanin). We assessed the localization of human MFSD12 isoform c (RefSeq NM_174983.4) tagged at the C terminus with the hemagglutinin (HA) epitope (MFSD12-HA). By immunofluorescence microscopy, MFSD12-HA localized to punctate structures throughout the cell. Surprisingly, these puncta, like those labeled by the endogenous lysosomal membrane protein LAMP2, but not the melanosomal enzyme TYRP1, overlapped only weakly with pigmented melanosomes (Fig. 6, E to G; quantified in Fig. 6H). Instead, MFSD12-HA colocalized with LAMP2 (Fig. 6E; quantified in Fig. 6H), indicating that the MFSD12 protein localizes to late endosomes and/or lysosomes in melanocytes and not to eumelanosomes. Functional characterization of MFSD12 in mice CRISPR-Cas9 was used to generate an Mfsd12 null allele in a wild-type mouse background (Fig. 7A and fig. S11). Four founders were observed with a uniformly gray coat color, rather than the expected agouti coat color (fig. S11, A and B). These four gray founders harbored deletions at the targeted site (fig. S11C). Microscopic observation revealed a lack of pheomelanin, resulting in white, rather than yellow, banding of hairs in Mfsd12 mutants (Fig. 7B). Fig. 7 In vivo mouse model of MFSD12 deficiency. (A) Wild-type agouti mouse (left) with a gray Mfsd12-targeted littermate (right). (B) Hair from the Mfsd12-targeted mouse has grossly normal eumelanin (lower black region of the hair shaft); however, the upper subapical yellow band in wild-type (B, left) appears white in the Mfsd12 mutant (B, right) due to a reduction in pheomelanin. The Mfsd12 knockout coat color appeared phenotypically similar to that of grizzled (gr) mice, an allele previously mapped to a syntenic ~2-Mb region overlapping Mfsd12 (33). Like our CRISPR-Cas9 Mfsd12 knockout, homozygous gr/gr mice are characterized by a gray coat resulting from dilution of yellow pheomelanin pigment from the subterminal agouti band of the hair shaft. Exome sequencing of an archived gr/gr DNA sample, subsequently confirmed by Sanger sequencing in an independent colony, identified a 9-bp in-frame deletion within exon 2 of Mfsd12 (fig. S12) as the sole mutation affecting a coding sequence in this mapped candidate region. The deleted amino acids for the gr/gr allele, Mfsd12 p.Leu163_Ala165del, are in the cytoplasmic loop between the transmembrane domains TM4 and TM5 within a highly conserved MFS domain (fig. S13). These results indicate that mutation of Mfsd12 is responsible for the gray coat color of gr/gr mutant mice, and that loss of Mfsd12 reduces pheomelanin within the hairs of agouti mice. Together, these results indicate that MFSD12 plays a conserved role in mammalian pigmentation. Depletion of MFSD12 increases eumelanin content in a cell-autonomous manner in skin melanocytes, consistent with the lower levels of MFSD12 expression observed in melanocytes from individuals with African ancestry. Because MFSD12 localizes to lysosomes and not to eumelanosomes, this may reflect an indirect effect through modified lysosomal function. By contrast, loss of MFSD12 has the opposite effect on pheomelanin production, reflecting a more direct effect on function of pheomelanosomes, which have a distinct morphology (3), gene expression profile (34), and a potentially different intracellular origin from eumelanosomes (35). Although disruption of MFSD12 alone accounts for changes in pigmentation, the role of neighboring loci such as HMG20B on pigmentation remains to be explored.
|
|
|
Post by Admin on May 7, 2021 23:15:28 GMT
Skin pigmentation–associated loci that play a role in UV response are targets of selection Another genomic region associated with pigmentation encompasses a ~195-kb cluster of genes on chromosome 11 that play a role in UV response and melanoma risk, including the damage-specific DNA binding protein 1 (DDB1) gene (Figs. 1 and 3 and table S3). DDB1 (complexed with DDB2 and XPC) functions in DNA repair (36); levels of DDB1 are regulated by UV exposure and MC1R signaling, a regulatory pathway of pigmentation (37). DDB1 is a component of CUL4-RING E3 ubiquitin ligases that regulate several cellular and developmental processes (38); it is critical for follicle maintenance and female fertility in mammals (39) and for plastid size and fruit pigmentation in tomatoes (40). Knockouts of DDB1 orthologs are lethal in both mouse and fruitfly development (41), and DDB1 only exhibits rare (<1% frequency) nonsynonymous mutations in the TGP data set. Genetic variants near DDB1 were associated with human pigmentation in an African population with high levels of European admixture (7).
Because of extensive LD in this region, CAVIAR identified 33 SNPs predicted to be causal (Table 1). The most strongly associated SNPs are located in a region conserved across vertebrates flanked by TMEM138 and TMEM216 (42) ~36 to 44 kb upstream of DDB1 and are in high LD within this cluster (r2 > 0.7 in East Africans) (Fig. 3, Table 1, and table S3). Among these, the most significantly associated SNP is rs7948623 (F test, P = 2.2 × 10−11), located 172 bp downstream of TMEM138, which shows enhancer activity in WM88 melanoma cells (91.9 to 140.8× higher than the minimal promoter; fig. S7 and table S5) and interacts with the promoters of DDB1 and neighboring genes in MCF-7 cells (Table 1 and Fig. 3) (43, 44).
A second group of tightly linked SNPs (LD r2 > 0.7 in East Africans) with predicted high probability of containing causal variants spans a ~195-kb region encompassing DDB1 and TMEM138 (Table 1 and Fig. 3). Two SNPs that tag this LD block are rs1377457 (F test, P = 1.5 × 10−9), located ~7600 bp downstream of TMEM138, and rs148172827 (F test, P = 1.8 × 10−9), an insertion/deletion polymorphism at TKFC (triokinase and FMN cyclase) located in an enhancer active in WM88 melanoma cells (67.6 to 76.2× higher than the minimal promoter; fig. S7 and table S5), which overlaps an MITF binding site in melanocytes (30, 45); both SNPs interact with the promoters of DDB1 and neighboring genes in MCF-7 cells (Table 1 and Fig. 3) (43, 44). SNPs within introns of DDB1 (rs12289370, rs7934735, rs11230664, rs12275843, and rs7120594) also tag this LD block (Table 1 and Fig. 3).
RNA-seq data from 106 primary melanocyte cultures indicate that African ancestry is significantly correlated with increased DDB1 gene expression (PCC, P = 2.6 × 10−5; fig. S9). Association tests using a permutation approach indicated that, of the 35 protein-coding genes with a transcription start site within 1 Mb of rs7948623, expression of DDB1 is most strongly associated with a SNP in an intron of DDB1, rs7120594, at marginal statistical significance after correction for ancestry and multiple testing (Padj = 0.06; fig. S9). The allele associated with dark pigmentation at rs7120594 correlates with increased DDB1 expression. We did not have the power to detect an association between expression of DDB1 and SNPs in LD with rs7948623 due to low minor allele frequencies (~2%). The role of DDB1 and neighboring loci in human pigmentation remains to be further explored.
The derived rs7948623 (T) allele near TMEM138 (associated with dark pigmentation) is most common in East African Nilo-Saharan populations and is at moderate to high frequency in South Asian and Australo-Melanesian populations (Fig. 1 and fig. S4). At SNP rs11230664, within DDB1, the ancestral (C) allele (associated with dark pigmentation) is common in all sub-Saharan African populations, having the highest frequency in East African Nilo-Saharan, Hadza, and San populations (88 to 96%), and is at moderate to high frequency in South Asian and Australo-Melanesian populations (12 to 66%) (Fig. 1 and fig. S4). The derived (T) allele (associated with light pigmentation) is nearly fixed in European, East Asian, and Native American populations.
In South Asians and Australo-Melanesians, the alleles associated with darker pigmentation reside on haplotypes closely related, or identical, to those observed in Africa (Fig. 5 and fig. S6), suggesting that they are identical by descent. The TMRCAs for the derived dark allele at rs7948623 and the derived light allele at rs11230664 are estimated to be older than 600 and 250 ka, respectively (Fig. 4).
Consistent with a selective sweep, we see an excess of rare alleles (and extreme negative Tajima’s D values) and high levels of homozygosity extending ~350 to 550 kb in Europeans and Asians, respectively (figs. S5 and S14). We observe extreme negative Tajima’s D values in East African Nilo-Saharans and San over a shorter distance (115 and 100 kb, respectively) (fig. S5). A haplotype extending greater than 195 kb is common in Eurasians and rare in Africans (Fig. 5) and tags the alleles associated with light skin pigmentation. The TMRCA of a large number of haplotypes carrying the rs7948623 (A) allele in non-Africans, associated with light pigmentation, is 60 ka (95% CI, 58 to 62 ka), close to the inferred time of the migration of modern humans out of Africa (Fig. 4) (46). These results, combined with large FST values between Africans and Europeans at SNPs tagging the extended haplotype near DDB1 (for example, FST = 0.98 between Nilo-Saharans and CEU at rs7948623, within the top 0.01% of values on chromosome 11; table S4), are consistent with differential selection of alleles associated with light and dark pigmentation in Africans and non-Africans at this locus.
Identification of variation at OCA2 and HERC2 affecting skin pigmentation Another region of significantly associated SNPs encompasses the OCA2 and HERC2 loci on chromosome 15 (Fig. 3 and table S3). HERC2 was identified in GWAS for eye, hair, and skin pigmentation traits (5–7, 47–49). The oculocutaneous albinism II gene (OCA2, formerly called the P gene) encodes a 12-transmembrane domain–containing chloride transporter protein and affects pigmentation by modulating melanosomal pH (50). The most common types of albinism in Africans are caused by mutations in OCA2 (51).
Because of extensive LD in the OCA2 and HERC2 region, CAVIAR predicted 10 potentially causal SNPs (Table 1) that cluster within three regions. We order these clusters based on physical distance; region 1 is located within OCA2, and regions 2 and 3 are located within introns of HERC2 (Fig. 3).
The SNP with highest probability of being causal from CAVIAR analysis is rs1800404 (F test, P = 1.0), a synonymous variant located in region 1 within exon 10 of OCA2 (Fig. 3, Table 1, and table S3) associated with eye color in Europeans (52). The ancestral rs1800404 (C) allele, associated with dark pigmentation, is common in most Africans as well as southern and eastern Asians and Australo-Melanesians, whereas the derived (T) allele, associated with light pigmentation, is most common (frequency >70%) in Europeans and San (Fig. 1 and fig. S4), consistent with a previous observation (53). Haplotype (Fig. 5) and coalescent analyses (Fig. 4 and fig. S6) show two divergent clades, one enriched for the rs1800404 (C) allele and the other for the rs1800404 (T) allele. Coalescent analysis indicates that the TMRCA of all lineages is 1.7 Ma (95% CI, 1.5 to 2.0 Ma), and the TMRCA of lineages containing the derived (T) allele is 629 ka (95% CI, 426 to 848 ka) (Fig. 4). The deep coalescence of lineages, and the positive Tajima’s D values in this region in both African and non-African populations (fig. S5), is consistent with balancing selection acting at this locus.
The SNP with highest probability of being causal in region 3 is rs4932620 (F test, P = 3.2 × 10−9) located within intron 11 of HERC2 (Fig. 3, Table 1, and table S3). This SNP is 917 bp from rs916977, a SNP associated with blue eye color in Europeans (54, 55), and is in strong LD (r2 = 1.0 in most East African populations), with SNPs extending into region 2 of HERC2 (Table 1). The derived rs4932620 (T) allele associated with dark skin pigmentation is most common in Ethiopian populations with high levels of Nilo-Saharan ancestry and is at moderate frequency in other Ethiopian, Hadza, and Tanzania Nilo-Saharan populations (Fig. 1 and fig. S4). Haplotype analysis indicates that the rs4932620 (T) allele in South Asians and Australo-Melanesians is on the same or similar haplotype background as in Africans (Fig. 5 and fig. S6), suggesting that it is identical by descent. The TMRCA of haplotypes containing the rs4932620 (T) allele is 247 ka (95% CI, 158 to 345 ka) (Fig. 4).
We also observe an LD block of SNPs within region 2 of HERC2 that are associated with skin pigmentation, although they do not reach genome-wide significance (table S3). These are in a region with enhancer activity in Europeans (47). For example, SNP rs6497271 (F test, P = 1.8 × 10−6), which is located 437 bp from SNP rs12913832, has been associated with skin color in Europeans (47) and is in a consensus SOX2 motif (a transcription factor that modulates levels of MITF in melanocytes) (Fig. 3) (56). The ancestral rs6497271 (A) allele associated with dark pigmentation is on haplotypes in South Asians and Australo-Melanesians similar or identical to those in Africans (Fig. 5 and fig. S6), suggesting that they are identical by descent. The derived (G) allele associated with light skin pigmentation is most common in Europeans and San and dates to 921 ka (95% CI, 703 ka to 1.2 Ma) (Figs. 1 and 4 and figs. S4 and S6). SNPs associated with pigmentation at all three regions show high allelic differentiation when comparing East African Nilo-Saharans and CEU (FST = 0.72 to 0.85, top 0.5% on chromosome 15) (table S4).
Analyses of RNA-seq data from 106 primary melanocyte cultures indicate that African ancestry is significantly correlated with increased OCA2 gene expression (PCC, Padj = 6.1 × 10−7) (fig. S9). A permutation approach identified significant associations between OCA2 expression and SNPs within an LD block tagged by rs4932620 extending across regions 2 and 3 (Padj = 2.2 × 10−2). Alleles in this LD block associated with dark pigmentation correlate with increased OCA2 expression. We did not observe associations between the candidate causal variants in region 1 and OCA2 expression despite a high minor allele frequency (34%). However, we observe a significant association between a haplotype tagged by rs1800404 and alternative splicing resulting in inclusion/exclusion of exon 10 (linear regression t test, P = 9.1 × 10−40). Exon 10 encodes the amino acids encompassing the third transmembrane domain of OCA2 and is the location of several albinism-associated OCA2 mutations (57, 58), raising the possibility that the shorter transcript encodes a nonfunctional channel. Comparing splice junction usage across individuals, we estimate that each additional copy of the light rs1800404 (T) allele reduces inclusion of exon 10 by ~20% (95% CI, 17.9 to 21.5%; fig. S9). Therefore, homozygotes for the light rs1800404 (T) allele are expected to produce ~60% functional OCA2 protein (compared to individuals with albinism who produce no functional OCA2 protein).
|
|