|
Post by Admin on Feb 14, 2018 18:48:19 GMT
Next-generation sequencing (NGS) technologies are revolutionizing the field of ancient DNA (aDNA), and have allowed the sequencing of complete ancient genomes5,6, such as that of Ötzi, a Neolithic human body found in the Alps1. However, very little is known of the genetic composition of earlier hunter-gatherer populations from the Mesolithic period (ca 10,000–5,000 years before present, BP, that immediately preceded the Neolithic period). The Iberian site called La Braña-Arintero was discovered in 2006 when two male skeletons were found in a deep cave system, 1,500 meters above sea level in the Cantabrian mountain range (León, Northwestern Spain) (Fig. 1a). The skeletons were dated to ~7,000 years BP (7,940–7,690 calibrated BP)7. Because of the cold environment and stable thermal conditions in the cave, the preservation of these specimens proved to be exceptional (Fig. 1b). We identified a tooth with high human DNA content (48.4%) and sequenced this specimen to a final 3.40X effective genomic depth-of-coverage (Extended Data Fig. 1). Figure 1 Geographic location and genetic affinities of the La Braña 1 individual We undertook several tests to assess the authenticity of the genome sequence and to determine the amount of potential modern human contamination. First, we observed that sequence reads from both the mitochondrial DNA (mtDNA) and the nuclear DNA of La Braña 1 showed the typical ancient DNA misincorporation patterns that arise from degradation of DNA over time8 (Extended Data Fig. 2a, b). Second, we showed that the observed number of human DNA fragments was negatively correlated with the fragment length (R2 >0.92), as expected for ancient degraded DNA, and that the estimated rate of DNA decay was low and in agreement with predicted values9 (Extended Data Fig. 2c, d). We then estimated the contamination rate in the mtDNA genome, assembled to a high depth-of-coverage (91X), by checking for positions differing from the mtDNA genome (haplogroup U5b2c1) that was previously retrieved with a capture method2. We obtained an upper contamination limit of 1.69% (0.75%–2.6%, 95% CI) (Supplementary Information). Finally, to generate a direct estimate of nuclear DNA contamination, we screened for heterozygous positions (among reads with >4x coverage) in known polymorphic sites (dbSNP-137) at uniquely mapped sections on the X chromosome6 (Supplementary Information). We found that the proportion of false heterozygous sites was 0.31%. Together these results suggest low levels of contamination in the La Brana 1 sequence data.
|
|
|
Post by Admin on Feb 15, 2018 18:59:29 GMT
To investigate the relationship to extant European samples, we conducted Principal Component Analysis (PCA)10 and found the ~7,000-year-old Mesolithic sample was divergent from extant European populations (Extended Data Fig. 3a, b), but was placed in proximity to Northern Europeans (e.g. samples from Sweden and Finland)11–14. Additional PCAs and allele sharing analyses with ancient Scandinavian specimens3 supported the genetic similarity of the La Braña 1 genome to Neolithic hunter-gatherers (Ajv70, Ajv52, Ire8) relative to Neolithic farmers (Gok4, Ötzi) (Fig. 1c, Extended Data Fig. 3c and Extended Data Fig. 4). Thus, this Mesolithic individual from Southwestern Europe represents a formerly widespread gene pool that seems to be partially preserved in some modern-day Northern European populations, as suggested previously with limited genetic data2,3. We subsequently explored the La Braña affinities to an ancient Upper Palaeolithic genome from Mal’ta site near Lake Baikal in Siberia15. Outgroup f3 and D statistics16,17, using different modern reference populations, support that Mal’ta is significantly closer to La Braña 1 than to Asians or modern Europeans (Extended Data Fig. 5 and Supplementary Information). These results suggest that despite the vast geographical distance and temporal span, La Braña 1 and Mal’ta share common genetic ancestry, indicating a genetic continuity in ancient Western and Central Eurasia. This observation matches findings of similar cultural artifacts across time and space in Upper Paleolithic Western Eurasia and Siberia, particularly the presence of anthropomorphic “Venus” figurines which have been recovered from several sites in Europe and Russia, including the Mal’ta site15. We also compared the genome-wide heterozygosity of the La Braña 1 genome to a dataset of modern humans with similar coverage (3–4X). The overall genomic heterozygosity was 0.042%, lower than the values observed in present day Asians (0.046–0.047%), Europeans (0.051–0.054%), and Africans (0.066–0.069%) (Extended Data Fig. 6a). The effective population size, estimated from heterozygosity patterns, suggests a global reduction in population size of ~20% relative to extant Europeans (Supplementary Information). Moreover, no evidence of tracts of autozygosity suggestive of inbreeding was observed (Extended Data Fig. 6b). To systematically investigate the timing of selection events in the recent history of modern Europeans, we compared the La Braña genome to modern populations at loci that have been categorized as of interest for their role in recent adaptive evolution. With respect to two recent well-studied adaptations to changes in diet, we found the ancient genome to carry the ancestral allele for lactose intolerance4 and ~5 copies of the salivary amylase (AMY1) gene (Extended Data Fig. 7 and Supplementary Information), a copy number compatible with a low-starch diet18. These results suggest the La Brana hunter-gatherer was poor at digesting milk and starch, supporting the hypotheses that these abilities were selected for during the later transition to agriculture. To expand the survey, we analyzed a catalog of candidate signals for recent positive selection based on whole-genome sequence variation from the 1000 Genomes Project13, which included 35 candidate non-synonymous variants – ten of which were detected uniquely in the CEU sample (Utah residents with Northern and Western European ancestry)19. For each variant we assessed whether the Mesolithic genome carried the ancestral or derived (putatively adaptive) allele. Of the ten variants, the Mesolithic genome carried the ancestral and non-selected allele as a homozygote in three regions: C12orf29 (a gene with unknown function), SLC45A2 (rs16891982) and SLC24A5 (rs1426654) (Table 1). The latter two variants are the two strongest known loci affecting light skin pigmentation in Europeans20–22 and their ancestral alleles and associated haplotypes are either absent or segregate at very low frequencies in extant Europeans (3% and 0% for SLC45A2 and SLC24A5 respectively) (Fig. 2). We subsequently examined all genes known to be associated with pigmentation in Europeans22, finding ancestral alleles in MC1R, TYR and KITLG and derived alleles in TYRP1, ASIP and IRF4 (Supplementary Information). Although the precise phenotypic effects cannot currently be ascertained in a European genetic background, results from functional experiments20 indicate that the allelic combination in this Mesolithic individual is likely to have resulted in dark skin pigmentation and dark/brown hair. Further examination revealed that this individual carried the rs12913832*C single nucleotide polymorphism (SNP) and the associated homozygous haplotype spanning the HERC2/OCA2 locus that is strongly associated with blue eye color23. Moreover, a prediction of eye color based on genotypes at additional loci using HIrisPlex24 yielded a 0.823 maximal and 0.672 minimal probability for being non-brown eyed (Supplementary Information). The genotypic combination leading to a predicted phenotype of dark skin and non-brown eyes is unique and no longer present in contemporary European populations. Our results indicate that the adaptive spread of light skin pigmentation alleles was not complete in some European populations by the Mesolithic, and that the spread of alleles associated with light/blue eye color may have preceded changes in skin pigmentation.
|
|
|
Post by Admin on Feb 17, 2018 19:15:08 GMT
Figure 2 Ancestral variants around the SLC45A2 (rs16891982, above) and SLC24A5 (rs1426654, below) pigmentation genes in the Mesolithic genome (blue: ancestral, red: derived) For the remaining loci, La Braña 1 displayed the derived, putatively adaptive variants in five cases, including three genes, PTX4, UHRF1BP1, and GPATCH119, involved in the immune system (Table 1 and Extended Data Fig. 8). The latter is associated with the risk of bacterial infection. We subsequently determined the allelic states in 63 SNPs from 40 immunity genes with previous evidence for positive selection and for carrying polymorphisms shown to influence susceptibility to infections in modern Europeans (Supplementary Information). La Braña 1 carries derived alleles in 24 genes (60%) that have a wide range of functions in the immune system: pattern recognition receptors, intracellular adaptor molecules, intracellular modulators, cytokines and cytokine receptors, chemokines and chemokine receptors and effector molecules. Interestingly, four out of six SNPs from the first category are intracellular receptors of viral nucleic acids (TLR3, TLR8, IFIH1/MDA5 and LGP2)25. Finally, to explore the functional regulation of the genome, we also assessed the La Braña 1 genotype at all expression quantitative trait loci (eQTL) regions associated to positive selection in Europeans (Supplementary Information). The most interesting finding is arguably the predicted overexpression of eight immunity genes (36% of those with described eQTLs), including three Toll-like receptor genes (TLR1, TLR2 and TLR4) involved in pathogen recognition26. These observations suggest that the Neolithic transition did not drive all cases of adaptive innovation on immunity genes found in modern Europeans. Several of the derived haplotypes seen at high frequency today in extant Europeans were already present during the Mesolithic, either as neutral standing variation and/or due to selection predating the Neolithic. De novo mutations that increased in frequency rapidly in response to zoonotic infections during the transition to farming should be identified among those genes where La Braña 1 carries ancestral alleles. To confirm if the genomic traits seen at La Braña can be generalized to other Mesolithic populations, analyses of additional ancient genomes from Central and Northern Europe will be needed. Nevertheless, this genome sequence provides the first insight as to how these hunter-gatherers are related to contemporary Europeans and other ancient peoples in both Europe and Asia, and suggests how ancient DNA can shed light on the timing and nature of recent positive selection. Nature. 2014 Mar 13; 507(7491): 225–228.
|
|
|
Post by Admin on Feb 18, 2018 18:46:03 GMT
La Braña 1 and 2 mtDNA HVR1 PCR amplified, cloned, and sequenced mitochondrial (mtDNA) HVR1 sequences, generated in two independent laboratories, indicate that La Braña specimens (Figure 1) belong to the U5b haplotype (16192T-16270T) (see Table S1 available online). Although the observation of the same haplotype in both individuals could be explained through matrilineal family relationship, the emerging picture of genetic uniformity within European populations during the Mesolithic suggests this might not be the case. Figure 1. The Two Mesolithic Skeletons as They Were Discovered The images show the skeletons as they were accidentally discovered in 2006. Above, La Braña 1; below, La Braña 2. These two novel sequences were aligned against all previously reported mtDNA HVR-1 sequences from European Paleolithic, Mesolithic, and Neolithic individuals, accounting for a total number of 166 sequences (each 253 bp in length). Serial coalescent simulations showed low support for a population model of genetic continuity from Mesolithic to Neolithic populations, suggesting that mtDNA variation better fitted population models where European Paleolithic/Mesolithic populations were replaced during the Neolithic transition (see Supplemental Experimental Procedures). La Braña 1 Complete mtDNA Genome We subsequently captured and sequenced a mtDNA library from La Braña 1 on an Illumina Hi-Seq 2000 platform at the Center for GeoGenetics in Copenhagen, Denmark (see Supplemental Experimental Procedures). The number of raw reads generated was 44,581,347, of which 19,993,417 uniquely mapped to the human mtDNA reference genome (rCRS). Sequences starting and ending in the same nucleotide were collapsed because they could derive from the same template molecules. The clonality of the sample was relatively high, and after the collapse, only 5,488 reads were kept. Nevertheless, it was possible to retrieve the complete mtDNA with a final 28× coverage and 16,450 sites covered at least once (Figure 2). The La Braña 1 mtDNA haplotype was an U5b2c1 (Tables S2 and S3), according to the standard PhyloTree classification [24] and the HaploGrep online tool for haplogroup attribution [25]. Figure 2. The La Braña 1 Complete Mitochondrial Genome Mapping coverage of unique DNA reads (in red) and the mtDNA GC content. Coverage is correlated with CG content as shown in [19] and [30]. MtDNA Contamination Estimates To estimate the potential modern DNA contamination in the generated results, we followed several approaches. First, we estimated the phylogenetic assignment of the nucleotide positions differing from the mtDNA human reference (rCRS) in the light of the known mtDNA tree (Table S2) [25]. We then searched for heterogeneities (that could either be contaminants, heteroplasmic sites, or damage) in those positions (Table S2), finding the U5b consensus sequence in 92% of the reads. Thus, the upper limit for mtDNA contamination is 8% (2%–13%, 95% C.I.). Second, we searched for the identical 16192T-16270T HVR1 mtDNA haplotype (between positions 16022 and 16400) using an in-house database (compiled by F. Calafell) and found that it is residually present at a 0.4% frequency in modern populations from the Iberian Peninsula, as estimated from 2,749 published mtDNA sequences. At a pan-European level, the same haplotype is found in only 40 out of 22,807 (0.18%) published mtDNA individual sequences. Third, we analyzed the ratio of nucleotide residues at the 5′ and 3′ ends of the reads. It has been demonstrated that ancient DNA templates exhibit 5′ and 3′ overhangs, resulting in inflated cytosine deamination rates and changes from cytosine to thymine residues at the 5′ ends and from guanines to adenines at the 3′ ends [22, 26–28]. This particular nucleotide misincorporation pattern has been observed in a number of ancient samples that have been subject to deep sequencing, suggesting that it is a specific trait of ancient DNA sequences [19]. We have analyzed the base composition at the sequence ends, finding the described signal of cytosine deamination (Figure S1), thus suggesting that La Braña 1 consensus sequence is in fact endogenous. In conclusion, most of the sequences retrieved show a nucleotide misincorporation pattern typical of ancient DNA sequences, derive essentially from a single individual, and show a phylogenetically coherent haplotype that is rare in modern Iberian populations.
|
|
|
Post by Admin on Feb 19, 2018 18:59:21 GMT
La Braña 1 and 2 Shotgun Genomic Data For La Braña 1, 42,396,337 raw sequence reads were obtained, of which 6,113,535 mapped to the human reference genome (Hg18). After collapsing them to remove clonal reads and paralogs, 728,880 uniquely mapped reads remained (Table S4). The number of reads recovered from La Braña 2 was much lower, with 15,670,532 original reads, of which 364,578 could be uniquely mapped (Table S4). This represents a shotgun efficiency of 1.7% and 2.3% for La Braña 1 and 2, respectively, higher than the efficiency figures found in other samples from the Iberian Peninsula [27] but significantly lower for instance than those obtained for Vindija Neanderthals [29]. The rather high clonality can be explained by an initial low copy number of DNA template in the ancient extracts. The generated data covered 41,320,020 nucleotide positions for La Braña 1 and 16,876,146 for La Braña 2; thus, about 1.34% and 0.53% of the La Braña 1 and 2 genomes were retrieved, respectively. The read average length was 74.7 and 59.5 nucleotides, respectively (Table S4), shorter than the 85.7 nucleotides observed in the mtDNA reads (Table S3) but similar to the length previously reported for DNA extracted from Neanderthal remains [30]. The ratio of X chromosome versus Y chromosome sequences was close to 9:1 (n = ∼18,000 versus n = ∼2,000 reads, and n = ∼8,000 versus n = ∼1,000 for La Braña 1 and 2, respectively), consistent with the length ratio between both sex chromosomes. This would confirm the previous anthropological identification of La Braña specimens as males. Figure 3. PCA Analyses of the Two Mesolithic Individuals Left, La Braña 1; right, La Braña 2. The analyses were generated using 47,742 SNPs for La Braña 1 and 32,339 SNPs for La Braña 2, and five current European populations (Finns, Iberians, Great Britons, Tuscans, and CEU) [31]. A worldwide genomic principal component analysis (PCA) with data from the 1000 Genomes Project [31] places La Braña 1 and 2 near, but not within the variation of current European populations (Figure S2). However, when compared exclusively to European populations, La Braña 1 and 2 fall closer to Northern European populations such as CEU and Great Britons than Southern European groups such as Iberians or Tuscans (Figure 3). With 1KGPomni chip [31] data, the PCA generates a similar pattern (Figure S3), although the general geographic structure is less clear because of the limited number of SNPs (see Supplemental Experimental Procedures).
|
|