|
Post by Admin on May 3, 2024 23:05:46 GMT
Indigenous Australian genomes show deep structure and rich novel variation Abstract The Indigenous peoples of Australia have a rich linguistic and cultural history. How this relates to genetic diversity remains largely unknown because of their limited engagement with genomic studies. Here we analyse the genomes of 159 individuals from four remote Indigenous communities, including people who speak a language (Tiwi) not from the most widespread family (Pama–Nyungan). This large collection of Indigenous Australian genomes was made possible by careful community engagement and consultation. We observe exceptionally strong population structure across Australia, driven by divergence times between communities of 26,000–35,000 years ago and long-term low but stable effective population sizes. This demographic history, including early divergence from Papua New Guinean (47,000 years ago) and Eurasian groups1, has generated the highest proportion of previously undescribed genetic variation seen outside Africa and the most extended homozygosity compared with global samples. A substantial proportion of this variation is not observed in global reference panels or clinical datasets, and variation with predicted functional consequence is more likely to be homozygous than in other populations, with consequent implications for medical genomics2. Our results show that Indigenous Australians are not a single homogeneous genetic group and their genetic relationship with the peoples of New Guinea is not uniform. These patterns imply that the full breadth of Indigenous Australian genetic diversity remains uncharacterized, potentially limiting genomic medicine and equitable healthcare for Indigenous Australians. www.nature.com/articles/s41586-023-06831-w
|
|
|
Post by Admin on May 16, 2024 2:26:24 GMT
The Indigenous populations of Australia remain poorly represented in sequencing panels and clinical databases. Their inclusion is warranted on the grounds of equity and their unique demographic history. Indigenous Australians probably descend from an early dispersal of humans across Asia3, inheriting substantial ancestry from extinct hominin groups1,4,5. Previous DNA studies have identified novel variation6 and inferred a long history of geographical regionalism in Australia7. An earlier whole-genome sequencing study inferred a sudden separation from Papuans 25–40 thousand years ago (ka) and divergence within Australia occurring 10–32 ka (ref. 1). Importantly, all 83 participants in the study were Pama–Nyungan language speakers, a language family that is widespread across Australia despite its relatively recent origin (estimated at 6 ka)8, possibly accounting for the lack of strong discernible structure1. It is estimated that another 27 language families9, largely restricted to the Top End and Kimberley region, are unrepresented in genomic data. Linguistic variation is often correlated with patterns of genetic variation10, supporting the inclusion of speakers of these languages in genomics studies.
If limited population structure remains after more representative geographical and language group sampling, a common set of genomic tools and reference panels will be sufficient to inform medical research and clinical practice. Alternatively, previously undocumented structure, due to patterns of migration, isolation and population size change, may indicate the poor suitability of such panels and support wider sampling to capture the full distribution and diversity of common and rare alleles.
Such patterns can be explored by quantifying the levels of novel and shared variation relative to other human populations and by applying population genetic models to determine structure and its causes. Both approaches require adequate sampling within communities and the inclusion of communities that capture the breadth of the underlying genetic diversity.
The NCIG collection The Australian National University holds more than 7,000 biospecimens collected between the 1960s and 1990s from about 40 Indigenous communities (Supplementary Note 1). A panel of leading Aboriginal and Torres Strait Islander Australians recommended the collection be placed under Indigenous-majority custodianship, leading to the establishment of the National Centre for Indigenous Genomics (NCIG) in 201611. The primary role of NCIG is to engage with Indigenous communities on the existence and nature of the collection, extend and promote its use for research and ensure that research is done with appropriate personal consent and community engagement (Methods).
During recent community engagement, 159 community members provided new blood or saliva samples under modern consent and ethics protocols. This study analyses genetic data from these Indigenous Australians from four environmentally diverse regions across northern and central Australia, including tropical savannah and rainforest, remote islands and desert. (Clearly these environments will have varied over the many millennia Indigenous Australians have lived on the continent). This is a large and purposefully diverse collection of genomic data from Indigenous Australians.
The cohort includes 59 individuals from the Tiwi Islands. The Tiwi people experienced a long period of isolation from mainland Australia12 and speak a linguistic isolate unrelated to the Pama–Nyungan languages spoken by the other three communities involved. Included are 33 people from the community of Wurrumiyanga on Bathurst Island, 20 from Milikapiti and six from Pirlangimpi on Melville Island. This is about 3% of the current population of the islands (around 2,000). The cohort also includes 48 individuals from the community of Yarrabah on the traditional lands of the Gunggandji and Mandingalbay Yidinji. The Yarrabah Aboriginal Mission, established in 1892, was used as a settlement for displaced Indigenous people from across Queensland. In 1938, 43 different tribal groups were represented in Yarrabah13. The cohort contains 14 people from the Central Desert community of Titjikala, comprising of members of the Southern Arrernte, Yankunytjatjara, Luritja and Pitjantjatjara. Finally, there are 38 individuals from the community of Galiwin’ku on Elcho Island. Established in 1942, the community comprises members of 30 closely related clan groups (Yalu team Galiwin’ku, personal communication).
DNA was extracted from either blood or saliva and Illumina sequenced to high coverage (minimum 30×, median 42×; see Methods and Supplementary Note 2). Variants were called jointly and phased with 60 previously sequenced individuals from geographically adjacent populations (25 men from the highlands of Papua New Guinea (PNG) drawn from five different language groups1 and 35 men from 11 regions of the Bismarck Archipelago of PNG in Island Melanesia5).
|
|
|
Post by Admin on May 17, 2024 20:04:30 GMT
The NCIG collection The Australian National University holds more than 7,000 biospecimens collected between the 1960s and 1990s from about 40 Indigenous communities (Supplementary Note 1). A panel of leading Aboriginal and Torres Strait Islander Australians recommended the collection be placed under Indigenous-majority custodianship, leading to the establishment of the National Centre for Indigenous Genomics (NCIG) in 201611. The primary role of NCIG is to engage with Indigenous communities on the existence and nature of the collection, extend and promote its use for research and ensure that research is done with appropriate personal consent and community engagement (Methods).
During recent community engagement, 159 community members provided new blood or saliva samples under modern consent and ethics protocols. This study analyses genetic data from these Indigenous Australians from four environmentally diverse regions across northern and central Australia, including tropical savannah and rainforest, remote islands and desert. (Clearly these environments will have varied over the many millennia Indigenous Australians have lived on the continent). This is a large and purposefully diverse collection of genomic data from Indigenous Australians.
The cohort includes 59 individuals from the Tiwi Islands. The Tiwi people experienced a long period of isolation from mainland Australia12 and speak a linguistic isolate unrelated to the Pama–Nyungan languages spoken by the other three communities involved. Included are 33 people from the community of Wurrumiyanga on Bathurst Island, 20 from Milikapiti and six from Pirlangimpi on Melville Island. This is about 3% of the current population of the islands (around 2,000). The cohort also includes 48 individuals from the community of Yarrabah on the traditional lands of the Gunggandji and Mandingalbay Yidinji. The Yarrabah Aboriginal Mission, established in 1892, was used as a settlement for displaced Indigenous people from across Queensland. In 1938, 43 different tribal groups were represented in Yarrabah13. The cohort contains 14 people from the Central Desert community of Titjikala, comprising of members of the Southern Arrernte, Yankunytjatjara, Luritja and Pitjantjatjara. Finally, there are 38 individuals from the community of Galiwin’ku on Elcho Island. Established in 1942, the community comprises members of 30 closely related clan groups (Yalu team Galiwin’ku, personal communication).
DNA was extracted from either blood or saliva and Illumina sequenced to high coverage (minimum 30×, median 42×; see Methods and Supplementary Note 2). Variants were called jointly and phased with 60 previously sequenced individuals from geographically adjacent populations (25 men from the highlands of Papua New Guinea (PNG) drawn from five different language groups1 and 35 men from 11 regions of the Bismarck Archipelago of PNG in Island Melanesia5).
Genetic ancestry in the collection We emphasize that genetic ancestry proportions may or may not align with identity and that all communities worldwide have varying degrees of shared ancestry. Nonetheless, we seek to focus on genetic ancestry that is Indigenous Australian in origin. Thus, our cohort was combined with the 1000 Genomes Project samples14 (hereafter 1000 Genomes), and we applied standard algorithms to identify genomic regions with ancestry other than Indigenous ancestry (Methods and Supplementary Note 2). We find that 100 of 111 individuals from Titjikala, Galiwin’ku and Tiwi have only Indigenous ancestry (Extended Data Fig. 1a). By contrast, consistent with the history of the community, all Yarrabah individuals have an appreciable degree of European, East Asian and/or putative Melanesian ancestry (mean 41%, range 11–73%). Notably, and consistent with known sex-specific demographic patterns1,15, all Australian individuals have a mitochondrial lineage belonging to a previously documented Indigenous Australian haplogroup (see ‘Mitochondrial diversity’ section).
To avoid genomic regions of non-Indigenous ancestry confounding analyses, local ancestry was inferred along each haplotype on the basis of a reference panel of individuals thought to be unadmixed from Australia, PNG, Eurasia and Africa. Genomic regions were masked within an individual if one or both haplotypes were inferred to be of non-Indigenous ancestry: that is, neither Australian nor Papuan (see ‘Ancestry inference’ in Methods and Supplementary Note 2). Ten individuals from Tiwi showed patterns of polymorphism and clustering consistent with having at least one recent ancestor from an Indigenous community other than Tiwi (Supplementary Note 2). Unless otherwise stated, all analyses were performed on this ancestry-masked dataset, filtered to remove these ten Tiwi individuals and first- and second-degree relatives, leaving 89 individuals (34 Tiwi, 31 Yarrabah, 17 Galiwin’ku, 7 Titjikala).
The size of this collection, its geographical distribution and the limited non-Indigenous ancestry is notable compared with previous studies1,16,17. This allowed for characterization of novel and shared genetic variation at the individual and population levels and inference of the demographic forces that have generated these patterns.
|
|