Removing reference bias in ancient DNA

Jovialis

Advisor
Messages
9,276
Reaction score
5,843
Points
113
Ethnic group
Italian
Y-DNA haplogroup
R-PF7566 (R-Y227216)
mtDNA haplogroup
H6a1b7
Abstract

During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA sequencing reads are short, single-ended and frequently mutated by post-mortem chemical modifications. All these features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Recently, alternative approaches for read mapping and genetic variation analysis have been developed that replace the linear reference by a variation graph which includes all the alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for ancient DNA. We used vg to align multiple previously published aDNA samples to a variation graph containing 1000 Genome Project variants, and compared these with the same data aligned with bwa to the human linear reference genome. We show that use of vg leads to a much more balanced allelic representation at polymorphic sites and better variant detection in comparison with bwa, especially in the presence of post-mortem changes, effectively removing reference bias. A recently published approach that filters bwa alignments using modified reads also removes bias, but has lower sensitivity than vg. Our findings demonstrate that aligning aDNA sequences to variation graphs allows recovering a higher fraction of non-reference variation and effectively mitigates the impact of reference bias in population genetics analyses using aDNA, while retaining mapping sensitivity.


https://www.biorxiv.org/content/early/2019/09/26/782755.full.pdf
 
From the pre-print:

We also investigated the effect of vg or bwa alignment on Principal Component Analysis (PCA),
another widely used analysis technique in the field of aDNA. Restricting this analysis to samples from Europe and West/Central Asia, we projected the ancient samples and the reference genome onto a PCA plot derived from modern samples. We observe modest differences between the positions of vg and bwa aligned samples, but these are not conclusive in terms of the direction of the bias (Supplementary Figures S19 and S20). For example, the bwa processed Botai sample appears to be slightly closer to the reference than its vg aligned equivalent, while the opposite pattern is observed for the Yamnaya sample. Given the variability in our PCA results, it is not possible to make strong conclusions about the effects of removing reference bias on PCA projection.
 

This thread has been viewed 4102 times.

Back
Top