• Don't want to see ads? Install an adblocker like uBlock Origin or use a Europe-based privacy-friendly browser like Vivaldi or Mullvad.

AADR v66.0 (Allen Ancient DNA Resource)

I've compiled a quick overview of the new samples in version 66.
They are identified as "New", in red, in the right-hand column.
Thank you for this excel sheet, it is really useful!

This AADR update is cool and all, but frankly, I've got my eye more on the new ARGMix tool that you posted. That is going to be a lot of fun to explore once they upload the software.
 
So I guess argmix is supposed to be stronger than what you’ve been using @Jovialis?

Pretty sick list of samples in that spreadsheet, it’s got pretty much every ancient sample ever dug up.
 
So I guess argmix is supposed to be stronger than what you’ve been using @Jovialis?

Pretty sick list of samples in that spreadsheet, it’s got pretty much every ancient sample ever dug up.
Thus far, I’ve been using ADMIXTOOLS 2 in RStudio (qpWave/qpAdm, PCA, FST, etc.), but I find this approach interesting because it offers a different way of interpreting ancestry. Rather than relying only on a broad genome-wide view, it attempts to isolate specific ancestral components and analyze them separately, which can give a clearer picture of continuity and admixture within the ancestry layers a population actually inherited.
 
So it’s better if you want something that can predict real actual ancestry by reading smaller clips of your genome then?

Might be a little slow atm, barely slept due to hay fever…
 
I've gotten around to it yet, (life happens) but I see there's a 2 million SNP panel version which is the biggest yet. I will start with merge my sample into that one.
 
I've gotten around to it yet, (life happens) but I see there's a 2 million SNP panel version which is the biggest yet. I will start with merge my sample into that one.
Good luck, can’t wait to see how this goes. Maybe run some of the ancient samples. I expect some real interesting results from this thing!
 
I successfully added my WGS30X saple to the AADR v66 2M panel. Starting from an already deduplicated BAM plus the v66 2M files, I extracted panel genotypes which produced roughly an 87.7% call rate, with strong average depth at target sites. The main obstacle was that the public 2M release was in TGENO / transpose-packed format, which older EIGENSOFT-style builds could not read properly, so I had to switch to a newer AdmixTools v8 build with TGENO support, convert the reference panel to PACKEDANCESTRYMAP, and then merge the sample into that converted dataset. A second issue was WSL memory limits: the merge initially died with an out-of-memory kill after successfully reading the packed genotype file, so increasing WSL RAM and swap resolved it. After that, the merge completed cleanly.
 
I successfully added my WGS30X saple to the AADR v66 2M panel. Starting from an already deduplicated BAM plus the v66 2M files, I extracted panel genotypes which produced roughly an 87.7% call rate, with strong average depth at target sites. The main obstacle was that the public 2M release was in TGENO / transpose-packed format, which older EIGENSOFT-style builds could not read properly, so I had to switch to a newer AdmixTools v8 build with TGENO support, convert the reference panel to PACKEDANCESTRYMAP, and then merge the sample into that converted dataset. A second issue was WSL memory limits: the merge initially died with an out-of-memory kill after successfully reading the packed genotype file, so increasing WSL RAM and swap resolved it. After that, the merge completed cleanly.
This is also the way I am going to be able to resolve the format issue with those Akbari et al. 2026 samples too.
 
memory blew up eh? I knew something like this would need a ton of ram. Way more than my MacBook lol. I’m guessing something like this would take a while to run and figure out your ancestry. I never ran these tools so I wouldn’t know, then again I never had my dna tested.
 
memory blew up eh? I knew something like this would need a ton of ram. Way more than my MacBook lol. I’m guessing something like this would take a while to run and figure out your ancestry. I never ran these tools so I wouldn’t know, then again I never had my dna tested.
Before I updated my Dell Laptop, you could basically fry an egg on it when I ran models. But I upgraded the ram, hard drives, fans, and did a thermal repaste. I bought it in 2020, but I think I can bring it to 2030 at least.
 
I wouldn’t even have a laptop to fry an egg on. It would explode running something like that!
 
1776521968049.png

???

I think they meant to write "Danubian".
 
1776531475084.png


Perhaps a bit unorthodox, but who knows for sure...
 
Interesting indeed! Not sure what to make of that
 
Yes, it’s a bit TOO broad, doesn’t say where they actually came from. If you have ancestry from Greeks your anatolian farmer wouldn’t be entirely from what is now Lazio, obv. And you’re almost half steppe according to this, which doesn’t seem right.
 
Yes, it’s a bit TOO broad, doesn’t say where they actually came from. If you have ancestry from Greeks your anatolian farmer wouldn’t be entirely from what is now Lazio, obv. And you’re almost half steppe according to this, which doesn’t seem right.
With qpAdm, you often do get broad models rather than very proximal combinations. A lot of the so-called percision you see in consumer genomic is pretty much hype based on weak science, it is a business. For stuff like G25, it is basically junk science based on nearly arbitrary PCA distances. That being said, I'm still exploring different combos. It's tough to separate Anatolian_N from Greek vs Italic, etc.

The high steppe is probably capturing some WHG, along with the EHG/CHG. I think Catacomb is a tiny bit more CHG
 
Last edited:
Back
Top