AADR v66.0 (Allen Ancient DNA Resource)

Jovialis · Apr 13, 2026

YEAH! They just updated it!

I will merge my WGS30x sample and explore the new additions.

The Allen Ancient DNA Resource (AADR): A curated compendium of ancient human genomes - David Reich Lab Dataverse

davef · Apr 13, 2026

Wow! So glad we have this!

Tautalus · Apr 13, 2026

I've compiled a quick overview of the new samples in version 66.
They are identified as "New", in red, in the right-hand column.

AADRv66.xlsx

docs.google.com

Jovialis · Apr 14, 2026

Tautalus said:
I've compiled a quick overview of the new samples in version 66.
They are identified as "New", in red, in the right-hand column.

Thank you for this excel sheet, it is really useful!

This AADR update is cool and all, but frankly, I've got my eye more on the new ARGMix tool that you posted. That is going to be a lot of fun to explore once they upload the software.

davef · Apr 14, 2026

So I guess argmix is supposed to be stronger than what you’ve been using @Jovialis?

Pretty sick list of samples in that spreadsheet, it’s got pretty much every ancient sample ever dug up.

Jovialis · Apr 14, 2026

davef said:
So I guess argmix is supposed to be stronger than what you’ve been using @Jovialis?

Pretty sick list of samples in that spreadsheet, it’s got pretty much every ancient sample ever dug up.

Thus far, I’ve been using ADMIXTOOLS 2 in RStudio (qpWave/qpAdm, PCA, FST, etc.), but I find this approach interesting because it offers a different way of interpreting ancestry. Rather than relying only on a broad genome-wide view, it attempts to isolate specific ancestral components and analyze them separately, which can give a clearer picture of continuity and admixture within the ancestry layers a population actually inherited.

davef · Apr 14, 2026

So it’s better if you want something that can predict real actual ancestry by reading smaller clips of your genome then?

Might be a little slow atm, barely slept due to hay fever…

Jovialis · Apr 14, 2026

I've gotten around to it yet, (life happens) but I see there's a 2 million SNP panel version which is the biggest yet. I will start with merge my sample into that one.

davef · Apr 15, 2026

Jovialis said:
I've gotten around to it yet, (life happens) but I see there's a 2 million SNP panel version which is the biggest yet. I will start with merge my sample into that one.

Good luck, can’t wait to see how this goes. Maybe run some of the ancient samples. I expect some real interesting results from this thing!

Jovialis · Apr 15, 2026

I successfully added my WGS30X saple to the AADR v66 2M panel. Starting from an already deduplicated BAM plus the v66 2M files, I extracted panel genotypes which produced roughly an 87.7% call rate, with strong average depth at target sites. The main obstacle was that the public 2M release was in TGENO / transpose-packed format, which older EIGENSOFT-style builds could not read properly, so I had to switch to a newer AdmixTools v8 build with TGENO support, convert the reference panel to PACKEDANCESTRYMAP, and then merge the sample into that converted dataset. A second issue was WSL memory limits: the merge initially died with an out-of-memory kill after successfully reading the packed genotype file, so increasing WSL RAM and swap resolved it. After that, the merge completed cleanly.

Jovialis · Apr 15, 2026

Jovialis said:
I successfully added my WGS30X saple to the AADR v66 2M panel. Starting from an already deduplicated BAM plus the v66 2M files, I extracted panel genotypes which produced roughly an 87.7% call rate, with strong average depth at target sites. The main obstacle was that the public 2M release was in TGENO / transpose-packed format, which older EIGENSOFT-style builds could not read properly, so I had to switch to a newer AdmixTools v8 build with TGENO support, convert the reference panel to PACKEDANCESTRYMAP, and then merge the sample into that converted dataset. A second issue was WSL memory limits: the merge initially died with an out-of-memory kill after successfully reading the packed genotype file, so increasing WSL RAM and swap resolved it. After that, the merge completed cleanly.

This is also the way I am going to be able to resolve the format issue with those Akbari et al. 2026 samples too.

davef · Apr 15, 2026

memory blew up eh? I knew something like this would need a ton of ram. Way more than my MacBook lol. I’m guessing something like this would take a while to run and figure out your ancestry. I never ran these tools so I wouldn’t know, then again I never had my dna tested.

Jovialis · Apr 15, 2026

davef said:
memory blew up eh? I knew something like this would need a ton of ram. Way more than my MacBook lol. I’m guessing something like this would take a while to run and figure out your ancestry. I never ran these tools so I wouldn’t know, then again I never had my dna tested.

Before I updated my Dell Laptop, you could basically fry an egg on it when I ran models. But I upgraded the ram, hard drives, fans, and did a thermal repaste. I bought it in 2020, but I think I can bring it to 2030 at least.

davef · Apr 17, 2026

I wouldn’t even have a laptop to fry an egg on. It would explode running something like that!

Jovialis · Apr 18, 2026

???

I think they meant to write "Danubian".

Jovialis · Apr 18, 2026

Perhaps a bit unorthodox, but who knows for sure...

davef · Apr 18, 2026

Interesting indeed! Not sure what to make of that

Jovialis · Apr 18, 2026

davef said:
Interesting indeed! Not sure what to make of that

Probably true in a very broad sense, not literal however.

davef · Apr 18, 2026

Yes, it’s a bit TOO broad, doesn’t say where they actually came from. If you have ancestry from Greeks your anatolian farmer wouldn’t be entirely from what is now Lazio, obv. And you’re almost half steppe according to this, which doesn’t seem right.

Jovialis · Apr 18, 2026

davef said:
Yes, it’s a bit TOO broad, doesn’t say where they actually came from. If you have ancestry from Greeks your anatolian farmer wouldn’t be entirely from what is now Lazio, obv. And you’re almost half steppe according to this, which doesn’t seem right.

With qpAdm, you often do get broad models rather than very proximal combinations. A lot of the so-called percision you see in consumer genomic is pretty much hype based on weak science, it is a business. For stuff like G25, it is basically junk science based on nearly arbitrary PCA distances. That being said, I'm still exploring different combos. It's tough to separate Anatolian_N from Greek vs Italic, etc.

The high steppe is probably capturing some WHG, along with the EHG/CHG. I think Catacomb is a tiny bit more CHG

AADR v66.0 (Allen Ancient DNA Resource)

Advisor

Hornet

Regular Member

Advisor

Hornet

Advisor

Hornet

Advisor

Hornet

Advisor

Advisor

Hornet

Advisor

Hornet

Advisor

Advisor

Hornet

Advisor

Hornet

Advisor