Physical trait PCA Ancient Europeans and World

Doggerland

Regular Member
Messages
197
Reaction score
78
Points
28
Ethnic group
Germanic
Y-DNA haplogroup
I
mtDNA haplogroup
H
I filtered the 129 SNPs I am using for trait prediction of ancient samples to determine which alleles differ between the founding populations of modern Europeans. 29 SNPs reached a relevance. Based on this I created a PCA and clustering with PAST:




Added some modern Europeans:





NorthEuropean is a private donor sample from a Dane. WestEuropeans is Basque + French + Brits. EastEuropean is Ukrainian + Estonian + Russian. SouthEuropean is Greek + Italian + Iberian. Sami is from Kola.

I did the same for the modern populations that are believed to represent races in a global sense and 114 SNPs reached a relevance. I used the average allele values from NCBI.
Africans, Middle Easterns, Asians and European. But I personally don't like that ancestry calculators cannot differentiate between Africans and Apes, so I added primates to make this possible. I used samples from Chimpanzee, Gorilla, Rhesus Monkey, Orang Utan, Baboon and called this Simian. There are many alleles for physical traits that differ between Africans and apes/monkeys and the distance between humans and apes is larger as I suspected.

The result:








Added some ancient samples, Hindu Kush(Northern South Asian) and Oceanians:








AncientNative is a merge of Kennewick Man, Clovis and Mesolithic American samples.
 
1. I used the MiddleEast and Qatar average allele values from NCBI. This PCA suggests they are relatively similar in a global sense. The European average allele values from NCBI may include people from the Balkans and Greece too, which makes Europe as a whole population more near the Middle Easterns as for example a Brit or a Swede.

2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113750/


Lets add a Swede and see what happens:


Ancient Europe:





Modern World:



One should keep in mind that those are two different SNP sets. First differs between ancient European populations, second between modern world. It is not made to differentiate between modern Europeans.
 
1. I used the MiddleEast and Qatar average allele values from NCBI. This PCA suggests they are relatively similar in a global sense. The European average allele values from NCBI may include people from the Balkans and Greece too, which makes Europe as a whole population more near the Middle Easterns as for example a Brit or a Swede.

2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113750/


Lets add a Swede and see what happens:


Ancient Europe:





Modern World:



One should keep in mind that those are two different SNP sets. First differs between ancient European populations, second between modern world. It is not made to differentiate between modern Europeans.

It's a bit counter-intuitive as regards Southern Europeans isn't it? They're closer to Mesolithic people and Swedes than Neolithic farmers?
 
The SNPS where selected for the allele difference between the founding populations. As you can see, neolithic cultures are very close when it comes to this specific alleles. Mesolithic Populations had a greater distance and more variation. When someone modern does not cluster with them, this means this person or population has not this specific alleles and may be not very much related to this population in a physiological sense. The alleles that differ in modern Europeans are not all the same as for ancient populations. Some alleles that where frequent once had become rare, others that where falling between the variation of all ancient populations are now more different between the populations.
Western Europeans seem to have the highest frequency of specific Neolithic alleles it seems.

For example the Steppe Populations had very few specific alleles that falling out of the inter-population variation range, so if someone has steppe alleles and the rest is in between the inter-population variation range, he will automatically be more near the Steppe.

You can also compare all alleles to a sample or a population like I did before I was using this PCA, but this will create an individual match. For example one can have none of the neolithic specific alleles, but match high with a neolithic sample, because the individual alleles that are falling into the in-between populations range have a high similarity. That means the two individuals look similar, but only on the individual level, not in the sense of populations/race/ethnicity:

https://1.bp.blogspot.com/-zNC-Uuuf...AADYk/H6kKqKVl1iU/s1600/doppelganger+bush.png

https://buzzworthy.s3.amazonaws.com/wp-content/uploads/2016/07/black-arnold-schwarzenegger.png

So what to measure depends on what your question is.
This PCA does not measure individual matching of inter-population alleles, because only the SNPs are used for the PCA, that differ between the European founding populations.

I added a Sardinian, Basque, North Italian, Ancient Greek, Ancient Roman and Hungarian:


 
The SNPS where selected for the allele difference between the founding populations. As you can see, neolithic cultures are very close when it comes to this specific alleles. Mesolithic Populations had a greater distance and more variation. When someone modern does not cluster with them, this means this person or population has not this specific alleles and may be not very much related to this population in a physiological sense. The alleles that differ in modern Europeans are not all the same as for ancient populations. Some alleles that where frequent once had become rare, others that where falling between the variation of all ancient populations are now more different between the populations.
Western Europeans seem to have the highest frequency of specific Neolithic alleles it seems.

For example the Steppe Populations had very few specific alleles that falling out of the inter-population variation range, so if someone has steppe alleles and the rest is in between the inter-population variation range, he will automatically be more near the Steppe.

You can also compare all alleles to a sample or a population like I did before I was using this PCA, but this will create an individual match. For example one can have none of the neolithic specific alleles, but match high with a neolithic sample, because the individual alleles that are falling into the in-between populations range have a high similarity. That means the two individuals look similar, but only on the individual level, not in the sense of populations/race/ethnicity:

https://1.bp.blogspot.com/-zNC-Uuuf...AADYk/H6kKqKVl1iU/s1600/doppelganger+bush.png

https://buzzworthy.s3.amazonaws.com/wp-content/uploads/2016/07/black-arnold-schwarzenegger.png

So what to measure depends on what your question is.
This PCA does not measure individual matching of inter-population alleles, because only the SNPs are used for the PCA, that differ between the European founding populations.

I added a Sardinian, Basque, North Italian, Ancient Greek, Ancient Roman and Hungarian:



Evolution shapes our genetic code with regard to fitness for our environment; i.e. weather, sunlight, food, water etc. There is no way, given that fact, that North Italians should cluster with SHG and South Mesolithic.

Sorry, Doggerland, it still doesn't make any sense; we must be missing a lot of important snps.

The same problems show up with the trait findings on sites like 23andme. Some very rare but high impact genes, like BRC1, the breast cancer gene, are very informative. For some traits, like IQ, for example, there are probably dozens of alleles which are implicated. That's why the predictions can be wrong.

We just need more data.
 
Why the North Italian and the Sardinian is near the SHG/ Mesolithic I can explain to you by looking at the samples. Both lacking much Neolithic specific alleles in the significant SNPs.

But they have some alleles that are associated with SHG/Meso that where not present or more rare in Neolithic populations:

rs12570134 G = Northern Mesolithic
This allele causes an eyelid shape that is more common in East Asians and Africans, but generally more uncommon in all populations today.

rs263156 GG = Southern Mesolithic + SHG
Larger ear lobe.

rs11638069 CC = Mesolithic
Causes more yellowish eye color.

rs878639 GG = Southern Mesolithic + Eastern Steppe
Associated with body height.

rs756853 TT = SHG
Lesser likelihood of baldness.

rs11636232 T = Mesolithic + SHG
Lighter/blue eye color.

For example the T alleles in rs11636232 for light/blue eyes. It is known that Sardinans can have blue eyes: https://qph.fs.quoracdn.net/main-qimg-a7e89ad6c8c7fc69d09a3c00b7a39323
The same goes for North Italians.

This alleles cannot be of neolithic origin so they must have been later introduced or a recurrence of Mesolithic heritage. Maybe they where introduced by Indoeuropeans or Germanic Tribes (Vandals raided Sardinia) Germanic people where also a part of the Roman Empire in some cases.

I can post here all alleles that are associated with neolithic populations and you can see if you have any of them:

rs6709347 A= Western Neolithic
This allele is associated with eye socket shape. Rare trait today, most common in Europeans.

rs2058742 T = Neolithic
This allele causes a higher nasal angle(Upturned Nose), most common in Africans today.

rs9567488 T = Neolithic
Associated with a broad philtrum.

rs3758477 T = Western Neolithic
More inverted forehead.

rs4779685 T = Neolithic
Absence of eye color flecks.

rs10514310 G = Neolithic
More body fat in childhood.

rs2236705 A = Southern Neolithic
Associated with more leg mass. This trait is more common in Asian populations today.

rs4927012 C = Southern Neolithic
Associated with hand morphology, digit length decreased.

rs10756819 AA = Northern Neolithic
Lighter skin color.

rs683 AA = Neolithic + SHG
Associated with hair and iris color.

rs10777129 A = Neolithic
Lighter skin color and hair.

rs7183877 A = Southern Neolithic
Darker eye color.

rs8051733 G = Northern Neolithic + Eastern Steppe
Associated with skin color, probably lighter.

rs2228479 A = Western Neolithic
Red or blonde hair if homozygote, lighter skin.

If someone wants to be on the PCA, send me a PM. I will send the list with all needed SNP and you can fill them out and send them back. Say if you want a private link or if I should integrate the person here in the thread.
 
You're basing all this on one sample from Sardinia and one sample from Northern Italy?

What about the various Neolithic groups?
 
I think you don’t understand what I did to create the PCA. For this one needs the average allele values of the European founding populations.
I analyzed a lot of samples to get the average values for the alleles of 129 SNPs related to physical traits. SNPs that had the same allele value for all founding populations where exuded from the PCA, because they cannot show ethnic/racial differences, only on the individual level.

For the Neolithic Farmers, samples from Cardial Pottery, Wartberg, French Atlantic Neolithic sites, Neolithic sites on the British Isles, Funnel Beaker, Cucuteni/Trypillan, Karanovo, Lengynel, Varna, Vinca and various Linear Pottery groups where used.

For the Mesolithic Hunters, samples from Cheddar Man, Loschbour, La Brana, Spain, France, Italy, Germany, Ireland and Hungary where used.

For the Scandinavian Hunter Gatherers, samples from Ertebolle, Maglemose, Pitted Ware, Motala and Norway where used.

For the Steppe Herders, samples from Yamnaya, Surbnaya, Andronovo, Sintashta, Afanasievo and EHG where used.

If you want you can link a modern Italian sample from a public database like ENA for download, if you don’t like the current Italian results and I can add this to the PCA. The prerequisite is that the SNPs are also available in the sample. Therefor it must be of good quality and large size.
 
can you add other modern westeurasians aside from europeans to the "ancient europe" pca?
 
Yes, I understood all that. Where did you get the alleles for the Sardinians and the North Italians? It's not valid to use one sample, mine, for example, not to mention I'm about half Tuscan.

I'll look them up on 23andme just for the hell of it.
 
Yes, I understood all that. Where did you get the alleles for the Sardinians and the North Italians? It's not valid to use one sample, mine, for example, not to mention I'm about half Tuscan.

I'll look them up on 23andme just for the hell of it.

No joy. Not genotyped. Anyone have a link to a program through which I can run my raw data?
 
can you add other modern westeurasians aside from europeans to the "ancient europe" pca?

Wouldn't make sense, because it is based on the differences between European HGs, Neolithic and Steppe.
For what you asked for I would need another basic grouping for ancient populations worldwide.
I can do that, but then I must group them that it makes sense.
Europeans did a great research on ancient DNA, other Nations did this not, because of economic, cultural and religious issues. The results is that it is possible to get the mean value of alleles for ancient European populations, because there are many samples, but for example for Chinese Neolithic or Mesolithic Caucasus has very few samples that are available. The results will be less accurate, because the allele values will be based on only one sample or very few.

An example of available samples from Paleolithic to early Neolithic:

https://ibb.co/NK6PQ6f


Neolithic and Bronze Age:

https://ibb.co/61YdVfR


Source: https://umap.openstreetmap.fr/en/map/ancient-human-dna_41837#3/48.75/81.39

As you see some regions in the world still lack samples, for example Africa, South Asia Oceania, Japan and Beringa.

I could make a PCA for Mesolithic, Neolithic and Bronze Age world. But for example CHG will not be included, because the samples are of too low quality and don’t contain enough of the needed SNPs.

A PCA Neolithic + Bronze Age World would be the most possible to create and the most accurate for adding modern West Eurasians.
 
Wouldn't make sense, because it is based on the differences between European HGs, Neolithic and Steppe.
For what you asked for I would need another basic grouping for ancient populations worldwide.

if it makes sense to add modern europeans why would it be so different to add other modern westeurasians to the graphic? you don't have to add other ancient westeurasians. after all you are just looking at a few selected snps. i'd like to see how other modern westeurasians compare to those ancient european populations you have there in the pca with that specific set of snps.
 
Modern West Asians could have traits from beyond eastern Steppe and Anatolia. There is also a large possibility that they have Early Fertile Crescent Neolithic traits. https://i.pinimg.com/736x/b1/3b/61/b13b6141c5a7192cb7320d2f46633fdb--crescents-maps.jpg

What I also think is a problem is the sparse data from the Natufian and pre-pottery sites from the Levant. Not enough samples to get mean values for alleles. I doubt that modern west Asians have nothing inherited from there.

The other way around we can add single ancient West Asians and other ancient samples to the modern world PCA:




AncientLevant is a merge of Natufian+Neolithic+Chalcolithic Levant samples.
AncientAfrican is a merge of ShumLaka + Malawi Mesolithic + Pygmy + Bantu.

Another problem with PCAs is that one can believe that samples that are near each other must have somehow the same components, but this is not always the case. They can be at a specific location because of different components, but a 2D view cannot show this. For that clustering is a better method. For example TarimMummy and FestFertCrescNeol are near each other, but not when it comes to clustering:

 
Can you check whether Kumsay or botai clusters around yamna?
Archaeologically kumsay belongs to yamna, but they are separated by their admixture:

"The Kumsay burial site was first discovered in 2009, and is named after the village it was found nearby, near the Ulriver in western Kazakhstan. Roughly this location. This burial site was classified as being part of the Yamnaya horizon based on the burials rites of the perople. These burials were '"pit graves" covered with an earthen mound. Just like with the Yamnaya. Furthermore, many of the people ad a supine position with flexed legs, similar to the positions seen with the Yamnaya. And the people buried here were sprinkled with red ochre, another tradition also prevalent in the Yamnaya horizon.
What is interesting about the people here was that many of them were really big, sturdy people. Aside from the general robust features and all, several of the people here were well over 190 cm, reaching up to and above 2 meters tall! "

https://musaeumscythia.blogspot.com/2021/11/a-look-at-kumsay-graveyard-of-giants.html


botai (3500bc) kumsay (3000bc) genetic admixture:

caucasus-cline-narasimhan.jpg
 
Little Update with more modern and ancient populations/single samples. All samples from Asia are colored red, from Europe blue, from Middle East orange and Africa brown:





CroMagnon is a merge of Paleolithic European samples that where classified as “Cro Magnon Type” by scientists in the past.


Clustering with Simian(Apes+Monkeys) as root:


 
My result with thanks to Doggerland!



I try to grasp it, some say the population of my region is basically a combination of Single Grave Culture/Corded Ware and the Funnelbeakers (the ones in my region were closer to mesolithic HG than to EEF).

My position on the PCA would then make sense!

But may be I'm "over interpreting". I'm curios were my parents would end....

Feel free to comment (Doggerland and others).
 
Back
Top