Population structure in Italy using ancient and modern samples

My distance from all the samples is important, due to my non-European components (15-18% Native American, 2-3% Sub Saharan African). What caught my attention is that although some calculators, with a significant distance (up to 22%), show my affinities with samples from northern Italy, others instead show affinity with samples from southeastern Europe (Romanians, Bulgarians, Thracian Turks) , even with more affinity than the Italian samples. My interpretation (which may be very wrong) is that these peoples show small Eastern Eurasian components, and that some calculators might interpret them as similar to the Native American component.
But such "Western" affinities had never appeared to me, such as Wales or Germany....

I think that's a good hunch. It would be EEF like, plus a Slavic component with some eastern affinities or some ancestry from peoples like the Bulgars, or an Ottoman component with some Central Asian affinity.
 
@Pax Augusta, @Salento, @italouruguayan

Code:
UK_Wales:WalesL45,6.74,0.5,2.5,0,30.35,29.65,0,0,7.52,1.13,21.6,0
Ashkenazi_o:ashkenazy10w,7.38,0.4,1.63,0.35,31.14,28.55,1.32,0,7.51,0.05,21.49,0.19
GermanyB:GermanB18,5.58,1.45,1.42,0.42,31.01,32.91,0.26,1.86,5.14,0,19.94,0
Greece_Macedonia:GreeceMaced8,7.86,1.73,1.06,0,26.44,30.87,0,0,6.41,0.64,24.98,0
Ashkenazi_o:ashkenazy9w,6.13,0,2.78,0.25,35.03,28.09,2.19,0,5.96,0,19.38,0.18
Greece_NorthEast:GreeceNE34,6.36,0,2.82,0,27.36,30.25,0,0,7.9,0,25.31,0
Greece_NorthEast:GreeceNE59,6.59,0.83,1.44,0,25.65,26.93,1.27,0,10.96,0,26.24,0.08

These are the distances of samples above used as target and, using as source the Vahaduo K12b updated spreadsheet:

Distance to:Greece_NorthEast:GreeceNE59
4.37961185Turk_Makedonya
4.52077427Bulgarian_Thrace
5.04378826Macedonian_South
5.05466122Greek_Thessaly
5.40490518Bulgarian_East
5.43984375Greek_Thrace
5.74054875Albanian_Kosovo
5.77509307Moldovan_Gagauz
5.93518323Greek_Macedonia
6.02970978Greek_Thessaloniki
6.05070244Macedonian_Vardar
6.60162859Macedonian_Northeast&Skopje;
6.74244763Macedonian_East
6.81491012Macedonian_Polog
7.09058531Greek_Peloponnese
7.10554713Turk_Deliorman
7.18034818Bulgarian_Central
7.54792687Albanian
7.56649853Turk_Trakya
7.80544041Pomak_Bulgaria
7.86221979Pomak_Greece
8.39933926Moldovan_South
8.48232869Bulgarian_West
9.14783581Romanian
9.37139797Greek_Central


Distance to:Greece_NorthEast:GreeceNE34
2.89259399Macedonian_Polog
3.46555912Bulgarian_East
3.49915704Macedonian_Vardar
3.70596816Moldovan_Gagauz
3.77583103Macedonian_South
3.78284285Macedonian_East
3.93033077Bulgarian_Central
3.97266913Greek_Macedonia
3.98150725Macedonian_Northeast&Skopje;
4.55587533Moldovan_South
4.67640888Bulgarian_West
4.84033057Pomak_Bulgaria
5.18395602Romanian
5.67920769Albanian_Kosovo
6.17983819Pomak_Greece
6.44520752Bulgarian_Thrace
6.49372774Turk_Makedonya
6.87858270Montenegrin
7.02725409Greek_Thessaly
7.73405456Greek_Thrace
7.95475330Turk_Deliorman
8.10506632Greek_Thessaloniki
8.67857707Italian_Friuli_VG
9.20416753Albanian
9.98241454Turk_Trakya


Distance to:Ashkenazi_o:ashkenazy9w
3.97836650Italian_Friuli_VG
4.73729881Italian_Piedmont
4.82879902Italian_Veneto
5.31954885Italian_Trentino
6.03938842Swiss_Italian
6.99577015Italian_Aosta_Valley
7.05284340Italian_Lombardy
7.55560057Italian_Liguria
8.21692156Italian_Emilia
8.56102214Austrian_Tyrol
9.46043339Swiss_French
9.50565095Italian_Tuscany
9.96080820Spanish_Baleares
10.03612475Macedonian_Vardar
10.28896982Portuguese
10.36751658Macedonian_Polog
10.52702237Macedonian_East
10.63366353Spanish_Canarias
10.80087959Macedonian_South
11.22993767Italian_Romagna
11.52451734Albanian_Kosovo
11.64410581Greek_Macedonia
11.87263240Montenegrin
11.90727089Romanian
12.20054507Macedonian_Northeast&Skopje;


Distance to:Greece_Macedonia:GreeceMaced8
3.05681206Moldovan_Gagauz
3.54866172Bulgarian_East
3.67042232Bulgarian_Central
3.71130705Macedonian_Polog
3.87640297Moldovan_South
4.26184232Bulgarian_West
4.45850872Romanian
4.49338403Pomak_Bulgaria
4.70820560Macedonian_Vardar
4.85993827Macedonian_East
4.89864267Macedonian_Northeast&Skopje;
4.94085013Greek_Macedonia
5.55332333Macedonian_South
5.58341293Pomak_Greece
6.18373673Turk_Deliorman
6.41329868Montenegrin
6.44729401Turk_Makedonya
7.08849067Albanian_Kosovo
7.31450614Bulgarian_Thrace
8.42560384Greek_Thessaly
8.81152087Turk_Trakya
9.01696734Greek_Thrace
9.37526533Greek_Thessaloniki
9.63255937Serb
9.68051652Italian_Friuli_VG


Distance to:GermanyB:GermanB18
5.98632609Austrian_Tyrol
6.19753177Montenegrin
6.32678433Italian_Friuli_VG
6.85178079Romanian
7.35885181Moldovan_South
7.64024214Bulgarian_West
7.65620010Macedonian_Polog
7.73080850Serb
8.11041306Swiss_French
8.21610613Macedonian_East
8.68888946Italian_Trentino
8.72001147Macedonian_Vardar
8.75078282Bulgarian_Central
8.98854827Hungarian_Transylvania+Székely
9.07730687Italian_Veneto
9.30740028Pomak_Bulgaria
9.43038175Macedonian_Northeast&Skopje;
9.76421016Greek_Macedonia
9.76480415Macedonian_South
9.82648462Moldovan_Central
9.99203183Moldovan_Gagauz
10.03113154Bulgarian_East
10.06980635Bavarian_German
10.19094696Italian_Piedmont
10.29755371Swiss_Italian


Distance to:Ashkenazi_o:ashkenazy10w
4.56495345Italian_Friuli_VG
6.28743191Macedonian_Vardar
6.51771432Macedonian_Polog
6.78273544Italian_Veneto
6.87373261Macedonian_East
6.95633524Macedonian_South
7.45501174Italian_Piedmont
7.74977419Greek_Macedonia
8.02148989Albanian_Kosovo
8.32313042Romanian
8.36157282Macedonian_Northeast&Skopje;
8.36366546Moldovan_South
8.54350631Bulgarian_West
8.54554270Moldovan_Gagauz
8.60482423Italian_Trentino
8.65557624Bulgarian_Central
8.70070112Bulgarian_East
8.99576567Montenegrin
9.18344707Italian_Liguria
9.28941333Italian_Emilia
9.55675154Pomak_Bulgaria
9.68210269Swiss_Italian
9.76258163Italian_Lombardy
9.77690135Greek_Thessaly
9.86282921Italian_Tuscany


Distance to:UK_Wales:WalesL45
5.10288154Italian_Friuli_VG
5.75871513Macedonian_Polog
5.94568751Macedonian_Vardar
6.22204147Macedonian_East
6.63006787Macedonian_South
7.10584970Moldovan_South
7.12500526Romanian
7.31356274Greek_Macedonia
7.38367795Bulgarian_West
7.60840982Macedonian_Northeast&Skopje;
7.62621794Bulgarian_Central
7.63264699Italian_Veneto
7.77098449Montenegrin
7.84963056Moldovan_Gagauz
7.87671251Bulgarian_East
8.05316708Albanian_Kosovo
8.42894418Pomak_Bulgaria
8.50355220Italian_Piedmont
9.17952068Italian_Trentino
9.56000523Austrian_Tyrol
9.87535316Greek_Thessaly
10.04413262Turk_Makedonya
10.28497448Italian_Liguria
10.30603707Bulgarian_Thrace
10.35402639Swiss_Italian

Those two samples labeled "Ashkenazi" are clearly not Ashkenazi at all. That Wales sample also looks really "Wonky"
 
If we were making a modern calculator we would filter the samples, just like Vahaduo and others.

Most users never see the unfiltered, unsupervised big blocks of Modern samples, … and to a degree, that’s what some of these are.

The “experts” pick and choose the samples, or they make averages out of it, … it’s OK, I guess.
 
With the new coordinates, my results got a bit strange...Wales? Ashkenazi? German?
Distance to: italouruguayan
12.12725855 UK_Wales:WalesL45
12.45939806 Ashkenazi_o:ashkenazy10w
13.30189460 GermanyB:GermanB18
13.55852499 Greece_Macedonia:GreeceMaced8
13.80775145 Ashkenazi_o:ashkenazy9w
14.34106342 Greece_NorthEast:GreeceNE34
14.41980582 Greece_NorthEast:GreeceNE59
14.56452883 Friuli-Venezia-Giulia:ALP346
14.61209088 Friuli-Venezia-Giulia:KF2700960
14.75188801 Piedmont:ItalyPiedmont63
15.11099600 Friuli-Venezia-Giulia:ALP235
15.23157904 Piedmont:piedmont154
15.33053489 Friuli-Venezia-Giulia:KF1800761
15.36169262 Albanian:AL9
15.36592984 Friuli-Venezia-Giulia:ALP081
15.46325968 Friuli-Venezia-Giulia:KF2700922
15.47954780 Trentino-Alto-Adige:ALP200
15.49061006 Lombardy:ALP288


my new ones are fine

Distance to: Torziok12b
1.83711731 Veneto:ALP022
2.08712721 Friuli-Venezia-Giulia:ALP235
2.80196360 Veneto:ALP249
2.82104591 Friuli-Venezia-Giulia:KF2700960
2.84116173 Friuli-Venezia-Giulia:ALP346
2.94732082 Ashkenazi_o:ashkenazy9w
3.05674664 Trentino-Alto-Adige:ALP200
3.09166622 Friuli-Venezia-Giulia:ALP354
3.11242671 Friuli-Venezia-Giulia:ALP280
3.25023076 Veneto:KF1803151
3.28344331 Piedmont:ItalyPiedmont127
3.28800852 Veneto:KF1803105
3.42483576 Friuli-Venezia-Giulia:ALP081
3.51180865 Corsica_o:corsica11908
3.61798286 Friuli-Venezia-Giulia:KF1800761
3.89629311 Veneto:ALP378
3.92330218 Veneto:ALP273
4.13805510 Trentino-Alto-Adige:ALP420
4.16438471 Veneto:KF1803109
4.19916658 Veneto:KF1800751
4.21149617 Lombardy:ALP288
4.21709616 N_Italy_HGDP:HGDP01154
4.22912521 Piedmont:ItalyPiedmont63
4.36495132 Veneto:ALP250
4.37962327 Veneto:Alp401


I do have 15% of welsh and irish in myheritage admixture ( along with 72% italian )
 
They are all academic samples.

And? That may be, but if so, the academics didn't check the ancestry of those people carefully enough, because there's no way in hell those two samples are 100% Ashkenazi. The Wales sample is also highly suspect. Maybe the researchers were oblivious to the fact that quite a few Italians settled in Wales.

Therefore, they shouldn't be used. Inclusion in a PCA, for example is going to throw it off.

The Corsican samples should also all be carefully checked, because apparently the researchers didn't give a darn whether some of them were French admixed.

Dienekes always carefully checked each and every sample, academic or not, to make sure it should be included.

No offense to people working on this, of course, but we all want accuracy, I'm sure.
 
I re-extracted the raw-data (Win and Linux) and ran the Dodecad K12b with a variety of apps.

... test subject WalesL45:

Code:
WalesL45_AdmixtureStudio,6.74,0.5,2.5,0,30.35,29.65,0,0,7.52,1.13,21.6,0
WalesL45_Admix_Linux,6.74,0.50,2.49,0,30.36,29.66,0,0,7.52,1.13,21.60,0
WalesL45_DIYDodecadWin,6.58,0.47,2.36,0.05,30.51,29.44,0.25,0,7.87,1.12,21.34,0
WalesL45_DIYDodecadLinux64,6.58,0.47,2.36,0.05,30.51,29.44,0.25,0,7.87,1.12,21.34,0
WalesL45_DIYDodecadLinux64_10-iteration,6.58,0.46,2.36,0,30.51,29.44,0.24,0,7.87,1.18,21.34,0

W9n9Atn.gif


… looks OK, I guess.
 
Last edited:
Never thought it was your error, Salento, but something must have gone wrong with the selection of that sample.
 
What are scientific samples? … samples collected from a specific verified area … It doesn’t necessarily mean that those samples must match each other.

The scientific list below has 109 samples collected in Tuscany, … COLLECTED in … Some of the samples on that list do not resemble typical Tuscans.

They made available the block of samples collected in Tuscany, … after that it's up to others which samples to use or not to use or how to label them.


HSX3FMb.gif
 
Never thought it was your error, Salento, but something must have gone wrong with the selection of that sample.

thanks, Angela, I understand that some samples look strange, I guess we could omit some samples, … carefully avoiding being biased :)
 
And? That may be, but if so, the academics didn't check the ancestry of those people carefully enough, because there's no way in hell those two samples are 100% Ashkenazi. The Wales sample is also highly suspect. Maybe the researchers were oblivious to the fact that quite a few Italians settled in Wales.

Therefore, they shouldn't be used. Inclusion in a PCA, for example is going to throw it off.

The Corsican samples should also all be carefully checked, because apparently the researchers didn't give a darn whether some of them were French admixed.

Dienekes always carefully checked each and every sample, academic or not, to make sure it should be included.

No offense to people working on this, of course, but we all want accuracy, I'm sure.


I was stating a fact: these are indeed academic samples. They have been dowloaded by academic sites by Salento, who is not responsible at all for their strange results, because outliers and especially mislabeled individuals (who don't fully belong to the label, who may have been born in the place, to which the label refers, to parents who have migrated from elsewhere) are really here and there in the academic sample sets. For many different reasons. This is frequently seen happening, for more than 10 years in academic studies. Sometimes they represent a percentage of the set not large enough to influence the average too much, other times as in the case of Corsicans they represent 25 percent of the individuals, if memory serves me, and influence the average more. There are attentative geneticists who remove them, other geneticists who use them all.
 
What are scientific samples? … samples collected from a specific verified area … It doesn’t necessarily mean that those samples must match each other.

The scientific list below has 109 samples collected in Tuscany, … COLLECTED in … Some of the samples on that list do not resemble typical Tuscans.

They made available the block of samples collected in Tuscany, … after that it's up to others which samples to use or not to use or how to label them.

Thanks Salento for that. That's the TSI sample set, it belongs to 1000genomes/HapMap projects. Or rather, TSI is a sample set from the HapMap project shared with 1000genomes.

GDtfxZe.png


In this sample set there are definitely individuals who are not 100 percent Tuscan, the same people who collected this sample around 2006 or 2007 confirmed this (I also found the date but can't remember it precisely now).

As stated in the project data sheet that I found 4 or 5 years ago and I believe also already published it on Eupedia, TSI is set of individuals based on self reported people with "at least three out of four grandparents who were born in Tuscany."

Aside from the fact that one of the grandparents may be non-Tuscan, even the fact that the other three were simply born in Tuscany (and their full ancestry from the Tuscany is not proven but only self reported) may lead to further inaccuracy.

I think it is the only sample set, but I am really going from memory, of 1000genomes/HapMap projects that does not guarantee that all grandparents are native.

In general, I think that being the sample set really large, there are 117 in all, many are completely Tuscan for real, but obviously not all of them are, and so the most obvious outliers (those that move away from the main cluster) should be removed and in any case not be considered. Again, in the past some geneticists have removed outliers from this sample set, but that doesn't always happen. The Estonian geneticist Metspalu used in his studies a subset of TSI, which he labeled TSI30 (I think based on 21 or 30 individuals from this set chosen by him). Diekenes in his calculators used this specific subset created by Metspalu, TSI30.


Official statement from the TSI HapMap project.

DsKtT7f.png
 
Last edited:
I looked at the coordinates of the TSIs big list.
but … the Tuscan and the TSI30 of the original Vahaduo don’t match the updated Vahaduo Tuscan,

… the end result is that the TSIs that match the original don’t match the updated, and the TSIs that match the updated don’t match the original.

Posting the TSIs coordinates would create too much confusion, I think.
 
I looked at the coordinates of the TSIs big list.
but … the Tuscan and the TSI30 of the original Vahaduo don’t match the updated Vahaduo Tuscan,

… the end result is that the TSIs that match the original don’t match the updated, and the TSIs that match the updated don’t match the original.

Posting the TSIs coordinates would create too much confusion, I think.

Indeed, agreed.
 
Samples from most of Italy are missing for the Iron Age. I do not understand why studies on Iron Age Italy presented almost a year ago have not yet come out. However, it is possible that slowly the picture will be clarified.
 
I bet R437 wasn't much of an "outlier" if you traveled to the south in the IA. But I guess we just have to wait and see.

Antonio 2019 has a table listing the full outliers, R437 and R850 are not on the list, … and they are Scientific Samples :)
 
Antonio 2019 has a table listing the full outliers, R437 and R850 are not on the list, … and they are Scientific Samples :)

Funny that we have a ton of Greek samples now that show that steppe wasn't a major component. But some people hang their hopes that they will show 50% steppe etc. Thus far there's only one, a woman from the MBA not even from where the Mycenaeans were. But we have only 6 Latin samples, and yet people are confident to call two outliers... why?
 
Funny that we have a ton of Greek samples now that show that steppe wasn't a major component. But some people hang their hopes that they will show 50% steppe etc. Thus far there's only one, a woman from the MBA not even from where the Mycenaeans were. But we have only 6 Latin samples, and yet people are confident to call two outliers... why?

850 and 437 are outliers relative to the main cluster of Etruscans and Italics, even the leaked samples from the Campanian paper show Samnites were akin to Latins and Etruscans.
Though they were outliers in a genetic sense it doesn't seem true to me that they must have been foreigners born outside of Italy (I don't understand why they are labelled "Greeks" here): the Daunian samples show a cline from Latins to Sicily_BA (who we know from the abstract of an upcoming study were similar to Sicily_IA) and, surprise surprise, ORD001 is very close to 437.

I suspect that the "east med" gene flow in the later samples from the Campanian study is due to a mixing of the later arrived Etruscans and Italics and the previous inhabitants, maybe more akin to Sicily_BA (and maybe the "east med" affinity was strengthened by the Greek colonies)
 

This thread has been viewed 331198 times.

Back
Top