The Arrival of Steppe & Iranian Related Ancestry in Islands of West Mediterranean

Here are the graphs I have made so far. Looks pretty plausible to me, but I think the Barcin-like vs. Tepecik-like thing should still be more explained by genetic studies under archaeological and historical evidences. Using Barcin/Boncuklu vs. Tepecik/Kumtepe causes the differences between northernmost and southernmost Italy to become much higher, because now it's a matter of very different proportions of at least 2 distinct ANF/EEF sources, not just a mostly Barcin-like substrate with minor contributions from additional Levant-related and CHG/Iran-related admixtures over time.

Graph - ITALIANS, GREEKS, BALKANIC AND CENTRAL-EASTERN EUROPEAN : https://imgur.com/a/7ecKxc8
 
@Angela
Average of few runs for Tuscans (average), with penalty

TUR_Barcin_N,35.1
TUR_Tepecik_Ciftlik_N,32.0
Yamnaya_RUS_Samara,16.0
WHG,6.2
RUS_Khvalynsk_En,5.3
IRN_Ganj_Dareh_N,4.1
GEO_CHG,1.2
MAR_EN,0.1

Total of 67.1 Anatolian, 21.3 Steppe, 6.2 WHG, 5.3 CHG/Iran and 0.1 Morocco EN.

@Ygorcs
Great work! Thanks!
 
@Angela
Average of few runs for Tuscans (average), with penalty

TUR_Barcin_N,35.1
TUR_Tepecik_Ciftlik_N,32.0
Yamnaya_RUS_Samara,16.0
WHG,6.2
RUS_Khvalynsk_En,5.3
IRN_Ganj_Dareh_N,4.1
GEO_CHG,1.2
MAR_EN,0.1

Total of 67.1 Anatolian, 21.3 Steppe, 6.2 WHG, 5.3 CHG/Iran and 0.1 Morocco EN.

@Ygorcs
Great work! Thanks!

Are you using nMonte3? I think it also allows the aggregation of individual source samples, so you don't necessarily need to use averages.
 
Since individual samples vary quite a bit, and averages are often a bit misleading if there is genetic structure within the population, I thought the best thing to understand the main trends in the genetic makeup of different areas and try to correlate them to historic and pre-historic demographic events was to model individuals and list them on a graph where you can really visualize the trends. I intend to do it for all European, Near Eastern and North African populations, but so far the graphs are complete for Italy, Greece, Balkans and Central-Eastern Europe.

These graphs also make it very visible what RegioX is also pointing out: North Italy is much more Barcin-like (with significant Tepecik/Kumtepe-like though), while the rest of Italy and especially Sicily is much more Tepecik/Kumtepe-like. The downside of using Tepecik/Kumtepe, though, is that the CHG/Iran_N flow pointed by genetic studies virtually disappears almost everywhere in Italy in the majority of individuals, but of course that flow was always more likely to have arrived there in heavily admixed form via mostly non-Caucasian and non-Iranian peoples of the East Mediterranean shores.

Also, some actual Levant-related admixture must be '"hidden" within the Tepecik/Kumtepe-like admixture, because Italian Jews appear here with just too low Levant_PPNB for my taste, and it's also very doubtful that Morocco_EN (Taforalt-like) arrived in individuals of some parts of South Italy and some Greek individuals in totally unadmixed form, not packed with a lot of Anatolia_L as well as Levant_N ancestry, too.

So maybe there ISN'T as much Iran Neo in Italy as the one paper found. Maybe Tepecek does indeed need to be used because there really was a lot of it in Italy, and especially Southern Italy since the MN.

There's no "downside" to a result unless we want to make the results support a certain hypothesis.

There should only be a "correct" result, whatever that might turn out to be.

I also have no idea what "too low Levant PPNB" for "your taste" in Italian Jews means. The Roman ones, not the Ashkenazi ones or Ashkenazi and Spanish Jewish admixed ones of other parts of Italy, have been in Rome for more than 2000 years. Jews have always taken "local" wives. Who says they didn't taken them in Rome, and changed their genetic signature.

We don't even know what the Jews of the diaspora to the Hellenistic world looked like yet. Who knows how much PPNB they actually had after that first round of admixture.

I think we have to beware of doing too much "pre-judging" of what the results "should be".
 
So maybe there ISN'T as much Iran Neo in Italy as the one paper found. Maybe Tepecek does indeed need to be used because there really was a lot of it in Italy, and especially Southern Italy since the MN.

There's no "downside" to a result unless we want to make the results support a certain hypothesis.

There should only be a "correct" result, whatever that might turn out to be.

I also have no idea what "too low Levant PPNB" for "your taste" in Italian Jews means. The Roman ones, not the Ashkenazi ones or Ashkenazi and Spanish Jewish admixed ones of other parts of Italy, have been in Rome for more than 2000 years. Jews have always taken "local" wives. Who says they didn't taken them in Rome, and changed their genetic signature.

We don't even know what the Jews of the diaspora to the Hellenistic world looked like yet. Who knows how much PPNB they actually had after that first round of admixture.

I think we have to beware of doing too much "pre-judging" of what the results "should be".

1) Of course a "downside" of my model in comparison with what has been published by professional geneticists in released papers. I'm just following your own advice: authors of professional genetic studies must know much better than me, so my own models should reflect theirs as much as possible if they're to be taken seriously. Why should I suddenly consider my model much better than theirs?
2) "For my taste" because that would mean a much much higher degree of non-Middle Eastern ancestry than more than 2 papers on European Jews have assumed is actually the case, and I don't think it's credible that the discrepancy between this model and what several prior studies found would really be so high. Also, I confess I'm a bit afraid of unwillingly encouraging the xenophobes who say Jews are just Europeans with very minor "real" ancient Jewish ancestry and have no place in the Middle Eastern history and demographics...

Also, there is a methodological problem that is IMO hard to solve: if a Barcin-like people got some but not much CHG/Iran_N-related and Levant_N-related amixture not too long after the Neolithic, it'll be nearly identical to Tepecik-Ciftlik, so it's virtually impossible to distinguish if it all came directly via Tepecik-Ciftlik-like people or that kind of genetic makeup appeared only gradually through several successive layers of admixture event. Only with a lot of aDNA samples from the same place allowing a diachronic analysis of 1 specific region we could answer it definitely.

Could I please suggest that you take what I write with a bit more good will and less suspicion as if trying to micro-manage and find any word that could be used against me even based on what you "divinate" was going through my head (as you said before about Phoenicians/Carthaginians)? I think it should already be clear to everyone here that all I'm interested in is in getting closer to the historical truth and I have no personal or national agenda at all.
 
1) Of course a "downside" of my model in comparison with what has been published by professional geneticists in released papers. I'm just following your own advice: authors of professional genetic studies must know much better than me, so my own models should reflect theirs as much as possible if they're to be taken seriously. Why should I suddenly consider my model much better than theirs?
2) "For my taste" because that would mean a much much higher degree of non-Middle Eastern ancestry than more than 2 papers on European Jews have assumed is actually the case, and I don't think it's credible that the discrepancy between this model and what several prior studies found would really be so high. Also, I confess I'm a bit afraid of unwillingly encouraging the xenophobes who say Jews are just Europeans with very minor "real" ancient Jewish ancestry and have no place in the Middle Eastern history and demographics...

Also, there is a methodological problem that is IMO hard to solve: if a Barcin-like people got some but not much CHG/Iran_N-related and Levant_N-related amixture not too long after the Neolithic, it'll be nearly identical to Tepecik-Ciftlik, so it's virtually impossible to distinguish if it all came directly via Tepecik-Ciftlik-like people or that kind of genetic makeup appeared only gradually through several successive layers of admixture event. Only with a lot of aDNA samples from the same place allowing a diachronic analysis of 1 specific region we could answer it definitely.

Could I please suggest that you take what I write with a bit more good will and less suspicion as if trying to micro-manage and find any word that could be used against me even based on what you "divinate" was going through my head (as you said before about Phoenicians/Carthaginians)? I think it should already be clear to everyone here that all I'm interested in is in getting closer to the historical truth and I have no personal or national agenda at all.

I'm sorry if you feel there's anything personal in the questions I'm putting to you.

Perhaps it's that I've cross-examined witnesses for too many years, looking for any weakness, and had to be very critically and usually not very politely questioned by judges for decades in return. Our system of jurisprudence pits one view against the other in controlled debate. All my training is that you have to look for the weak spots, the underlying assumptions which may be coloring the results. Only then can the "truth" or as much of it as we can every see, be known. That's all.

It need not get personal, and usually wasn't in my experience. I could have a very heated debate in court with my opponent, and then go out to lunch. All very collegial.

Also, if I have given the opinion that I think each and every academic paper is perfect, let me correct that idea right now. However, yes, if virtually all of the population geneticists at the major labs see it one way, and someone not in the profession presents a contrary hypothesis, I think the burden of proof is on that person to show why and how they're wrong.
 
I'm sorry if you feel there's anything personal in the questions I'm putting to you.

Perhaps it's that I've cross-examined witnesses for too many years, looking for any weakness, and had to be very critically and usually not very politely questioned by judges for decades in return. Our system of jurisprudence pits one view against the other in controlled debate. All my training is that you have to look for the weak spots, the underlying assumptions which may be coloring the results. Only then can the "truth" or as much of it as we can every see, be known. That's all.

It need not get personal, and usually wasn't in my experience. I could have a very heated debate in court with my opponent, and then go out to lunch. All very collegial.

Also, if I have given the opinion that I think each and every academic paper is perfect, let me correct that idea right now. However, yes, if virtually all of the population geneticists at the major labs see it one way, and someone not in the profession presents a contrary hypothesis, I think the burden of proof is on that person to show why and how they're wrong.

No problem then, Angela. I just wanted to clarify some things and make sure you understood I was not intending to "force" the models into some preconceived and prejudged concept, but just trying to make them align more with what professional geneticists have been saying (though my personal opinion is that they often get the correct or close to correct results, but still fail quite frequently in the interpretation of the results and/or the way they present them to the public, which we could see particularly surprisingly in the Ancient Rome and the Sicilian/Sardinian aDNA studies, but also in several others).

I don't take things personally, either... at last for a long time. I've always been very easy to calm down, forgive any misunderstandings and minor altercations and go on as before. But I'm glad I commented what I did, because your last comment clarified your actual position about my posts very much and, I think, made sure you aren't taking my views as any indication of a dishonest agenda or personal bias, and that's a relief to me.
 
Are you using nMonte3? I think it also allows the aggregation of individual source samples, so you don't necessarily need to use averages.
Yes, nMonte3. pen = 0.001 doesn't even work with nMonte afaik. As I said in another post, I did use individuals as "sources" in nMonte3. What I could not do is using several coordinates as targets at the same time in R software, as we do in Vahaduo. That's a limitation in my running using nMonte3, reason why I used averages in "targets" (not sources). However, as I also said in a previous post, using pen = 0.001 produces different results each run (I mean, even using the same sources and target), so I also average the results returned by few runs when pen = 0.001 (which is different from using averages in sources). The "bad side" are non-reproducible results, but I believe the averages should be similar for everyone running it.
 
Yes, nMonte3. pen = 0.001 doesn't even work with nMonte afaik. As I said in another post, I did use individuals as "sources" in nMonte3. What I could not do is using several coordinates as targets at the same time in R software, as we do in Vahaduo. That's a limitation in my running using nMonte3, reason why I used averages in "targets" (not sources). However, as I also said in a previous post, using pen = 0.001 produces different results each run (I mean, even using the same sources and target), so I also average the results returned by few runs when pen = 0.001 (which is different from using averages in sources). The "bad side" are non-reproducible results, but I believe the averages should be similar for everyone running it.
@Ygorcs
Here's an example. If I run several times Remedello BA RISE486 in Vahaduo using the same sources, it will return exactly the same results.

ITA_Remedello_BA:RISE486 in Vahaduo
TUR_Barcin_N,77.4
WHG,18.2
MAR_EN,4.4

Now see what happens when I use pen = 0.001 in R sw (remember that I hasn't changed the sources in any of the runs).

ITA_Remedello_BA:RISE486 --->
1)
TUR_Barcin_N,86.4
WHG,7.6
Yamnaya_RUS_Samara,2.8
RUS_Khvalynsk_En,1.8
MAR_EN,1
Levant_PPNB,0.4

2)
TUR_Barcin_N,86.2
WHG,8.2
Yamnaya_RUS_Samara,2.6
RUS_Khvalynsk_En,1.6
MAR_EN,1.2
Levant_PPNB,0.2

3)
TUR_Barcin_N,86.4
WHG,8.2
Yamnaya_RUS_Samara,2.6
RUS_Khvalynsk_En,1.4
Levant_PPNB,0.8
MAR_EN,0.4
CMR_Shum_Laka_8000BP,0.2

4)
TUR_Barcin_N,86.4
WHG,7.4
Yamnaya_RUS_Samara,3.4
RUS_Khvalynsk_En,1.2
Levant_PPNB,0.8
MAR_EN,0.6
CMR_Shum_Laka_8000BP,0.2

5)
TUR_Barcin_N,86.2
WHG,7.8
Yamnaya_RUS_Samara,3.2
RUS_Khvalynsk_En,1.6
Levant_PPNB,0.6
MAR_EN,0.4
CMR_Shum_Laka_8000BP,0.2

6)
TUR_Barcin_N,86.8
WHG,8.4
Yamnaya_RUS_Samara,2.6
MAR_EN,1.2
RUS_Khvalynsk_En,1

Etc.

We're seeing 6 runs and 6 different results with the same sources.

Apparently modern samples may vary a bit more in each run under pen = 0.001 and the same sources.

Notice, also, that when using pen = 0.001, that excess of WHG % in Vahaduo is "drained" in favor of Steppe, which seems correct, since the sample supposedly has Steppe ancestry (if I chose the right one). MAR EN % from Vahaduo was also drained under penalty, which seems to make sense as well. If I'm not missing something, these specific results under penalty look better than those we got in Vahaduo (the first results in this post).

What I did in my previous runs with pen 0.001 was averaging results (nothing to do with averages in sources). And since I was using R sw, I also used averages for "targets" (not sources), given the fact I don't manage to run it for several targets at the same time, as we do in Vahaduo.
 
@Palermo
That's what I was saying. Researchers seem to have found higher Anatolian in North Italy, while these models I posted seem to indicate higher Anatolian in South Italy. Reason why I wondered if this is explained by the fact they used Barcin N only.

@Angela
Thanks. I was trying to check differences with penalty and with no penalty using even older sources such Anatolian Hunter-Gatherer, Kotias, EHG, WHG (5 individuals), Iran Meso, Natufian, Taforalt and Shum Laka (ancient SSA).

Here we have Bergamo, for example, scaled with pen = 0.001 (a single run seemed enough for this purpose)
TUR_Pinarbasi_HG,67.6
GEO_CHG,11.6
RUS_Karelia_HG,10.4
WHG,5.4
IRN_HotuIIIb_Meso,5

Oddly, no Natufian.

Now, scaled and pen = 0 (same as Vahaduo)
TUR_Pinarbasi_HG,64.6
RUS_Karelia_HG,17.8
GEO_CHG,16
WHG,1.4
Levant_Natufian,0.2

Too litle Natufian as well.

In this last one with no penalty, Iran disappears, and I would not expect such low WHG, since I used AHG rather than ENF, then I believe the first results (with penalty) are somewhat closer to what we'd expect, at least in this model specifically. Not perfect though. I think Bergamo should score some Natufian (?). :unsure:

ED: whatever the actual Steppe source was, it must have included EHG, some CHG and little ANF.

Ok, I see what you are saying. Still, all your models are within the 56% to 72% range, so still a good job, in my view. YGORCs had some good estimates as well. I just got back from Golf course so I apologize if already discussed but maybe you and YGORCS and compare notes so to speak and see if you both can parse something out.

Edit to my post, I see you all are comparing notes(y)
 
Ok, Oh In understand. I appreciate your efforts. If you do try it, my opinion, and for the record, that is all it is!, I would suggest following the admixture graph in the Antonio et al 2019 paper. Start with the Local Roman_WHG and other Italian_HG (Villabruna) and then see if during Neolithic there are different sources of Anatolian_Neollithic, or just 1 source. The paper documents CHG/Iran_NEO that came in during the Neolithic so it seems logical that if one of the Neolithic sources used doesn't capture the Iran_NEO/CHG that Antonio et al 2019 found, might not be the best source of Anatolian_Neolithic for the Neolithic Romans.
Here are the samples between Meso and IA, as promised. Did it little by little, since it's time consuming when using R sw. :) I used both Barcin and Tepecik, and both Yamnaya and Khvalynsk, with penalty. I performed just a single run for each sample, in order to make it easier. Notice that different runs may produce slightly different results, and also that some of these results may be pretty different from those generated by Vahaduo (which works without penalty).

ITA_Ardea_Latini_IA:RMPR851 --->
TUR_Barcin_N,69.6
Yamnaya_RUS_Samara,15.4
WHG,8.6
RUS_Khvalynsk_En,6.4

ITA_Ardea_Latini_IA_o:RMPR850 --->
TUR_Tepecik_Ciftlik_N,62
Yamnaya_RUS_Samara,13.4
IRN_Ganj_Dareh_N,8
TUR_Barcin_N,8
Levant_PPNB,3.6
RUS_Khvalynsk_En,2.2
GEO_CHG,1.6
MAR_EN,1
WHG,0.2

ITA_Boville_Ernica_IA:RMPR1021 --->
TUR_Barcin_N,64.6
Yamnaya_RUS_Samara,14.8
WHG,9.6
TUR_Tepecik_Ciftlik_N,5.8
RUS_Khvalynsk_En,4.6
GEO_CHG,0.6

ITA_Etruscan:RMPR473 --->
TUR_Barcin_N,72.4
Yamnaya_RUS_Samara,15.8
WHG,6.4
RUS_Khvalynsk_En,4.4
TUR_Tepecik_Ciftlik_N,1

ITA_Etruscan:RMPR474b --->
TUR_Barcin_N,51.4
Yamnaya_RUS_Samara,21.2
TUR_Tepecik_Ciftlik_N,14.4
WHG,8
RUS_Khvalynsk_En,3.8
GEO_CHG,1.2

ITA_Etruscan_o:RMPR475b --->
TUR_Barcin_N,48
TUR_Tepecik_Ciftlik_N,22.8
Yamnaya_RUS_Samara,9.6
WHG,6.8
Levant_PPNB,4.2
MAR_EN,3.8
CMR_Shum_Laka_8000BP,2.8
RUS_Khvalynsk_En,2

ITA_Grotta_Continenza_CA:RMPR4 --->
TUR_Barcin_N,93
WHG,5.2
Yamnaya_RUS_Samara,1.6
RUS_Khvalynsk_En,0.2

ITA_Grotta_Continenza_CA:RMPR5 --->
TUR_Barcin_N,94.4
WHG,5.4
RUS_Khvalynsk_En,0.2

ITA_Grotta_Continenza_Meso:RMPR7 --->
WHG,100

ITA_Grotta_Continenza_Meso:RMPR11 --->
WHG,100

ITA_Grotta_Continenza_Meso:RMPR15 --->
WHG,100

ITA_Grotta_Continenza_N:RMPR2 --->
TUR_Barcin_N,98.2
WHG,1.8

ITA_Grotta_Continenza_N:RMPR3 --->
TUR_Barcin_N,99.4
WHG,0.4
RUS_Khvalynsk_En,0.2

ITA_Grotta_Continenza_N:RMPR8 --->
TUR_Barcin_N,96.4
WHG,2.2
Yamnaya_RUS_Samara,0.8
RUS_Khvalynsk_En,0.6

ITA_Grotta_Continenza_N:RMPR9 --->
TUR_Barcin_N,98.8
Levant_PPNB,1
Yamnaya_RUS_Samara,0.2

ITA_Grotta_Continenza_N:RMPR10 --->
TUR_Barcin_N,98
WHG,1.4
Yamnaya_RUS_Samara,0.4
RUS_Khvalynsk_En,0.2

ITA_Grotta_Continenza_N_o:RMPR6 --->
TUR_Barcin_N,86.2
WHG,11.4
Yamnaya_RUS_Samara,1.6
RUS_Khvalynsk_En,0.8

ITA_Monte_San_Biagio_CA:RMPR1014 --->
TUR_Barcin_N,91.4
WHG,6.8
RUS_Khvalynsk_En,1
Yamnaya_RUS_Samara,0.8

ITA_Olmo_di_Nogara_MBA:9309_Co --->
TUR_Barcin_N,87.4
Yamnaya_RUS_Samara,5.6
WHG,5.2
RUS_Khvalynsk_En,1.8

ITA_Olmo_di_Nogara_MBA:9323_Oss --->
TUR_Barcin_N,85.2
WHG,6.6
Yamnaya_RUS_Samara,6
RUS_Khvalynsk_En,2.2

ITA_Prenestini_tribe_IA:RMPR435b --->
TUR_Barcin_N,60.4
Yamnaya_RUS_Samara,24.6
WHG,8
RUS_Khvalynsk_En,7

ITA_Prenestini_tribe_IA_o:RMPR437b --->
TUR_Tepecik_Ciftlik_N,47.8
TUR_Barcin_N,29.4
Yamnaya_RUS_Samara,11.6
IRN_Ganj_Dareh_N,4.4
RUS_Khvalynsk_En,3.6
WHG,2
GEO_CHG,1.2

ITA_Proto-Villanovan:RMPR1 --->
TUR_Barcin_N,39.8
TUR_Tepecik_Ciftlik_N,24.8
Yamnaya_RUS_Samara,24.4
RUS_Khvalynsk_En,5.2
WHG,4.6
GEO_CHG,1
IRN_Ganj_Dareh_N,0.2

ITA_Remedello_BA:RISE486 ---> already posted

ITA_Remedello_BA:RISE487 --->
TUR_Barcin_N,88.4
WHG,8
Yamnaya_RUS_Samara,2.6
RUS_Khvalynsk_En,1

ITA_Remedello_BA:RISE489 --->
TUR_Barcin_N,90.8
WHG,7.2
Yamnaya_RUS_Samara,1.2
RUS_Khvalynsk_En,0.8

ITA_Ripabianca_di_Monterado_N:RMPR16 --->
TUR_Barcin_N,94
WHG,4
Yamnaya_RUS_Samara,1.4
RUS_Khvalynsk_En,0.6

ITA_Ripabianca_di_Monterado_N:RMPR17 --->
TUR_Barcin_N,93.2
WHG,4
Yamnaya_RUS_Samara,2.4
RUS_Khvalynsk_En,0.4

ITA_Ripabianca_di_Monterado_N:RMPR18 --->
TUR_Barcin_N,96.4
WHG,2
IRN_Ganj_Dareh_N,0.8
Yamnaya_RUS_Samara,0.8

ITA_Ripabianca_di_Monterado_N:RMPR19 --->
TUR_Barcin_N,93.6
WHG,4
Yamnaya_RUS_Samara,1.4
RUS_Khvalynsk_En,1

ITA_Rome_Latini_IA:RMPR1016 --->
TUR_Barcin_N,72.8
Yamnaya_RUS_Samara,14.4
WHG,7.6
RUS_Khvalynsk_En,4.6
GEO_CHG,0.6

ITA_Sicily_EBA:I3122 --->
TUR_Barcin_N,91.2
WHG,6.2
Yamnaya_RUS_Samara,1.4
RUS_Khvalynsk_En,1.2

ITA_Sicily_EBA:I3123 --->
TUR_Barcin_N,80.6
Yamnaya_RUS_Samara,6.8
WHG,6.6
TUR_Tepecik_Ciftlik_N,3.2
RUS_Khvalynsk_En,1.4
IRN_Ganj_Dareh_N,0.8
MAR_EN,0.4
GEO_CHG,0.2

ITA_Sicily_EBA:I3124 --->
TUR_Barcin_N,80.2
Yamnaya_RUS_Samara,8.4
WHG,6.8
RUS_Khvalynsk_En,3.4
TUR_Tepecik_Ciftlik_N,1.2

ITA_Sicily_EBA:I7807 --->
TUR_Barcin_N,93.2
WHG,2.8
Yamnaya_RUS_Samara,2.6
RUS_Khvalynsk_En,0.8
TUR_Tepecik_Ciftlik_N,0.6

ITA_Sicily_EBA:I8561 --->
TUR_Barcin_N,73.4
Yamnaya_RUS_Samara,13.6
WHG,9.6
RUS_Khvalynsk_En,3.2
GEO_CHG,0.2

ITA_Sicily_EBA:I11442 --->
TUR_Barcin_N,80
Yamnaya_RUS_Samara,6.6
TUR_Tepecik_Ciftlik_N,5.8
WHG,5
RUS_Khvalynsk_En,2
MAR_EN,0.4
GEO_CHG,0.2

ITA_Sicily_EBA:I11443 --->
TUR_Barcin_N,46.2
Yamnaya_RUS_Samara,35.2
WHG,7.8
RUS_Khvalynsk_En,6
TUR_Tepecik_Ciftlik_N,4.8

ITA_Sicily_LBA:I3876 --->
TUR_Barcin_N,69.4
TUR_Tepecik_Ciftlik_N,14.2
Yamnaya_RUS_Samara,6.4
WHG,4.6
RUS_Khvalynsk_En,2.8
IRN_Ganj_Dareh_N,2.2
GEO_CHG,0.2
MAR_EN,0.2

ITA_Sicily_LBA:I3878 --->
TUR_Barcin_N,84
Yamnaya_RUS_Samara,6.2
WHG,4.4
RUS_Khvalynsk_En,1.8
TUR_Tepecik_Ciftlik_N,1.6
IRN_Ganj_Dareh_N,1
MAR_EN,0.6
GEO_CHG,0.4

ITA_Sicily_LBA:I10372 --->
TUR_Barcin_N,84.6
TUR_Tepecik_Ciftlik_N,5
Yamnaya_RUS_Samara,5
WHG,3
RUS_Khvalynsk_En,1.8
IRN_Ganj_Dareh_N,0.6

ITA_Sicily_MBA:I3125 --->
TUR_Barcin_N,73.2
TUR_Tepecik_Ciftlik_N,12.8
Yamnaya_RUS_Samara,5.4
WHG,4.6
IRN_Ganj_Dareh_N,2.4
RUS_Khvalynsk_En,1.6

ITA_Sicily_MBA:I4109 --->
TUR_Barcin_N,87.2
WHG,3.6
TUR_Tepecik_Ciftlik_N,3
Yamnaya_RUS_Samara,2.8
IRN_Ganj_Dareh_N,1.4
RUS_Khvalynsk_En,1.4
MAR_EN,0.6

ITA_Sicily_MN:I4062 --->
TUR_Barcin_N,95.2
WHG,3.4
Yamnaya_RUS_Samara,0.8
RUS_Khvalynsk_En,0.6

ITA_Sicily_MN:I4063 --->
TUR_Barcin_N,95.2
WHG,4.6
Yamnaya_RUS_Samara,0.2

ITA_Sicily_MN:I4064 --->
TUR_Barcin_N,96.2
WHG,2.8
RUS_Khvalynsk_En,0.6
Yamnaya_RUS_Samara,0.4

ITA_Sicily_MN:I4065 --->
TUR_Barcin_N,89.2
WHG,7
Yamnaya_RUS_Samara,2.4
RUS_Khvalynsk_En,1.4
ITA_Villanovan:RMPR1015 --->
TUR_Barcin_N,74.4
Yamnaya_RUS_Samara,13.8
WHG,7
RUS_Khvalynsk_En,4.6
GEO_CHG,0.2

Certain results actually seem a bit odd, such those showing Steppe in Neo, for example, as well as in all Remedello (not sure it should). I mean, these are all single runs (rather than averages of several runs), still, they're enough to have an idea of what the different setting generates. Probably this approach also has some flaws anyway. Another example: maybe it produces too high WHG % in modern Italians (if I'm not missing something).
It'd be interesting to do tests using unscaled coordinates as well (with penalty), for comparison.

Below, the results from Vahaduo (scaled and no penalty).

80AEsov.jpg


idFVAe7.jpg
 
Regio_X: I agree, the Yamnaya showing up in Neolithic Italy is unlikely, Lazaradis et al 2016 "Genomic insights into the origin of farming in the ancient Near East" documented the Yamnaya Culture of the Steppe Herders had some 43% ancestry from a Near East source, which could be Iran_NEO/Chalcolithic, CHG or Armenian from same periods. Could the Yamnaya be picking up early signals of those Near East sources that Antonio et al 2019 documented in Neolithic Lazio/Rome. Is there a Yamnaya source that captures the 57% WEHG/ 43% Near East (Iran/Armenian/CHG) proportion more accurately?

The Yamnaya showing up in the Early Bronze age Sicily is consistent with the first sources of Steppe related ancestry arriving there around 2,200 BC from Iberia per Fernandes et al 2020 and of course the same for Lazio/Rome. I have mentioned this before but there are some Neolithic Samples from Sicily in the works per the pre-print paper by Vandeloosdrecht et al 2020 "Genomic and dietary transitions during the Mesolithic and Early Neolithic in Sicily" and while it is a pre-print (not yet certified by peer review), my reading of the results from that paper are in line with what Antionio et al 2020 documented in Mesolithic and Neolithic Lazio. I posted the summary and models that the authors have as now in their pre-print version I think earlier in this thread.

Still back to your Neolithic Romans, I think your model is capturing what happened, WHG ancestry pre Neolithic, then Neolithic dominate with some residual WHG left over. The VandeLoosdrecht et al 2020 paper in Sicily documents the same exact pattern.
 
Interesting. In the first set of analyses, more Tepecik in Bronze Age Protovillanovan from the Marche than in Sicily. Certainly is in Central Italy by the Iron Age.

Question will be when will it show up in Southern Italy.

Really strange is that although like the first analysis Vahaduo includes Tepecik, it gives R850 nearly 20% Levantine PPNB versus 3.6 in the first analysis. That's way too big a difference.

How does that make sense?
 
Here are the graphs I have made so far. Looks pretty plausible to me, but I think the Barcin-like vs. Tepecik-like thing should still be more explained by genetic studies under archaeological and historical evidences. Using Barcin/Boncuklu vs. Tepecik/Kumtepe causes the differences between northernmost and southernmost Italy to become much higher, because now it's a matter of very different proportions of at least 2 distinct ANF/EEF sources, not just a mostly Barcin-like substrate with minor contributions from additional Levant-related and CHG/Iran-related admixtures over time.

Graph - ITALIANS, GREEKS, BALKANIC AND CENTRAL-EASTERN EUROPEAN : https://imgur.com/a/7ecKxc8

Are those your graphs at imgur? Nice work and well presented, easy to read, descriptions and labels clear.
 
Interesting. In the first set of analyses, more Tepecik in Bronze Age Protovillanovan from the Marche than in Sicily. Certainly is in Central Italy by the Iron Age.

Question will be when will it show up in Southern Italy.

Really strange is that although like the first analysis Vahaduo includes Tepecik, it gives R850 nearly 20% Levantine PPNB versus 3.6 in the first analysis. That's way too big a difference.

How does that make sense?

Angela: My first guess would be what Regio_X said in post 207, and I think you yourself have suggested, I have as well, that these Amateur calculators are very, very, sensitive to the source variables included. The fact that perhaps source variables chosen can result in R850 going from 20% Levantine PPNB to 3.6% or vice versa I think supports that basic point.

My point has always been if the amateur calculator results are way out of line with a large body of published papers, then I will by default lean to the published research. That doesn't mean a published paper doesn't have flaws and can be confusing. The recent discussion about Morroco_LN which caused some emotions to run high (I myself was guilty of that) and how it is measured admixture wise I think sort of supports that basic point.
 
Interesting. In the first set of analyses, more Tepecik in Bronze Age Protovillanovan from the Marche than in Sicily. Certainly is in Central Italy by the Iron Age.

Question will be when will it show up in Southern Italy.

Really strange is that although like the first analysis Vahaduo includes Tepecik, it gives R850 nearly 20% Levantine PPNB versus 3.6 in the first analysis. That's way too big a difference.

How does that make sense?
Pretty different G25 results, indeed. I posted both precisely for this comparison. It's good to understand how these tools work. In the first set of results, I used penalty, which in theory avoid overfittings, but I believe it could do it at some expense as well, so much so that we're seeing traces of Steppe in Neo, and in all Remedello. Additionally, perhaps it "concentrates" WHG in modern Italians, for example, maybe to avoid overfitting in Barcin and Steppe? Really don't know. Vahaduo in turn works with no penalty. Oh!, and these are scaled coordinates. We could still test it with unscaled coordinates (which would generate more two sets of different results in G25). Additionally to all of that, we have the issue of the sources selected, which can affect results substantially. For example, if we use Khvalynsk instead Yamnaya in Vahaduo, as Ygorcs usually do, Steppe drops, and the Levant PPNB also drops a lot. Mind you, I used both Khvalynsk and Yamnaya, but Vahaduo chooses Yamnaya entirely, whereas the model with penalty may use both Steppe sources. Question is: which of all these models is closer to the reality?

So, regarding R850, using penalty, part of that Levant PPNB, Barcin and CHG we see in Vahaduo generates the Tepecik-like you see in the first modelling (with penalty). On the other hand, even in Vahaduo, when Ygorcs for example uses only Khvalynsk rather than Yamnaya, Steppe also drops a bit, and we have then an excess of CHG (in comparison), so to speak. This "excess" of CHG makes the algorithm pick up some PPNB-like and Barcin-like to form a Tepecik-like. In other words, the Tepecik demands Barcin-like (mainly) plus some CHG/Iran-like and Levant PPNB-like, at the same time the latter (PPNB) itself demands some AAF-like (AHG plus CHG/Iran).

What's curious in Vahaduo is that it may distinguish between Barcin and Tepecik in many cases, but between Yamnaya and Khvalynsk, no, it always prefers Yamnaya-like, completely. The penalty I used in R software "forced" G25 to pick up some Khvalynsk like ancestry as well.

Finally, as I said, each run using penalty generates slightly different results, which means they are non-reproducible. It's a problem. I've tried to minimize that by running the same targets (moderns) many times and averaging results. But I run the targets in my last post just once, otherwise the process would be too time consuming.

As I said in another post, these models may be "delicate". :)
It'd be really great if one of us learned how to use other tools as qpAdm. Same results using different "approaches" could be used to reinforce hypotheses, or the opposite.

Sorry if my post is not so clear. I have problems with English. ;)
 
Pretty different G25 results, indeed. I posted both precisely for this comparison. It's good to understand how these tools work. In the first set of results, I used penalty, which in theory avoid overfittings, but I believe it could do it at some expense as well, so much so that we're seeing traces of Steppe in Neo, and in all Remedello. Additionally, perhaps it "concentrates" WHG in modern Italians, for example, maybe to avoid overfitting in Barcin and Steppe? Really don't know. Vahaduo in turn works with no penalty. Oh!, and these are scaled coordinates. We could still test it with unscaled coordinates (which would generate more two sets of different results in G25). Additionally to all of that, we have the issue of the sources selected, which can affect results substantially. For example, if we use Khvalynsk instead Yamnaya in Vahaduo, as Ygorcs usually do, Steppe drops, and the Levant PPNB also drops a lot. Mind you, I used both Khvalynsk and Yamnaya, but Vahaduo chooses Yamnaya entirely, whereas the model with penalty may use both Steppe sources. Question is: which of all these models is closer to the reality?

So, regarding R850, using penalty, part of that Levant PPNB, Barcin and CHG we see in Vahaduo generates the Tepecik-like you see in the first modelling (with penalty). On the other hand, even in Vahaduo, when Ygorcs for example uses only Khvalynsk rather than Yamnaya, Steppe also drops a bit, and we have then an excess of CHG (in comparison), so to speak. This "excess" of CHG makes the algorithm pick up some PPNB-like and Barcin-like to form a Tepecik-like. In other words, the Tepecik demands Barcin-like (mainly) plus some CHG/Iran-like and Levant PPNB-like, at the same time the latter (PPNB) itself demands some AAF-like (AHG plus CHG/Iran).

What's curious in Vahaduo is that it may distinguish between Barcin and Tepecik in many cases, but between Yamnaya and Khvalynsk, no, it always prefers Yamnaya-like, completely. The penalty I used in R software "forced" G25 to pick up some Khvalynsk like ancestry as well.

Finally, as I said, each run using penalty generates slightly different results, which means they are non-reproducible. It's a problem. I've tried to minimize that by running the same targets (moderns) many times and averaging results. But I run the targets in my last post just once, otherwise the process would be too time consuming.

As I said in another post, these models may be "delicate". :)
It'd be really great if one of us learned how to use other tools as qpAdm. Same results using different "approaches" could be used to reinforce hypotheses, or the opposite.

Sorry if my post is not so clear. I have problems with English. ;)

No, you're very clear; it's the situation that isn't clear.

Which "results" are closer to reality, and which are just artifacts?

Perhaps when we have really proximate samples to compare it will be clearer than when we try to model moderns using such ancient samples.
 
No, you're very clear; it's the situation that isn't clear.

Which "results" are closer to reality, and which are just artifacts?

Perhaps when we have really proximate samples to compare it will be clearer than when we try to model moderns using such ancient samples.
Indeed. I guess the easiest to distinguish are those very old components, however, they may be not that informative.
As for the different models, I really don't know which is closer to reality, but at least they may agree in some things. I believe it's possible to work some hypotheses based on these "agreements".
Out of curiosity, this is what unscaled would return for Remedello RISE486, using the same sources and penalty. Notice that Tepecik shows up, while scaled models returned only Barcin.

RISE486 unscaled with penalty
Anatolia_Barcin_N,60.2
Anatolia_Tepecik_Ciftlik_N,29
WHG,7.2
MAR_EN,1.4
Levant_PPNB,1.2
RUS_Khvalynsk_En,0.6
Yamnaya_RUS_Samara,0.4

RISE486 unscaled without penalty
Anatolia_Barcin_N,65.6
WHG,20.8
Anatolia_Tepecik_Ciftlik_N,10.8
MAR_EN,2.8

R850 unscaled with penalty
Anatolia_Tepecik_Ciftlik_N,44
Anatolia_Barcin_N,18.2
Yamnaya_RUS_Samara,14.4
IRN_Ganj_Dareh_N,9.6
Levant_PPNB,7.6
GEO_CHG,2.2
RUS_Khvalynsk_En,2
CMR_Shum_Laka_8000BP,1
MAR_EN,1

R850 unscaled without penalty
Anatolia_Barcin_N,45.6
Yamnaya_RUS_Samara,17.6
Levant_PPNB,15.6
IRN_Ganj_Dareh_N,15.2
GEO_CHG,3.8
MAR_EN,1.4
Anatolia_Tepecik_Ciftlik_N,0.8

That's what I was talking about when I mentioned that unscaled would generate more different results.

So, considering the sources chosen, which of the four do you think makes more sense for R850, for example? Scaled without penalty (that one from Vahaduo I posted), scaled with penalty, unscaled without penalty or unscaled with penalty? Lol
 
Indeed. I guess the easiest to distinguish are those very old components, however, they may be not that informative.
As for the different models, I really don't know which is closer to reality, but at least they may agree in some things. I believe it's possible to work some hypotheses based on these "agreements".
Out of curiosity, this is what unscaled would return for Remedello RISE486, using the same sources and penalty. Notice that Tepecik shows up, while scaled models returned only Barcin.

RISE486 unscaled with penalty
Anatolia_Barcin_N,60.2
Anatolia_Tepecik_Ciftlik_N,29
WHG,7.2
MAR_EN,1.4
Levant_PPNB,1.2
RUS_Khvalynsk_En,0.6
Yamnaya_RUS_Samara,0.4

RISE486 unscaled without penalty
Anatolia_Barcin_N,65.6
WHG,20.8
Anatolia_Tepecik_Ciftlik_N,10.8
MAR_EN,2.8

R850 unscaled with penalty
Anatolia_Tepecik_Ciftlik_N,44
Anatolia_Barcin_N,18.2
Yamnaya_RUS_Samara,14.4
IRN_Ganj_Dareh_N,9.6
Levant_PPNB,7.6
GEO_CHG,2.2
RUS_Khvalynsk_En,2
CMR_Shum_Laka_8000BP,1
MAR_EN,1

R850 unscaled without penalty
Anatolia_Barcin_N,45.6
Yamnaya_RUS_Samara,17.6
Levant_PPNB,15.6
IRN_Ganj_Dareh_N,15.2
GEO_CHG,3.8
MAR_EN,1.4
Anatolia_Tepecik_Ciftlik_N,0.8

That's what I was talking about when I mentioned that unscaled would generate more different results.

So, considering the sources chosen, which of the four do you think makes more sense for R850, for example? Scaled without penalty (that one from Vahaduo I posted), scaled with penalty, unscaled without penalty or unscaled with penalty? Lol

To borrow a phrase from "The King of Siam": It's a puzzlement.:)

Seriously, this is why I have always been so skeptical about the validity of this "modeling". I'm not accusing anyone here of deliberately choosing a certain method to support a certain agenda, not to mention a "judicious" choice of samples, but surely it should be clear that it can be done, something I have been saying for many years, having been alerted to the fact that it was being done by discussions I was seeing.

So, for us here, let's say I am skeptical of the "precision" of it. Maybe other methods are more accurate. That's a whole other avenue of exploration. Maybe there's a reason the academics rather unanimously don't show results for Levant Neolithic.
 
modern sardinian ( don't know from where in sardinia)


[FONT=&quot]Z436675[/FONT]
[FONT=&quot]Threshold of components set to 1.000[/FONT]
[FONT=&quot]Threshold of method set to 0.25%[/FONT]
[FONT=&quot]Personal data has been read. 20 approximations mode.[/FONT]
[h=2]Gedmatch.Com[/h][h=2]MDLP K11 Modern 4-Ancestors Oracle[/h][FONT=&quot]This program is based on 4-Ancestors Oracle Version 0.96 by Alexandr Burnashev.[/FONT]
[FONT=&quot]Questions about results should be sent to him at: [email protected][/FONT]
[FONT=&quot]Original concept proposed by Sergey Kozlov.[/FONT]
[FONT=&quot]Many thanks to Alexandr for helping us get this web version developed.[/FONT]

[FONT=&quot]MDLP K11 2xOracle and OracleX4[/FONT]

Admix Results (sorted):

#PopulationPercent
1Neolithic61.44
2WHG13.35
3Basal12.51
4EHG11.43
5African1.10


Finished reading population data. 161 populations found.
11 components mode.

--------------------------------

Least-squares method.

Using 1 population approximation:
1 Hungary_CA @ 15.339438
2 GermanStuttgart_LBK @ 15.866030
3 GermanStuttgart_LBK @ 15.866030
4 Hungary_MBA @ 17.711269
5 Irish_LN @ 18.546011
6 Iberian_Chalcolitic @ 21.809052
7 Vatya_MBA @ 22.780289
8 Balkan_LBA @ 23.488144
9 Europe_MN @ 23.728521
10 Germany_BA @ 23.773239
11 Germany_Bronze_Age @ 23.773239
12 Scandinavian_MN @ 23.889975
13 Iberia_Chalcolithic @ 23.990364
14 Anatolia_Chalcolithic @ 24.215815
15 Maros_BA @ 24.369606
16 Baalberge_MN @ 24.497467
17 Spain_MN @ 25.355747
18 Salzmuende_MN @ 26.649120
19 Remedello_BA @ 27.135021
20 Esperstedt_MN @ 27.761374

Using 2 populations approximation:
1 50% GermanStuttgart_LBK +50% Vatya_MBA @ 6.594885


Using 3 populations approximation:
1 50% Europe_EN +25% Irish_BA +25% Levant_N @ 2.692108


Using 4 populations approximation:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 Bell_Beaker + LBK_EN + Levant_N + Remedello_BA @ 1.912332
2 Bell_Beaker + Iberian_Chalcolitic + Levant_N + Starcevo_EN @ 1.962995
 

This thread has been viewed 104603 times.

Back
Top