The spread of 'Steppe' DNA and autosomal best-fit analysis

What's the saddest is that of 62 samples, they only got mtdna. It seems to happened a little too much for important contexte... I guess it's just way cheaper to only test for mtdna snp's.
Dare I say it, there is also sometimes a culture within academia of withholding or delaying publication of information.
 
Dare I say it, there is also sometimes a culture within academia of withholding or delaying publication of information.

Indeed, indeed.
2019 will show by how much.
Two things will be revealed this upcoming year. 1 Neolithic Transcaucasus (6000Bc-5000bc) and 2. South Balkan Chalcolithic (4500-3800bc).

And they have the data for a while now. I bet anyone it will be the final pieces of the (one of) puzzle.
 
Please post a link to your statistical best-fit.
Epoch. The part I most agree with Pip, and therefore disagree with you I suppose, is that "steppe" will turn out to be "nothing". And did I get a beating for that one...
Suvorovo from the steppe? Why,? Because of ochre and stone mace-head?
 
Indeed, indeed.
2019 will show by how much.
Two things will be revealed this upcoming year. 1 Neolithic Transcaucasus (6000Bc-5000bc) and 2. South Balkan Chalcolithic (4500-3800bc).

And they have the data for a while now. I bet anyone it will be the final pieces of the (one of) puzzle.

Do you heard that it was in work, or do you have " intuition " it will happened?
 
North Ukrainian R1a-M417 sample I6561 (dated to approximately 3,960 BC) gives a different picture of R1a-M417 Corded Ware, as Corded Ware seems to have directly descended from its community, without the need for any additional Steppe DNA insertion from later Yamnaya or EEF DNA insertion from Neolithic North European communities.

It perhaps indicates that Corded Ware populations were tightly-knit - largely genetically uninfluenced not only by Yamnaya on the paternal side (unsurprisingly, as they had virtually wholly different yDNA), but also by EEF Neolithic communities on the maternal side. It looks like Corded Ware was both of uniform paternal lineage and largely endogamous - its best fit contribution from the EEF groups that it replaced (Funnel Beaker, GA and Baalberge etc.) looks to be about 2% on average, so I cannot see that its men took in too many outsider women as it expanded.

One possibly interesting aspect is that North Eastern Corded Ware (Latvia and Lithuania) is a little different in that it does appear to have a Yamnayan element - a 95% best-fit contribution in one Latvian sample, and a 17% average contribution in Lithuanian samples generally. It is perhaps striking that these North Eastern Corded Ware (Yamnayan-admixed) R1a populations were the only ones that survived and thrived in Europe after Corded Ware's collapse.
 
North Ukrainian R1a-M417 sample I6561 (dated to approximately 3,960 BC) gives a different picture of R1a-M417 Corded Ware, as Corded Ware seems to have directly descended from its community, without the need for any additional Steppe DNA insertion from later Yamnaya or EEF DNA insertion from Neolithic North European communities.

It perhaps indicates that Corded Ware populations were tightly-knit - largely genetically uninfluenced not only by Yamnaya on the paternal side (unsurprisingly, as they had virtually wholly different yDNA), but also by EEF Neolithic communities on the maternal side. It looks like Corded Ware was both of uniform paternal lineage and largely endogamous - its best fit contribution from the EEF groups that it replaced (Funnel Beaker, GA and Baalberge etc.) looks to be about 2% on average, so I cannot see that its men took in too many outsider women as it expanded.

One possibly interesting aspect is that North Eastern Corded Ware (Latvia and Lithuania) is a little different in that it does appear to have a Yamnayan element - a 95% best-fit contribution in one Latvian sample, and a 17% average contribution in Lithuanian samples generally. It is perhaps striking that these North Eastern Corded Ware (Yamnayan-admixed) R1a populations were the only ones that survived and thrived in Europe after Corded Ware's collapse.

Makes perfect sense to me, I've always wondered how academia can be so stupid to expect 100% Y chromosome replacement from Yamnaya to Corded Ware yet think Corded Ware comes directly from Yamnaya.
 
North Ukrainian R1a-M417 sample I6561 (dated to approximately 3,960 BC) gives a different picture of R1a-M417 Corded Ware, as Corded Ware seems to have directly descended from its community, without the need for any additional Steppe DNA insertion from later Yamnaya or EEF DNA insertion from Neolithic North European communities.

It perhaps indicates that Corded Ware populations were tightly-knit - largely genetically uninfluenced not only by Yamnaya on the paternal side (unsurprisingly, as they had virtually wholly different yDNA), but also by EEF Neolithic communities on the maternal side. It looks like Corded Ware was both of uniform paternal lineage and largely endogamous - its best fit contribution from the EEF groups that it replaced (Funnel Beaker, GA and Baalberge etc.) looks to be about 2% on average, so I cannot see that its men took in too many outsider women as it expanded.

One possibly interesting aspect is that North Eastern Corded Ware (Latvia and Lithuania) is a little different in that it does appear to have a Yamnayan element - a 95% best-fit contribution in one Latvian sample, and a 17% average contribution in Lithuanian samples generally. It is perhaps striking that these North Eastern Corded Ware (Yamnayan-admixed) R1a populations were the only ones that survived and thrived in Europe after Corded Ware's collapse.

What program did you use? What dataset? And could you post the results? With standard errors and such?
 
What program did you use? What dataset? And could you post the results? With standard errors and such?
I used the most extensive dataset I could find (Genetiker's), and selected whichever combination of prior-dated samples yielded the lowest percentage autosomal variance from the average mean of the population in question. There were thousands of results for each test, and I've only retained the ones that yielded best fits.

For instance, the orthodox hypothesis (that German Corded Ware = 75% Russian Yamnaya + 25% Funnelbeaker) yields a variance that is almost ten times that of the best fit combination, so it is passed over.
 
Here is another example result - for a German Bell Beaker best-fit: 68% Bulgaria Steppe-like Chalcolithic + 14% Khvalynsk R1a + 18% Globular Amphora. This is quite similar to German Corded Ware, except that its core EEF:EHG ratio is a bit larger and this distinction is accentuated by a best-fit admixture with a Globular Amphora population.

My reading of this is that the ancestors of German Corded Ware look most like a branched-off Eastern Ukraine Suvorovo, and the ancestors of German Bell Beaker look like Prut-Dniester Suvorovo who moved North West towards a GA population before Corded Ware got there. Corded Ware people look self-contained and largely endogamous; the ancestors of Bell Beaker people look to have mixed more with other EEF populations like GA.

Genetiker's dataset does not include RRBP, but an mtDNA best-fit for Bell Beaker suggests a degree of RRBP admixture. Perhaps advancing Corded Ware forced pre-Bell Beaker westwards across Southern Poland, Central Germany and into Northern France, from where it later resurged (as Bell Beaker) to challenge it?
 
I used the most extensive dataset I could find (Genetiker's)

Where can I find that dataset?

and selected whichever combination of prior-dated samples yielded the lowest percentage autosomal variance from the average mean of the population in question. There were thousands of results for each test, and I've only retained the ones that yielded best fits.

Using what methods and tools? How did you calculate this?


For instance, the orthodox hypothesis (that German Corded Ware = 75% Russian Yamnaya + 25% Funnelbeaker) yields a variance that is almost ten times that of the best fit combination, so it is passed over.
 
Where can I find that dataset? Using what methods and tools? How did you calculate this?
The dataset is under the heading K=14 admixture analysis, and is in graphical form, so is a little tricky to use. I wrote my own tool on Excel that calculates the percentage autosomal equivalence between the samples under investigation and different combinations of prior-dated samples. (Identifying data to specific samples, thereby allowing them to be dated, can also be quite tricky.)
 
The dataset is under the heading K=14 admixture analysis, and is in graphical form, so is a little tricky to use. I wrote my own tool on Excel that calculates the percentage autosomal equivalence between the samples under investigation and different combinations of prior-dated samples. (Identifying data to specific samples, thereby allowing them to be dated, can also be quite tricky.)

This is not a sensible approach at all to put it mildly.
 
I have also run some calculations for early Steppe DNA appearances in Southern Europe (ATP3 in Northern Spain 3,300 BC and Croatian Vucedol 2,800 BC). These look related to each other, but not directly related to Bell Beaker, Corded Ware or Yamnaya.

ATP3's best fit comes out as 39% Central Anatolian Neolithic, 38% Bulgarian Steppe-like Chalcolithic and 23% Bulgarian other Chalcolithic.

Steppe-like Vucedol's best fit comes out as 33% ATP3, 24% Khvalynsk Q, 23% Central Anatolian Neolithic, 17% Cucuteni-Tripolye and 3% Ukraine early Chalcolithic.

My reading of this is that the origin looks most like East Balkan Suvorovo that branched off early up the Danube, with ATP3 venturing as far as Spain and pre-Vucedol staying in the Northern Balkans and mixing with Cucuteni.
 
I have also run some calculations for early Steppe DNA appearances in Southern Europe (ATP3 in Northern Spain 3,300 BC and Croatian Vucedol 2,800 BC). These look related to each other, but not directly related to Bell Beaker, Corded Ware or Yamnaya.

ATP3's best fit comes out as 39% Central Anatolian Neolithic, 38% Bulgarian Steppe-like Chalcolithic and 23% Bulgarian other Chalcolithic.

Steppe-like Vucedol's best fit comes out as 33% ATP3, 24% Khvalynsk Q, 23% Central Anatolian Neolithic, 17% Cucuteni-Tripolye and 3% Ukraine early Chalcolithic.

My reading of this is that the origin looks most like East Balkan Suvorovo that branched off early up the Danube, with ATP3 venturing as far as Spain and pre-Vucedol staying in the Northern Balkans and mixing with Cucuteni.

Why for Genetiker in K = 16 on ATP 3, the " Steppe " component is the Teal/CHG? Why CHG would overrun EHG for the Steppe ancestry?
 
Why for Genetiker in K = 16 on ATP 3, the " Steppe " component is the Teal/CHG? Why CHG would overrun EHG for the Steppe ancestry?
Not sure. I can't even find the K=16 data for ATP3 - only that in Genetiker's K=16, the teal seems to be identified as "Northern Middle Eastern". In recent years, Genetiker has worked from a K=14 plot, which has the most extensive dataset, so this is what I have used.
 
Why not? And what is the approach that works better?


What markod says. Also, ADMIXTURE is a quite rough approach to genetic mixture, especially if you mix older populations with modern day populations. Also, to fully appreciate what it states you need to weigh in other K values as well. ADMIXTURE goes a bit like this: When *forced* to be modeled as combination of two of the provided samples how would they look (That is K=2), and when *forced* to be modeled as a combination of three samples (K=3), etc etc. Afterwards the statistically best fit is chosen, if run in an unsupervised mode. Even in the unsupervised case a lot of individual samples will simply be a forced bad fit.

The tools that Reich labs created (f3stats, D-stats and ADMIXTOOLS) are available but rather complicated to use. Also, you'd probably need the full samples which take up a huge amount of disk space for those, but if you choose that path and are willing to experience a steep learning curve you can download it here.

Eurogenes Davidski provides a simpler tool called nMonte, created by Huijbrechts, (more explanation here) that allows modeling on the basis of pre-calculated PCA values. It's conclusions are consistently similar to qpAdm of the Reich lab tools. You have to install "R", though.
 
Because it's based on a supervised ADMIXTURE analysis. These have to be interpreted with some caution.
Yes, I agree that it has to be used with caution.
My data analysis provides only a rough guide to what is the best fit from the limited range of samples that we have.
However, what it does indicate is that the Yamnayan samples that we have (in combination with a variety of other ancient samples) provide such diverse readings to Bell Beaker and Corded Ware that they are clearly not the best explanation as their major genetic contributor. The 'Steppe' components within BB, CW, ATP3 and Vucedol match much more closely with preceding Steppe people (Khvalynsk), especially those that appear to have already been present in Bulgaria by the 5th millennium BC admixed with Anatolian/EEF.
 
Yes, I agree that it has to be used with caution.
My data analysis provides only a rough guide to what is the best fit from the limited range of samples that we have.
However, what it does indicate is that the Yamnayan samples that we have (in combination with a variety of other ancient samples) provide such diverse readings to Bell Beaker and Corded Ware that they are clearly not the best explanation as their major genetic contributor. The 'Steppe' components within BB, CW, ATP3 and Vucedol match much more closely with preceding Steppe people (Khvalynsk), especially those that appear to have already been present in Bulgaria by the 5th millennium BC admixed with Anatolian/EEF.

If you can replicate that with D-stats or f3stat or qpAdm, yes. But it could also simply be a artifact.
 

This thread has been viewed 105159 times.

Back
Top