Steppe DNA South of the Caucasus

Dating - If we take R1a-Z93 as a proxy for steppic DNA South of the Caucasus, and take the expansion in the number of its extant lineages as a proxy for the expansion and dispersal of its populations, we see the following pattern. (I am using yfull's estimates, based on SNP variance. My own estimates based on STR variance show similar results.)

Expansion per 100 years -
3,000-2,850 BC - 0%
2,850-2,700 BC - over 1,000 %
2,700-2,250 BC - 4%
2,250-1,800 BC - 89%
1,800-1,500 BC - 8%
1,500-900 BC - 7%
900 BC-0 AD - 3%

It is clear that Z93 was a regular lineage for most of its history, apart from during two huge bursts of expansion/development - around 2,850-2,700 BC (consistent with CWC's initial expansion) and 2,250-1,800 BC. The latter of these expansions is presumably reflective of the expansion and spread of Z93 populations East and South of the Caucasus, and its greatest period of expansion is during the early part (2,250-2,200 BC).

This indicates that the expansion of the Mitanni and Hittites (dated to around 1,500 BC) were not proxies for the expansion of Z93 populations, which had mainly occurred probably 700-800 years earlier. Accordingly, I would propose that significant steppic DNA moved East and South of the Caucasus before 2,000 BC (as autosomal DNA suggests), and that subsequent IE/Aryan expansions during the middle 2nd millennium BC were of admixed populations (in which steppic DNA was already only a minor component, say up to 20%).
 
yDNA data from yfull suggests two separate waves of steppic people spread South of the Caucasus - Before 2,200 BC, it estimates these populations developed in tandem, with expansions in R1a-Z2124 and R1a-Y3 lineages occurring at the same time. However, after 2,200 BC, these two haplogroups show completely opposite development patterns:

Between 2,200 and 1,850 BC, it is estimated that 23 new extant Y3 lineages emerged, compared to only 4 for Z2124.

In the subsequent period 1,850 to 1,450 BC, the converse is estimated to have occurred, with 41 new lineages emerging for Z2124 and only 1 for Y3.

After this, both haplogroups develop fairly normally at gradual rates, with only Y3 having another strongish growth spell at the end of the second millennium BC.

Questions: might this pattern tie up with
1. Autosomal data, which appears to show two different steppic contributions in Armenia - a more Westerly one (Poltavka/Srubnaya/Eastern Corded Ware-like) in the Middle Bronze Age, replaced by a more easterly one (Potapovka/Sintashta-like) in the Late Bronze Age?
2.Any cultural/historical developments South of the Caucasus?
(Bear in mind, some think yfull's dates are a little conservative, although my own calculations using different methodology yield pretty similar estimates.)

Also, do we have early archaeological Y3 samples?
 
Hi, so what are you all thinking for Armenians? Where did they come from? Catacomb? When did they arrive in Armenia?

Was the MBA sample from the Trialeti Culture? Do you all think the Trialeti Culture were speakers of (proto)Armenian?

I'm seeing the Baltics mentioned a few times. What's interesting is that some linguists have noted some morphological similarities between Armenian and Balto-Slavic languages.

Are we sure that samples found with Steppe ancestry in Armenia are Armenian and not Iranian (Scythian), Cimmerian, or some other group? Perhaps the Iranians or Cimmerians account for the LBA Shintashta-like sample(s)?

From the historical record, in the Bible, a region in the Armenia area is called Ashkenaz. Ashkenaz is usually equated with the Scythians. Also, Ishkugulu and possibly Eriak, mentioned by the Urartians in the 8th century BCE, could possibly be Scythian settlements as well (Ishku could be a version of Ashka, Eriak would be Arias (Aryans?)).

It's also believed that Gyumri, which is located near where Ishkugulu and Eriak were, was settled by/named after Cimmerians, during the 8th century BCE as well.

So it seems that there was a Scythian and Cimmerian presence in what is now modern Armenia. We know that they helped to overthrow the Urartians, but it seems that they were in the region at least a century prior to the fall of Urartu. If these LBA samples were found in the north of modern Armenia, they could very well be from one of these other groups.
 
How is it there decent steppe ancestry in South Caucasus but nothing in Eastern Anatolia?

It is intriguing for me that 1 of the MLBA Anatolian individuals I have data from consistently picks some small amount of steppe ancestry in the models I have tried for several BA samples of that and other regions. The other samples totally lack that component, only 1 of them does. The fits improve a bit when some steppic source is included. What's intriguing, considering the pretty early divergence of the Anatolian IE family, is that that Anatolian sample consistently prefers an early steppe input (Progress_Eneolithic) over any other when I use all kinds of steppe aDNA samples possible (Yamnaya, Eneolithic Steppe, BB, CWC, Andronovo), in addition to a larger number of non-steppic DNA samples of the Neolithic, Chalcolithic and Early BA.

[1] "distance%=2.1477 / distance=0.021477"
Anatolia_MLBA:MA2203


Anatolia_EBA_Ovaoren 46.80
Greece_N 31.05
Sarazm_Eneolithic 7.65
Levant_N 5.75
Armenia_ChL 5.20
RUS_Progress_Eneolithic 3.55
 
This indicates that the expansion of the Mitanni and Hittites (dated to around 1,500 BC) were not proxies for the expansion of Z93 populations, which had mainly occurred probably 700-800 years earlier. Accordingly, I would propose that significant steppic DNA moved East and South of the Caucasus before 2,000 BC (as autosomal DNA suggests), and that subsequent IE/Aryan expansions during the middle 2nd millennium BC were of admixed populations (in which steppic DNA was already only a minor component, say up to 20%).

Why do yu presume that the subsequent IEAryan expansions were of admixed populations with little steppic admixture? Couldn't it have been, as the data suggest, a secondary and more demographically (in the long term) impactful expansion coming originally from a region that had preserved its MLBA Steppe ancestry much better, North-Central Asia (Srubnaya and Andronovo horizons)? I agree that when that kind of ancestry, together with some clades of Z93 (and other haplogroups), arrived in areas like the Iranian Plateau and the Armenian Highlands (probably absorbing the steppe-admixed people already living in parts of that broad region), they were already mixed with South-Central Asian people along the way (mainly of Iran_Neolithic descent), but I do not think the expansion began with such people with diluted steppic ancestry. What I think is likely is that a post-Yamnaya type of ancestry with R1a-Z93 went along a southern route and also along an eastern route (maybe in its first expansion i.e. 2850-2700 B.C.), and much later a secondary southward expansion came mainly from those who had taken that former eastern route, coming via Turan into West Asia.
 
Hi, so what are you all thinking for Armenians? Where did they come from? Catacomb? When did they arrive in Armenia?

Was the MBA sample from the Trialeti Culture? Do you all think the Trialeti Culture were speakers of (proto)Armenian?

If you're talking about modern Armenians, they are probably derived from a mixture of different populations. One MBA sample was identified as Trialeti, although it looks an atypical 'outlier' to the other samples. I don't know about language, and am just looking at the genetics.

Are we sure that samples found with Steppe ancestry in Armenia are Armenian and not Iranian (Scythian), Cimmerian, or some other group? Perhaps the Iranians or Cimmerians account for the LBA Shintashta-like sample(s)?
The samples I looked at pre-date the Scythians. i haven't analysed them in detail, but they don't really look Scythian to me. Yes, perhaps the Scythians had a genetic influence at a later stage?
 
It is intriguing for me that 1 of the MLBA Anatolian individuals I have data from consistently picks some small amount of steppe ancestry in the models I have tried for several BA samples of that and other regions. The other samples totally lack that component, only 1 of them does. The fits improve a bit when some steppic source is included. What's intriguing, considering the pretty early divergence of the Anatolian IE family, is that that Anatolian sample consistently prefers an early steppe input (Progress_Eneolithic) over any other when I use all kinds of steppe aDNA samples possible (Yamnaya, Eneolithic Steppe, BB, CWC, Andronovo), in addition to a larger number of non-steppic DNA samples of the Neolithic, Chalcolithic and Early BA.

[1] "distance%=2.1477 / distance=0.021477"
Anatolia_MLBA:MA2203
Anatolia_EBA_Ovaoren 46.80
Greece_N 31.05
Sarazm_Eneolithic 7.65
Levant_N 5.75
Armenia_ChL 5.20
RUS_Progress_Eneolithic 3.55

Yes, I'm sure some steppic DNA leaked into Anatolia at various points, especially originating from predominantly R1b communities - after all, the Steppe really isn't that far away. As I've pointed out before, most branches of R1b-Z2103 seem to coalesce there, and R1b-Bell Beaker also has a good fit with a Central Anatolian Neolithic component. I imagine that any steppic immigrants were heavily outnumbered, and thus their DNA was for the most part massively diluted through admixture over time.
 
Why do yu presume that the subsequent IEAryan expansions were of admixed populations with little steppic admixture?
Because (i) the major growth of steppic lineages seems to pre-date the expansion of apparently steppic cultures, and (ii) we can see from Armenian samples that steppic autosomal DNA was already heavily diluted by indigenous Middle Eastern DNA and mixed into Middle Eastern paternal lineages at an earlier stage. I don't see any pure steppic samples anywhere South of the Caucasus. Isn't there also some evidence to suggest that some adopted local languages like Hurrian?

I agree that when that kind of ancestry, together with some clades of Z93 (and other haplogroups), arrived in areas like the Iranian Plateau and the Armenian Highlands (probably absorbing the steppe-admixed people already living in parts of that broad region), they were already mixed with South-Central Asian people along the way (mainly of Iran_Neolithic descent), but I do not think the expansion began with such people with diluted steppic ancestry.
Yes, I don't think the expansion of steppic lineages in the region necessarily began with people of diluted steppic ancestry. I am merely talking about the expansion of the later steppe-like cultures/peoples, such as the Mitanni and the Hittites, which seem to have arisen quite some time after the steppic lineages themselves expanded.

What I think is likely is that a post-Yamnaya type of ancestry with R1a-Z93 went along a southern route and also along an eastern route (maybe in its first expansion i.e. 2850-2700 B.C.), and much later a secondary southward expansion came mainly from those who had taken that former eastern route, coming via Turan into West Asia.
I often see the expansion of steppic people described as coming in waves, with the first wave ignored as largely disappearing, followed by a second wave that also apparently disappears with little trace, and with only the last wave believed to have any significant genetic impact. To me, this is not only simplistic, but also fits poorly with the data.

The data suggests to me that, for the most part, the various steppic peoples arrived and thrived at the Caucasus approximately together. Some (the earlier thrivers) quickly mixed with Southern/Middle Eastern peoples, the later thrivers look to be more heavily mixed with older Armenian-like peoples (rather than mainstream Iranians/Central Asians). I can't see that much difference between them, apart from the later thrivers looking a bit closer to East Volga, rather than West Volga. The abrupt shifts in their lineage growth patterns between one branch and another, and the abrupt changes in best-fit admixed autosomal inheritance, both suggest conflict between themselves, rather than a joint steppic people conflicting with indigenous folk, and I would expect that many of these indigenous folk were co-opted into their internecine warring.

Within a few centuries of arrival, the massive expansion of steppic lineages dies out, presumably as the steppic people became less destructive of indigenous paternal lineages, and more integrated into general Middle Eastern populations.
 
Last edited:
Why do yu presume that the subsequent IEAryan expansions were of admixed populations with little steppic admixture? Couldn't it have been, as the data suggest, a secondary and more demographically (in the long term) impactful expansion coming originally from a region that had preserved its MLBA Steppe ancestry much better, North-Central Asia (Srubnaya and Andronovo horizons)? I agree that when that kind of ancestry, together with some clades of Z93 (and other haplogroups), arrived in areas like the Iranian Plateau and the Armenian Highlands (probably absorbing the steppe-admixed people already living in parts of that broad region), they were already mixed with South-Central Asian people along the way (mainly of Iran_Neolithic descent), but I do not think the expansion began with such people with diluted steppic ancestry. What I think is likely is that a post-Yamnaya type of ancestry with R1a-Z93 went along a southern route and also along an eastern route (maybe in its first expansion i.e. 2850-2700 B.C.), and much later a secondary southward expansion came mainly from those who had taken that former eastern route, coming via Turan into West Asia.
These are the main genetic differences between the earlier (MBA) and the later (LBA) steppic samples in Armenia:
1. The MBA people had a substantial (approximately 1/3) Southern component, derived either from the Levant, Iraq or even possibly Arabia, whereas this Southern component appears absent from the LBA people. This seems unlikely to be a coincidence, and looks to be a sign that some of the first MBA steppic people ventured deep into the Middle East before returning admixed with Middle Easterners to colonise Armenia.
2. There is little trace of EBA Armenians in the MBA people, but they comprise a substantial component within the LBA people. This suggests that the MBA steppic people resurging into Armenia from the South did not integrate with EBA Armenians, but that the LBA steppic people did. My suggestion is that the MBA and LBA steppic people arrived approximately together, but that the MBA people roamed further, whereas the LBA people were descended from steppics who largely remained within their Southern Caucasus refuge.
3. I do not see any best-fit traces of Iranian within either the MBA or LBA Armenian steppic samples. Accordingly, my suggestion is that both most likely derive from people who came directly to the Southern Caucasus without gestating in Central Asia or Iran along the way.

In each of these admixed populations, perhaps unlike in Europe, both autosomal DNA and paternal lineages of the indigenous people seem to have continued to thrive. It looks to me like the elite lineages were shared between the indigenous and steppic people, and I suspect that the steppics largely became a respected caste of mercenaries/enforcers/guards/'people herders' within ethnically mixed populations, supporting the indigenous elites. The polarised developments of the different branches of R1a-Z94 suggest that the steppics' fortunes were dependent on the fluctuating successes and failures of the general populations they supported; one branch probably conflicting with the other.
 
It is intriguing for me that 1 of the MLBA Anatolian individuals I have data from consistently picks some small amount of steppe ancestry in the models I have tried for several BA samples of that and other regions. The other samples totally lack that component, only 1 of them does. The fits improve a bit when some steppic source is included. What's intriguing, considering the pretty early divergence of the Anatolian IE family, is that that Anatolian sample consistently prefers an early steppe input (Progress_Eneolithic) over any other when I use all kinds of steppe aDNA samples possible (Yamnaya, Eneolithic Steppe, BB, CWC, Andronovo), in addition to a larger number of non-steppic DNA samples of the Neolithic, Chalcolithic and Early BA.

[1] "distance%=2.1477 / distance=0.021477"
Anatolia_MLBA:MA2203
Anatolia_EBA_Ovaoren 46.80
Greece_N 31.05
Sarazm_Eneolithic 7.65
Levant_N 5.75
Armenia_ChL 5.20
RUS_Progress_Eneolithic 3.55

My objections:

The location of this sample is pretty good but it is from 1750–1500 BC. Also given the widespread distribution of Anatolian languages i would expect ~%15-20 steppe admixture (similar to Myceneans) in all the samples.
 
My objections:

The location of this sample is pretty good but it is from 1750–1500 BC. Also given the widespread distribution of Anatolian languages i would expect ~%15-20 steppe admixture (similar to Myceneans) in all the samples.

My guess is that the lighter Steppe DNA in the Anatolian sample is mostly from a different (earlier) source.

Mycaenean BA DNA does, however, look to be from similar source to the two Armenian clusters (MBA and LBA), indicating that Steppe-infused Caucasus DNA branched apart circa 2,000 BC in at least three directions (Middle East, Armenia and Balkans/Greece) and perhaps four (if we include Northern India).
 
My objections:

The location of this sample is pretty good but it is from 1750–1500 BC. Also given the widespread distribution of Anatolian languages i would expect ~%15-20 steppe admixture (similar to Myceneans) in all the samples.

Just have a look at the ADMIXTURE runs in the Anatolian Hunter-Gatherer paper (Supplement figure 1). It has both Kumtepe samples, Kumtepe 6 and Kumtepe 4. One of them looks pretty packed with Steppe ancestry. There is no doubt that is Kumtepe 4, a very low resolution sample, because the paper where they were published in makes extensively clear that Kumtepe 6 is a run of the mill Anatolian farmer.

https://www.cell.com/current-biology/pdfExtended/S0960-9822(15)01516-X

You can see PCA's with Kum4 that cluster with Greeks and Iberians, while Kum6 is more inclined to Sardinians. Now Kum4 is dated 5,500 - 4,800 BP. Have a look at what the Kumtepe site actually means:

https://en.wikipedia.org/wiki/Kumtepe

Around 3700 BC new settlers came to Kumtepe. The people of this new culture, Kumtepe B, built relatively large houses with multiple rooms, sometimes a porch. They also practiced animal husbandry and agriculture. The main domestic animals were goats and sheep, bred not only for meat but for milk and wool as well. They knew lead and bronze along with copper. Shortly after 3000 BC Yassıtepe and Hisarlık (Troy) were colonized probably from Kumtepe.

Kum4 is at the right spot and time. It is unfortunate that the sample is such low resolution that formal stats can't be done with it, otherwise the debate was over.
 
Last edited:
Just have a look at the ADMIXTURE runs in the Anatolian Hunter-Gatherer paper. It has both Kumtepe samples, Kumtepe 6 and Kumtepe 4. One of them looks pretty packed with Steppe ancestry. There is no doubt that is Kumtepe 4, a very low resolution sample, because the paper where they were published in makes extensively clear that Kumtepe 6 is a run of the mill Anatolian farmer.

https://www.cell.com/current-biology/pdfExtended/S0960-9822(15)01516-X

You can see PCA's with Kum4 that cluster with Greeks and Iberians, while Kum6 is more inclined to Sardinians. Now Kum4 is dated 5,500 - 4,800 BP. Have a look at what the Kumtepe site actually means:

https://en.wikipedia.org/wiki/Kumtepe



Kum4 is at the right spot and time. It is unfortunate that the sample is such low resolution that formal stats can't be done with it, otherwise the debate was over.

It does not look very clear to me for Kum4, only for Kum6. The analysis looks pretty limited, but perhaps I am misreading it.

I have seen formal stats run for even lower resolution samples than Kum4. Perhaps they did run the stats but didn't publish, because they didn't like the answer that came out? Indeed, perhaps the Steppe ancestry even looked uncomfortably heavy? It's a curious thing that so many of the most interesting samples are low res.
 
@epoch That's very interesting. I'm not sure what to think about it though because i don't know enough about the Balkan-West Anatolian interactions except i think there were less plague and population collapse in Anatolia.
I wonder if there are any data on skull shapes of 3700 BC Kumtepe population? If it's same with the Balkan steppe ones that would be an important evidence.
 
My objections:

The location of this sample is pretty good but it is from 1750–1500 BC. Also given the widespread distribution of Anatolian languages i would expect ~%15-20 steppe admixture (similar to Myceneans) in all the samples.

Even Mycenaeans might have had in fact less than ~15-20% (that's the upper reach of the estimates, not the average IIRC), and the Mycenaeans seem to have been relatively late arrivals from the steppe, speaking a LPIE dialect that shared many common lexical and grammatical innovations with Armenian, Indo-Iranian and other proto-languages that seem to correspond to some of the latest IE dialects to migrate to very far away from the steppe. I'd certainly expect the Anatolian speakers to have much less steppe ancestry by the MLBA in Anatolia. Their language seems much more archaic, i.e. having split from the PIE dialect continuum area a lot before any other, maybe even as early as the Eneolithic in the 5th millennium B.C., and it might have split off and started mixing with non-steppic populations more than a millennium before the steppic ancestors of the Mycenaeans, and that's if the Proto-Anatolians speakers of Early PIE were indeed fully "steppic" autosomally, and not something else. Add to that that those Proto-Anatolian speakers then went on to live and develop in some of the most populated areas of the world back then, either first Southeast Europe or Transcaucasia, and later Anatolia, so their steppic ancestry would've been very diluted since a long time before the Proto-Greeks even existed. So, I honestly don't expect Anatolian Indo-Europeans of the MLBA, even if their language really had origins in a steppic population, to have more than 10% of Steppe_Eneolithic ancestry.
 
Even Mycenaeans might have had in fact less than ~15-20% (that's the upper reach of the estimates, not the average IIRC), and the Mycenaeans seem to have been relatively late arrivals from the steppe, speaking a LPIE dialect that shared many common lexical and grammatical innovations with Armenian, Indo-Iranian and other proto-languages that seem to correspond to some of the latest IE dialects to migrate to very far away from the steppe. I'd certainly expect the Anatolian speakers to have much less steppe ancestry by the MLBA in Anatolia. Their language seems much more archaic, i.e. having split from the PIE dialect continuum area a lot before any other, maybe even as early as the Eneolithic in the 5th millennium B.C., and it might have split off and started mixing with non-steppic populations more than a millennium before the steppic ancestors of the Mycenaeans, and that's if the Proto-Anatolians speakers of Early PIE were indeed fully "steppic" autosomally, and not something else. Add to that that those Proto-Anatolian speakers then went on to live and develop in some of the most populated areas of the world back then, either first Southeast Europe or Transcaucasia, and later Anatolia, so their steppic ancestry would've been very diluted since a long time before the Proto-Greeks even existed. So, I honestly don't expect Anatolian Indo-Europeans of the MLBA, even if their language really had origins in a steppic population, to have more than 10% of Steppe_Eneolithic ancestry.

As a hypothesis this makes sense to me. Steppe IE comes to Balkans, mix with locals, steppe admixture gets diluted, then with minority steppe admixture they move to Anatolia and mix with locals there. In this case they might have even less than %3-5 steppe ancestry. This would require heavy immigration from Balkans to Anatolia though since Anatolia was quite populated and developed i don't believe elite dominance model would work.
 
Even Mycenaeans might have had in fact less than ~15-20% (that's the upper reach of the estimates, not the average IIRC), and the Mycenaeans seem to have been relatively late arrivals from the steppe, speaking a LPIE dialect that shared many common lexical and grammatical innovations with Armenian, Indo-Iranian and other proto-languages that seem to correspond to some of the latest IE dialects to migrate to very far away from the steppe.
It depends what is meant by Steppe ancestry. Mycenaeans most likely had a bit of steppic DNA derived from a variety of sources (direct and indirect) at various times - the older sources (via Central Anatolia and the Suvorovo, almost completely diluted away), then some Cernavoda, then some Yamnaya, then some South Caucasus. In total, as you suggest, a steppic component of probably less than 15%. I wouldn't categorise Mycenaeans as arrivals from the Steppe - for the most part they were already there in Greece; but they appear to have been catalysed by incoming semi-steppic people with genetic links to the same South Caucasians that spread across the Middle East from around 2,200 BC.

I'd certainly expect the Anatolian speakers to have much less steppe ancestry by the MLBA in Anatolia. Their language seems much more archaic, i.e. having split from the PIE dialect continuum area a lot before any other, maybe even as early as the Eneolithic in the 5th millennium B.C., and it might have split off and started mixing with non-steppic populations more than a millennium before the steppic ancestors of the Mycenaeans, and that's if the Proto-Anatolians speakers of Early PIE were indeed fully "steppic" autosomally, and not something else.
Again, Anatolians most likely had some steppic DNA derived from a variety of sources over time - some archaic (heavily diluted), some Eneolithic, and some Bronze Age. Given both their genetics and language, as you indicate, I suspect their main source of steppic ancestry (heavily diluted) was pre-Bronze Age and largely unrelated to the Mycenaeans, and I doubt the Proto-Anatolian speakers of early PIE were fully steppic autosomally.

Add to that that those Proto-Anatolian speakers then went on to live and develop in some of the most populated areas of the world back then, either first Southeast Europe or Transcaucasia, and later Anatolia, so their steppic ancestry would've been very diluted since a long time before the Proto-Greeks even existed. So, I honestly don't expect Anatolian Indo-Europeans of the MLBA, even if their language really had origins in a steppic population, to have more than 10% of Steppe_Eneolithic ancestry.
Yes, although you can see that degrees of dilution varied hugely so it cannot always be assumed. Some examples:
1. Steppic people going South of the Caucasus/Caspian during the Bronze Age - in the Middle East, their DNA appears to have been more heavily diluted (due to integration?); in Northern India, less so (due to separation by caste?).
2. Steppic people going into Europe during the Chalcolithic - in SE Europe/Balkans, heavily diluted; in SW Europe, less diluted; in N Europe, less diluted still. To the extent that the steppic people separated, thrived and then admixed with each other, dilution did not occur to anywhere near the same degree.
 
Returning to what seems to be the most lasting/influential Steppe DNA insertion South of the Caucasus (probably late third millennium BC), it looks like it branched widely in four directions:
1. South towards Arabia and/or the Levant
2. Mainly remaining around the Southern Caucasus (Armenia/N Iran), then later spreading South
3. Back through the Steppe towards Balkans/Greece
4. East into Pakistan and Northern India

Different paternal lineages appear to have dominated each branch, but each seems to have had a similar autosomal mix of Bronze Age Steppe and South Caucasus components.
 

This thread has been viewed 27627 times.

Back
Top