PDA

View Full Version : Autosomal Results from Di Cristofaro et al



Angela
02-11-13, 18:24
I took a look at the Di Cristofaro et al paper (co-authored by Roy King, Underhill, Natalie Myers and the Estorian Bio Center researchers), Afghan Hindu Kush: Where Eurasian Sub-Continent Gene flows Converge.
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0076748#pone-0076748-g002

It contains results for Middle Eastern populations, "Jewish" populations, and some European ones.

I hesitate to say any of this is necessarily authoritative, as they strangely sampled in very few places in Europe, however some interesting things did appear.

The autosomal results can be found in Figure 2 and Figure S2.

As to the light green component, the authors state that it weakly peaks in two places, the Caucasus and the Indus basin, and covers all of western Europe to the western parts of Russia, and even to the extreme west of China and half of India. From the map, it looks to be most dense south of the Caucasus, in the southeast corner of the Black Sea (extending into Armenia?) and on both sides of the Caspian Sea. The graphics are terrible, but as far as Europe is concerned, to me it seems stronger sort of east of the Rhine, and particularly in central Eastern Europe and part of the Balkans. (This affinity between central eastern Europe and the northern Near East also showed up in a recent study of mitochondrial dna.)

The authors further state that the light blue component has it's highest frequency in the Levant (Syria and Lebanon) and "is present westward in Europe until the Atlantic Ocean and decreases eastward toward Afghanistan." To me, this component seems from the map to be centered in the heartland of agriculture from what I can tell, (eastern Turkey, Syria, the Levant generally) and seems to be strongest in Italy and west of the Rhine.

The blue component is said to be most frequent in northwestern Europe, and decreases with latitude as you're going south.

These are not the European components as we have seen them in Dodecad.

In fact, they look to me very much like the Geno 2.0 results, although the "Northwestern" component is more dominant here.
In Geno 2.0 the components are called Northern European, Southern European and Southwest Asian. (And, of course, Underhill is a co-author here.)

The "West Asian" component here, which was called Southwest Asian in Geno 2.0, looks to have affinities to what Dienekes called the Gedrosia component, although it is more frequent than appeared from his analysis.

Is this tracking the elusive "Indo-European" component? After all, one of the stated purposes of the Geno 2.0 project was to find and track them?

To show how far we are from Dodecad components, the Sardinians, unique as ever, have no West Asian, which comports with previous analyses, but here they are 60% Northwestern and only 40% Neolithic farmer? Mediterranean?, Aegean? eastern Mediterranean?

I also took a look at the French and Italian results (labelled North Italian, but samples were taken in Tuscany as well)

The French appear to be 60-70% Northwestern, 10-20% Mediterranean(?), with a median of about 18%, and slightly more than 20% "West Asian".

The Italians seem to be about 55-60% Northwestern, 25% Mediterranean(?) and 20% West Asian.

The West Asian seems to average out to about 20% in all the Europeans.

Any speculations are more than welcome! :)

ElHorsto
02-11-13, 19:25
I took a look at the Di Cristofaro et al paper (co-authored by Roy King, Underhill, Natalie Myers and the Estorian Bio Center researchers), Afghan Hindu Kush: Where Eurasian Sub-Continent Gene flows Converge.
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0076748#pone-0076748-g002

It contains results for Middle Eastern populations, "Jewish" populations, and some European ones.

I hesitate to say any of this is necessarily authoritative, as they strangely sampled in very few places in Europe, however some interesting things did appear.

The autosomal results can be found in Figure 2 and Figure S2.

As to the light green component, the authors state that it weakly peaks in two places, the Caucasus and the Indus basin, and covers all of western Europe to the western parts of Russia, and even to the extreme west of China and half of India. From the map, it looks to be most dense south of the Caucasus, in the southeast corner of the Black Sea (extending into Armenia?) and on both sides of the Caspian Sea. The graphics are terrible, but as far as Europe is concerned, to me it seems stronger sort of east of the Rhine, and particularly in central Eastern Europe and part of the Balkans. (This affinity between central eastern Europe and the northern Near East also showed up in a recent study of mitochondrial dna.)

The authors further state that the light blue component has it's highest frequency in the Levant (Syria and Lebanon) and "is present westward in Europe until the Atlantic Ocean and decreases eastward toward Afghanistan." To me, this component seems from the map to be centered in the heartland of agriculture from what I can tell, (eastern Turkey, Syria, the Levant generally) and seems to be strongest in Italy and west of the Rhine.

The blue component is said to be most frequent in northwestern Europe, and decreases with latitude as you're going south.

These are not the European components as we have seen them in Dodecad.


Which one exactly? The various Dodecad runs already differ considerably from each other. I think it was K7 which returned a "North-West european" component, but in K12b it completely disappeared and got replaced by a mixture of other components (Atlantic_Med + Northern + Gedrosia - Caucasus). I personally try to compare different runs at the same time as possible, hoping the differences would reveal hidden relationships. But if I had to decide for a single run, then K12b makes most sense imho.



To show how far we are from Dodecad components, the Sardinians, unique as ever, have no West Asian, which comports with previous analyses, but here they are 60% Northwestern and only 40% Neolithic farmer? Mediterranean?, Aegean? eastern Mediterranean?

I also took a look at the French and Italian results (labelled North Italian, but samples were taken in Tuscany as well)

The French appear to be 60-70% Northwestern, 10-20% Mediterranean(?), with a median of about 18%, and slightly more than 20% "West Asian".

The Italians seem to be about 55-60% Northwestern, 25% Mediterranean(?) and 20% West Asian.


To my eyes this just confirms the same impression from the dodecad runs that western europe is very blurred between "meditteranean" and "north-european". This is especially visible when comparing K10 and K12b, where K10 tends to categorize western europe more to the north (therefore called "Atlantic_Baltic" component), whereas K12b decided to put Western Europe (in particular Iberians, Sardinians, Basques) more into the "Altantic_med" bin, leaving Britain 40% "Atlantic_med". After all, Atlantic_med is an old neighbour of Northern (geographically, but also FST distance), so it might be principally impossible to clearly separate both, especially if even Northern is partially defined by "Atlantic_med" (0% does not imply absolute lack!), maybe by old paleolithic contacts.

Angela
02-11-13, 22:13
Which one exactly? The various Dodecad runs already differ considerably from each other. I think it was K7 which returned a "North-West european" component, but in K12b it completely disappeared and got replaced by a mixture of other components (Atlantic_Med + Northern + Gedrosia - Caucasus). I personally try to compare different runs at the same time as possible, hoping the differences would reveal hidden relationships. But if I had to decide for a single run, then K12b makes most sense imho.



To my eyes this just confirms the same impression from the dodecad runs that western europe is very blurred between "meditteranean" and "north-european". This is especially visible when comparing K10 and K12b, where K10 tends to categorize western europe more to the north (therefore called "Atlantic_Baltic" component), whereas K12b decided to put Western Europe (in particular Iberians, Sardinians, Basques) more into the "Altantic_med" bin, leaving Britain 40% "Atlantic_med". After all, Atlantic_med is an old neighbour of Northern (geographically, but also FST distance), so it might be principally impossible to clearly separate both, especially if even Northern is partially defined by "Atlantic_med" (0% does not imply absolute lack!), maybe by old paleolithic contacts.

Very insightful comments...thanks for posting.

As precise as I always try to be, I obviously wasn't precise enough...I suppose I should have said that this is pretty far from the way the Dodecad and other autosomal components are *usually* interpreted on these kinds of Boards?

I do agree with pretty much everything you posted. Whenever I think about these Dodecad components, I like to refer back to the post where Dienekes looked at the K12b components in terms of the World 9 components...
http://dienekes.blogspot.com/2012/08/inter-relationships-of-dodecad-k12b-and.html

and this post where he looked at the K12b components in terms of one another, or the "one out" method, and the components from K7 and K12b in terms of each other...
http://dienekes.blogspot.com/2012/09/inter-relationships-between-dodecad-k7b.html

The f(3) statistics posted in the latter are also very interesting...although I'm not quite sure if I'm supposed to take from them that Atlantic Med and Southern are slightly more "pure" than North Euro and West Asian?

One of the things that strike me when looking at K12b components in terms of the World 9 components is that Atlantic Med comes out as about 90% Caucasus and 10% North Euro, and North Euro comes out as a mixture of Atlantic/Med, Gedrosia, and a slice of Siberian. When looking at the K12b components in terms of each other, Atlantic/Med is about 90% Southwest Asian and 10% North Euro. At first I tended to see that S.W.Asian and Caucasus as exclusively "farmer" input, but perhaps not...

When looking at the results in just this study, which seem to me to pretty clearly separate out the "Neolithic farmer" component, do you see the "Northwestern" component as perhaps composed of two Paleolithic components, one northern and one Mediterranean, or perhaps one Paleolilthic and one late Mesolithic and coming from the eastern Mediterranean but pre-farming?

And what about the "West Asian" component here?

ElHorsto
05-11-13, 12:22
Very insightful comments...thanks for posting.

As precise as I always try to be, I obviously wasn't precise enough...I suppose I should have said that this is pretty far from the way the Dodecad and other autosomal components are *usually* interpreted on these kinds of Boards?

I do agree with pretty much everything you posted. Whenever I think about these Dodecad components, I like to refer back to the post where Dienekes looked at the K12b components in terms of the World 9 components...
http://dienekes.blogspot.com/2012/08/inter-relationships-of-dodecad-k12b-and.html

and this post where he looked at the K12b components in terms of one another, or the "one out" method, and the components from K7 and K12b in terms of each other...
http://dienekes.blogspot.com/2012/09/inter-relationships-between-dodecad-k7b.html

The f(3) statistics posted in the latter are also very interesting...although I'm not quite sure if I'm supposed to take from them that Atlantic Med and Southern are slightly more "pure" than North Euro and West Asian?


Well, maybe if Caucasus_K12b is so important in West_Asian_K7b and if Caucasus_K12b is so "young" according to this f3 statistics, then I think it is not surprising to find West_Asian_K7b more admixed too.
Regarding the rank of North-Euro I could not find such an explanation. Maybe it is too hard to find nowadays a pure enough population resembling the ancient hunter-gatherers. Yet the Saami always come closest. The La Brana sample showed considerable Atlantic_med_K12b admixture (40%), which was absent in the Ajv samples from Gotland. K10 and lesser are not able to show this difference because it was all bumped into Atlantic_Baltic. After all I don't know how accurate the f3 rank for North_euro component really is.



When looking at the results in just this study, which seem to me to pretty clearly separate out the "Neolithic farmer" component, do you see the "Northwestern" component as perhaps composed of two Paleolithic components, one northern and one Mediterranean, or perhaps one Paleolilthic and one late Mesolithic and coming from the eastern Mediterranean but pre-farming?


What they claim to be "North_west european" AC4 more looks to me like North-West eurasian. Too bad they don't show Britain and Ireland in the map. If Britain were most modal they would have shown or mentioned it I think. I believe AC4 is basically the same component like Atlantic_Baltic_K10b, which is not surprising since they selected K=9.



And what about the "West Asian" component here?

Do you mean AC6? It most resembles Caucasus_K12b + Gedrosia_K12b, but it does not fit Greece, Albania, South Italy at all. I don't know, maybe it is the West asian part which came during Bronze-Age and later.

Angela
05-11-13, 19:36
Well, maybe if Caucasus_K12b is so important in West_Asian_K7b and if Caucasus_K12b is so "young" according to this f3 statistics, then I think it is not surprising to find West_Asian_K7b more admixed too.
Regarding the rank of North-Euro I could not find such an explanation. Maybe it is too hard to find nowadays a pure enough population resembling the ancient hunter-gatherers. Yet the Saami always come closest. The La Brana sample showed considerable Atlantic_med_K12b admixture (40%), which was absent in the Ajv samples from Gotland. K10 and lesser are not able to show this difference because it was all bumped into Atlantic_Baltic. After all I don't know how accurate the f3 rank for North_euro component really is.



What they claim to be "North_west european" AC4 more looks to me like North-West eurasian. Too bad they don't show Britain and Ireland in the map. If Britain were most modal they would have shown or mentioned it I think. I believe AC4 is basically the same component like Atlantic_Baltic_K10b, which is not surprising since they selected K=9.



Do you mean AC6? It most resembles Caucasus_K12b + Gedrosia_K12b, but it does not fit Greece, Albania, South Italy at all. I don't know, maybe it is the West asian part which came during Bronze-Age and later.


Yes, that is what I have been thinking might possibly explain AC6. There is indeed a statement by National Geographic about their goals for Geno 2.0, and one of them is that it is to trace the migration of the "Indo-Europeans". In the results from Geno 2.0, some of the Balkan countries do get higher values for what Geno 2.0 calls "Southwest Asian"...20 for Bulgaria, 18 for the Russians, 17 for Finns, Britain, Germany, Tuscany, etc, The only European countries of the ones listed that are very low for it are the Iberian countries, which only get 13. This might fit with the fact that the signal peters out as it goes west. It does look to me like a component that, once it got to Europe (in the Balkans), went overland in a more central Europe migration path, perhaps hitting Italy slightly later. I think this would fit in with the mtDNA results that just came out in the Brandt paper, which show a close FST between Eastern Europe and the northern Near East for mtDNA. How the H.Pamjav et al paper on the yDNA in Europe, which seems to show something similar for central East Europe fits in with all of this, I'm not sure. This is all just speculation, mind you! :)
http://en.wikipedia.org/wiki/Genographic_Project
http://www.dienekes.blogspot.com/2013/10/visualizing-y-haplogroup-distributions.html

Your analysis makes sense to me for the ranking of the K7b "West Asian" component. Assuming just for the moment that the ranking for the North Europe component is correct, the reasons are more obscure. The only thing that I can think of is the fact that when Dienekes analyzed the K12b North Euro component, using the "one out" method, North Euro K12b appears to be over 60% Atlanto-Med, over 30% Gedrosia, and a sliver of Siberian. And then, of course, Atlanto Med in another deconstruction, turned out to be mostly "Caucasus".
http://dienekes.blogspot.com/2012/08/inter-relationships-of-dodecad-k12b-and.html

As to AC4, their "North-Western European" component, which is, as you say, probably closer to North West Eurasian, it may be correlated with Atlantic-Baltic from K7b as you suggest. However, that also raises other issues, since the f(3) statistics for Atlantic-Baltic appear to be "younger", and thus it looks more admixed. It shows up as a combination of Southern, Caucasus_Gedrosia, and a sliver of Amerindian. If that is correct, this Caucasus_Gedrosia component may have actually come into Europe far earlier than the Bronze Age, although certainly the Bronze Age could have added to the signal.

ElHorsto
11-11-13, 22:33
Yes, that is what I have been thinking might possibly explain AC6. There is indeed a statement by National Geographic about their goals for Geno 2.0, and one of them is that it is to trace the migration of the "Indo-Europeans". In the results from Geno 2.0, some of the Balkan countries do get higher values for what Geno 2.0 calls "Southwest Asian"...20 for Bulgaria, 18 for the Russians, 17 for Finns, Britain, Germany, Tuscany, etc, The only European countries of the ones listed that are very low for it are the Iberian countries, which only get 13. This might fit with the fact that the signal peters out as it goes west. It does look to me like a component that, once it got to Europe (in the Balkans), went overland in a more central Europe migration path, perhaps hitting Italy slightly later. I think this would fit in with the mtDNA results that just came out in the Brandt paper, which show a close FST between Eastern Europe and the northern Near East for mtDNA. How the H.Pamjav et al paper on the yDNA in Europe, which seems to show something similar for central East Europe fits in with all of this, I'm not sure. This is all just speculation, mind you! :)
http://en.wikipedia.org/wiki/Genographic_Project
http://www.dienekes.blogspot.com/2013/10/visualizing-y-haplogroup-distributions.html

Your analysis makes sense to me for the ranking of the K7b "West Asian" component. Assuming just for the moment that the ranking for the North Europe component is correct, the reasons are more obscure. The only thing that I can think of is the fact that when Dienekes analyzed the K12b North Euro component, using the "one out" method, North Euro K12b appears to be over 60% Atlanto-Med, over 30% Gedrosia, and a sliver of Siberian. And then, of course, Atlanto Med in another deconstruction, turned out to be mostly "Caucasus".
http://dienekes.blogspot.com/2012/08/inter-relationships-of-dodecad-k12b-and.html

As to AC4, their "North-Western European" component, which is, as you say, probably closer to North West Eurasian, it may be correlated with Atlantic-Baltic from K7b as you suggest. However, that also raises other issues, since the f(3) statistics for Atlantic-Baltic appear to be "younger", and thus it looks more admixed. It shows up as a combination of Southern, Caucasus_Gedrosia, and a sliver of Amerindian. If that is correct, this Caucasus_Gedrosia component may have actually come into Europe far earlier than the Bronze Age, although certainly the Bronze Age could have added to the signal.

Yes, Atlantic_Baltic is likely very admixted.

The Eurogenes K15 (http://bga101.blogspot.de/2013/10/eurogenes-k15-now-at-gedmatch.html) is also interesting in that the big Atlantic_med(K12b) and Atlantic_baltic(K7b/K10a) component got splitted (not surprising when K is increased). There is now North_sea_K15, Atlantic_K15, West_med_K15 and East_med_K15 (and others, but not relevant for now). If K is increasing, then most components increasingly seem to approach local geography and more recent history, imho, although there are disargreeing opinions somewhere which I do not understand.
A bit more interesting in K15 are the Fst distances, where the Atlantic_K15 component is closest to North_sea_K15 (0.015; no surprise yet), but then it is next closest to East_Euro_K15 and East_med_K15 (both 0.022; surprise). West_med is further away (0.028; very surprising). Even West_asian(K15) is closer to Atlantic med (0.026). This does not quite fit geography.
So either this is a further hint towards an exceptionally ancient history of Atlantic_Med_K12b/West_med_K15, or it is just a reflection of the isolated insular history of the Sardinians, because this component is modal in Sardinia (Atlantic_med_K12b was too). If the latter is the case, then the Atlantic_K15 component, which is modal in Basques, is probably also just a result of the isolated history of the Basques.

Angela
12-11-13, 01:00
Yes, Atlantic_Baltic is likely very admixted.

The Eurogenes K15 (http://bga101.blogspot.de/2013/10/eurogenes-k15-now-at-gedmatch.html) is also interesting in that the big Atlantic_med(K12b) and Atlantic_baltic(K7b/K10a) component got splitted (not surprising when K is increased). There is now North_sea_K15, Atlantic_K15, West_med_K15 and East_med_K15 (and others, but not relevant for now). If K is increasing, then most components increasingly seem to approach local geography and more recent history, imho, although there are disargreeing opinions somewhere which I do not understand.
A bit more interesting in K15 are the Fst distances, where the Atlantic_K15 component is closest to North_sea_K15 (0.015; no surprise yet), but then it is next closest to East_Euro_K15 and East_med_K15 (both 0.022; surprise). West_med is further away (0.028; very surprising). Even West_asian(K15) is closer to Atlantic med (0.026). This does not quite fit geography.
So either this is a further hint towards an exceptionally ancient history of Atlantic_Med_K12b/West_med_K15, or it is just a reflection of the isolated insular history of the Sardinians, because this component is modal in Sardinia (Atlantic_med_K12b was too). If the latter is the case, then the Atlantic_K15 component, which is modal in Basques, is probably also just a result of the isolated history of the Basques.


I don't find that admixture analysis very informative for population genetics purposes. The fact that the fst figures don't make any sense just highlights that fact in my opinion. And, indeed, I think that "Atlantic" component is no more useful than a "Kalash" component would be...I'm not quite sure that the "Atlantic-Med" component is much better, although just knowing the population history of Sardinia, and the fact that this is also modal in Oetzi at least hints that it is pretty old...certainly older than the "Atlantic" component.

I don't think the problem is necessarily that there are a lot of "K" components; it depends on the algorithm, although after a certain point I think it does get less informative.

Have you taken a look at the fst numbers from the Dodecad Globe 13 run? They seem to make sense...also interesting are the fst statistics for the K=11 iteration of that run, particularly in light of the recent ancient find at Mal'ta.
https://docs.google.com/spreadsheet/ccc?key=0ArJDEoCgzRKedGR2ZWRoQ0VaWTc0dlV1cHh4ZUNJR UE&pli=1#gid=26