Analysis of over 2,000 HVR1 profiles From SouthWest Asia

Fire Haired14

Banned
Messages
2,185
Reaction score
582
Points
0
Y-DNA haplogroup
R1b DF27*
mtDNA haplogroup
U5b2a2b1
I got the HVR1 data from Table S1. of the 2013 study Y-Chromosome and mtDNA Genetics Reveal Significant Contrasts in Affinities of Modern Middle Eastern Populations with European and African Populations”.

I plan on doing this same-type of analysis with as many modern pops as possible.

Here's the link to my analysis.


https://docs.google.com/document/d/1BxqemNoVOk1XVP73awfpAL9oG6c8F5w6m9s-hACH3Ws/edit?usp=sharing

I broke down the haplotypes as much as they possibly could be to understand the maternal gene pool of SouthWest Asia. Most of what I learned is self-explainable if you look at the haplogroup frequencies and the haplotypes. I don't see a need for a lengthy description. Later I will do more work on this.

In terms of variation in SouthWest Asia: Lebanon, Palestine, Syria, and Jordan are practically indistinguishable from each other. Yeman and Saudi Arabia are united in differences from those more northwestern countries. So the split in SouthWest Asia seems to be NorthWestern vs SouthEastern.

Yeman and Saudi Arabia have more of the African haplogroup L(xM, N), especially Yeman with 38.3%. Yeman and Saudi Arabia have much less H and HV(xH), much more R0a(and more diversity in R0), J, and N1a1a. There's also a higher frequency and variety of M in Saudi Arabia and Yeman.

Because Neolithic Central Europeans were mostly of ancient West Asian decent(50-70% ENF) but separated by over 8,000 years from modern SouthWest Asians, I compared the two(not in documents or spreadsheets yet).

It is quite obvious the two are not closely related maternally. Neolithic Central Europeans trace most of the maternal lineages to Stone age West Asian-East Mediterranean women, who's lineages didn't do well in SouthWest Asia.

Here's my comparison of the two. I'll do a more formal comparison with a spreadsheet sometime in the future.

https://docs.google.com/document/d/1L8YOd8ntIAAjUIAfd8aCMBar7mTKi2qIixinnjUA94k/edit

Overall IMO SouthWest Asians are very native to SouthWest Asia. There aren't signs in autosomal DNA of any significant gene flow from other regions into SouthWest Asia besides ANE. It's probably safe to assume most of their maternal lineages have been evolving in the broad region of the Middle East for 10,000s of years.

"ENF" ancestry might be very old and very pre-Neolithic. ENF-rich people may have already had alot of regional(ethnic, etc.) mtDNA-diversity 9,000 years ago when Neolithic Central European's ENF ancestors left. This can explain why modern SouthWest Asian mtDNA is so differnt from EEF, and why they have so much diversity in "West Eurasian" haplogroups.

A 2001 study found R0a1a and R0a2c in Upper Palaeolithic Morocoo dating to 10,000BC. It could very well be contamination. Although this is what I would expect.

The package of "Near Eastern" lineages which arrived in Europe with farming, had been evolving in the Middle East for 1,000s of years before farming existed. They did not expand with farming like in Europe. So, I expect Middle Eastern hunter gatherers to display, T, J, R0, N1, etc.
 
Yeman and Saudi Arabia have more of the African haplogroup L(xM, N), especially Yeman with 38.3%

Must be founder effect. Yemenites do have significant percentage of East African admixture (~10-15%) but not ~40%.

Because Neolithic Central Europeans were mostly of ancient West Asian decent(50-70% ENF)

If you mean Neolithic European farmer component. More like ~75-80%. The study spoke about 50% minimum to 98% maximum.
If you mean Neolithic European farmers as people than 60-70% fits.
 
Fire Haired, have you read the following paper and taken note of the mtDna sequences provided in it?
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0054616

"Y-Chromosome and mtDNA Genetics Reveal Significant Contrasts in Affinities of Modern Middle Eastern Populations with European and African Populations. "

Some of your conclusions were already noted by them, but they also have a slightly different take on it in terms of the relationship to Europe, although perhaps I am misunderstanding you.

"The Middle East was a funnel of human expansion out of Africa, a staging area for the Neolithic Agricultural Revolution, and the home to some of the earliest world empires. Post LGM expansions into the region and subsequent population movements created a striking genetic mosaic with distinct sex-based genetic differentiation. While prior studies have examined the mtDNA and Y-chromosome contrast in focal populations in the Middle East, none have undertaken a broad-spectrum survey including North and sub-Saharan Africa, Europe, and Middle Eastern populations. In this study 5,174 mtDNA and 4,658 Y-chromosome samples were investigated using PCA, MDS, mean-linkage clustering, AMOVA, and Fisher exact tests of F[SUB]ST[/SUB]'s, R[SUB]ST[/SUB]'s, and haplogroup frequencies. Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa. Previous Y-chromosome results showed a Levantine coastal-inland contrast marked by J1 and J2, and a very strong North African component was evident throughout the Middle East. Neither of these patterns were observed in the mtDNA. While J2 has penetrated into Europe, the pattern of Y-chromosome diversity in Lebanon does not show the widespread affinities with Europe indicated by the mtDNA data. Lastly, while each population shows evidence of connections with expansions that now define the Middle East, Africa, and Europe, many of the populations in the Middle East show distinctive mtDNA and Y-haplogroup characteristics that indicate long standing settlement with relatively little impact from and movement into other populations."

There is also the following paper and the sequences discussed in it.
http://www.nature.com/ncomms/journal/v4/n5/full/ncomms2871.html
"A European population in Minoan Bronze Age"

"The first advanced Bronze Age civilization of Europe was established by the Minoans about 5,000 years before present. Since Sir Arthur Evans exposed the Minoan civic centre of Knossos, archaeologists have speculated on the origin of the founders of the civilization. Evans proposed a North African origin; Cycladic, Balkan, Anatolian and Middle Eastern origins have also been proposed. Here we address the question of the origin of the Minoans by analysing mitochondrial DNA from Minoan osseous remains from a cave ossuary in the Lassithi plateau of Crete dated 4,400–3,700 years before present. Shared haplotypes, principal component and pairwise distance analyses refute the Evans North African hypothesis. Minoans show the strongest relationships with Neolithic and modern European populations and with the modern inhabitants of the Lassithi plateau. Our data are compatible with the hypothesis of an autochthonous development of the Minoan civilization by the descendants of the Neolithic settlers of the island."

I know I have saved others. I'll try to find them for you.
 
Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa.

Lebanon is not similar to Europe mtDNA wise. The authors of this study in my opinion used unreliable methods to find mtDNA-relationship. Studying mtDNA variation is not rocket science, it's very simple. You don't need PCAs, stats, etc. There is no mathematical method that can always tell how closely related people are maternally. What you need to do in my opinion is find the most recent subclade for each sample and look at every haplotype individually.

There is also the following paper and the sequences discussed in it.
http://www.nature.com/ncomms/journal/v4/n5/full/ncomms2871.html
"A European population in Minoan Bronze Age"

"The first advanced Bronze Age civilization of Europe was established by the Minoans about 5,000 years before present. Since Sir Arthur Evans exposed the Minoan civic centre of Knossos, archaeologists have speculated on the origin of the founders of the civilization. Evans proposed a North African origin; Cycladic, Balkan, Anatolian and Middle Eastern origins have also been proposed. Here we address the question of the origin of the Minoans by analysing mitochondrial DNA from Minoan osseous remains from a cave ossuary in the Lassithi plateau of Crete dated 4,400–3,700 years before present. Shared haplotypes, principal component and pairwise distance analyses refute the Evans North African hypothesis. Minoans show the strongest relationships with Neolithic and modern European populations and with the modern inhabitants of the Lassithi plateau. Our data are compatible with the hypothesis of an autochthonous development of the Minoan civilization by the descendants of the Neolithic settlers of the island."

I know I have saved others. I'll try to find them for you.

Thanks. I'll look at that study. It'll be neat to see Bronze age East-Mediterranean mtDNA. There's Greek and Creten mtDNA from the study I got SouthWest Asian data, and it'll be fun to compare them to Minoan mtDNA.
 
Must be founder effect. Yemenites do have significant percentage of East African admixture (~10-15%) but not ~40%.

It's not a founder effect because they had various differnt L lineages. 40% African maternal lineages can equal 20% African blood, if none of their fathers were African. But I guess it's not that simple. 80% of Irish and over 50% of Chadic people have R1b, but they don't share over 50% blood from an ancient R1b people.

If you mean Neolithic European farmer component. More like ~75-80%. The study spoke about 50% minimum to 98% maximum.
If you mean Neolithic European farmers as people than 60-70% fits.

That test was based on Bedouin, and we need Neolithic Near eastern genomes. EEF may have been 100% from an ancient West Asian pop, who had more WHG than west Asians today.
 
Lebanon is not similar to Europe mtDNA wise. The authors of this study in my opinion used unreliable methods to find mtDNA-relationship. Studying mtDNA variation is not rocket science, it's very simple. You don't need PCAs, stats, etc. There is no mathematical method that can always tell how closely related people are maternally. What you need to do in my opinion is find the most recent subclade for each sample and look at every haplotype individually.

I don't think any mathematical model is perfect, but I don't see how the results can be reliable if you don't do one, and a very sophisticated one at that. You have to see the whole pattern.

Perhaps you can explain in more detail what you're doing. Are you looking at terminal snps and then going backward to find the geographic trail to the more basal versions? How could that be accurate, however, given that most published mtDna sequences are not fully resolved? I don't see how anyone can have any idea on many of these published sequences whether they actually would be positive for more down stream markers or not, given that in most cases they didn't test for them. Wouldn't you need FGS scans of all of them to make this kind of determination?

It would be like comparing published R1b sequences where some samples were tested for R1b downstream clades and some were only tested for R1b1.
 
I don't think any mathematical model is perfect, but I don't see how the results can be reliable if you don't do one, and a very sophisticated one at that. You have to see the whole pattern.

Perhaps you can explain in more detail what you're doing. Are you looking at terminal snps and then going backward to find the geographic trail to the more basal versions? How could that be accurate, however, given that most published mtDna sequences are not fully resolved? I don't see how anyone can have any idea on many of these published sequences whether they actually would be positive for more down stream markers or not, given that in most cases they didn't test for them. Wouldn't you need FGS scans of all of them to make this kind of determination?

It would be like comparing published R1b sequences where some samples were tested for R1b downstream clades and some were only tested for R1b1.

I guess a mathematical model can work, but I don't like to see people put too much faith in them.

Almost all mtDNA studies test somewhere around 16000-16500 in HVR1. I can compare most samples at the same coverage. If HVR2 and CR are tested I'll only compare them to other samples who were tested for HVR2 and CR.

I think terminal snps("extra mutations") will start to present patterns.
 

This thread has been viewed 6327 times.

Back
Top