PLOS ONE just released Y-Chromosome Diversity in Modern Bulgarians: New Clues about Their Ancestry by Karachanak et al. Bulgaria was relatively undersampled to this day. This study, sampling 808 lineages from each of the 9 Bulgarian provinces, will provide valuable new insight, especially since it is the first research to look into the deep subclades of haplogroups G (10 subclades tested !), J2, Q, and R1b (8 subclades). It is a major improvement from the Bulgarian study by the same team five years ago, which didn't even differentiate the three main branches of I (I1, I2-M223 and I2-M423). Haplogroups non-listed in the table (C, H, L, R2) decrease by 0.7%.
Compared to the data I had computed on the Y-DNA tables (which I will update soon), this study shows a slightly higher frequency for haplogroups R1a (+2.5%), E1b1b (+2.1%), R1b (+2%) J1 (+1.5%), T (+1%), I1 (+0.8%), G (+0.8%), and N (+0.5%), but a lower percentage for J2 (-5%) and I2-M423 (-2%). The remaining I2-M223 and Q have virtually unchanged frequencies.
G-P303 (G2a1c2a in the present ISOGG nomenclature) is the dominant branch of G2a in Bulgaria, amounting to 3.1% of all paternal lineages. Two of its subclades are well represented : L497 (1.9%) and U1 (0.5%). Other forms of G2a only have trace frequencies, but nevertheless show a remarkable diversity. There were even samples of G2a* (P15).
In the Kurgan hypothesis, the scenario I favour for the spread of Indo-European people and languages, I postulated that R1b migrated from Anatolia to the Pontic Steppe in the Neolithic, then invaded the Danube basin from the late Chalcolithic and early Bronze Age, starting by roughly 5500 ybp. To confirm that hypothesis, R1b subclades should show some kind a gradient in time from east to west. In other words older subclades like L23 should be most common in Southeast Europe, then decrease in frequency towards Western Europe, where newer subclades arose. This is exactly what we observe here, as the main Bulgarian subclade of R1b is L23, which makes up half of all R1b lineages. Most of the other R1b lineages are later Celtic, Roman or Germanic arrivals, or even older subclades (1% of M269).
The Celtic R1b-S116 only makes up 0.7% of the Bulgarian gene pool, while the Celto-Roman R1b-U152 is at 2.2%, a frequency in line with those observed in other parts of the Roman Empire outside the U152 homeland (Italy + Alps), be it Iberia, North Africa or Anatolia.
It's always interesting to try to find traces of Germanic lineages in southern Europe. In this case, it is the Goths who settled in the region in the 3rd century. Typical Germanic haplogroups include I1 (4.3%), I2a2 (former I2b, 1.7%) and R1b-U106 (1.2%). This would appear to confirm my previous estimate that the Goths carried far more I1 and I2b lineages than R1b. Unfortunately no R1a subclades were tested aside from M458, which is a Slavic branch. The Germanic subclade of interest for the Goths is Z284.
I wish I could comment on the regional variations, but only Sofia and Plovdiv have over 100 samples (the very minimum to be relevant), and some provinces have ridiculous sample sizes (e.g. n=15 for Varna and n=21 for Razgrad).
The authors of the paper make some bizarre assumptions, on which I will comment below.
Why on earth would they classify E-V13 as Western Eurasian and not Middle Eastern or North African ? There is enough evidence (its presence across the Middle East, North Africa and Iberia) now to be confident that E-V13 did not originate in the Balkans, but probably in Northeast Africa like all other main subclades of E1b1b.
Note that Karanachak et al. already claimed in their 2008 Bulgarian study that E-V13 reached the Balkans 17,000 years ago and expanded from their in Neolithic times. They haven't learned anything in five years.
In other words they are saying that R1b-L23 and E-V13 were already in Europe before the Neolithic. This is highly unlikely. That is a good example of why one shouldn't assume anything based on the age estimates using STR loci, which have proved unreliable many times before.
Compared to the data I had computed on the Y-DNA tables (which I will update soon), this study shows a slightly higher frequency for haplogroups R1a (+2.5%), E1b1b (+2.1%), R1b (+2%) J1 (+1.5%), T (+1%), I1 (+0.8%), G (+0.8%), and N (+0.5%), but a lower percentage for J2 (-5%) and I2-M423 (-2%). The remaining I2-M223 and Q have virtually unchanged frequencies.
G-P303 (G2a1c2a in the present ISOGG nomenclature) is the dominant branch of G2a in Bulgaria, amounting to 3.1% of all paternal lineages. Two of its subclades are well represented : L497 (1.9%) and U1 (0.5%). Other forms of G2a only have trace frequencies, but nevertheless show a remarkable diversity. There were even samples of G2a* (P15).
In the Kurgan hypothesis, the scenario I favour for the spread of Indo-European people and languages, I postulated that R1b migrated from Anatolia to the Pontic Steppe in the Neolithic, then invaded the Danube basin from the late Chalcolithic and early Bronze Age, starting by roughly 5500 ybp. To confirm that hypothesis, R1b subclades should show some kind a gradient in time from east to west. In other words older subclades like L23 should be most common in Southeast Europe, then decrease in frequency towards Western Europe, where newer subclades arose. This is exactly what we observe here, as the main Bulgarian subclade of R1b is L23, which makes up half of all R1b lineages. Most of the other R1b lineages are later Celtic, Roman or Germanic arrivals, or even older subclades (1% of M269).
The Celtic R1b-S116 only makes up 0.7% of the Bulgarian gene pool, while the Celto-Roman R1b-U152 is at 2.2%, a frequency in line with those observed in other parts of the Roman Empire outside the U152 homeland (Italy + Alps), be it Iberia, North Africa or Anatolia.
It's always interesting to try to find traces of Germanic lineages in southern Europe. In this case, it is the Goths who settled in the region in the 3rd century. Typical Germanic haplogroups include I1 (4.3%), I2a2 (former I2b, 1.7%) and R1b-U106 (1.2%). This would appear to confirm my previous estimate that the Goths carried far more I1 and I2b lineages than R1b. Unfortunately no R1a subclades were tested aside from M458, which is a Slavic branch. The Germanic subclade of interest for the Goths is Z284.
I wish I could comment on the regional variations, but only Sofia and Plovdiv have over 100 samples (the very minimum to be relevant), and some provinces have ridiculous sample sizes (e.g. n=15 for Varna and n=21 for Razgrad).
The authors of the paper make some bizarre assumptions, on which I will comment below.
Karachanak et al. said:We found that the Y-chromosome gene pool in modern Bulgarians is primarily represented by Western Eurasian haplogroups with ~ 40% belonging to haplogroups E-V13 and I-M423, and 20% to R-M17. Haplogroups common in the Middle East (J and G) and in South Western Asia (R-L23*) occur at frequencies of 19% and 5%, respectively.
Why on earth would they classify E-V13 as Western Eurasian and not Middle Eastern or North African ? There is enough evidence (its presence across the Middle East, North Africa and Iberia) now to be confident that E-V13 did not originate in the Balkans, but probably in Northeast Africa like all other main subclades of E1b1b.
Note that Karanachak et al. already claimed in their 2008 Bulgarian study that E-V13 reached the Balkans 17,000 years ago and expanded from their in Neolithic times. They haven't learned anything in five years.
Karachanak et al. said:The lineage analysis provided the following interesting results: (i) R-L23* is present in Eastern Bulgaria since the post glacial period; (ii) haplogroup E-V13 has a Mesolithic age in Bulgaria from where it expanded after the arrival of farming; (iii) haplogroup J-M241 probably reflects the Neolithic westward expansion of farmers from the earliest sites along the Black Sea.
In other words they are saying that R1b-L23 and E-V13 were already in Europe before the Neolithic. This is highly unlikely. That is a good example of why one shouldn't assume anything based on the age estimates using STR loci, which have proved unreliable many times before.
Last edited: