R1b-L151*

—BoNe— · Apr 13, 2025

L151* es el subclado R1b con mayor descendencia, con una estimación de unos 300 millones de varones, y es prácticamente el fundador de Europa. Sus fechas se remontan a entre el 3500 y el 2500 d. C. ¿Alguien sabe si disponemos de estudios exhaustivos sobre el porcentaje real de L151*?

MOESAN · Apr 15, 2025

—BoNe— said:
L151* es el subclado R1b con mayor descendencia, con una estimación de unos 300 millones de varones, y es prácticamente el fundador de Europa. Sus fechas se remontan a entre el 3500 y el 2500 d. C. ¿Alguien sabe si disponemos de estudios exhaustivos sobre el porcentaje real de L151*?

an screen capture (I didn't take the study references, helas) I have at hand:
Y-R1b-L151/L11*
1,33% (13/974) Central Western Europe (Germany Austria Switzerland France)
1,30% (7/537) Northwestern Europe (Ireland UK Netherlands Scandinavia)
0,67% (4/660) Southern Europe (Italy Greece)
0,54% (3/554) Southwestern Europe (Iberia)
0,38% (3/780) Central Europe (Poland Czechia Slovakia Hungary)
0,13% (3/2315) Eastern Europe (Ukraina Bela-Russia Russia Estonia)
0,00% (0/1020) Southeastern Europe (Balkans Romania)
concerning the upstream Y-R1b-L51* Iberia is better classed (the first) and Northwestern is only 5th;
ATW Eastern and Southeastern Europe are the last ones here again.

—BoNe— · Apr 17, 2025

It seems that when it comes to the oldest subclades like L51*, P310*, and L151*, there’s very little data available, and most of it comes from studies with extremely small sample sizes or is derived secondhand from other research. I’ve seen a map showing L51* reaching frequencies of 8% in Central Europe and 6% in northern Portugal, but honestly, I don’t find it particularly reliable.

If you haven’t noticed, it’s fair to say that 90% of the refinement of R1b-M269+ has been done by independent researchers and enthusiasts on online forums. It’s honestly a bad joke that official academic studies continue to ignore these refinements—some of them still using subclade classifications from 2012 in papers published in 2019. All they seem to do is get lost in mental masturbation over PCA admixture plots.

MOESAN · Apr 17, 2025

—BoNe— said:
It seems that when it comes to the oldest subclades like L51*, P310*, and L151*, there’s very little data available, and most of it comes from studies with extremely small sample sizes or is derived secondhand from other research. I’ve seen a map showing L51* reaching frequencies of 8% in Central Europe and 6% in northern Portugal, but honestly, I don’t find it particularly reliable.

If you haven’t noticed, it’s fair to say that 90% of the refinement of R1b-M269+ has been done by independent researchers and enthusiasts on online forums. It’s honestly a bad joke that official academic studies continue to ignore these refinements—some of them still using subclade classifications from 2012 in papers published in 2019. All they seem to do is get lost in mental masturbation over PCA admixture plots.

It's true that some relatively recent surveys lack refinement concerning subclades of haplo's. It's a pity. But sometimes the recollected DNA is too bad or too scarce to do better.

—BoNe— · Apr 17, 2025

There’s a lot of misinformation out there about contamination and degraded samples.

Contamination used to be a problem back when “analog” methods like PCR-based detection were used, where only specific DNA fragments were amplified. In those cases, you had to know exactly which mutations you were looking for, and you couldn’t detect new or unknown variants. But with Next-Generation Sequencing (NGS), the analysis is much more advanced. Not only can it detect known mutations, but it can also identify intermediate variants that would have gone unnoticed with older methods.

In the past, NGS was expensive and usually reserved for complex studies, while PCR was used for simpler tests. But NGS is essentially like performing a million PCRs simultaneously, giving us a far more complete and accurate picture of the DNA—even in degraded or contaminated samples.

Contamination or degradation can be a challenge when analyzing autosomal DNA, but not when determining Y-DNA or mitochondrial DNA haplogroups and their chronological position. For example, each male cell contains a full copy of the Y chromosome—about 57 million base pairs—and a single ancient sample can yield DNA from millions of such cells. Even if the DNA is fragmented, there are millions of opportunities to reconstruct the Y sequence because we’re not dealing with just one copy.

So no, it’s not like you only get one shot at reading the Y chromosome. Every male cell has a copy, and even highly degraded remains can provide enough fragments to determine the haplogroup. This is very different from something like eye color mutations, which are located on specific autosomal chromosomes and harder to detect in ancient DNA.

One clear example is Tutankhamun. His DNA has been classified up to haplogroup R1b-M269. It’s not that researchers couldn’t go further—it’s that they chose not to publish the deeper subclade. Why? Likely to avoid public speculation or because revealing his precise lineage could allow modern individuals with matching subclades to make bold claims about shared ancestry, especially if that subclade is still alive today.

Another interesting case involves pre-dynastic Egyptian mummies from the Naqada culture. One sample was assigned to U152>L2>L20. Because this mummy predates 2500 BCE, some researchers claimed it must be contamination. But that’s just because they assume U152 didn’t exist yet—based on a consensus estimate, not hard data. If a U152 sample is found in a mummy from 3500–3000 BCE, that means P312, the parent clade, must have existed long before 2500 BCE.

It makes no sense for a sample to be considered valid enough to study cardiovascular disease markers, and yet supposedly not reliable for Y-DNA classification—one of the easiest things to analyze in ancient DNA.

Even if a sample is contaminated by DNA from 10, 20, or even 500 modern individuals, deep subclade identification is still possible with NGS. The technology is capable of distinguishing ancient sequences from modern ones. The real obstacle is not technical—it’s interpretative and political. Researchers or institutions often choose not to release certain details, perhaps to avoid misinterpretation by the general public.

To truly resolve the origins of P312 and its closest subclades, we would need two things:

Large-scale analysis of at least 10,000 individuals per region in all key areas involved in the Atlantic Bronze Age (4000–2000 BCE).
A precise phylogenetic mapping of horses, cattle, and pigs, especially the domesticated lineages associated with those human populations.

If those two steps were carried out, everything would become much clearer. But at the current pace, we might not see it for another 30 years.

Research should focus almost exclusively on refining the Y-chromosome tree. Women can’t generate thousands of lineage copies per generation—men can.

As for PCAs (Principal Component Analyses), they’ve proven unreliable in this context. The obsession with “steppe ancestry” oversimplifies things. Every population has at least 25% of what’s vaguely labeled as “Atlantic-Mediterranean.” If Northern Europe averages 25% R1a, then yes, it’ll show high “steppe ancestry”—but that has nothing to do with the actual Yamnaya people. The Yamnaya were distant cousins of both Bell Beaker and Corded Ware groups, not their direct ancestors.

MOESAN · Apr 17, 2025

—BoNe— said:
It seems that when it comes to the oldest subclades like L51*, P310*, and L151*, there’s very little data available, and most of it comes from studies with extremely small sample sizes or is derived secondhand from other research. I’ve seen a map showing L51* reaching frequencies of 8% in Central Europe and 6% in northern Portugal, but honestly, I don’t find it particularly reliable.

If you haven’t noticed, it’s fair to say that 90% of the refinement of R1b-M269+ has been done by independent researchers and enthusiasts on online forums. It’s honestly a bad joke that official academic studies continue to ignore these refinements—some of them still using subclade classifications from 2012 in papers published in 2019. All they seem to do is get lost in mental masturbation over PCA admixture plots.

It's true that some relatively recent surveys lack refinement concerning subclades of haplo's. It's a pity. But sometimes the recollected DNA is too bad or too scarce to do better.

—BoNe— said:
There’s a lot of misinformation out there about contamination and degraded samples.

Contamination used to be a problem back when “analog” methods like PCR-based detection were used, where only specific DNA fragments were amplified. In those cases, you had to know exactly which mutations you were looking for, and you couldn’t detect new or unknown variants. But with Next-Generation Sequencing (NGS), the analysis is much more advanced. Not only can it detect known mutations, but it can also identify intermediate variants that would have gone unnoticed with older methods.

In the past, NGS was expensive and usually reserved for complex studies, while PCR was used for simpler tests. But NGS is essentially like performing a million PCRs simultaneously, giving us a far more complete and accurate picture of the DNA—even in degraded or contaminated samples.

Contamination or degradation can be a challenge when analyzing autosomal DNA, but not when determining Y-DNA or mitochondrial DNA haplogroups and their chronological position. For example, each male cell contains a full copy of the Y chromosome—about 57 million base pairs—and a single ancient sample can yield DNA from millions of such cells. Even if the DNA is fragmented, there are millions of opportunities to reconstruct the Y sequence because we’re not dealing with just one copy.

So no, it’s not like you only get one shot at reading the Y chromosome. Every male cell has a copy, and even highly degraded remains can provide enough fragments to determine the haplogroup. This is very different from something like eye color mutations, which are located on specific autosomal chromosomes and harder to detect in ancient DNA.

One clear example is Tutankhamun. His DNA has been classified up to haplogroup R1b-M269. It’s not that researchers couldn’t go further—it’s that they chose not to publish the deeper subclade. Why? Likely to avoid public speculation or because revealing his precise lineage could allow modern individuals with matching subclades to make bold claims about shared ancestry, especially if that subclade is still alive today.

Another interesting case involves pre-dynastic Egyptian mummies from the Naqada culture. One sample was assigned to U152>L2>L20. Because this mummy predates 2500 BCE, some researchers claimed it must be contamination. But that’s just because they assume U152 didn’t exist yet—based on a consensus estimate, not hard data. If a U152 sample is found in a mummy from 3500–3000 BCE, that means P312, the parent clade, must have existed long before 2500 BCE.

It makes no sense for a sample to be considered valid enough to study cardiovascular disease markers, and yet supposedly not reliable for Y-DNA classification—one of the easiest things to analyze in ancient DNA.

Even if a sample is contaminated by DNA from 10, 20, or even 500 modern individuals, deep subclade identification is still possible with NGS. The technology is capable of distinguishing ancient sequences from modern ones. The real obstacle is not technical—it’s interpretative and political. Researchers or institutions often choose not to release certain details, perhaps to avoid misinterpretation by the general public.

To truly resolve the origins of P312 and its closest subclades, we would need two things:

Large-scale analysis of at least 10,000 individuals per region in all key areas involved in the Atlantic Bronze Age (4000–2000 BCE).

A precise phylogenetic mapping of horses, cattle, and pigs, especially the domesticated lineages associated with those human populations.

If those two steps were carried out, everything would become much clearer. But at the current pace, we might not see it for another 30 years.

Research should focus almost exclusively on refining the Y-chromosome tree. Women can’t generate thousands of lineage copies per generation—men can.

As for PCAs (Principal Component Analyses), they’ve proven unreliable in this context. The obsession with “steppe ancestry” oversimplifies things. Every population has at least 25% of what’s vaguely labeled as “Atlantic-Mediterranean.” If Northern Europe averages 25% R1a, then yes, it’ll show high “steppe ancestry”—but that has nothing to do with the actual Yamnaya people. The Yamnaya were distant cousins of both Bell Beaker and Corded Ware groups, not their direct ancestors.

OK for Y-haplo's, I didn't think accurately ( I didn't think at all). Concerning Yamnaya, not everyone thinks it's for us western Europeans our direct ancestor. That said, Y-haplo can undergo founder effect so the interpretation of our today %'s can be very mistaking sometimes. I rely more on ancient founds if their datation is reliable than on our interpretations based on modern samples, even if we have to do with what we have at hand.

R1b-L151*

—BoNe—

Regular Member

MOESAN

Elite member

—BoNe—

Regular Member

MOESAN

Elite member

—BoNe—

Regular Member

MOESAN

Elite member