Comparing Ancient Greek populations to modern Greeks and Italians

It is a leaked academic PCA not something from G25.


OK, I looked at it again and there's no Myceneans nor Anatolia_BA on it unless I am blind.
 
I included the BGR_IA southern arc samples, Alalakh_MLBA is still closer, check again.

So this is the order, from closest to farthest?

Code:
GRC_Mycenaean_BA TUR_Hatay_Alalakh_MLBA 0.0595 0.000941
GRC_Mycenaean_BA BGR_KapitanAndreevo_IA 0.0888 0.00132
GRC_Mycenaean_BA TUR_E_Arslantepe_EBA   0.123  0.00180
GRC_Mycenaean_BA BGR_TellKran_EBA       0.205  0.00233
GRC_Mycenaean_BA GRC_Logkas_MBA         0.212  0.00262
GRC_Mycenaean_BA BGR_Dzhulyunitsa_EBA   0.215  0.00267
GRC_Mycenaean_BA BGR_Tell_Ezero_EBA     0.314  0.00268
GRC_Mycenaean_BA BGR_Smyadovo_EBA       0.325  0.00288
GRC_Mycenaean_BA BGR_Boyanovo_EBA       0.434  0.00262
GRC_Mycenaean_BA BGR_Beli_Breyag_EBA    0.522  0.00224
GRC_Mycenaean_BA BGR_Diamandievo_IA     0.536  0.00223
GRC_Mycenaean_BA BGR_Merichleri_EBA     0.539  0.00212
GRC_Mycenaean_BA BGR_Merichleri_MLBA    0.543  0.00219

If I had said that Mycenaeans were more akin to a Levantine population like Alalakh_MLBA than to Thracians, half the forum would have lynched me and called me a "t-roll" XD

Also, don't take it the wrong way but please don't post g25 crap as some sort of rebuttal when I present you with fst distances straight from the source provided by the authors.

Your methodology is bs and your conclusions based on it false.

For heaven's sake though, the tool that produces results that make the most sense should be used, no matter what it is. And Global25 distances make more sense.

Distance to:GRC_Mycenaean_BA
0.02295970BGR_KapitanAndreevo_IA
0.09308855
TUR_Alalakh_MLBA
 
I am not commenting on g25, when it becomes open source and verifiable I'll look at it again.

Until then, you can believe what you want in your little cult.
 
I am not commenting on g25, when it becomes open source and verifiable I'll look at it again.

Until then, you can believe what you want in your little cult.

It is not a cult at all, if Global25 had shown distances like the ones you listed above I would have abandoned it long ago precisely because they were too far-fetched.
 
It is not a cult at all, if Global25 had shown distances like the ones you listed above I would have abandoned it long ago precisely because they were too far-fetched.


I reproduced the qpadm run they had for Myceneans with KapitanAndreevo, here is the whole thing if anyone wants to verify:

Code:
[SIZE=2]left=c("TUR_Marmara_Barcın_N","ISR_Feldman_PPNB", "SRB_Iron_Gates_HG", "EHG", "CHG")
> right=c("Mbuti.DG","IRN_Ganj_Dareh_N","ISR_Natufian_EpiP","MAR_Taforalt_EpiP","RUS_AfontovaGora3","RUS_MA1_HG","TUR_Pınarbaşı_EpiP", "WHG")
> target=c("BGR_KapitanAndreevo_IA")
> mypops=c("Mbuti.DG","IRN_Ganj_Dareh_N","ISR_Natufian_EpiP","MAR_Taforalt_EpiP","RUS_AfontovaGora3","RUS_MA1_HG","TUR_Pınarbaşı_EpiP", "WHG","BGR_KapitanAndreevo_IA","TUR_Marmara_Barcın_N","TUR_C_Boncuklu_PPN","ISR_Feldman_PPNB", "SRB_Iron_Gates_HG", "EHG", "CHG")
> 
> extract_f2(prefix, my_f2_dir, pops = mypops, overwrite = TRUE, maxmiss = 1)
ℹ Reading allele frequencies from packedancestrymap files...
ℹ SouthernArc_Public.geno has 5940 samples and 1233013 SNPs
ℹ Calculating allele frequencies from 107 samples in 15 populations
ℹ Expected size of allele frequency data: 286 MB
1233k SNPs read...
✔ 1233013 SNPs read in total
! 1150639 SNPs remain after filtering. 1015685 are polymorphic.
ℹ Allele frequency matrix for 1150639 SNPs and 15 populations is 221 MB
ℹ Computing pairwise f2 for all SNPs and population pairs requires 6627 MB RAM without splitting
ℹ Computing without splitting since 6627 < 8000 (maxmem)...
ℹ Data written to C:\Users\eptr\Documents\SouthernArc_Public\my_f2_dir_eptr/
> f2_blocks = f2_from_precomp(my_f2_dir, pops = mypops, afprod = TRUE)
ℹ Reading precomputed data for 15 populations...
ℹ Reading ap data for pair 120 out of 120...
[/SIZE].
.
.


And these are the results:

Code:
[SIZE=2]results = qpadm(prefix, left, right, target, allsnps = TRUE)[/SIZE]
[SIZE=2]ℹ Reading metadata...[/SIZE]
[SIZE=2]ℹ Computing block lengths for 1150639 SNPs...[/SIZE]
[SIZE=2]ℹ Computing 35 f4-statistics for block 713 out of 713...[/SIZE]
[SIZE=2]ℹ "allsnps = TRUE" uses different SNPs for each f4-statistic[/SIZE]
[SIZE=2]  Number of SNPs used for each f4-statistic:
.
.
.
[/SIZE][SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target                 left                 weight     se     z[/SIZE]
[SIZE=2]  <chr>                  <chr>                 <dbl>  <dbl> <dbl>[/SIZE]
[SIZE=2]1 BGR_KapitanAndreevo_IA TUR_Marmara_Barcın_N 0.621  0.0598 10.4 [/SIZE]
[SIZE=2]2 BGR_KapitanAndreevo_IA ISR_Feldman_PPNB     0.0723 0.0508  1.42[/SIZE]
[SIZE=2]3 BGR_KapitanAndreevo_IA SRB_Iron_Gates_HG    0.0572 0.0110  5.18[/SIZE]
[SIZE=2]4 BGR_KapitanAndreevo_IA EHG                  0.0561 0.0154  3.65[/SIZE]
[SIZE=2]5 BGR_KapitanAndreevo_IA CHG                  0.194  0.0204  9.51[/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof  chisq        p f4rank TUR_Marmara_Barcın_N[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl>  <dbl>    <dbl>  <dbl>                <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3   3.56 3.13e- 1      4                0.621
[/SIZE].
.
.

You can see even BGR_KapitanAndreevo_IA also scores PPNB (7.23%) and high CHG (19.4%) with a robust p-value (0.313) and that's why the fst distances show such proximity in my post above.

This is a result your platform fails to capture and reproduce because it's either crap or tweaked this way, or both.

As such it cannot be used to dispute anything presented in the paper that is easily reproduced with the open source tools the authors use themselves following their methodology.

This is my absolutely last post concerning g25 shennanigans.
 
I swear I've never seen so much tunnel vision thinking in my life.

Alalakh is right on the border between Anatolian and Syria, a Neolithic farmer population came precisely from the area.

Even if the analysis were correct, how would we know if this person was actually from further north in Anatolia rather than further south from the Levant proper.

No wonder the people from Anthrogenica have gotten so many things wrong.
 
I reproduced the qpadm run they had for Myceneans with KapitanAndreevo, here is the whole thing if anyone wants to verify:
Code:
[SIZE=2]left=c("TUR_Marmara_Barcın_N","ISR_Feldman_PPNB", "SRB_Iron_Gates_HG", "EHG", "CHG")
> right=c("Mbuti.DG","IRN_Ganj_Dareh_N","ISR_Natufian_EpiP","MAR_Taforalt_EpiP","RUS_AfontovaGora3","RUS_MA1_HG","TUR_Pınarbaşı_EpiP", "WHG")
> target=c("BGR_KapitanAndreevo_IA")
> mypops=c("Mbuti.DG","IRN_Ganj_Dareh_N","ISR_Natufian_EpiP","MAR_Taforalt_EpiP","RUS_AfontovaGora3","RUS_MA1_HG","TUR_Pınarbaşı_EpiP", "WHG","BGR_KapitanAndreevo_IA","TUR_Marmara_Barcın_N","TUR_C_Boncuklu_PPN","ISR_Feldman_PPNB", "SRB_Iron_Gates_HG", "EHG", "CHG")
> 
> extract_f2(prefix, my_f2_dir, pops = mypops, overwrite = TRUE, maxmiss = 1)
ℹ Reading allele frequencies from packedancestrymap files...
ℹ SouthernArc_Public.geno has 5940 samples and 1233013 SNPs
ℹ Calculating allele frequencies from 107 samples in 15 populations
ℹ Expected size of allele frequency data: 286 MB
1233k SNPs read...
✔ 1233013 SNPs read in total
! 1150639 SNPs remain after filtering. 1015685 are polymorphic.
ℹ Allele frequency matrix for 1150639 SNPs and 15 populations is 221 MB
ℹ Computing pairwise f2 for all SNPs and population pairs requires 6627 MB RAM without splitting
ℹ Computing without splitting since 6627 < 8000 (maxmem)...
ℹ Data written to C:\Users\eptr\Documents\SouthernArc_Public\my_f2_dir_eptr/
> f2_blocks = f2_from_precomp(my_f2_dir, pops = mypops, afprod = TRUE)
ℹ Reading precomputed data for 15 populations...
ℹ Reading ap data for pair 120 out of 120...
[/SIZE].
.
.
And these are the results:
Code:
[SIZE=2]results = qpadm(prefix, left, right, target, allsnps = TRUE)[/SIZE]
[SIZE=2]ℹ Reading metadata...[/SIZE]
[SIZE=2]ℹ Computing block lengths for 1150639 SNPs...[/SIZE]
[SIZE=2]ℹ Computing 35 f4-statistics for block 713 out of 713...[/SIZE]
[SIZE=2]ℹ "allsnps = TRUE" uses different SNPs for each f4-statistic[/SIZE]
[SIZE=2]  Number of SNPs used for each f4-statistic:
.
.
.
[/SIZE][SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target                 left                 weight     se     z[/SIZE]
[SIZE=2]  <chr>                  <chr>                 <dbl>  <dbl> <dbl>[/SIZE]
[SIZE=2]1 BGR_KapitanAndreevo_IA TUR_Marmara_Barcın_N 0.621  0.0598 10.4 [/SIZE]
[SIZE=2]2 BGR_KapitanAndreevo_IA ISR_Feldman_PPNB     0.0723 0.0508  1.42[/SIZE]
[SIZE=2]3 BGR_KapitanAndreevo_IA SRB_Iron_Gates_HG    0.0572 0.0110  5.18[/SIZE]
[SIZE=2]4 BGR_KapitanAndreevo_IA EHG                  0.0561 0.0154  3.65[/SIZE]
[SIZE=2]5 BGR_KapitanAndreevo_IA CHG                  0.194  0.0204  9.51[/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof  chisq        p f4rank TUR_Marmara_Barcın_N[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl>  <dbl>    <dbl>  <dbl>                <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3   3.56 3.13e- 1      4                0.621
[/SIZE].
.
.
You can see even BGR_KapitanAndreevo_IA also scores PPNB (7.23%) and high CHG (19.4%) with a robust p-value (0.313) and that's why the fst distances show such proximity in my post above.
This is a result your platform fails to capture and reproduce because it's either crap or tweaked this way, or both.
As such it cannot be used to dispute anything presented in the paper that is easily reproduced with the open source tools the authors use themselves following their methodology.
This is my absolutely last post concerning g25 shennanigans.
Ok I agree G 25 has its faults but why is that when I compare my results to other Greeks I get Messinian with G 25 as my number one population at a distance of 1.9 compared to gedmatch calcs which give me Thessaly, Central Greece or Albania at 3.75. The other issue I have with Dodecad (as an example) is the ancestral components; Gedrosia? SW Asia? Atlantic, North Europe, Caucasus how are they determined and what do they mean? When I compare my Dodecad results (as an example with new samples) to G 25 they’re pretty much the same but at greater distances so maybe when all is said and done there isn’t much difference.
 
OK, I looked at it again and there's no Myceneans nor Anatolia_BA on it unless I am blind.
Aegean BA refers to Mycenaeans. And I doubt Anatolia BA will be very different to the Byzantine one maybe only a little less Western.
 
Last edited:
You can see even BGR_KapitanAndreevo_IA also scores PPNB (7.23%) and high CHG (19.4%) with a robust p-value (0.313) and that's why the fst distances show such proximity in my post above.

When Iran_N is not included, a similar percentage also comes out on Global25 (5.8% Levant_PNB). But that still does not justify the allegedly close proximity between Mycenaean or Kapitan Andreevo and Alalakh.

This is a result your platform fails to capture and reproduce because it's either crap or tweaked this way, or both.

As such it cannot be used to dispute anything presented in the paper that is easily reproduced with the open source tools the authors use themselves following their methodology.

This is my absolutely last post concerning g25 shennanigans.

Luckily, "my" platform would never reproduce shitty results like these, taken directly from the paper.

Etruscan_Tarquinia = CHG 14.45%, EHG 9.87%, Levant_PPN 5.94%, SRB_Iron_Gates_HG 13.57%, TUR_Marmara_Barcın_N 56.19%

Imagine then using these results to prove a very recent connection between the Etruscans and West Asia. The paper did the same with Yamnaya by claiming non-existent Levantine admixture.

Even if the analysis were correct, how would we know if this person was actually from further north in Anatolia rather than further south from the Levant proper.

If you are referring to Alalakh_MLBA, that average is fully North Levantine and clusters close to EMBA Syrians from Ebla.

Closest averages on G25

Distance to: TUR_Alalakh_MLBA
0.01382046 TUR_SE_Kilis_MBA
0.01407303 SYR_Ebla_EMBA
0.01891936 TUR_SE_Kilis_EBA_A
0.02302061 MKD_Anc_outlier1
0.02370778 Levant_Beirut_IAIII
0.02471069 Levant_LBN_Roman
0.02483178 IRN_DinkhaTepe_BIA_A

Dodecad K12b distances with modern populations

Distance to: Alalakh_MLBA
5.33902045 Lebanese_Christian
5.81571781 Palestinian_Christian
6.23079967 Syrian_Christian
6.70861389 Jordanian_Christian
8.23371704 Nusayri_Turkey
8.27243029 Iraqi_Jew
8.71029762 Kurdish_Jew
9.32366659 Assyrian_West
9.62651803 Assyrian_South
 
The paper did the same with Yamnaya by claiming non-existent Levantine admixture.


:LOL::LOL::LOL:

It's non existent because you say so.

Never mind the fact that it says explicitly in the supplementary material why Levant_PPN was used, but then again when did your little cult ever bother with mundane things like actual reading?

.
.
.
The CHG/EHG combination is invariant in the fitting models, with the EHG proportion in the ~40-50% range in all of them. We note parenthetically that the model of (17) that includes CHG/EHG/WHG/Anatolian Neolithic ancestry fails in our framework (p<1e-10), and inspection of outlier f4-statistics indicates that it underestimates (Z<-3) shared drift with Levant_PPN (Z=-5.6), Natufians, Azerbaijan Neolithic, and Ganj Dareh outgroups.
.
.
.
To summarize our results in this section, when we consider Neolithic sources, we can model the ancestry of the Yamnaya cluster as a mixture of a southern source from the South Caucasus and a CHG/EHG-admixed source which does not correspond to any of the sampled Eneolithic populations of the steppe either because they might not have the right balance of CHG and EHG ancestry -which presumably existed in other proportions than those in sampled individuals- or because they have extra Siberian affinity. The populations of the South Caucasus can be modeled as having both Anatolian and Levantine-related ancestry using the analysis of the Neolithic continuum (11) and in terms of the 5-source model (Supplementary Text S3, Fig. S 3, Fig. 5) and could thus be useful candidates for contributing this type of ancestry to the Yamnaya along the Caucasus genetic bridge (Fig. 3; Fig. S 3). Our modeling suggests that the contribution of the southern population to the ancestry of the Yamnaya was substantial. When we consider more proximal Chalcolithic/Eneolithic sources, the Yamnaya cluster can be modeled with a Southern Arc source that is not geographically welllocalized but includes the Caucasus and SE Anatolia and northern ancestry related to the 309Eneolithic of the North Caucasus Piedmont but not corresponding to it exactly (having more EHG ancestry than it and no Siberian affinity
.
.
.
 
:LOL::LOL::LOL:

It's non existent because you say so.

Never mind the fact that it says explicitly in the supplementary material why Levant_PPN was used, but then again when did your little cult ever bother with mundane things like actual reading?

Amen to that.

The Steppe...Corded Ware...and repeat.
 
I ran my own WGS data with the paper's methodology out of curiosity and these are my observations:

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left      weight     se     z[/SIZE]
[SIZE=2]  <chr>  <chr>      <dbl>  <dbl> <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.302  0.0723  4.18[/SIZE]
[SIZE=2]2 eptr  PPN       0.149  0.0545  2.73[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0693 0.0217  3.19[/SIZE]
[SIZE=2]4 eptr  EHG       0.0962 0.0293  3.29[/SIZE]
[SIZE=2]5 eptr  CHG       0.384  0.0400  9.59[/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof chisq        p f4rank Turkey_N      PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl> <dbl>    <dbl>  <dbl>    <dbl>    <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3  21.7 7.40e- 5      4    0.302   0.149     0.0693
[/SIZE]


Model fails with a p-value of 0,000740.

Swapping CHG for Iran_N (from left to right outgroup list and vice versa):

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left       weight     se       z[/SIZE]
[SIZE=2]  <chr>  <chr>       <dbl>  <dbl>   <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.416   0.0621  6.69  [/SIZE]
[SIZE=2]2 eptr  PPN       0.00456 0.0506  0.0901[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0961  0.0217  4.43  [/SIZE]
[SIZE=2]4 eptr  EHG       0.0338  0.0300  1.13  [/SIZE]
[SIZE=2]5 eptr  Iran_N    0.450   0.0358 12.5   [/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof  chisq         p f4rank Turkey_N       PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl>  <dbl>     <dbl>  <dbl>    <dbl>     <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3   1.92 5.89e-  1      4    0.416   4.56e-3    0.0961[/SIZE]

The model passes with a robust p-value of 0,589.


Observations:

1) CHG and Iran_N seem indeed interchangeable and need to be swapped around on a per case basis.

2) My Serbia_Irongates_HG is huge, nearly 10%. That will definitely over-inflate my Steppe (my EHG is barely 3.5%) on other PCA calculators since the Balkan_HG will act as a stand-in for Karelia/Samara_HG giving a false sense of the reality since Serbia_Irongates_HG was never a source population for Yamnaya_Samara.

Something to ponder on?

 
I ran my own WGS data with the paper's methodology out of curiosity and these are my observations:

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left      weight     se     z[/SIZE]
[SIZE=2]  <chr>  <chr>      <dbl>  <dbl> <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.302  0.0723  4.18[/SIZE]
[SIZE=2]2 eptr  PPN       0.149  0.0545  2.73[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0693 0.0217  3.19[/SIZE]
[SIZE=2]4 eptr  EHG       0.0962 0.0293  3.29[/SIZE]
[SIZE=2]5 eptr  CHG       0.384  0.0400  9.59[/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof chisq        p f4rank Turkey_N      PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl> <dbl>    <dbl>  <dbl>    <dbl>    <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3  21.7 7.40e- 5      4    0.302   0.149     0.0693
[/SIZE]


Model fails with a p-value of 0,000740.

Swapping CHG for Iran_N (from left to right outgroup list and vice versa):

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left       weight     se       z[/SIZE]
[SIZE=2]  <chr>  <chr>       <dbl>  <dbl>   <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.416   0.0621  6.69  [/SIZE]
[SIZE=2]2 eptr  PPN       0.00456 0.0506  0.0901[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0961  0.0217  4.43  [/SIZE]
[SIZE=2]4 eptr  EHG       0.0338  0.0300  1.13  [/SIZE]
[SIZE=2]5 eptr  Iran_N    0.450   0.0358 12.5   [/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof  chisq         p f4rank Turkey_N       PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl>  <dbl>     <dbl>  <dbl>    <dbl>     <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3   1.92 5.89e-  1      4    0.416   4.56e-3    0.0961[/SIZE]

The model passes with a robust p-value of 0,589.


Observations:

1) CHG and Iran_N seem indeed interchangeable and need to be swapped around on a per case basis.

2) My Serbia_Irongates_HG is huge, nearly 10%. That will definitely over-inflate my Steppe (my EHG is barely 3.5%) on other PCA calculators since the Balkan_HG will act as a stand-in for Karelia/Samara_HG giving a false sense of the reality since Serbia_Irongates_HG was never a source population for Yamnaya_Samara.

Something to ponder on?


So CHG fed into Iran N prior to the Neolithic, right (I'm guessing a significant chunk)-thus the interchangeability? My other question that was brought up by another poster was the comment by Lazaridis (I may be misrepresenting what he said) about the Balkan IE source that came South into Greece during the Bronze age was like 60-70 CHG/30 EHG. So could an additional source of CHG found in Myceneans coming from that population?
 
Does E-V13 need re-imagining? It was found in Nicaea, a Byzantine/Greek place, in ca. 600 ad., before Vlach and Albanian settlements in Greece. Perhaps it was from the late-antiquity Hellenic/Byzantine world where some Albanians and Vlachs picked up this haplogroup, to spread it in other parts of Greece later. A quick perusal of YFull seems to show Greeks in older branches and generally higher up on the tree.
 
Last edited:
Heraclean period is compatible with Vlachs.
 
My other question that was brought up by another poster was the comment by Lazaridis (I may be misrepresenting what he said) about the Balkan IE source that came South into Greece during the Bronze age was like 60-70 CHG/30 EHG. So could an additional source of CHG found in Myceneans coming from that population?


I've missed that bit.
 
Does E-V13 need re-imagining? It was found in Nicaea, a Byzantine/Greek place, in ca. 600 ad., before Vlach and Albanian settlements in Greece. Perhaps it was from the late-antiquity Hellenic/Byzantine world where some Albanians and Vlachs picked up this haplogroup, to spread it in other parts of Greece later. A quick perusal of YFull seems to show Greeks in older branches and generally higher up on the tree.

Do you mean to say that the Greek E-V13 is older than the E-V13 found in Albanians?

How about the Mycenaean like E-V13 sample found in the EBA in Bulgaria.

How about the E-V13 found further north.

It would seem to me that before drawing all these conclusions a diagram should be done showing the branches and age of the samples. Southern Italian E-V13 should be included as well.
 
I ran my own WGS data with the paper's methodology out of curiosity and these are my observations:

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left      weight     se     z[/SIZE]
[SIZE=2]  <chr>  <chr>      <dbl>  <dbl> <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.302  0.0723  4.18[/SIZE]
[SIZE=2]2 eptr  PPN       0.149  0.0545  2.73[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0693 0.0217  3.19[/SIZE]
[SIZE=2]4 eptr  EHG       0.0962 0.0293  3.29[/SIZE]
[SIZE=2]5 eptr  CHG       0.384  0.0400  9.59[/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof chisq        p f4rank Turkey_N      PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl> <dbl>    <dbl>  <dbl>    <dbl>    <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3  21.7 7.40e- 5      4    0.302   0.149     0.0693
[/SIZE]


Model fails with a p-value of 0,000740.

Swapping CHG for Iran_N (from left to right outgroup list and vice versa):

Code:
[SIZE=2]results$weights[/SIZE]
[SIZE=2]# A tibble: 5 × 5[/SIZE]
[SIZE=2]  target left       weight     se       z[/SIZE]
[SIZE=2]  <chr>  <chr>       <dbl>  <dbl>   <dbl>[/SIZE]
[SIZE=2]1 eptr  Turkey_N  0.416   0.0621  6.69  [/SIZE]
[SIZE=2]2 eptr  PPN       0.00456 0.0506  0.0901[/SIZE]
[SIZE=2]3 eptr  Balkan_HG 0.0961  0.0217  4.43  [/SIZE]
[SIZE=2]4 eptr  EHG       0.0338  0.0300  1.13  [/SIZE]
[SIZE=2]5 eptr  Iran_N    0.450   0.0358 12.5   [/SIZE]
[SIZE=2]> results$popdrop[/SIZE]
[SIZE=2]# A tibble: 31 × 16[/SIZE]
[SIZE=2]   pat      wt   dof  chisq         p f4rank Turkey_N       PPN Balkan_HG[/SIZE]
[SIZE=2]   <chr> <dbl> <dbl>  <dbl>     <dbl>  <dbl>    <dbl>     <dbl>     <dbl>[/SIZE]
[SIZE=2] 1 00000     0     3   1.92 5.89e-  1      4    0.416   4.56e-3    0.0961[/SIZE]

The model passes with a robust p-value of 0,589.


Observations:

1) CHG and Iran_N seem indeed interchangeable and need to be swapped around on a per case basis.

2) My Serbia_Irongates_HG is huge, nearly 10%. That will definitely over-inflate my Steppe (my EHG is barely 3.5%) on other PCA calculators since the Balkan_HG will act as a stand-in for Karelia/Samara_HG giving a false sense of the reality since Serbia_Irongates_HG was never a source population for Yamnaya_Samara.

Something to ponder on?


I've always maintained that the same sort of thing happens in far-northeastern Europe. All that WHG inflates their "steppe" proportions. The same thing may be true everywhere in Europe where there may have been a WHG "resurgence", which I also always proposed.
 

This thread has been viewed 127945 times.

Back
Top