eupator
destroyer of delusions
- Messages
- 509
- Reaction score
- 284
- Points
- 63
- Ethnic group
- Rhōmaiōs (Rumelia + Anatolia)
Had some time to kill so here goes.
These are my best ‘mixed mode’ results for each period for G25 Illustrative DNA. The raw dna file that was used to produce these results was an extract from my Whole Genome Sequence Nebula Genomics file, that has 99.9% SNP coverage with eurogenes’ template, meaning it is the best possible that could have been used.
Now, I tried to recreate all these models on qpadm/admixtools2 to check for the validity of my results. The reference samples that were used were the exact same ones used by illustrative DNA, meaning for “Corded Ware Culture” I used: N44, N45, pcw040, pcw041, pcw061, pcw070, pcw211, pcw350, pcw361. The same goes for the rest of the references. The raw dna file that was merged to the Reich dataset was also the same that was used to produce the g25 results.
These are the qpadm/admixtools2 results, using a fairly ‘easy’ right list to give the model some breathway.
The same model gives me 62.2% Copper Age Anatolian (4.47% s.e.) and 37.8% Corded Ware Culture (4.47% s.e.), percentages similar to g25 but the p-value is extremely low at 0.00473 meaning the model is a fail (p.value required above 5%).
Now for the Iron Age, where g25 gives me 49% IA Thracian and 51% Etiuni (Caucasus-Trialeti).
Qpadm gives me roughly the same percentages 54.2% (s.e. 7.64%) for Thracian and 45.8% (s.e. 7.64%) for Etiuni, but the p-value again is abysmally low at 0.00000156 indicating a fail.
I swap around some stufff and run my own model:
It works.
39% Empuries2 + 52.7% Armenia_LBA + 8.36% Slav AV2. Tail is 5.39% and s.e. almost at 5%.
Before someone mentions the obvious, that this is a 3way model, nowhere on illustrative dna I get a result that is even remotely close to the above.
TLDR:
Those who put their blind faith on their g25 results they should keep in mind that those do not correspond necessarily to proper fstats modelling.
These are my best ‘mixed mode’ results for each period for G25 Illustrative DNA. The raw dna file that was used to produce these results was an extract from my Whole Genome Sequence Nebula Genomics file, that has 99.9% SNP coverage with eurogenes’ template, meaning it is the best possible that could have been used.
Now, I tried to recreate all these models on qpadm/admixtools2 to check for the validity of my results. The reference samples that were used were the exact same ones used by illustrative DNA, meaning for “Corded Ware Culture” I used: N44, N45, pcw040, pcw041, pcw061, pcw070, pcw211, pcw350, pcw361. The same goes for the rest of the references. The raw dna file that was merged to the Reich dataset was also the same that was used to produce the g25 results.
These are the qpadm/admixtools2 results, using a fairly ‘easy’ right list to give the model some breathway.
Code:
> right = c('Mbuti.DG', 'Iran_N', 'Natufian', 'Iberomaurusian', 'Russia_AfontovaGora3', 'Russia_MA1_HG.SG', 'Turkey_Epipaleolithic', 'WHG', 'Iraq_PPN', 'ONG.SG')
> target = c('dosas')
> left = c('Copper_Age_Anatolian','Corded_Ware_Culture')
> results = qpadm(prefix, left, right, target, allsnps = TRUE)
ℹ Reading metadata...
ℹ Computing block lengths for 1150639 SNPs...
ℹ Computing 18 f4-statistics for block 713 out of 713...
ℹ "allsnps = TRUE" uses different SNPs for each f4-statistic
Number of SNPs used for each f4-statistic:
pop1 pop2 pop3 pop4 n
1 dosas Copper_Age_Anatolian Mbuti.DG Iberomaurusian 639471
2 dosas Copper_Age_Anatolian Mbuti.DG Iran_N 613110
3 dosas Copper_Age_Anatolian Mbuti.DG Iraq_PPN 271162
4 dosas Copper_Age_Anatolian Mbuti.DG Natufian 139462
5 dosas Copper_Age_Anatolian Mbuti.DG ONG.SG 665456
6 dosas Copper_Age_Anatolian Mbuti.DG Russia_AfontovaGora3 155549
7 dosas Copper_Age_Anatolian Mbuti.DG Russia_MA1_HG.SG 472363
8 dosas Copper_Age_Anatolian Mbuti.DG Turkey_Epipaleolithic 510849
9 dosas Copper_Age_Anatolian Mbuti.DG WHG 462873
10 dosas Corded_Ware_Culture Mbuti.DG Iberomaurusian 663322
11 dosas Corded_Ware_Culture Mbuti.DG Iran_N 630366
12 dosas Corded_Ware_Culture Mbuti.DG Iraq_PPN 276211
13 dosas Corded_Ware_Culture Mbuti.DG Natufian 140980
14 dosas Corded_Ware_Culture Mbuti.DG ONG.SG 708861
15 dosas Corded_Ware_Culture Mbuti.DG Russia_AfontovaGora3 156781
16 dosas Corded_Ware_Culture Mbuti.DG Russia_MA1_HG.SG 497634
17 dosas Corded_Ware_Culture Mbuti.DG Turkey_Epipaleolithic 524260
18 dosas Corded_Ware_Culture Mbuti.DG WHG 467622
ℹ Computing admixture weights...
ℹ Computing standard errors...
ℹ Computing number of admixture waves...
> results$weights
# A tibble: 2 × 5
target left weight se z
<chr> <chr> <dbl> <dbl> <dbl>
1 dosas Copper_Age_Anatolian 0.622 0.0447 13.9
2 dosas Corded_Ware_Culture 0.378 0.0447 8.45
> results$popdrop
# A tibble: 3 × 13
pat wt dof chisq p f4rank Copper_Age_Anatolian Corded_Ware_Cul…¹ feasi…² best dofdiff chisq…³ p_nes…⁴
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl> <dbl> <dbl> <dbl>
1 00 0 8 22.1 4.73e- 3 1 0.622 0.378 TRUE NA NA NA NA
2 01 1 9 219. 3.41e-42 0 1 NA TRUE TRUE 0 -270. 1
3 10 1 9 489. 1.26e-99 0 NA 1 TRUE TRUE NA NA NA
# … with abbreviated variable names ¹Corded_Ware_Culture, ²feasible, ³chisqdiff, ⁴p_nested
The same model gives me 62.2% Copper Age Anatolian (4.47% s.e.) and 37.8% Corded Ware Culture (4.47% s.e.), percentages similar to g25 but the p-value is extremely low at 0.00473 meaning the model is a fail (p.value required above 5%).
Now for the Iron Age, where g25 gives me 49% IA Thracian and 51% Etiuni (Caucasus-Trialeti).
Code:
> results = qpadm(prefix, left, right, target, allsnps = TRUE)
ℹ Reading metadata...
ℹ Computing block lengths for 1150639 SNPs...
ℹ Computing 18 f4-statistics for block 713 out of 713...
ℹ "allsnps = TRUE" uses different SNPs for each f4-statistic
Number of SNPs used for each f4-statistic:
pop1 pop2 pop3 pop4 n
1 dosas Etiuni Mbuti.DG Iberomaurusian 657869
2 dosas Etiuni Mbuti.DG Iran_N 627599
3 dosas Etiuni Mbuti.DG Iraq_PPN 275421
4 dosas Etiuni Mbuti.DG Natufian 140826
5 dosas Etiuni Mbuti.DG ONG.SG 693364
6 dosas Etiuni Mbuti.DG Russia_AfontovaGora3 156721
7 dosas Etiuni Mbuti.DG Russia_MA1_HG.SG 489370
8 dosas Etiuni Mbuti.DG Turkey_Epipaleolithic 521589
9 dosas Etiuni Mbuti.DG WHG 467356
10 dosas Thracian Mbuti.DG Iberomaurusian 641747
11 dosas Thracian Mbuti.DG Iran_N 616376
12 dosas Thracian Mbuti.DG Iraq_PPN 272303
13 dosas Thracian Mbuti.DG Natufian 139943
14 dosas Thracian Mbuti.DG ONG.SG 668717
15 dosas Thracian Mbuti.DG Russia_AfontovaGora3 155919
16 dosas Thracian Mbuti.DG Russia_MA1_HG.SG 474445
17 dosas Thracian Mbuti.DG Turkey_Epipaleolithic 512834
18 dosas Thracian Mbuti.DG WHG 464542
ℹ Computing admixture weights...
ℹ Computing standard errors...
ℹ Computing number of admixture waves...
> results$weights
# A tibble: 2 × 5
target left weight se z
<chr> <chr> <dbl> <dbl> <dbl>
1 dosas Thracian 0.542 0.0764 7.10
2 dosas Etiuni 0.458 0.0764 5.99
> results$popdrop
# A tibble: 3 × 13
pat wt dof chisq p f4rank Thracian Etiuni feasible best dofdiff chisqdiff p_nested
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl> <dbl> <dbl> <dbl>
1 00 0 8 41.7 1.56e- 6 1 0.542 0.458 TRUE NA NA NA NA
2 01 1 9 318. 4.12e-63 0 1 NA TRUE TRUE 0 -74.4 1
3 10 1 9 392. 6.08e-79 0 NA 1 TRUE TRUE NA NA NA
Qpadm gives me roughly the same percentages 54.2% (s.e. 7.64%) for Thracian and 45.8% (s.e. 7.64%) for Etiuni, but the p-value again is abysmally low at 0.00000156 indicating a fail.
I swap around some stufff and run my own model:
Code:
eading metadata...
ℹ Computing block lengths for 1150639 SNPs...
ℹ Computing 57 f4-statistics for block 713 out of 713...
ℹ "allsnps = TRUE" uses different SNPs for each f4-statistic
Number of SNPs used for each f4-statistic:
.
.
.
ℹ Computing admixture weights...
ℹ Computing standard errors...
ℹ Computing number of admixture waves...
warning: solve(): system is singular (rcond: 2.0141e-17); attempting approx solution
> results$weights
# A tibble: 3 × 5
target left weight se z
<chr> <chr> <dbl> <dbl> <dbl>
1 dosas Spain_Hellenistic_Emporion 0.390 0.0636 6.12
2 dosas Armenia_LBA.SG 0.527 0.0739 7.13
3 dosas AV2 0.0836 0.0467 1.79
> results$popdrop
# A tibble: 7 × 14
pat wt dof chisq p f4rank Spain_Hellen…¹ Armen…² AV2 feasi…³ best dofdiff
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl> <dbl>
1 000 0 17 27.3 5.39e- 2 2 0.390 0.527 0.0836 TRUE NA NA
39% Empuries2 + 52.7% Armenia_LBA + 8.36% Slav AV2. Tail is 5.39% and s.e. almost at 5%.
Before someone mentions the obvious, that this is a 3way model, nowhere on illustrative dna I get a result that is even remotely close to the above.
TLDR:
Those who put their blind faith on their g25 results they should keep in mind that those do not correspond necessarily to proper fstats modelling.