Agreed. We can only examine the samples we have, preferably using the best samples (for instance, avoiding using samples that are not representative or are just too mixed between themselves that the results will be a lot less reliable). But I maintain that I see nothing "crazy" in my G25 models of South Italian samples. They may not be representative of the whole population, particularly because they're so few, but still they are there and they exist and are part of the population.
Now that I'm finally proving that Anatolia_N is indeed a pool of many Neolithic Anatolian samples (actually the Fernandes et al. supplement explicitly says that, though they don't specify which individual samples they're using and from where they are), I'm getting results that do not look that different from what previous studies have asserted, with the caveat that doing so will hide a bit of the CHG/Iran and Levant_N that may be there not because it arrived with ANF farmers, but due to later admixtures blended into a more unmixed Barcin-N population.
Levant_N is still there in my models (in lower proportions and fewer individuals, of course), but in all honesty I don't really think most geneticists were interested in making more complicated ancestry models with less profoundly divergent samples like Anatolia_N and Levant_N, particularly if they can clearly see that the proportions of the latter are so minor that they could be basically explained away by Anatolia_N and focus the model more on very divergent population movements, like Iran_N, Taforalt-related and Pontic-Caspian Seppe ones. There's also the issue of the aim of the study: they won't care about modelling a more convoluted demographic history involving a few percents of Levantine or North African ancestry when and if the aim of their study is to track Indo-European migrations or understand when Iran/CHG arrived in a certain part of the world (just some examples). Ultimately what I want to say is: the models aren't talking for themselves, they are also being modelled according to what the authors want to detect and to explain.
Let me show you one example just with Abruzzo and Sicilian Italians:
Target | Distance | GEO_CHG | IRN_Ganj_Dareh_N | ITA_Grotta_Continenza_Meso | Levant_PPNB | MAR_EN | RUS_Karelia_HG | RUS_Khvalynsk_En | TUR_Barcin_N | TUR_Boncuklu_N | TUR_Kumtepe_N | TUR_Tepecik_Ciftlik_N | WHG |
Italian_Abruzzo:Alp090 | 0.02574303 | 3.6 | 10.2 | 0.0 | 9.2 | 0.0 | 0.0 | 18.6 | 50.6 | 0.0 | 0.0 | 5.8 | 2.0 |
Italian_Abruzzo:Alp140 | 0.02782162 | 4.2 | 5.8 | 0.0 | 5.6 | 0.0 | 0.0 | 21.6 | 41.4 | 0.0 | 1.6 | 17.8 | 2.0 |
Italian_Abruzzo:ALP161 | 0.02310929 | 6.0 | 7.4 | 0.0 | 1.2 | 0.0 | 0.0 | 21.2 | 54.6 | 0.0 | 0.0 | 8.6 | 1.0 |
Italian_Abruzzo:Alp162 | 0.01846710 | 0.0 | 10.2 | 0.0 | 0.8 | 0.0 | 0.0 | 21.2 | 32.2 | 0.0 | 0.0 | 32.8 | 2.8 |
Italian_Abruzzo:ALP205 | 0.01634055 | 1.6 | 8.0 | 0.0 | 4.6 | 0.0 | 0.0 | 17.6 | 20.4 | 0.0 | 0.0 | 47.4 | 0.4 |
Italian_Abruzzo:Alp380 | 0.01672006 | 0.0 | 6.8 | 1.2 | 2.6 | 0.0 | 0.0 | 21.0 | 22.0 | 0.0 | 5.0 | 41.4 | 0.0 |
Italian_Abruzzo:Alp503 | 0.02798765 | 5.4 | 6.4 | 0.0 | 5.6 | 0.0 | 0.0 | 21.6 | 53.4 | 0.0 | 0.0 | 7.6 | 0.0 |
Italian_Abruzzo:Alp616 | 0.02406986 | 5.0 | 8.6 | 0.0 | 2.2 | 0.4 | 0.0 | 19.0 | 53.8 | 0.0 | 0.0 | 11.0 | 0.0 |
Italian_Abruzzo:ItalyAbruzzo13 | 0.02815055 | 0.0 | 10.6 | 0.0 | 7.2 | 0.0 | 0.0 | 20.2 | 53.2 | 0.0 | 0.0 | 5.8 | 3.0 |
Italian_Abruzzo:ItalyAbruzzo14 | 0.02084826 | 1.0 | 7.0 | 0.0 | 2.0 | 0.0 | 0.0 | 22.2 | 36.8 | 0.0 | 0.0 | 31.0 | 0.0 |
Italian_Abruzzo:ItalyAbruzzo15 | 0.01890400 | 0.8 | 8.6 | 0.0 | 0.0 | 0.0 | 0.0 | 22.6 | 41.6 | 0.0 | 0.0 | 24.8 | 1.6 |
Italian_Abruzzo:ItalyAbruzzo16 | 0.01895165 | 0.4 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 21.4 | 19.8 | 0.0 | 1.0 | 49.8 | 2.6 |
Italian_Abruzzo:ItalyAbruzzo17 | 0.01732816 | 3.8 | 4.4 | 0.0 | 5.0 | 0.0 | 0.0 | 25.6 | 51.4 | 0.0 | 0.0 | 9.4 | 0.4 |
Italian_Abruzzo:ItalyAbruzzo19 | 0.01991065 | 3.2 | 6.6 | 0.0 | 0.0 | 0.0 | 0.0 | 21.4 | 26.0 | 0.0 | 0.0 | 41.6 | 1.2 |
Italian_Abruzzo:ItalyAbruzzo20 | 0.01974145 | 6.2 | 8.0 | 0.0 | 0.0 | 0.0 | 0.0 | 21.8 | 51.0 | 0.0 | 0.0 | 12.2 | 0.8 |
Italian_Abruzzo:ItalyAbruzzo21 | 0.02732278 | 3.6 | 3.6 | 0.0 | 1.0 | 0.8 | 0.0 | 20.4 | 30.4 | 0.0 | 0.0 | 36.6 | 3.6 |
Italian_Abruzzo:ItalyAbruzzo22 | 0.01732184 | 4.0 | 3.2 | 0.0 | 0.0 | 0.0 | 0.0 | 22.6 | 37.4 | 0.0 | 0.0 | 31.6 | 1.2 |
Italian_Abruzzo:ItalyAbruzzo23 | 0.02636423 | 2.6 | 5.4 | 0.0 | 0.0 | 0.0 | 5.6 | 17.6 | 30.2 | 0.0 | 0.0 | 38.6 | 0.0 |
Italian_Abruzzo:ItalyAbruzzo9 | 0.02822279 | 3.6 | 7.0 | 0.0 | 10.6 | 0.0 | 0.0 | 24.0 | 40.0 | 0.0 | 0.0 | 14.6 | 0.2 |
Sicilian_East:EastSicilian2H | 0.01960721 | 5.8 | 3.8 | 0.0 | 4.0 | 1.0 | 0.0 | 17.4 | 33.2 | 0.0 | 8.4 | 25.2 | 1.2 |
Sicilian_East:EastSicilian5H | 0.02540931 | 0.0 | 8.4 | 0.0 | 11.2 | 0.4 | 0.0 | 21.6 | 36.8 | 0.0 | 0.0 | 20.0 | 1.6 |
Sicilian_East:EastSicilian8H | 0.01839250 | 5.2 | 8.8 | 0.0 | 8.2 | 0.8 | 0.0 | 12.6 | 14.4 | 0.0 | 11.2 | 34.4 | 4.4 |
Sicilian_West:WestSicilian10H | 0.02170191 | 0.0 | 10.6 | 0.0 | 2.8 | 0.0 | 0.0 | 17.2 | 23.4 | 0.0 | 11.8 | 28.6 | 5.6 |
Sicilian_West:WestSicilian4H | 0.01892833 | 0.6 | 10.0 | 0.0 | 0.8 | 4.0 | 0.0 | 13.8 | 13.4 | 2.6 | 4.2 | 47.4 | 3.2 |
Sicilian_West:WestSicilian7H | 0.02291438 | 0.0 | 8.6 | 0.0 | 3.2 | 2.8 | 0.0 | 17.6 | 32.4 | 0.0 | 0.0 | 30.2 | 5.2 |
Average | 0.02201117 | 2.7 | 7.3 | 0.0 | 3.5 | 0.4 | 0.2 | 20.1 | 36.0 | 0.1 | 1.7 | 26.2 | 1.8 |
In other words, clustering the samples together as studies often do:
64.0% ANATOLIA_N (Tepecik-Ciftlik + Kumtepe included) +
10.0% CHG/IRAN +
3.5% LEVANT_N +
20.3% STEPPE +
1.8% WHG +
0.4% MOROCCO_EN (TAFORALT-LIKE)
Does that look really crazy, extremely implausible considering the entire history of the region? I don't think so.