Admixtools Modern Italian 4-way aDNA model

Jovialis · May 7, 2025

Code:

library(admixtools)
library(tidyverse)

prefix   <- "C:/Users/jovialis/Documents/Bioinformatics/Jovialis_HO_merge/merged_HO"
f2_dir   <- "C:/Users/jovialis/Documents/Bioinformatics/Jovialis_HO_merge/f2_blocks"

target   <- "Italian_North.HO"
left     <- c(
    "Russia_Samara_EBA_Yamnaya.AG",
    "Luxembourg_Mesolithic.AG",
    "Turkey_Marmara_Barcin_N.AG",
    "Iran_GanjDareh_N.AG"
)

outgroups <- c(
    "Ethiopia_4500BP.AG", "Russia_UstIshim_IUP.DG", "Italy_Epigravettian.AG.BY.AA", "Russia_YuzhniyOleniyOstrov_Mesolithic.AG", "Russia_MA1_UP.SG", "Georgia_Satsurblia_LateUP.SG", "Jordan_PPNB.AG")

mypops <- c(target, left, outgroups)

extract_f2(
    prefix, f2_dir,
    pops      = mypops,
    overwrite = TRUE,
    auto_only = TRUE,
    blgsize   = 0.05
)

f2_blocks <- f2_from_precomp(
    f2_dir,
    pops   = mypops,
    afprod = TRUE
)

qpwave_results <- qpwave(
    f2_blocks,
    left    = left,
    right   = outgroups,
    verbose = TRUE
)
print(qpwave_results$rankdrop)

min_p <- min(qpwave_results$rankdrop$p, na.rm = TRUE)

if (!is.na(min_p) && min_p > 0.05) {
    res_geno <- qpadm(
        prefix,
        left      = left,
        right     = outgroups,
        target    = target,
        allsnps   = TRUE,
        auto_only = TRUE,
        verbose   = TRUE,
        return_f4 = TRUE
    )
    print(res_geno$weights)
    print(res_geno$popdrop)
} else {
    res_geno <- qpadm(
        prefix,
        left      = left,
        right     = outgroups,
        target    = target,
        allsnps   = TRUE,
        auto_only = TRUE,
        verbose   = TRUE,
        return_f4 = TRUE
    )
    res_f2 <- qpadm(
        f2_blocks,
        left      = left,
        right     = outgroups,
        target    = target,
        verbose   = TRUE
    )
    print(res_geno$weights)
    print(res_geno$popdrop)
    print(res_f2$weights)
    print(res_f2$popdrop)
}

Jovialis · May 7, 2025

Jovialis · May 7, 2025

Fitting one individual’s genome (i.e. Jovialis) against four distal sources naturally inflates the SEs, so Z-scores of ~2 are expected. As long as the overall qpAdm p-value is >0.05 and all ancestry proportions stay between 0–1 within ±1 SE, the model is reliable; even if a few Z’s fall below 3.

baeticvs · May 7, 2025

so its outputting two different results for target pop, one is based of directly computed f4 statics and the other computes f4 statics indirectly from precomputed f2, which one is "better"? the direct one im guessing

Jovialis · May 7, 2025

baeticvs said:
so its outputting two different results for target pop, one is based of directly computed f4 statics and the other computes f4 statics indirectly from precomputed f2, which one is "better"? the direct one im guessing

The first one is the better one which is "ALLSNPS = TRUE" and tends to be the one used for aDNA studies.

The other is more for comparative reasons, which is based on pre-computed F2 with MAXMISS = 1 and 0.05 cM block settings. That would will always yield the same results.

Sometimes there's some different results yielded from ALLSNPS = TRUE, but this model is very stable and is consistent. Otherwise it is best to run the model a few times to yield the same result at least twice.

Jalisciense · May 8, 2025

How did you make the graph bro? What program did you use?

baeticvs · May 9, 2025

Jalisciense said:
How did you make the graph bro? What program did you use?

looks like matplotlib (python library), just ask chatgpt to write python code to display vertical bars like that

qh777 · May 10, 2025

Jovialis said:
Fitting one individual’s genome (i.e. Jovialis) against four distal sources naturally inflates the SEs, so Z-scores of ~2 are expected. As long as the overall qpAdm p-value is >0.05 and all ancestry proportions stay between 0–1 within ±1 SE, the model is reliable; even if a few Z’s fall below 3.

I have seen one study that published qpAdm models where the Z scores were 1.3-1.5

And I asked ChatGPT to compile a list of studies that publish qpAdm models that have Z scores less than 3:

1. Lazaridis et al. (2022) – “The genetic history of the Southern Arc”

[Source: https://www.science.org/doi/10.1126/science.abm4247]

This massive study includes dozens of qpAdm models.
On page S79 of the supplementary info (Table S2.2), you’ll find many models with Z-scores between 1.3 and 2.9, especially for complex three-way or four-way models.
For instance, models for populations like Mycenaeans, Anatolia_MLBA, or Armenia_MBA sometimes include Z-scores in the 1.5–2.8 range.
Authors note that p-values and archaeological plausibility were key, and Z ≥ 3 was not required in all cases.

2. Narasimhan et al. (2019) – “The formation of human populations in South and Central Asia”

[Source: https://www.science.org/doi/10.1126/science.aat7487]

This study also used many qpAdm models to estimate admixture in Central/South Asia.
In Extended Data Table 3, there are models with Z-scores below 3.
For example, some models for Indus_Periphery or Swat Valley samples used Z-scores around 1.4–2.6, with acceptable p-values.

3. Harney et al. (2021) – “Ancient DNA from Chalcolithic Israel”

[Source: https://www.cell.com/cell/fulltext/S0092-8674(21)00096-7]

This study models Bronze and Chalcolithic Levantines.
In Table S5 (Supplement), some Z-scores for model components are ~1.7 or lower, yet the models are accepted based on good p-values and overall historical sense.

Admixtools Modern Italian 4-way aDNA model

Jovialis

Advisor

Jovialis

Advisor

Jovialis

Advisor

baeticvs

Regular Member

Jovialis

Advisor

Jalisciense

Regular Member

baeticvs

Regular Member

qh777

Regular Member

1. Lazaridis et al. (2022) – “The genetic history of the Southern Arc”

2. Narasimhan et al. (2019) – “The formation of human populations in South and Central Asia”

3. Harney et al. (2021) – “Ancient DNA from Chalcolithic Israel”

Admixtools Modern Italian 4-way aDNA model

Advisor

Advisor

Advisor

Regular Member

Advisor

Regular Member

Regular Member

Regular Member

1. Lazaridis et al. (2022) – “The genetic history of the Southern Arc”​

2. Narasimhan et al. (2019) – “The formation of human populations in South and Central Asia”​

3. Harney et al. (2021) – “Ancient DNA from Chalcolithic Israel”​

1. Lazaridis et al. (2022) – “The genetic history of the Southern Arc”

2. Narasimhan et al. (2019) – “The formation of human populations in South and Central Asia”

3. Harney et al. (2021) – “Ancient DNA from Chalcolithic Israel”