• Don't want to see ads? Install an adblocker like uBlock Origin or use a Europe-based privacy-friendly browser like Vivaldi or Mullvad.

Admixtools Modern Italian 4-way aDNA model

Jovialis

Advisor
Messages
9,888
Reaction score
6,794
Points
113
Ethnic group
Italian
Y-DNA haplogroup
R1b-PF7566>Y227216
mtDNA haplogroup
H6a1b7
1746633660181.png

Code:
library(admixtools)
library(tidyverse)

prefix   <- "C:/Users/jovialis/Documents/Bioinformatics/Jovialis_HO_merge/merged_HO"
f2_dir   <- "C:/Users/jovialis/Documents/Bioinformatics/Jovialis_HO_merge/f2_blocks"

target   <- "Italian_North.HO"
left     <- c(
    "Russia_Samara_EBA_Yamnaya.AG",
    "Luxembourg_Mesolithic.AG",
    "Turkey_Marmara_Barcin_N.AG",
    "Iran_GanjDareh_N.AG"
)

outgroups <- c(
    "Ethiopia_4500BP.AG", "Russia_UstIshim_IUP.DG", "Italy_Epigravettian.AG.BY.AA", "Russia_YuzhniyOleniyOstrov_Mesolithic.AG", "Russia_MA1_UP.SG", "Georgia_Satsurblia_LateUP.SG", "Jordan_PPNB.AG")

mypops <- c(target, left, outgroups)

extract_f2(
    prefix, f2_dir,
    pops      = mypops,
    overwrite = TRUE,
    auto_only = TRUE,
    blgsize   = 0.05
)

f2_blocks <- f2_from_precomp(
    f2_dir,
    pops   = mypops,
    afprod = TRUE
)

qpwave_results <- qpwave(
    f2_blocks,
    left    = left,
    right   = outgroups,
    verbose = TRUE
)
print(qpwave_results$rankdrop)

min_p <- min(qpwave_results$rankdrop$p, na.rm = TRUE)

if (!is.na(min_p) && min_p > 0.05) {
    res_geno <- qpadm(
        prefix,
        left      = left,
        right     = outgroups,
        target    = target,
        allsnps   = TRUE,
        auto_only = TRUE,
        verbose   = TRUE,
        return_f4 = TRUE
    )
    print(res_geno$weights)
    print(res_geno$popdrop)
} else {
    res_geno <- qpadm(
        prefix,
        left      = left,
        right     = outgroups,
        target    = target,
        allsnps   = TRUE,
        auto_only = TRUE,
        verbose   = TRUE,
        return_f4 = TRUE
    )
    res_f2 <- qpadm(
        f2_blocks,
        left      = left,
        right     = outgroups,
        target    = target,
        verbose   = TRUE
    )
    print(res_geno$weights)
    print(res_geno$popdrop)
    print(res_f2$weights)
    print(res_f2$popdrop)
}
 
Fitting one individual’s genome (i.e. Jovialis) against four distal sources naturally inflates the SEs, so Z-scores of ~2 are expected. As long as the overall qpAdm p-value is >0.05 and all ancestry proportions stay between 0–1 within ±1 SE, the model is reliable; even if a few Z’s fall below 3.
 
so its outputting two different results for target pop, one is based of directly computed f4 statics and the other computes f4 statics indirectly from precomputed f2, which one is "better"? the direct one im guessing
 
so its outputting two different results for target pop, one is based of directly computed f4 statics and the other computes f4 statics indirectly from precomputed f2, which one is "better"? the direct one im guessing
The first one is the better one which is "ALLSNPS = TRUE" and tends to be the one used for aDNA studies.

The other is more for comparative reasons, which is based on pre-computed F2 with MAXMISS = 1 and 0.05 cM block settings. That would will always yield the same results.

Sometimes there's some different results yielded from ALLSNPS = TRUE, but this model is very stable and is consistent. Otherwise it is best to run the model a few times to yield the same result at least twice.
 
How did you make the graph bro? What program did you use?
 
Last edited:
Fitting one individual’s genome (i.e. Jovialis) against four distal sources naturally inflates the SEs, so Z-scores of ~2 are expected. As long as the overall qpAdm p-value is >0.05 and all ancestry proportions stay between 0–1 within ±1 SE, the model is reliable; even if a few Z’s fall below 3.
I have seen one study that published qpAdm models where the Z scores were 1.3-1.5

And I asked ChatGPT to compile a list of studies that publish qpAdm models that have Z scores less than 3:

1. Lazaridis et al. (2022) – “The genetic history of the Southern Arc”


[Source: https://www.science.org/doi/10.1126/science.abm4247]


  • This massive study includes dozens of qpAdm models.
  • On page S79 of the supplementary info (Table S2.2), you’ll find many models with Z-scores between 1.3 and 2.9, especially for complex three-way or four-way models.
  • For instance, models for populations like Mycenaeans, Anatolia_MLBA, or Armenia_MBA sometimes include Z-scores in the 1.5–2.8 range.
  • Authors note that p-values and archaeological plausibility were key, and Z ≥ 3 was not required in all cases.



2. Narasimhan et al. (2019) – “The formation of human populations in South and Central Asia”


[Source: https://www.science.org/doi/10.1126/science.aat7487]


  • This study also used many qpAdm models to estimate admixture in Central/South Asia.
  • In Extended Data Table 3, there are models with Z-scores below 3.
  • For example, some models for Indus_Periphery or Swat Valley samples used Z-scores around 1.4–2.6, with acceptable p-values.



3. Harney et al. (2021) – “Ancient DNA from Chalcolithic Israel”


[Source: https://www.cell.com/cell/fulltext/S0092-8674(21)00096-7]


  • This study models Bronze and Chalcolithic Levantines.
  • In Table S5 (Supplement), some Z-scores for model components are ~1.7 or lower, yet the models are accepted based on good p-values and overall historical sense.
 
Back
Top