Admixtools admixtools2 TUTORIAL for WINDOWS.

AIexSuvorov · Jul 22, 2023

eupator said:
So far so good,

eupator said:
1240K are the high-SNP files and the HO are the low-SNP but with the moderns in from the Human Origin database.

Currently I should get f2_dir file but I don't understand how, could you please explain step-by-step what I have to do? Thanks for patience

eupator · Jul 23, 2023

AIexSuvorov said:
Currently I should get f2_dir file but I don't understand how, could you please explain step-by-step what I have to do? Thanks for patience

These are libraries auto generated by the command with f2stats.

You can skip these commands

extract_f2(prefix, my_f2_dir, pops = mypops, overwrite = TRUE, maxmiss = 1)
f2_blocks = f2_from_precomp(my_f2_dir, pops = mypops, afprod = TRUE)

if you want and go straight to f4 with

results = qpadm(prefix, left, right, target, allsnps = TRUE)
results$weights
results$popdrop

after setting your left, right and target.

AIexSuvorov · Jul 23, 2023

eupator said:
These are libraries auto generated by the command with f2stats.

You can skip these commands

extract_f2(prefix, my_f2_dir, pops = mypops, overwrite = TRUE, maxmiss = 1)
f2_blocks = f2_from_precomp(my_f2_dir, pops = mypops, afprod = TRUE)

if you want and go straight to f4 with

results = qpadm(prefix, left, right, target, allsnps = TRUE)
results$weights
results$popdrop

after setting your left, right and target.

Can I just ignore this "Error: object 'my_f2_dir' not found" and go to the next step?

eupator · Jul 24, 2023

You haven't set your left, right, target.

You also get that error because you haven't set a f2 dir and a mypops list.

Skip the extract f2 step and go straight to f4 like I wrote above.

AIexSuvorov · Aug 3, 2023

prefix = "C:/Users/1l16m/Downloads/qpAdm_files/1240K+HO/v54.1.p1_HO_public"
library(admixtools)
library(tidyverse)

target = c('Greek_1.DG')
left= c('Spain_Greek_oAegean','Polish.DG','Armenian.DG')
right = c('Mbuti.DG', 'Ethiopia_4500BP.SG', 'Russia_Ust_Ishim.DG', 'Czech_Vestonice16', 'Belgium_UP_GoyetQ116_1', 'Russia_Kostenki14.SG', 'Russia_AfontovaGora3', 'Italy_North_Villabruna_HG', 'Han.DG', 'Papuan.DG', 'Karitiana.DG', 'Georgia_Satsurblia.SG', 'Iran_GanjDareh_N', 'Turkey_Epipaleolithic', 'Morocco_Iberomaurusian', 'Jordan_PPNB', 'Russia_Karelia_HG.SG', 'Russia_Samara_EBA_Yamnaya', 'Czech_CordedWare', 'Armenia_LBA.SG', 'ONG.SG')
mypops = c('Mbuti.DG', 'Ethiopia_4500BP.SG', 'Russia_Ust_Ishim.DG', 'Czech_Vestonice16', 'Belgium_UP_GoyetQ116_1', 'Russia_Kostenki14.SG', 'Russia_AfontovaGora3', 'Italy_North_Villabruna_HG', 'Han.DG', 'Papuan.DG', 'Karitiana.DG', 'Georgia_Satsurblia.SG', 'Iran_GanjDareh_N', 'Turkey_Epipaleolithic', 'Morocco_Iberomaurusian', 'Jordan_PPNB', 'Russia_Karelia_HG.SG', 'Russia_Samara_EBA_Yamnaya', 'Czech_CordedWare', 'Armenia_LBA.SG', 'ONG.SG','Greek_1.DG','Spain_Greek_oAegean','Polish.DG','Armenian.DG')
results = qpadm(prefix, left, right, target, allsnps = TRUE)
results$weights
results$popdrop

Greek_1.DG = 37.4% Spain_Greek_oAegean + 53.9% Polish.DG + 8.71% Armenian.DG

Looks a lot more Polish than your result, maybe I picked different Greek sample, near Bulgarian border?

I will add the other two Emporion samples - I8208 and I8205 - which are both under label 'Spain_Hellenistic_oAegean' in the latest v54.1.p1_HO_public

target = c('Greek_1.DG')
left= c('Spain_Greek_oAegean','Spain_Hellenistic_oAegean','Polish.DG','Armenian.DG')
right = c('Mbuti.DG', 'Ethiopia_4500BP.SG', 'Russia_Ust_Ishim.DG', 'Czech_Vestonice16', 'Belgium_UP_GoyetQ116_1', 'Russia_Kostenki14.SG', 'Russia_AfontovaGora3', 'Italy_North_Villabruna_HG', 'Han.DG', 'Papuan.DG', 'Karitiana.DG', 'Georgia_Satsurblia.SG', 'Iran_GanjDareh_N', 'Turkey_Epipaleolithic', 'Morocco_Iberomaurusian', 'Jordan_PPNB', 'Russia_Karelia_HG.SG', 'Russia_Samara_EBA_Yamnaya', 'Czech_CordedWare', 'Armenia_LBA.SG', 'ONG.SG')
mypops = c('Mbuti.DG', 'Ethiopia_4500BP.SG', 'Russia_Ust_Ishim.DG', 'Czech_Vestonice16', 'Belgium_UP_GoyetQ116_1', 'Russia_Kostenki14.SG', 'Russia_AfontovaGora3', 'Italy_North_Villabruna_HG', 'Han.DG', 'Papuan.DG', 'Karitiana.DG', 'Georgia_Satsurblia.SG', 'Iran_GanjDareh_N', 'Turkey_Epipaleolithic', 'Morocco_Iberomaurusian', 'Jordan_PPNB', 'Russia_Karelia_HG.SG', 'Russia_Samara_EBA_Yamnaya', 'Czech_CordedWare', 'Armenia_LBA.SG', 'ONG.SG','Greek_1.DG','Spain_Greek_oAegean','Spain_Hellenistic_oAegean','Polish.DG','Armenian.DG')
results = qpadm(prefix, left, right, target, allsnps = TRUE)
results$weights
results$popdrop

Greek_1.DG = -27.7% Spain_Greek_oAegean + 65% Spain_Hellenistic_oAegean + 61.9% Polish.DG + 0.781% Armenian.DG

Is it possible to forbid modeling with negative values?

Jovialis · Aug 10, 2023

I posted this question in another thread, but it is also applicable here:

Jovialis said:
I've downloaded R and R studio, I "think" I installed Admixtools2 properly with the prompt. I also physically downloaded the Admixtools zip from github to my desktop. When I start up the Rstudio again, I run the Admixtools RMD file in vignettes in that folder.

I also downloaded the tar files from the Reich lab... now I'm stuck. How does one prompt R to replicate the model you made? What kind of file structure should there be?

Jovialis · Aug 10, 2023

Decided to delete everything and start over fresh, because I'm clearly doing something wrong.

I took screen caps for clarity:

Step 1:
Download and install R-Studio for Windows. https://www.rstudio.com/products/rstudio/

Did it:

Step 2:
Download and install R-tools 4.2 for Windows. https://cran.r-project.org/bin/windows/Rtools/
Did it:

Step 3:

After you complete the installation, run it (R-Studio)...
I'm stuck:

Shouldn't it automatically detect Rtools 4.2? The installation said it was successful. I tried looking up where I could find the right executable, but ChatGPT is just giving me BS.

Where can I find the right executable?

Jovialis · Aug 11, 2023

Jovialis said:
Decided to delete everything and start over fresh, because I'm clearly doing something wrong.

I took screen caps for clarity:

Step 1:
Download and install R-Studio for Windows. https://www.rstudio.com/products/rstudio/

Did it:

Step 2:
Download and install R-tools 4.2 for Windows. https://cran.r-project.org/bin/windows/Rtools/
Did it:

Step 3:

After you complete the installation, run it (R-Studio)...
I'm stuck:

Shouldn't it automatically detect Rtools 4.2? The installation said it was successful. I tried looking up where I could find the right executable, but ChatGPT is just giving me BS.

Where can I find the right executable?

So there is certainly something wrong, but it is not something I am doing. After I install R 4.2.0 with the installer it is supposed to have an icon that will bring me to R... This is not happening for me, for some reason. Instead I just have this folder in my C drive.

Basically, when the Rtools 4.2.0 installer is "finished" in the setup wizard, there is no shortcut for R.

Jovialis · Aug 11, 2023

Perhaps there are compatibility issues with Windows 11 and Rtools 4.2.0?

I downloaded a newer version, and it worked just fine... However, ADMIXTOOLS2 does not work with newer versions of R...

Jovialis · Aug 11, 2023

Eureka! I now see the error I am making,

I need to download R 4.2 first before Rtools

https://cran.utstat.utoronto.ca/bin/windows/base/

Please excuse my ignorance, I've never worked with this stuff before.

Nevertheless, I would recommend modifying the OP to include the link, instructing people to install R 4.2 before Rtools: https://cran.utstat.utoronto.ca/bin/windows/base/

Jovialis · Aug 11, 2023

Warning message:
package ‘Rcpp’ was built under R version 4.2.3

I installed R 4.2.1. Hope this warning isn't a big deal.

Jovialis · Aug 11, 2023

Jovialis said:
Warning message:
package ‘Rcpp’ was built under R version 4.2.3

I installed R 4.2.1. Hope this warning isn't a big deal.

So I was able to get pretty far in the the tutorial, I was up to running models; using the one euptor provided:

Code:

prefix = "C:\Users\Jovialis\Downloads\v54.1.p1_HO_public\v54.1.p1_HO_public"
my_f2_dir = "C:\Users\Jovialis\Downloads\my_f2_dir_Jovialis"
library(admixtools)
library(tidyverse)


# Load necessary libraries
library(admixtools)
library(tidyverse)


# Define paths
prefix = "C:\\Users\\Jovialis\\Downloads\\v54.1.p1_HO_public\\v54.1.p1_HO_public"
my_f2_dir = "C:\\Users\\Jovialis\\Downloads\\my_f2_dir_Jovialis"


# Define populations
target = c('Greek')
left = c('Greek_Emporion','Polish.DG','Armenian.DG')
right = c('Mbuti.DG', 'Ethiopia_4500BP_published.SG', 'Russia_Ust_Ishim.DG', 'Czech_Vestonice16', 'Belgium_UP_GoyetQ116_1_published', 'Russia_Kostenki14.SG', 'Russia_AfontovaGora3', 'Italy_North_Villabruna_HG', 'Han.DG', 'Papuan.DG', 'Karitiana.DG', 'Georgia_Satsurblia.SG', 'Iran_GanjDareh_N', 'Turkey_Epipaleolithic', 'Morocco_Iberomaurusian', 'Jordan_PPNB', 'Russia_HG_Karelia.SG', 'Russia_Samara_EBA_Yamnaya', 'Czech_Bohemia_CordedWare', 'Armenia_LBA.SG', 'ONG.SG')


# Generate f2 stats
mypops = c(right, target, left)
extract_f2(prefix, my_f2_dir, pops = mypops, overwrite = TRUE, maxmiss = 1)
f2_blocks = f2_from_precomp(my_f2_dir, pops = mypops, afprod = TRUE)


# Run qpAdm model
results = qpadm(prefix, left, right, target, allsnps = TRUE)


# Display results
print(results$weights)
print(results$popdrop)

But it looks like I should have installed R 4.2.3 rather than R 4.2.1

I don't know what this means:

Reading allele frequencies from packedancestrymap files...
Error in match_samples(indfile$X1, indfile$X3, inds, pops) :
Populations missing in indfile:
Ethiopia_4500BP_published.SG, Belgium_UP_GoyetQ116_1_published, Russia_HG_Karelia.SG, Czech_Bohemia_CordedWare, Greek, Greek_Emporion

Jovialis · Aug 12, 2023

I figured it out! Thanks again to euptor for putting together this tutorial.

Code:

[TABLE="class: GND-IWGDA3B GND-IWGDOW, width: 1011"]
[/TABLE]
> results$weights
# A tibble: 3 × 5
  target           left                      weight      se     z
  <chr>            <chr>                      <dbl>   <dbl> <dbl>
1 Italian_South.HO Greece_Minoan_Lassithi    0.638  0.0217   29.4
2 Italian_South.HO Russia_Samara_EBA_Yamnaya 0.265  0.0181   14.7
3 Italian_South.HO Morocco_Iberomaurusian    0.0963 0.00931  10.3
> results$popdrop
# A tibble: 7 × 14
  pat      wt   dof  chisq         p f4rank Greece_Minoan_Lassithi Russia_Samara_EBA_Ya…¹ Morocco_Iberomaurusian feasible
  <chr> <dbl> <dbl>  <dbl>     <dbl>  <dbl>                  <dbl>                  <dbl>                  <dbl> <lgl>   
1 000       0    16   79.6 1.96e- 10      2                  0.638                  0.265                 0.0963 TRUE    
2 001       1    17  193.  6.85e- 32      1                  0.781                  0.219                NA      TRUE    
3 010       1    17  308.  2.43e- 55      1                  0.940                 NA                     0.0603 TRUE    
4 100       1    17  946.  2.57e-190      1                 NA                      0.755                 0.245  TRUE    
5 011       2    18  351.  1.35e- 63      0                  1                     NA                    NA      TRUE    
6 101       2    18 1719.  0              0                 NA                      1                    NA      TRUE    
7 110       2    18 3487.  0              0                 NA                     NA                     1      TRUE    
# ℹ abbreviated name: ¹Russia_Samara_EBA_Yamnaya
# ℹ 4 more variables: best <lgl>, dofdiff <dbl>, chisqdiff <dbl>, p_nested <dbl>
>

ChatGPT interpretation of the results:

Let's break down the results:

1. **Weights**:
- The `results$weights` table provides the estimated admixture proportions for the target population `Italian_South.HO` from the source populations.
- `Greece_Minoan_Lassithi`: 63.8%
- `Russia_Samara_EBA_Yamnaya`: 26.5%
- `Morocco_Iberomaurusian`: 9.63%

This suggests that the `Italian_South.HO` population can be modeled as being approximately 63.8% `Greece_Minoan_Lassithi`, 26.5% `Russia_Samara_EBA_Yamnaya`, and 9.63% `Morocco_Iberomaurusian`.

2. **Model Fit**:
- The `results$popdrop` table provides statistics on the fit of various models.
- The model with all three source populations (`pat` = `000`) has a chi-squared value of 79.6 and a p-value of `1.96e-10`. This p-value is extremely low, suggesting that the model doesn't perfectly fit the data.
- The models with only two source populations have even lower p-values and higher chi-squared values, indicating worse fits.

The model with all three source populations (`pat` = `000`) is the best fit among the ones tested, but its low p-value suggests it's still not a perfect fit.

In summary, while the three-population model provides a reasonable estimate of the admixture proportions, the low p-value and the warning message indicate potential issues with the model's fit. It might be worth exploring other combinations of source populations or considering other factors that could improve the model's fit.

Jovialis · Aug 28, 2023

From now on, I am exclusively going to use Admixtools and Eigensoft for analysis.

Thanks to AI, these tools are no longer esoteric, but you do have to wrestle with it to get it right. But with trial and error, as well as persistence, it can be achieved.

It no longer makes any sense for me to use Dodecad and G25, when I can use sophisticated professional tools to replicate academic studies.

I look forward to novel insights that will be yielded from this.

It also makes no sense to ever test with another company again, since I can model my WGS30X sample with Admixtools.

I have to say, thank you eupator, once again, for sparking this interest in me to pursue this.

I recommend everyone that is using Vahaduo to stop, and start learning how to model with Admixtools.

Norrsken · Sep 5, 2023

Hello i keep getting this issue what is wrong?

Jovialis · Sep 15, 2023

Norrsken said:
View attachment 14151

Hello i keep getting this issue what is wrong?

If would be easier if you could post the text for the output rather than an image.

ChatGPT 4.0 has been indispensable for helping to resolve errors, try copy and pasting it into it and see what it tell you

It is a lot of trial and error, but it is worth it.

Jovialis · Sep 29, 2023

What does this mean, if it says I'm closest to Italian_North.HO, something is wrong here:

TSI I can understand; it is also the only one without a negative value:

Jovialis · Sep 29, 2023

Iulius said:
What does this mean, if it says I'm closest to Italian_North.HO, something is wrong here:

TSI I can understand; it is also the only one without a negative value:

Now this makes absolute sense now:

eupator · Oct 1, 2023

AIexSuvorov said:
Looks a lot more Polish than your result, maybe I picked different Greek sample, near Bulgarian border?

Greek-1.DG and Greek-2.DG are single DG samples in the 1240K database, they are from Thessaloniki if I remember correctly, one of them is probably a mixed pontic because it scores high Armenia_BA.

The samples I used in my example can be found in the HO database, they are numerous.

djdarko4 · Oct 7, 2023

eupator said:
Greek-1.DG and Greek-2.DG are single DG samples in the 1240K database, they are from Thessaloniki if I remember correctly, one of them is probably a mixed pontic because it scores high Armenia_BA.

The samples I used in my example can be found in the HO database, they are numerous.

I need help with running Outgroup F3 Statistics could you please guide me through that

Admixtools admixtools2 TUTORIAL for WINDOWS.

Regular Member

destroyer of delusions

Regular Member

destroyer of delusions

Regular Member

Advisor

Advisor

Advisor

Advisor

Advisor

Advisor

Advisor

Advisor

Advisor

Newbie

Advisor

Advisor

Advisor

destroyer of delusions

Newbie