Admixtools admixtools2 TUTORIAL for WINDOWS.

** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
Note: break used in wrong context: no loop is visible
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (admixtools)

Got it to work in the R Console. Within Rstudio itself I was getting errors
 
Check out this thread:


In addition to @eupator there's a few other members that could help you out.

There's times when it does replicate qpAdm, and times when it doesn't. Frankly, you should endeavor to use admixtools for optimal analysis. It is the suite of tools used by academics for peer-reviewed studies. Once you get it set up, it becomes second nature. I use AI to help me figure it out.
For quite some months I've spent countless hours experimenting with G25 to see how well it can replicate admixture results from formal studies that utilise qpAdm and ADMIXTURE etc. I agree that G25 can be hit and miss. The problem is there is no way to validate the results except by having pre-existing knowledge.

Anyway I've decided to learn ADMIXTURE for the time being and once I've got the hang of it I might move on to learn qpAdm. BTW I've noticed in this thread there's a few mentions of converting Eigenstrat format files to Plink format. Is there any benefit of this since qpAdm supports Eigenstrat files, doesn't it?
 
For quite some months I've spent countless hours experimenting with G25 to see how well it can replicate admixture results from formal studies that utilise qpAdm and ADMIXTURE etc. I agree that G25 can be hit and miss. The problem is there is no way to validate the results except by having pre-existing knowledge.

Anyway I've decided to learn ADMIXTURE for the time being and once I've got the hang of it I might move on to learn qpAdm. BTW I've noticed in this thread there's a few mentions of converting Eigenstrat format files to Plink format. Is there any benefit of this since qpAdm supports Eigenstrat files, doesn't it?
With eigenstrat you can take advantage of other tools, notably smartpca. Which is the same PCA that is used in many studies.

However, I myself have tried converting back from PLINK to see where I plot. But I run into an issue where the file becomes massive for some reason. I haven't been able to figure it out, and I've asked around, but haven't found the answer. Like yourself, I am self-taught, and still learning.

Welcome to the forum btw!
 
With eigenstrat you can take advantage of other tools, notably smartpca. Which is the same PCA that is used in many studies.

However, I myself have tried converting back from PLINK to see where I plot. But I run into an issue where the file becomes massive for some reason. I haven't been able to figure it out, and I've asked around, but haven't found the answer. Like yourself, I am self-taught, and still learning.

Welcome to the forum btw!
Thanks! Yeah I've downloaded the Reich dataset (the HO one) which of course is in Eigenstrat. Since ADMIXTURE also supports Eigenstrat format then I don't see much of a reason to convert the dataset to Plink. I just need to learn to create subsets of the Reich dataset. I notice however that the ADMIXTURE tutorials I've found tend to use Plink files – possibly because they were using an older version of ADMIXTURE which didn't support Eigenstrat, and/or because some of the operations in the tutorials would utilise Plink.

Btw when you converted back from the Plink file, did you convert to Eigenstrat or VCF? (I read that VCF files are quite larger than the other formats).
 
With eigenstrat you can take advantage of other tools, notably smartpca. Which is the same PCA that is used in many studies.

However, I myself have tried converting back from PLINK to see where I plot. But I run into an issue where the file becomes massive for some reason. I haven't been able to figure it out, and I've asked around, but haven't found the answer. Like yourself, I am self-taught, and still learning.

Welcome to the forum btw!
There is easy way how to avoid this.

You have to filter the samples being merged to the large dataset to only those SNPs already existing in the large dataset, else the extraneous SNPs in the sample that don't occur in the large forces the large to add No Calls to every sample in it swelling the dataset.
#
# This example would gather the SNPs in the primary and write them out. Then you use it to filter the sample, in this case my sample.
#
# You'd then merge this new filtered sample dataset with the primary like normal.


# plink --allow-no-sex --bfile v52.2_1240K_public --write-snplist --out v52.2_1240K_clean
# plink --23file PLg.txt --extract v52.2_1240K_clean.snplist --make-bed --out PLg_v54p1_genome
getwd()
# system("plink --allow-no-sex --bfile v52.2_1240K_public --write-snplist --out v52.2_1240K_clean ")
system("plink --bfile S2949 --extract v52.2_1240K_clean.snplist --make-bed --out S2949_filtered")

system("plink --bfile S2949_filtered --bmerge v52.2_1240K_public.bed v52.2_1240K_public.bim v52.2_1240K_public.fam --out Ho_Out")


--write-snplist - is the option to create clean snip list . Only the snips in this list will be used for the merge. This are the snips in the large dataset. Because the new file that you will merge may have some other snips or the names for the snips may be different which is also causing some issues. I noticed that some files may have different names for the snips, depending on the format.
 
There is easy way how to avoid this.

You have to filter the samples being merged to the large dataset to only those SNPs already existing in the large dataset, else the extraneous SNPs in the sample that don't occur in the large forces the large to add No Calls to every sample in it swelling the dataset.
#
# This example would gather the SNPs in the primary and write them out. Then you use it to filter the sample, in this case my sample.
#
# You'd then merge this new filtered sample dataset with the primary like normal.


# plink --allow-no-sex --bfile v52.2_1240K_public --write-snplist --out v52.2_1240K_clean
# plink --23file PLg.txt --extract v52.2_1240K_clean.snplist --make-bed --out PLg_v54p1_genome
getwd()
# system("plink --allow-no-sex --bfile v52.2_1240K_public --write-snplist --out v52.2_1240K_clean ")
system("plink --bfile S2949 --extract v52.2_1240K_clean.snplist --make-bed --out S2949_filtered")

system("plink --bfile S2949_filtered --bmerge v52.2_1240K_public.bed v52.2_1240K_public.bim v52.2_1240K_public.fam --out Ho_Out")


--write-snplist - is the option to create clean snip list . Only the snips in this list will be used for the merge. This are the snips in the large dataset. Because the new file that you will merge may have some other snips or the names for the snips may be different which is also causing some issues. I noticed that some files may have different names for the snips, depending on the format.
Thanks! I did this when merging previously.

What I meant is after it is already in PLINK format the size is good, and comparable to the previous eigenstrat format; converting back to eigenstrat from PLINK it becomes like 10x bigger than the original eigenstrat file, despite only having 1 sample added (with no extra SNPs, just the ones native to the original eigenstrat.)

At any rate, thanks for that information, because it will be useful when i re-merge my sample when the new updates for AADR come out.
 
Thanks! I did this when merging previously.

What I meant is after it is already in PLINK format the size is good, and comparable to the previous eigenstrat format; converting back to eigenstrat from PLINK it becomes like 10x bigger than the original eigenstrat file, despite only having 1 sample added (with no extra SNPs, just the ones native to the original eigenstrat.)

At any rate, thanks for that information, because it will be useful when i re-merge my sample when the new updates for AADR come out.
I think I have seen that also.
 
With eigenstrat you can take advantage of other tools, notably smartpca. Which is the same PCA that is used in many studies.

However, I myself have tried converting back from PLINK to see where I plot. But I run into an issue where the file becomes massive for some reason. I haven't been able to figure it out, and I've asked around, but haven't found the answer. Like yourself, I am self-taught, and still learning.

Welcome to the forum btw!
What's the most accurate thing for mobile users? I use g25, but I'm reading its not reliable at all.
 
What's the most accurate thing for mobile users? I use g25, but I'm reading its not reliable at all.
For mobile you are limited to pca-based admixture calculators and oracles such as G25. To run anything like ADMIXTURE or qpAdm which are academic standard you need a Linux/Unix based environment on a desktop or laptop.
 

This thread has been viewed 35948 times.

Back
Top