Admixtools Using Admixtools2 to model admixture

Jalisciense · Nov 5, 2024

Tautalus said:
As I told you, I didn't do it iteratively, that is, through a program that would rotate the populations automatically. I chose and changed the populations manually.
But there are functions that allow you to automate this population rotation, like qpAdm_rotation().
You can also test multiple models this way:
https://www.eupedia.com/forum/threads/multi-qpadm-automation-script.45272/

Thanks, I'll check it out; but Idk if my laptop will hold up though...

Jalisciense · Nov 27, 2024

Tautalus said:
The trickiest part of all this is the merging of our own data with the data from the Reich Lab.

There are several ways of doing it, here is one of the fastest and simplest.

It’s the conversion of your raw data in 23andme format to Bed format, then to Geno format, then merging it with the Reich data.

All this instructions assume that all the programs and packages needed for the execution in DOS or in Wsl are already installed.
The names of the files can be whatever you want.

1) Convert raw data in 23andme format to bed file (DOS session)
plink --allow-no-sex --alleleACGT --23file 23andMe.txt --make-bed --out outfile

This will produce 3 essential files, a bed, a bim and a fam file. In the fam file you could replace the -9 for 1.

2) Convert the bed file to geno (Eigenstrat format) (Wsl session)
You need to have a parameter file, with whatever name you want. I name it par.BED2GENO.par, its content are :

genotypename: outfile.bed
snpname: outfile.bim
indivname: outfile.fam
outputformat: EIGENSTRAT
genotypeoutname: outfile.geno
snpoutname: outfile.snp
indivoutname: outfile.ind

After the parameter file is done execute the command : convertf -p par.BED2GENO.par
This will produce 3 files, a geno, a ind and a snp file. In the ind file you can replace “Control” by your own name or alias.

3) Merge your data with the Reich data (In Wsl)
You need to have a parameter file. I name it par.MERGEGENO.par, its content :

geno1: outfile.geno
snp1: outfile.snp
ind1: outfile.ind
geno2: v54.1.p1_1240K_public.geno
snp2: v54.1.p1_1240K_public.snp
ind2: v54.1.p1_1240K_public.ind
genooutfilename: merged.geno
snpoutfilename: merged.snp
indoutfilename: merged.ind
outputformat: EIGENSTRAT
docheck: YES
hashcheck: YES
strandcheck: YES

Then execute the command : mergeit -p par.MERGEGENO.pa

Mergeit, according to the documentation, merges two data sets into a third, which has the union of the individuals and the intersection of the SNPs in the first two, which means that the final merged file will only have the SNPs that exist in both files and all the remaining SNPs will be discarded. This will produce a merged file smaller in number of SNPs, not in size, than the original Reich data, with all the info you need to model your admixture.

I compared the test results of this file with the test results of a merged file with all of Reich's data plus my data and they are identical.
This merge is the process that takes the longest, between half hour and an hour depending on the computer.

And that's it, after this process the merged files are ready to be used by qpadm. Now you have two datasets, one is the original Reich data, to test all the populations and the other is your merged files, to test your admixture.

This is useful, thanks.

But why do you need the DOS session? And the Wsl that are you using is the default Ubuntu distribution? Or did you change to another Linux?

Btw what is the characteristics of your computer or laptop (How much RAM you have and free, how many computer processors, what processor you are using, system type and Windows (C

/C drive? For example:

I have a laptop, my RAM is 8 GB/7.83 GB usable (But like 5-6 GB free), 8 computer processors, processor Intel ® Core ™ i5-10300H, 64-bit operating system, x64-based processor and 92.7 GB free of 215 GB.

I'm interested to merge my data to AADR (Reich Lab), so that's why I am asking those questions

Thucydides enjoyer · Jun 20, 2025

Tautalus said:
The trickiest part of all this is the merging of our own data with the data from the Reich Lab.

There are several ways of doing it, here is one of the fastest and simplest.

It’s the conversion of your raw data in 23andme format to Bed format, then to Geno format, then merging it with the Reich data.

All this instructions assume that all the programs and packages needed for the execution in DOS or in Wsl are already installed.
The names of the files can be whatever you want.

1) Convert raw data in 23andme format to bed file (DOS session)
plink --allow-no-sex --alleleACGT --23file 23andMe.txt --make-bed --out outfile

This will produce 3 essential files, a bed, a bim and a fam file. In the fam file you could replace the -9 for 1.

2) Convert the bed file to geno (Eigenstrat format) (Wsl session)
You need to have a parameter file, with whatever name you want. I name it par.BED2GENO.par, its content are :

genotypename: outfile.bed
snpname: outfile.bim
indivname: outfile.fam
outputformat: EIGENSTRAT
genotypeoutname: outfile.geno
snpoutname: outfile.snp
indivoutname: outfile.ind

After the parameter file is done execute the command : convertf -p par.BED2GENO.par
This will produce 3 files, a geno, a ind and a snp file. In the ind file you can replace “Control” by your own name or alias.

3) Merge your data with the Reich data (In Wsl)
You need to have a parameter file. I name it par.MERGEGENO.par, its content :

geno1: outfile.geno
snp1: outfile.snp
ind1: outfile.ind
geno2: v54.1.p1_1240K_public.geno
snp2: v54.1.p1_1240K_public.snp
ind2: v54.1.p1_1240K_public.ind
genooutfilename: merged.geno
snpoutfilename: merged.snp
indoutfilename: merged.ind
outputformat: EIGENSTRAT
docheck: YES
hashcheck: YES
strandcheck: YES

Then execute the command : mergeit -p par.MERGEGENO.par

Mergeit, according to the documentation, merges two data sets into a third, which has the union of the individuals and the intersection of the SNPs in the first two, which means that the final merged file will only have the SNPs that exist in both files and all the remaining SNPs will be discarded. This will produce a merged file smaller in number of SNPs, not in size, than the original Reich data, with all the info you need to model your admixture.

I compared the test results of this file with the test results of a merged file with all of Reich's data plus my data and they are identical.
This merge is the process that takes the longest, between half hour and an hour depending on the computer.

And that's it, after this process the merged files are ready to be used by qpadm. Now you have two datasets, one is the original Reich data, to test all the populations and the other is your merged files, to test your admixture.

may I ask tautalus,how do i install convertf?I have already completed the first step,i have my dna in .BED .BIM .FAM files.Although i am having trouble in completing the second step,since i am not familiar with wsl either.

Admixtools Using Admixtools2 to model admixture

Jalisciense

Regular Member

Jalisciense

Regular Member

Thucydides enjoyer

Regular Member