Jovialis
Advisor
- Messages
- 9,888
- Reaction score
- 6,794
- Points
- 113
- Ethnic group
- Italian
- Y-DNA haplogroup
- R1b-PF7566>Y227216
- mtDNA haplogroup
- H6a1b7
Just in case anyone wants to know, you convert eigenstrat format files to plink by creating a parfile in plink like so (This is also a note for me so I can remember; re-learning this stuff after a year can take a while):
1. Open Ubantu and navigate to where you have it mount, and where you send your files. Ming is on a D drive called "UbuntuJovialisHome":
Code:cd /mnt/d/UbuntuJovialisHome/
2. Create the parfile, I use nano:
Code:nano parfile
3. In the nano file editor type this:
Code:genotypename: /mnt/d/UbuntuJovialisHome/v62.0_HO_public.geno snpname: /mnt/d/UbuntuJovialisHome/v62.0_HO_public.snp indivname: /mnt/d/UbuntuJovialisHome/v62.0_HO_public.ind outputformat: PACKEDPED genotypeoutname: /mnt/d/UbuntuJovialisHome/data.bed snpoutname: /mnt/d/UbuntuJovialisHome/data.bim indivoutname: /mnt/d/UbuntuJovialisHome/data.fam familynames: NO
4. Finally, execute the parfile to produce the BED, BIM, and FAM:
Code:convertf -p /mnt/d/UbuntuJovialisHome/parfile
After a while it will finish processing and generate in the destination you are sending it to.
Optional. Here's a quick way to verify in Ubuntu when it is done processing (or you can just navigate there):
Code:ls -lh /mnt/d/UbuntuJovialisHome/v62.0_HO_public.*
Finally have the process documented from using my HG19-aligned 23andme txt file produced last year from when I processed it from FASTQ
Code:
# Step 1: Convert the 23andMe file to PLINK binary format.
plink --23file /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked_23andMe_V3.txt --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked
# Step 2a: Extract SNPs from the Jovialis dataset.
plink --bfile /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked --write-snplist --out /mnt/d/UbuntuJovialisHome/jovialis_snp_list
# Step 2b: Extract SNPs from the AADR (v62.0_HO_public) dataset.
plink --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --write-snplist --allow-no-sex --out /mnt/d/UbuntuJovialisHome/v62_snp_list
# Step 3: Find common SNPs between the two datasets (Jovialis and AADR).
comm -12 <(sort /mnt/d/UbuntuJovialisHome/jovialis_snp_list.snplist) <(sort /mnt/d/UbuntuJovialisHome/v62_snp_list.snplist) > /mnt/d/UbuntuJovialisHome/common_snps.txt
# Step 4: Filter the Jovialis dataset to keep only SNPs present in both datasets (common SNPs).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked --extract /mnt/d/UbuntuJovialisHome/common_snps.txt --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_common_snps
# Step 5: Attempt an initial merge and identify problematic SNPs (multiallelic or inconsistent strand).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/v62_common_snps --bmerge /mnt/d/UbuntuJovialisHome/Jovialis_common_snps --make-bed --out /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged
# Step 6: Flip problematic SNPs in the Jovialis dataset (fix strand inconsistencies).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_common_snps --flip /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged-merge.missnp --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_flipped
# Step 7: Exclude remaining problematic SNPs from the Jovialis dataset.
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_flipped --exclude /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged-merge.missnp --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_filtered
# Step 8: Filter the Jovialis dataset to keep only SNPs present in AADR.
plink --bfile /mnt/d/UbuntuJovialisHome/Jovialis_PLINK_binary --extract /mnt/d/UbuntuJovialisHome/v62.0_HO_public.bim --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_filtered_for_AADR
# Step 9: Perform the final merge, ensuring all SNPs from AADR are kept and only SNPs matching AADR from Jovialis are merged.
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --bmerge /mnt/d/UbuntuJovialisHome/Jovialis_filtered_cleaned --make-bed --out /mnt/d/UbuntuJovialisHome/v62_Jovialis_corrected_final
# Step 10a: Check SNP frequency in the final merged dataset to verify all SNPs from AADR were retained.
plink --bfile /mnt/d/UbuntuJovialisHome/v62_Jovialis_corrected_final --freq --out snp_check
# Step 10b: Check SNP frequency in the original AADR dataset for comparison.
plink --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --freq --out aadr_check
Last edited: