Admixtools admixtools2 TUTORIAL for WINDOWS.

baeticvs · Nov 20, 2024

Jovialis said:
Seems like it is best to produce PCAs in smartpca with 1240K, because of higher resolution, and is consistent with what aDNA studies use.

kinda unrelated but what do u think about v54.1 1240k vs v62 1240k? v62 is ofc more up to date but I see weird samples and labels and the v54 seems more curated or "refined"? which one do you use? I get slight differents results

Jovialis · Nov 20, 2024

baeticvs said:
kinda unrelated but what do u think about v54.1 1240k vs v62 1240k? v62 is ofc more up to date but I see weird samples and labels and the v54 seems more curated or "refined"? which one do you use? I get slight differents results

Not sure to be honest.

Jovialis · Nov 20, 2024

Celtion said:
Which pops did you use to set the eigenvectors?

They are listed in the Sample_List.txt

Jovialis · Nov 20, 2024

Here's a refined version of the PCA along with the R script used:

Code:

# Load necessary libraries
library(ggplot2)
library(dplyr)

# Set working directory
setwd("D:/Bioinformatics/01_Admixtools_Dataset/V62.0_HO_Eigenstrat_Merged_Jovialis/W_Eurasia_Mod_aDNA")

# Read the eigenvalues
evals <- scan("projected.eval.txt", quiet = TRUE)

# Read the eigenvectors
evecs <- read.table("projected.evec.txt", header = FALSE, stringsAsFactors = FALSE)

# Extract individual IDs and population labels
individuals <- as.character(evecs$V1)
populations <- as.character(evecs$V12)  # Adjust if your population labels are in a different column

# Extract the first two principal components and flip both axes
pc1 <- -as.numeric(evecs$V2)  # Horizontal flip
pc2 <- -as.numeric(evecs$V3)  # Vertical flip

# Create a data frame for plotting
pca_data <- data.frame(Individual = individuals, Population = populations, PC1 = pc1, PC2 = pc2, stringsAsFactors = FALSE)

# Remove rows with NA values (if any)
pca_data <- na.omit(pca_data)

# Define populations to highlight
highlighted_pops <- c(
    "Jovialis", "Armenian.HO", "Iranian.HO", "Turkish.HO", "Albanian.HO", "Italian_North.HO",
    "Bulgarian.HO", "Cypriot.HO", "Greek.HO", "Italian_South.HO", "Maltese.HO", "Sicilian.HO",
    "Italian_Central.HO", "English.HO", "French.HO", "Icelandic.HO", "Norwegian.HO", "Orcadian.HO",
    "Scottish.HO", "BedouinA.HO", "BedouinB.HO", "Jordanian.HO", "Palestinian.HO", "Saudi.HO",
    "Syrian.HO", "Abkhasian.HO", "Adygei.HO", "Balkar.HO", "Chechen.HO", "Georgian.HO", "Kumyk.HO",
    "Lezgin.HO", "Russia_NorthOssetian.HO", "Jew_Ashkenazi.HO", "Jew_Georgian.HO", "Jew_Iranian.HO",
    "Jew_Iraqi.HO", "Jew_Libyan.HO", "Jew_Moroccan.HO", "Jew_Tunisian.HO", "Jew_Turkish.HO",
    "Jew_Yemenite.HO", "Basque.HO", "Spanish.HO", "Spanish_North.HO", "Druze.HO", "Lebanese.HO",
    "Belarusian.HO", "Croatian.HO", "Czech.HO", "Estonian.HO", "Hungarian.HO", "Lithuanian.HO",
    "Ukrainian.HO", "IBS_CanaryIslands.DG", "Sardinian.HO", "Finnish.HO", "Mordovian.HO", "Russian.HO"
)

# Filter data to include only highlighted populations
pca_data <- pca_data %>% filter(Population %in% highlighted_pops)

# Assign groups for coloring and filling
pca_data <- pca_data %>%
    mutate(
        Group = case_when(
            Population == "Jovialis" ~ "Jovialis",
            Population == "Armenian.HO" ~ "Armenian",
            Population == "Iranian.HO" ~ "Iranian",
            Population == "Turkish.HO" ~ "Turkish",
            Population == "Albanian.HO" ~ "Albanian",
            Population == "Italian_North.HO" ~ "Italian_North",
            Population == "Bulgarian.HO" ~ "Bulgarian",
            Population == "Cypriot.HO" ~ "Cypriot",
            Population == "Greek.HO" ~ "Greek",
            Population == "Italian_South.HO" ~ "Italian_South",
            Population == "Maltese.HO" ~ "Maltese",
            Population == "Sicilian.HO" ~ "Sicilian",
            Population == "Italian_Central.HO" ~ "Italian_Central",
            Population == "English.HO" ~ "English",
            Population == "French.HO" ~ "French",
            Population == "Icelandic.HO" ~ "Icelandic",
            Population == "Norwegian.HO" ~ "Norwegian",
            Population == "Orcadian.HO" ~ "Orcadian",
            Population == "Scottish.HO" ~ "Scottish",
            Population == "BedouinA.HO" ~ "BedouinA",
            Population == "BedouinB.HO" ~ "BedouinB",
            Population == "Jordanian.HO" ~ "Jordanian",
            Population == "Palestinian.HO" ~ "Palestinian",
            Population == "Saudi.HO" ~ "Saudi",
            Population == "Syrian.HO" ~ "Syrian",
            Population == "Abkhasian.HO" ~ "Abkhasian",
            Population == "Adygei.HO" ~ "Adygei",
            Population == "Balkar.HO" ~ "Balkar",
            Population == "Chechen.HO" ~ "Chechen",
            Population == "Georgian.HO" ~ "Georgian",
            Population == "Kumyk.HO" ~ "Kumyk",
            Population == "Lezgin.HO" ~ "Lezgin",
            Population == "Russia_NorthOssetian.HO" ~ "North_Ossetian",
            Population == "Jew_Ashkenazi.HO" ~ "Jew_Ashkenazi",
            Population == "Jew_Georgian.HO" ~ "Jew_Georgian",
            Population == "Jew_Iranian.HO" ~ "Jew_Iranian",
            Population == "Jew_Iraqi.HO" ~ "Jew_Iraqi",
            Population == "Jew_Libyan.HO" ~ "Jew_Libyan",
            Population == "Jew_Moroccan.HO" ~ "Jew_Moroccan",
            Population == "Jew_Tunisian.HO" ~ "Jew_Tunisian",
            Population == "Jew_Turkish.HO" ~ "Jew_Turkish",
            Population == "Jew_Yemenite.HO" ~ "Jew_Yemenite",
            Population == "Basque.HO" ~ "Basque",
            Population == "Spanish.HO" ~ "Spanish",
            Population == "Spanish_North.HO" ~ "Spanish_North",
            Population == "Druze.HO" ~ "Druze",
            Population == "Lebanese.HO" ~ "Lebanese",
            Population == "Belarusian.HO" ~ "Belarusian",
            Population == "Croatian.HO" ~ "Croatian",
            Population == "Czech.HO" ~ "Czech",
            Population == "Estonian.HO" ~ "Estonian",
            Population == "Hungarian.HO" ~ "Hungarian",
            Population == "Lithuanian.HO" ~ "Lithuanian",
            Population == "Ukrainian.HO" ~ "Ukrainian",
            Population == "IBS_CanaryIslands.DG" ~ "Canary_Islands",
            Population == "Sardinian.HO" ~ "Sardinian",
            Population == "Finnish.HO" ~ "Finnish",
            Population == "Mordovian.HO" ~ "Mordovian",
            Population == "Russian.HO" ~ "Russian",
            TRUE ~ "Other"
        )
    )

# Assign colors with a focus on darker shades and valid color names
custom_colors <- c(
    "Jovialis" = "darkgoldenrod", "Armenian" = "darkblue", "Iranian" = "darkgreen",
    "Turkish" = "orange", "Albanian" = "green", "Italian_North" = "darkorange",
    "Bulgarian" = "steelblue", "Cypriot" = "darkmagenta", "Greek" = "saddlebrown",
    "Italian_South" = "darkorchid3", "Maltese" = "blue", "Sicilian" = "darkolivegreen",
    "Italian_Central" = "midnightblue", "English" = "firebrick", "French" = "chocolate4",
    "Icelandic" = "darkslategray", "Norwegian" = "mediumblue", "Orcadian" = "darkslateblue",
    "Scottish" = "darkseagreen", "BedouinA" = "darkcyan", "BedouinB" = "deepskyblue4",
    "Jordanian" = "darkred", "Palestinian" = "darkgreen", "Saudi" = "darkgoldenrod4",
    "Syrian" = "mediumvioletred", "Abkhasian" = "brown4", "Adygei" = "khaki4",
    "Balkar" = "purple4", "Chechen" = "royalblue4", "Georgian" = "brown3",
    "Kumyk" = "forestgreen", "Lezgin" = "springgreen4", "North_Ossetian" = "lightpink4",
    "Jew_Ashkenazi" = "chocolate", "Jew_Georgian" = "darkturquoise",
    "Jew_Iranian" = "dodgerblue4", "Jew_Iraqi" = "slateblue", "Jew_Libyan" = "cornflowerblue",
    "Jew_Moroccan" = "limegreen", "Jew_Tunisian" = "darkred", "Jew_Turkish" = "seagreen4",
    "Jew_Yemenite" = "navyblue", "Basque" = "darkorchid4", "Spanish" = "darkorchid",
    "Spanish_North" = "mediumseagreen", "Druze" = "slateblue4", "Lebanese" = "springgreen3",
    "Belarusian" = "darkturquoise", "Croatian" = "blue", "Czech" = "darkslateblue",
    "Estonian" = "darkslategray4", "Hungarian" = "darkorange3", "Lithuanian" = "tan4",
    "Ukrainian" = "tan", "Canary_Islands" = "navy", "Sardinian" = "darkseagreen4",
    "Finnish" = "olivedrab", "Mordovian" = "darkorange1", "Russian" = "red"
)

# Assign unique filled shapes to each group (cycling through the available filled shapes)
filled_shapes <- c(21, 22, 23, 24, 25)  # Circle, square, diamond, up-triangle, down-triangle
shape_values <- rep(filled_shapes, length.out = length(unique(pca_data$Group)))

# Plot the PCA with dark colors, different filled shapes to distinguish samples, and a black border around the PCA
ggplot(pca_data, aes(x = PC1, y = PC2, color = Group, fill = Group, shape = Group)) +
    geom_point(size = 3) +
    scale_color_manual(values = custom_colors) +
    scale_fill_manual(values = custom_colors) +
    scale_shape_manual(values = shape_values) +
    labs(
        title = "PCA Projection of Modern West Eurasia (AADR_HO v62.0 merged with Jovialis WGS 30x)",
        x = paste0("PC1 (", round(evals[1] / sum(evals) * 100, 2), "% variance)"),
        y = paste0("PC2 (", round(evals[2] / sum(evals) * 100, 2), "% variance)")
    ) +
    theme_minimal() +
    theme(
        legend.position = "bottom",
        legend.title = element_blank(),
        legend.text = element_text(size = 8),  # Decrease legend text size
        legend.key.size = unit(0.4, "cm"),  # Decrease size of legend keys
        legend.spacing.x = unit(0.2, "cm"),  # Decrease horizontal spacing in legend
        legend.box = "horizontal",  # Arrange legend items horizontally
        legend.direction = "horizontal",
        plot.title = element_text(hjust = 0.5),
        panel.border = element_rect(color = "black", fill = NA, linewidth = 1)  # Add black border around the PCA plot
    ) +
    guides(
        color = guide_legend(ncol = 8),
        shape = guide_legend(ncol = 8),  # Make sure the shape legend is also compact
        fill = guide_legend(ncol = 8)  # Make sure the fill legend is also compact
    )

Celtion · Nov 20, 2024

baeticvs said:
kinda unrelated but what do u think about v54.1 1240k vs v62 1240k? v62 is ofc more up to date but I see weird samples and labels and the v54 seems more curated or "refined"? which one do you use? I get slight differents results

The readme file that came with the v62.0 set implies that this is the most refined version:

• "Twist" sequencing is a significant update in our protocol for capture and data for all newer samples are captured this way, as described in [RohlandMallickGenomeResearch2022]. These samples are indicated in the genetic ids with "TW", in contrast to our older Agilent sequencing, marked "AG". One significant advantage is that twist capture reduces bias when co-analysed with shotgun data,
• the pseudo-haploid calling procedure has been updated to objectively determine thresholding parameters based on error rates (technical note to follow),
• 13442 poor performing SNPs were dropped from the Human Origins array, compared with previous releases.

Jalisciense · Nov 22, 2024

I need your help boys, I was trying to merging myself to modelling in qpAdm, I converted my AncestryDNA to 23andMe format, then I used this command:

plink --23file AncestryCombined.txt --make-bed --out mydata

And I got files .bed, .bim, .fam, .hh and .log.

Now I used this command:

plink --allow-no-sex --bfile v62.0_1240k_public --bmerge mydata --out v62.0_1240k_public_mydata

And I got this:

PLINK v1.9.0-b.7.7 64-bit (22 Oct 2024) cog-genomics.org/plink/1.9/
(C) 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to v62.0_1240k_public_mydata.log.
Options in effect:
--allow-no-sex
--bfile v62.0_1240k_public
--bmerge mydata
--out v62.0_1240k_public_mydata

5886 MB RAM detected; reserving 2943 MB for main workspace.
Error: Failed to open v62.0_1240k_public.fam.
Jalisciense@vbox:~/Downloads>

I received a file named v62.0_1240k_public_mydata.log, but why did it say that "Failed to open v62.0_1240k_public.fam"? What is wrong?

When I open the v62.0_1240k_public_mydata.log file is like this:

Jalisciense · Nov 22, 2024

I tried to change it to the bin folder, but still no luck.

All my .bed, .bim, .fam, .hh and .log files are in the bin folder:

This is my bin folder:

Celtion · Nov 22, 2024

Jalisciense said:
I need your help boys, I was trying to merging myself to modelling in qpAdm, I converted my AncestryDNA to 23andMe format, then I used this command:

plink --23file AncestryCombined.txt --make-bed --out mydata

And I got files .bed, .bim, .fam, .hh and .log.

Now I used this command:

plink --allow-no-sex --bfile v62.0_1240k_public --bmerge mydata --out v62.0_1240k_public_mydata

And I got this:

PLINK v1.9.0-b.7.7 64-bit (22 Oct 2024) cog-genomics.org/plink/1.9/
(C) 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to v62.0_1240k_public_mydata.log.
Options in effect:
--allow-no-sex
--bfile v62.0_1240k_public
--bmerge mydata
--out v62.0_1240k_public_mydata

5886 MB RAM detected; reserving 2943 MB for main workspace.
Error: Failed to open v62.0_1240k_public.fam.
Jalisciense@vbox:~/Downloads>

I received a file named v62.0_1240k_public_mydata.log, but why did it say that "Failed to open v62.0_1240k_public.fam"? What is wrong?

When I open the v62.0_1240k_public_mydata.log file is like this:

When merging you need the --make-bed flag before the --out flag to create the merged file.

Jalisciense · Nov 22, 2024

Celtion said:
When merging you need the --make-bed flag before the --out flag to create the merged file.

I think I don't understand, is this right?:

plink --23file AncestryCombined.txt --make-bed --out mydata

Celtion · Nov 22, 2024

Yeah but put it in your merging command as well so it should be like:
plink --allow-no-sex --bfile v62.0_1240k_public --bmerge mydata --make-bed --out v62.0_1240k_public_mydata

Jalisciense · Nov 22, 2024

Celtion said:
Yeah but put it in your merging command as well so it should be like:
plink --allow-no-sex --bfile v62.0_1240k_public --bmerge mydata --make-bed --out v62.0_1240k_public_mydata

Now I am getting this error:

Error: Failed to open v62.0_1240k_public.bed.

It is the fault of this or what? (Type: Unknown)

Celtion · Nov 22, 2024

Jalisciense said:
Now I am getting this error:

Error: Failed to open v62.0_1240k_public.bed.

It is the fault of this or what? (Type: Unknown)

Okay I see the problem now. In your working directory it looks like you only have v62.0_1240k_public in PACKEDANCESTRYMAP format (similar to Eigenstrat). You need to convert it to PACKEDPED (Plink format) using ConvertF so you have the .bed, .bim and .fam files.

Jalisciense · Nov 22, 2024

Celtion said:
Okay I see the problem now. In your working directory it looks like you only have v62.0_1240k_public in PACKEDANCESTRYMAP format (similar to Eigenstrat). You need to convert it to PACKEDPED (Plink format) using ConvertF so you have the .bed, .bim and .fam files.

Then I used this command:

./plink -p par.EIGENSTRAT.PED

But now I am getting this error:

jalisciense@vbox:~/bin> ./plink -p par.EIGENSTRAT.PED
PLINK v1.9.0-b.7.7 64-bit (22 Oct 2024) cog-genomics.org/plink/1.9/
© 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink.log.
Options in effect:
--p par.EIGENSTRAT.PED

Error: Unrecognized flag ('-p').
For more information, try "plink --help <flag name>" or "plink --help | more".
jalisciense@vbox:~/bin>

What's going on?

Celtion · Nov 22, 2024

Jalisciense said:
Then I used this command:

./plink -p par.EIGENSTRAT.PED

But now I am getting this error:

jalisciense@vbox:~/bin> ./plink -p par.EIGENSTRAT.PED
PLINK v1.9.0-b.7.7 64-bit (22 Oct 2024) cog-genomics.org/plink/1.9/
© 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to plink.log.
Options in effect:
--p par.EIGENSTRAT.PED

Error: Unrecognized flag ('-p').
For more information, try "plink --help <flag name>" or "plink --help | more".
jalisciense@vbox:~/bin>

What's going on?

No, the convertf command is a part of the Eigensoft suite so you need to have that installed.

Jalisciense · Nov 22, 2024

Celtion said:
No, the convertf command is a part of the Eigensoft suite so you need to have that installed.

Do you mean this?

GitHub - DReichLab/EIG: Eigen tools by Nick Patterson and Alkes Price lab

Eigen tools by Nick Patterson and Alkes Price lab. Contribute to DReichLab/EIG development by creating an account on GitHub.

github.com

Because the first time I downloaded this:

GitHub - DReichLab/AdmixTools: Tools test whether admixture occurred and more

Tools test whether admixture occurred and more. Contribute to DReichLab/AdmixTools development by creating an account on GitHub.

github.com

Celtion · Nov 22, 2024

Jalisciense said:
Do you mean this?

GitHub - DReichLab/EIG: Eigen tools by Nick Patterson and Alkes Price lab

Eigen tools by Nick Patterson and Alkes Price lab. Contribute to DReichLab/EIG development by creating an account on GitHub.

github.com

Yes. Version 8.0 is the latest so best to download that one from your link and follow the instructions to compile. If you run into trouble you can try some pre-compiled older versions: https://github.com/chrchang/eigensoft

baeticvs · Nov 22, 2024

@Jovialis did you delete a post regarding numoutevec setting having to be 15? what does it exactly do?

Jovialis · Nov 22, 2024

baeticvs said:
@Jovialis did you delete a post regarding numoutevec setting having to be 15? what does it exactly do?

Yeah, I tested it out, it is not optimal. Frankly, for projecting aDNA, I believe 1240K (more SNPs) numoutevec: 10 is optimal. I haven't had a chance to do it yet. But that is what is done by most studies. The aDNA samples were not plotting correctly in HO, they were too "Western" on the PCA relative to modern populations. Even the Imperial Roman and Anatolian ChL and BA samples, which are certainly not western. I thought changing the numoutevec would fix this, but it only caused modern samples to plot less optimally, and didn't do much to fix the aDNA samples.

The only downside is that 1240K has a less than comprehensive modern pop set. But I think academic studies include supplemental modern pops from other studies, that require approval from their sources, so not truly "public". Nevertheless, I do think some studies merge HO with 1240K, but I haven't had time to verify that, nor figure out how it is done.

baeticvs · Nov 22, 2024

Jovialis said:
Yeah, I tested it out, it is not optimal. Frankly, for projecting aDNA, I believe 1240K (more SNPs) numoutevec: 10 is optimal. I haven't had a chance to do it yet. But that is what is done by most studies. The aDNA samples were not plotting correctly in HO, they were too "Western" on the PCA relative to modern populations. Even the Imperial Roman and Anatolian ChL and BA samples, which are certainly not western. I thought changing the numoutevec would fix this, but it only caused modern samples to plot less optimally, and didn't do much to fix the aDNA samples.

uh, now I understand something...

Jovialis said:
The only downside is that 1240K has a less than comprehensive modern pop set. But I think academic studies include supplemental modern pops from other studies, that require approval from their sources, so not truly "public".

yeah thats what they do, here you have a lot of modern (and ancient) samples from other studies

Datasets | David Reich Lab

reich.hms.harvard.edu

check supplementary data from studies, for example

(from here:https://www.cell.com/cms/10.1016/j....b751f689-49c6-47bb-beb0-ab140c34922e/mmc2.pdf)
theres reference panels from where you can pick samples too, like 1000 genomes Phase 3 shown in the above image
btw, what do u think about imputation and phasing in the context of admixtools or any ancestry related analysis?

Jovialis · Nov 22, 2024

Jovialis said:

Finally have the process documented from using my HG19-aligned 23andme txt file produced last year from when I processed it from FASTQ

Code:

# Step 1: Convert the 23andMe file to PLINK binary format.
plink --23file /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked_23andMe_V3.txt --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked

# Step 2a: Extract SNPs from the Jovialis dataset.
plink --bfile /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked --write-snplist --out /mnt/d/UbuntuJovialisHome/jovialis_snp_list

# Step 2b: Extract SNPs from the AADR (v62.0_HO_public) dataset.
plink --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --write-snplist --allow-no-sex --out /mnt/d/UbuntuJovialisHome/v62_snp_list

# Step 3: Find common SNPs between the two datasets (Jovialis and AADR).
comm -12 <(sort /mnt/d/UbuntuJovialisHome/jovialis_snp_list.snplist) <(sort /mnt/d/UbuntuJovialisHome/v62_snp_list.snplist) > /mnt/d/UbuntuJovialisHome/common_snps.txt

# Step 4: Filter the Jovialis dataset to keep only SNPs present in both datasets (common SNPs).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_sorted_marked --extract /mnt/d/UbuntuJovialisHome/common_snps.txt --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_common_snps

# Step 5: Attempt an initial merge and identify problematic SNPs (multiallelic or inconsistent strand).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/v62_common_snps --bmerge /mnt/d/UbuntuJovialisHome/Jovialis_common_snps --make-bed --out /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged

# Step 6: Flip problematic SNPs in the Jovialis dataset (fix strand inconsistencies).
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_common_snps --flip /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged-merge.missnp --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_flipped

# Step 7: Exclude remaining problematic SNPs from the Jovialis dataset.
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/Jovialis_flipped --exclude /mnt/d/UbuntuJovialisHome/v62_Jovialis_merged-merge.missnp --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_filtered

# Step 8: Filter the Jovialis dataset to keep only SNPs present in AADR.
plink --bfile /mnt/d/UbuntuJovialisHome/Jovialis_PLINK_binary --extract /mnt/d/UbuntuJovialisHome/v62.0_HO_public.bim --make-bed --out /mnt/d/UbuntuJovialisHome/Jovialis_filtered_for_AADR

# Step 9: Perform the final merge, ensuring all SNPs from AADR are kept and only SNPs matching AADR from Jovialis are merged.
plink --allow-no-sex --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --bmerge /mnt/d/UbuntuJovialisHome/Jovialis_filtered_cleaned --make-bed --out /mnt/d/UbuntuJovialisHome/v62_Jovialis_corrected_final

# Step 10a: Check SNP frequency in the final merged dataset to verify all SNPs from AADR were retained.
plink --bfile /mnt/d/UbuntuJovialisHome/v62_Jovialis_corrected_final --freq --out snp_check

# Step 10b: Check SNP frequency in the original AADR dataset for comparison.
plink --bfile /mnt/d/UbuntuJovialisHome/v62.0_HO_public --freq --out aadr_check

VERY IMPORTANT UPDATE:

I was in the process of re-following my guide to merge my WGS30x sample with AADR 1240K, and I found a couple mistakes that were made by the stupid AI hallucinating part of the process.

Allow-no-sex must was added to step 2b. (you can add it to 2a, but you can just change the FAM accordingly to your proper sex)

Also, and this is crucial, I eliminated 4b: You absolutely do NOT want to filter the AADR to common SNPs found between your sample and AADR. Your sample must defer to AADR!

I have modified the original post accordingly. Apologies for any inconvenience this may have caused people.

Admixtools admixtools2 TUTORIAL for WINDOWS.

Regular Member

Advisor

Advisor

Advisor

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Regular Member

Advisor

Regular Member

Advisor