admixtools2 TUTORIAL for WINDOWS.

AFAIK, negative FST values should be essentially considered as 0 for most intents and purposes, meaning there is no notable genetic subdivision between the populations considered.

Edit:

Greek_2.DG is probably mixed. Don't forget to take into account the s.e. also.
 
FST run comparison (HO dataset) of neighboring moderns to Greece_Minoan_Lassithi, and Greece_BA_Mycenaean.

Code:
pop1                   pop2                 est       se
   <chr>                  <chr>              <dbl>    <dbl>
 1 Greece_Minoan_Lassithi Albanian          0.0168 0.00102 
 2 Greece_Minoan_Lassithi Armenian          0.0189 0.000951
 3 Greece_Minoan_Lassithi Armenian_Hemsheni 0.0233 0.00100 
 4 Greece_Minoan_Lassithi Egyptian          0.0239 0.000836
 5 Greece_Minoan_Lassithi Greek             0.0174 0.000853
 6 Greece_Minoan_Lassithi Italian_Sardinian 0.0206 0.00165 
 7 Greece_Minoan_Lassithi Italian_South     0.0138 0.00111 
 8 Greece_Minoan_Lassithi Lebanese          0.0203 0.000945
 9 Greece_Minoan_Lassithi Syrian            0.0224 0.000937
10 Greece_Minoan_Lassithi Turkish           0.0193 0.000833

Code:
pop1                pop2                  est      se
   <chr>               <chr>               <dbl>   <dbl>
 1 Greece_BA_Mycenaean Albanian          0.00675 0.00141
 2 Greece_BA_Mycenaean Armenian          0.00810 0.00132
 3 Greece_BA_Mycenaean Armenian_Hemsheni 0.0118  0.00137
 4 Greece_BA_Mycenaean Egyptian          0.0131  0.00119
 5 Greece_BA_Mycenaean Greek             0.00701 0.00123
 6 Greece_BA_Mycenaean Italian_Sardinian 0.0105  0.00199
 7 Greece_BA_Mycenaean Italian_South     0.00702 0.00148
 8 Greece_BA_Mycenaean Lebanese          0.00970 0.00130
 9 Greece_BA_Mycenaean Syrian            0.0126  0.00134
10 Greece_BA_Mycenaean Turkish           0.00861 0.00120
 
FST run comparison (HO dataset) of neighboring moderns to Greece_Minoan_Lassithi, and Greece_BA_Mycenaean.

Code:
pop1                   pop2                 est       se
   <chr>                  <chr>              <dbl>    <dbl>
 1 Greece_Minoan_Lassithi Albanian          0.0168 0.00102 
 2 Greece_Minoan_Lassithi Armenian          0.0189 0.000951
 3 Greece_Minoan_Lassithi Armenian_Hemsheni 0.0233 0.00100 
 4 Greece_Minoan_Lassithi Egyptian          0.0239 0.000836
 5 Greece_Minoan_Lassithi Greek             0.0174 0.000853
 6 Greece_Minoan_Lassithi Italian_Sardinian 0.0206 0.00165 
 7 Greece_Minoan_Lassithi Italian_South     0.0138 0.00111 
 8 Greece_Minoan_Lassithi Lebanese          0.0203 0.000945
 9 Greece_Minoan_Lassithi Syrian            0.0224 0.000937
10 Greece_Minoan_Lassithi Turkish           0.0193 0.000833

Code:
pop1                pop2                  est      se
   <chr>               <chr>               <dbl>   <dbl>
 1 Greece_BA_Mycenaean Albanian          0.00675 0.00141
 2 Greece_BA_Mycenaean Armenian          0.00810 0.00132
 3 Greece_BA_Mycenaean Armenian_Hemsheni 0.0118  0.00137
 4 Greece_BA_Mycenaean Egyptian          0.0131  0.00119
 5 Greece_BA_Mycenaean Greek             0.00701 0.00123
 6 Greece_BA_Mycenaean Italian_Sardinian 0.0105  0.00199
 7 Greece_BA_Mycenaean Italian_South     0.00702 0.00148
 8 Greece_BA_Mycenaean Lebanese          0.00970 0.00130
 9 Greece_BA_Mycenaean Syrian            0.0126  0.00134
10 Greece_BA_Mycenaean Turkish           0.00861 0.00120

So genetic distance wise the Albanians and the Greeks are closer to Myceaneans than South Italians are.

Hmm, this is new to me because on various Eurogenes or Dodecad calculators the opposite was true.
 
So genetic distance wise the Albanians and the Greeks are closer to Myceaneans than South Italians are.
Hmm, this is new to me because on various Eurogenes or Dodecad calculators the opposite was true.


If I remember correctly, a lot of the Greek samples (like a number of the GREEKGRALPOP/Greek that were used in the Lazaridis et al. (2014) nature paper) have West Asian admixture, from what source I don't know (maybe Anatolian/peninsular profile mixes like Greek-2.DG).

I am not commenting on the validity of stuff like g25, because I am tired of this conversation.
 
If I remember correctly, a lot of the Greek samples (like a number of the GREEKGRALPOP/Greek that were used in the Lazaridis et al. (2014) nature paper) have West Asian admixture, from what source I don't know (maybe Anatolian/peninsular profile mixes like Greek-2.DG).

I am not commenting on the validity of stuff like g25, because I am tired of this conversation.

What about the Albanians, though? Isn't it more likely that a Southern Italian is more akin to the Mycenaeans than an Albanian?
 
What about the Albanians, though? Isn't it more likely that a Southern Italian is more akin to the Mycenaeans than an Albanian?

If you believe that the Epirotes where part mycenean ......then ok

There were 14 epirote tribes noted by historians
 
What about the Albanians, though? Isn't it more likely that a Southern Italian is more akin to the Mycenaeans than an Albanian?

This thread is (mostly) a tutorial on the use of admixtools2.

I cannot help you with your question, I am afraid, plenty of other threads online discussing g25 functionality and validity.

It's up to you to figure out the nuances and, if kind enough, to inform us on what is being done wrong.

Sorry, I can't be more of help.
 
FST run comparison (HO dataset) of neighboring moderns to ponticgreek (Gumusxane).

Code:
A tibble: 8 × 4
  pop1    pop2                      est      se
  <chr>   <chr>                   <dbl>   <dbl>
1 ponticgreek Armenian          -0.000694   0.00286
2 ponticgreek Armenian_Hemsheni  0.00282    0.00293
3 ponticgreek Azeri              0.00353    0.00280
4 ponticgreek Georgian          -0.00000605 0.00276
5 ponticgreek Greek              0.00338    0.00281
6 ponticgreek Kurd               0.0100     0.00288
7 ponticgreek Ossetian           0.00428    0.00283
8 ponticgreek Turkish            0.00102    0.00274
 
It's up to you to figure out the nuances and, if kind enough, to inform us on what is being done wrong.

Here's my 2 cents: Fst is sensitive to different sample sizes and data types. Try out f3 with Mbuti or Neanderthal as an outgroup instead, it should be immune to this.
 
This thread is (mostly) a tutorial on the use of admixtools2.

I cannot help you with your question, I am afraid, plenty of other threads online discussing g25 functionality and validity.

It's up to you to figure out the nuances and, if kind enough, to inform us on what is being done wrong.

Sorry, I can't be more of help.

Ok, thank you anyway.
 
Here's my 2 cents: Fst is sensitive to different sample sizes and data types. Try out f3 with Mbuti or Neanderthal as an outgroup instead, it should be immune to this.

Yes, but there's no such mention of switching to f3 in the tutorial:

ADMIXTOOLS 2 Tutorial • admixtools (uqrmaie1.github.io)

FST

[FONT=MathJax_Math-italic]F[FONT=MathJax_Math-italic]S[/FONT][FONT=MathJax_Math-italic]T[/FONT]FST is closely related to [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2, but unlike [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2, it doesn’t function as a building block for other tools in ADMIXTOOLS 2. However, it is the most widely used metric to estimate the genetic distance between populations. Running extract_f2() will create files which don’t only contain [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2 estimates for each population pair, but also separate [FONT=MathJax_Math-italic]F[/FONT][FONT=MathJax_Math-italic]S[/FONT][FONT=MathJax_Math-italic]T[/FONT]FST estimates. The function fst() can either read these pre-computed estimates, or compute them directly from genotype files:[/FONT]
fst(my_f2_dir)
fst(prefix, pop1 = "Altai_Neanderthal.DG", pop2 = c("Denisova.DG", "Vindija.DG"))
To estimate [FONT=MathJax_Math-italic]F[FONT=MathJax_Math-italic]S[/FONT][FONT=MathJax_Math-italic]T[/FONT]FST without bias, we need at least two independent observations in each population. With pseudohaploid data, we only get one independent observation per sample, and so for populations consisting of only one pseudohaploid sample, [FONT=MathJax_Math-italic]F[/FONT][FONT=MathJax_Math-italic]S[/FONT][FONT=MathJax_Math-italic]T[/FONT]FST cannot be estimated without bias. If we want to ignore that bias and get estimates anyway, we can pretend the pseudohaploid samples are actually diploid using the option adjust_pseudohaploid = FALSE.[/FONT]
fst(prefix, pop1 = "Altai_Neanderthal.DG", pop2 = c("Denisova.DG", "Vindija.DG"),
adjust_pseudohaploid = FALSE)

 
So genetic distance wise the Albanians and the Greeks are closer to Myceaneans than South Italians are.
Hmm, this is new to me because on various Eurogenes or Dodecad calculators the opposite was true.

Maybe it's because Albanians and Mycenaeans both have a lot of Balkan EEF (GRC_N, ROU_N). In South Italians it's predominantly western EEF (Iberia_N). It can be seen in G25.
 
I think the sampling in the Reich dataset for moderns is not as super strict as in the PCA calculators where only clearly defined clusters of sub-groups exist as references with nothing in between.

You can see that in the case of the Greek samples, there's only a couple (?) labelled as "outlier" but I haven't inquired too much into it.
 
Yes, but there's no such mention of switching to f3 in the tutorial:

ADMIXTOOLS 2 Tutorial • admixtools (uqrmaie1.github.io)

It's mentioned that f3 is used for measuring shared drift vs. an outgrouop.
In practice this is sometimes used for generating similarity lists, similarly as f2 and fst are used. For example here the image B:
https://www.researchgate.net/figure...etic-history-with-Austronesian_fig3_303355340 shows the similarity of various Asian populations to Ma'anyan.
 
It's mentioned that f3 is used for measuring shared drift vs. an outgrouop.
In practice this is sometimes used for generating similarity lists, similarly as f2 and fst are used. For example here the image B:
https://www.researchgate.net/figure...etic-history-with-Austronesian_fig3_303355340 shows the similarity of various Asian populations to Ma'anyan.


OK, but this admixtools2 from October 2021, mate (your linked paper is from 2016 referring to the previous iteration).

This app uses f2, you can read further as to why/how in the github tutorial I keep linking.

...

[FONT=&quot]All of this is based on f-statistics ([FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2, [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT]f3, and [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]4[/FONT]f4), and all f-statistics can be derived from [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2 statistics.[/FONT]
[FONT=&quot]Because of this, ADMIXTOOLS 2 divides the computations into two steps:[/FONT]

  1. Computing [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2-statistics and storing them on disk. This can be slow since it accesses the genotype data.
  2. Using [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2-statistics to fit models. This is fast because [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2-statistics are very compact compared to genotype data.
[FONT=&quot]This page shows how standard ADMIXTOOLS analyses can be conducted in ADMIXTOOLS 2. In addition to that, ADMIXTOOLS 2 introduces a range of new methods, mostly focused on admixture graphs, which are intended to make analyses simpler, faster, and most importantly, more robust. These methods focus on quantifying variability by resampling SNPs, automated exploration of graph topologies, and simulating data under admixture graphs

...




[/FONT]
 
OK, but this admixtools2 from October 2021, mate (your linked paper is from 2016 referring to the previous iteration).

This app uses f2, you can read further as to why/how in the github tutorial I keep linking.




[/FONT][/COLOR]

Admixtools 2 is still pretty much the same as 1. It just runs in R instead of cmd, and has some options for faster running. Output of a f3 run should be the same in versions 1 and 2.

That qoute says all other calculations (f3, f4, qpadm...) are based on f2, it doesn't mean you should use only f2.
 
Admixtools 2 is still pretty much the same as 1. It just runs in R instead of cmd, and has some options for faster running. Output of a f3 run should be the same in versions 1 and 2.

That qoute says all other calculations (f3, f4, qpadm...) are based on f2, it doesn't mean you should use only f2.


Okay, so we should not use admixtools2 and FST, is that what you're saying?

They just put it in there to troll people.

Shall we just use enthusiast/highbrow approved g25 and be done with it.
 
These are the main uses of f3 with admixtools:

f3 and qp3Pop

There are three main uses of [FONT=MathJax_Math-italic]f[FONT=MathJax_Main]3[/FONT]f3-statistics:[/FONT]

  1. Testing whether a population is admixed: If [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT][FONT=MathJax_Main]([/FONT][FONT=MathJax_Math-italic]A[/FONT][FONT=MathJax_Main];[/FONT][FONT=MathJax_Math-italic]B[/FONT][FONT=MathJax_Main],[/FONT][FONT=MathJax_Math-italic]C[/FONT][FONT=MathJax_Main])[/FONT]f3(A;B,C) is negative, this suggests that [FONT=MathJax_Math-italic]A[/FONT]A is admixed between a population related to [FONT=MathJax_Math-italic]B[/FONT]B and one related to [FONT=MathJax_Math-italic]C[/FONT]C.
  2. Estimating the relative divergence time for pairs of populations (outgroup [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT]f3-statistics): Pairwise [FONT=MathJax_Math-italic]F[/FONT][FONT=MathJax_Math-italic]S[/FONT][FONT=MathJax_Math-italic]T[/FONT]FST and [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]2[/FONT]f2 are simpler estimates of genetic distance or divergence time, but they are affected by differences in population size. If [FONT=MathJax_Math-italic]O[/FONT]O is an outgroup relative to all populations [FONT=MathJax_Math-italic]i[/FONT]i and [FONT=MathJax_Math-italic]j[/FONT]j, then [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT][FONT=MathJax_Main]([/FONT][FONT=MathJax_Math-italic]O[/FONT][FONT=MathJax_Main];[/FONT][FONT=MathJax_Math-italic]i[/FONT][FONT=MathJax_Main],[/FONT][FONT=MathJax_Math-italic]j[/FONT][FONT=MathJax_Main])[/FONT]f3(O;i,j) will estimate the genetic distance between [FONT=MathJax_Math-italic]O[/FONT]O and the points of separation between [FONT=MathJax_Math-italic]i[/FONT]i and [FONT=MathJax_Math-italic]j[/FONT]j without being affected to drift that is specific to any population [FONT=MathJax_Math-italic]i[/FONT]i or [FONT=MathJax_Math-italic]j[/FONT]j.
  3. Fitting admixture graphs: [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT]f3-statistics of the form [FONT=MathJax_Math-italic]f[/FONT][FONT=MathJax_Main]3[/FONT][FONT=MathJax_Main]([/FONT][FONT=MathJax_Math-italic]O[/FONT][FONT=MathJax_Main];[/FONT][FONT=MathJax_Math-italic]i[/FONT][FONT=MathJax_Main],[/FONT][FONT=MathJax_Math-italic]j[/FONT][FONT=MathJax_Main])[/FONT]f3(O;i,j) for an arbitrary population [FONT=MathJax_Math-italic]O[/FONT]O, and all pairs of [FONT=MathJax_Math-italic]i[/FONT]i and [FONT=MathJax_Math-italic]j[/FONT]j are used in qpGraph, which is described below

How do you converge these with the function of FST in admixtools2?
 
Okay, so we should not use admixtools2 and FST, is that what you're saying?

They just put it in there to troll people.

Shall we just use enthusiast/highbrow approved g25 and be done with it.
No, you got it wrong.
I'm away from my PC, i found your thread interesting, and I just recommended you try out f3 as well, and compare it to fst.

No need to get admixtools1 involved, it's pretty much the same as admixtools2, just less user friendly.
 
These are the main uses of f3 with admixtools:



How do you converge these with the function of FST in admixtools2?

Use no 2, it says it's similar to f2/fst, but less sensitive to population size.
 

This thread has been viewed 30952 times.

Back
Top