How to misinterpret ADMIXTURE and STRUCTURE

Angela

Elite member
Messages
21,823
Reaction score
12,327
Points
113
Ethnic group
Italian
To my knowledge this is the second paper on this topic.

The first one was by Graham Coop and is discussed here:
https://www.eupedia.com/forum/threa...Calculators-Have-To-Be-Interpreted-Cautiously


This is the second one:
Daniel J. Lawson et al:
https://www.nature.com/articles/s41...3Mt0291wGkL_3UaqHYXHr5X3y-53ptl7hbBb4m2eqpA==

"Genetic clustering algorithms, implemented in programs such as STRUCTURE and ADMIX-
TURE, have been used extensively in the characterisation of individuals and populations
based on genetic data. A successful example is the reconstruction of the genetic history of
African Americans as a product of recent admixture between highly differentiated popula-
tions. Histories can also be reconstructed using the same procedure for groups that do not
have admixture in their recent history, where recent genetic drift is strong or that deviate in
other ways from the underlying inference model. Unfortunately, such histories can be mis-
leading. We have implemented an approach, badMIXTURE, to assess the goodness of
fit of the model using the ancestry palettes estimated by CHROMOPAINTER and apply it to both
simulated data and real case studies. Combining these complementary analyses with addi-
tional methods that are designed to test specific hypotheses allows a richer and more robust
analysis of recent demographic history. Model-based clustering has become a popular approach
to visualise the genetic ancestry of humans and other organisms. Pritchard et al.1introduced a Bayesian
algorithm STRUCTURE for defining populations and assigningindividuals to them. FRAPPE and
ADMIXTURE were later implemented based on a similar underlying inference model but
with algorithmic refinements that allow them to be run on datasets with hundreds of thousands
of genetic markers 2,3. Followingmany successful examples of inference 4–6, the STRUCTURE
barplot has become a de-facto standard used as a non-parametric description of genetic data
7alongside a Principle Components Analysis 8. However, some experienced researchers feel that
STRUCTURE has become“ a victim of its own success” due to frequent over-interpretation of the results
7.

Experienced researchers, particularly those interested in
population structure and historical inference, typically present
STRUCTURE results alongside other methods that make differ-
ent modelling assumptions. These include TreeMix 9, ADMIX-
TUREGRAPH 10, fine STRUCTURE 11, GLOBETROTTER12
,f3 and D statistics 13, amongst many others. These models can be
used both to probe whether assumptions of the model are likely
to hold and to validate specific features of the results. Each also
comes with its own pitfalls and difficulties of interpretation. It is
not obvious that any single approach represents a direct repla-
cement as a data summary tool. Here we build more directly on
the results of STRUCTURE/ADMIXTURE by developing a new
approach, badMIXTURE, to examine which features of the data
are poorly fit by the model. Rather than intending to replace more
specific or sophisticated analyses, we hope to encourage their use
by making the limitations of the initial analysis clearer."

"Results
The default interpretation protocol. Most researchers are cau-
tious but literal in their interpretation of STRUCTURE and
ADMIXTURE results, as caricatured in Fig.1, as it is difficult to
interpret the results at all without making several of these
assumptions."

xoZO7bQ.png
[/IMG]
 

This thread has been viewed 4337 times.

Back
Top