Eupedia Forums
Site NavigationEupedia Top > Eupedia Forum & Japan Forum
Results 1 to 12 of 12

Thread: Principal Component Analyses (PCA)-based findings are highly biased.

  1. #1
    researcher eupator's Avatar
    Join Date
    19-07-22
    Posts
    285

    Y-DNA haplogroup
    R-A12332*
    MtDNA haplogroup
    W6*

    Country: Greece


  2. #2
    Advisor Angela's Avatar
    Join Date
    02-01-11
    Posts
    21,578


    Ethnic group
    Italian
    Country: USA - New York



    1 members found this post helpful.
    Yes, I saw it. Did you notice the author?

    Anyone who knows anything about population genetics knows that PCAs give results based solely on the samples you choose, so it's relatively easy to manipulate the tool to show what you want it to show.

    That's why I ignore some amateur produced PCAs; I know the creator(s) aren't above "fiddling" with the samples they choose to include.

    Plus, it's two dimensions.

    Most "hobbyists" don't understand its inherent limitations.


    Non si fa il proprio dovere perchè qualcuno ci dica grazie, lo si fa per principio, per se stessi, per la propria dignità. Oriana Fallaci

  3. #3
    Banned
    Join Date
    12-10-16
    Posts
    1,262


    Country: Albania



    That's why I always use Y-Dna to verify my claims.

  4. #4
    Junior Member
    Join Date
    07-05-19
    Location
    Kansas City
    Posts
    8

    Y-DNA haplogroup
    DF27, z195, FT372222
    MtDNA haplogroup
    T2b

    Country: USA - Missouri



    Thank the powers that be, never understood how using PCA programs could be used effectively in genetics. You need a way to 'moor' the findings to a stable structure such as a helix for DNA bits or maybe even a map that identfies gravesites and maybe migration routes in order to have the values not stretch to show you what you want.

    The values twist, sometimes completely flip into the data you were looking for when dealing with a fourth dimension and not incorporating time as a factor.
    But I haven't kept up with with new findings. Reading the article I hope the community moves away from pca statistical modeling for mixed populations and focus more on where certain bits of DNA are first found but thats just my hope. I'll keep reading and watch what happens to the genomic community.
    I'm just a hobbyist with biases against pca modelling

  5. #5
    Advisor Angela's Avatar
    Join Date
    02-01-11
    Posts
    21,578


    Ethnic group
    Italian
    Country: USA - New York



    Quote Originally Posted by ihype02 View Post
    That's why I always use Y-Dna to verify my claims.
    Ydna can wind up telling you little and sometimes nothing about who you really are.

    My father, and practically every male in his corner of the world, was U-152. My mother bequeathed me a U2e lineage straight from the steppe.

    Even my father was probably not more than 30% steppe, and my mother less. Both of them, and me, are mostly Anatolian Neolithic with some extra CHG/Iran Neo.

    I don't deny or ignore the steppe ancestry, but it's only part of who I am and who they were.

    Yes, uniparentals can help us track migrations, but it's not an identity.

    Identity is composed of all of you, all your genome, your language, history and culture.

    To fill in the pre-history or even the history before the medieval period, we need to use all the tools at our disposal: uniparentals, PCAs, Admixture, qpAdm, all of them, while understanding the limitations of each one.

    You can't be a linear thinker in this discipline; it leads to tunnel vision and error.

  6. #6
    researcher eupator's Avatar
    Join Date
    19-07-22
    Posts
    285

    Y-DNA haplogroup
    R-A12332*
    MtDNA haplogroup
    W6*

    Country: Greece



    1 members found this post helpful.
    In case I am misunderstood, I am not a polemic 100% of the use of PCAs, on the contrary.

    PCAs can give you a rough idea of where to tread more often than not.

    What I am against and that's why I might come across as a zealot is using them as some sort of gospel, especially when it comes to fine tuning the minor or the extremely minor components, in order to re-write history.

    For example, I've seen on the web that one of the arguments used against Lazaridis' paper (2022) to counter the observation that EHG does not seem to exist in Anatolia is constructed on the fact that the closed source PCA of choice shows such admixture 1-3%.

    I don't find this line of argument serious.

  7. #7
    Advisor Angela's Avatar
    Join Date
    02-01-11
    Posts
    21,578


    Ethnic group
    Italian
    Country: USA - New York



    ^^It isn't serious science, but it's very serious politics.

  8. #8
    Moderator Pax Augusta's Avatar
    Join Date
    23-06-14
    Location
    Ara Pacis
    Posts
    1,808


    Ethnic group
    Italian
    Country: Italy



    1 members found this post helpful.

    All the criticisms he makes can be made of any multivariate analysis tool. It criticises PCAs because they are the most widely used.

    Genetics has been full for over 30 years of studies with contradictory results and absurd conclusions (reductio ad absurdum), cherry-picking and circular reasoning. And certainly not only for PCAs.

    Many genetic studies of the last 30 years have turned out to be highly biased in hindsight. It is really a problem of genetics (and geneticists) regardless. At amateur levels it becomes absolute anarchy.

  9. #9
    Regular Member
    Join Date
    13-12-19
    Posts
    39


    Country: Bulgaria



    1 members found this post helpful.
    It is good that Erhan Elhaik bring this problem for more attention for the scientific comunity. There is such issue and I was also trying to explain about it, however it is a lot of work just to explain the issue, give some examples, show the limitations etc..
    Erhan did some work about this. My point of view is quite different, I see also many advantages in PCA, that Erhan is not talking about and nobody is talking about it.

    There is also an issue with EHG, because the way they found them is based on PCA, however they don't look properly on this PCA projection. EHG should not be considered a different cluster, they are closely connected to WHG with some Central Asia admixture.

    CHG are a different story. There is no big connection between EHG and CHG.

  10. #10
    Regular Member
    Join Date
    16-07-22
    Posts
    38


    Country: United States



    "CAN BE highly biased" might be more accurate. Check this out, PC1 and PC2 map pretty close to latitude and longitude:


  11. #11
    Moderator Pax Augusta's Avatar
    Join Date
    23-06-14
    Location
    Ara Pacis
    Posts
    1,808


    Ethnic group
    Italian
    Country: Italy



    1 members found this post helpful.
    Quote Originally Posted by AnthrogenicaMember View Post
    "CAN BE highly biased" might be more accurate. Check this out, PC1 and PC2 map pretty close to latitude and longitude:



    This PCA uses a very old set, has been in various papers for years, and is perhaps one of the worst around. So much so that SK (Slovakia) ends up behind Cyprus.

  12. #12
    Regular Member
    Join Date
    25-06-18
    Posts
    1,681

    Y-DNA haplogroup
    R1b-M269 (LDNA)
    MtDNA haplogroup
    U5a1b

    Ethnic group
    Thracian
    Country: Greece



    2 members found this post helpful.
    Quote Originally Posted by eupator View Post
    In case I am misunderstood, I am not a polemic 100% of the use of PCAs, on the contrary.

    PCAs can give you a rough idea of where to tread more often than not.

    What I am against and that's why I might come across as a zealot is using them as some sort of gospel, especially when it comes to fine tuning the minor or the extremely minor components, in order to re-write history.

    For example, I've seen on the web that one of the arguments used against Lazaridis' paper (2022) to counter the observation that EHG does not seem to exist in Anatolia is constructed on the fact that the closed source PCA of choice shows such admixture 1-3%.

    I don't find this line of argument serious.
    PCA studies should lay all their cards on the table, laying out the dataset, assumptions, and how much of the variance is accounted by each PC. Anybody should be able to replicate their experiments/studies. Otherwise they are useless. The same criticism should apply to all studies and tools.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •