DNA testing companies under attack

Maciamo

I don’t have time right now to get involved in a lengthy discussion so I’m only responding to a few comments very briefly.

As regards cousin matching, everyone has different reasons for testing. I am more interested in using DNA testing to verify my genealogical research.

I’m somewhat confused about your reference to “historical population geneticists”. Historians specialise in history and population geneticists specialise in population genetics. Ideally we should leave the historians to study history and let the population geneticists analyse the genetic data. There certainly have been problems with papers published by geneticists without any input from historians. Fortunately, research is now very multidisciplinary. Population genetics is a very specialised and highly mathematical discipline. Models are a fundamental part of population genetics. See for example Servedio et al 2015 “Not just a theory—the utility of mathematical models in evolutionary biology”. If you go against the consensus and reject the use of models what methodology are you using to test your hypotheses about the origins of haplogroups?

The peer review process is important because it does at least have the effect of filtering out most of the pseudoscience. It also ensures that published papers conform to acceptable standards in terms of referencing and access to raw data, computer code, etc. The fact that something has gone through the peer review process means that it’s more likely to attract the attention of serious researchers, and also gives the research more credibility. However, the process is not perfect and there are still bad papers that scrape through.

I’m glad you were able to have a look at our website and that you can now appreciate the problems that BritainsDNA caused in the UK with their misleading marketing claims.

I agree that it is possible to produce very fine-scale phylogenies now that we have access advanced Y-DNA testing (eg, the BigY from Family Tree DNA and the Full Genomes YElite test). However, the phylogenies are only telling us about the structure of the tree. The phylogenies do not tell us anything about where these SNPs occurred in the past. There are many different ways of estimating TMRCAs and there are very wide confidence intervals, so even if we can get a reasonably accurate TMRCA for a specific SNP that is still only telling us about the present-day distribution of that SNP and not its past distribution. There is a growing collection of ancient samples that can be used to provide a temporal fix but the number of samples is still very small and comprehensive SNP testing is often not done or is not possible.

As you say, someone could take a Y-DNA test and learn that they are I1-Y1835, and that Y18385 is found today at high prevalence in Norway. With more data the confidence intervals on the TMRCAs are reduced but this still doesn’t tell us where Y18385 originated 1850 years ago plus or minus several hundred years. A lot can happen in 2000 years. Why do you think a male line would have stayed in the same location for all those years? There are always many different stories that will explain the data. That’s why it’s important to do the hypothesis testing to determine the most likely scenario.
 
Last edited:
Thanks for your reply, Debbie.

I agree that it is possible to produce very fine-scale phylogenies now that we have access advanced Y-DNA testing (eg, the BigY from Family Tree DNA and the Full Genomes YElite test). However, the phylogenies are only telling us about the structure of the tree. The phylogenies do not tell us anything about where these SNPs occurred in the past. There are many different ways of estimating TMRCAs and there are very wide confidence intervals, so even if we can get a reasonably accurate TMRCA for a specific SNP that is still only telling us about the present-day distribution of that SNP and not its past distribution. There is a growing collection of ancient samples that can be used to provide a temporal fix but the number of samples is still very small and comprehensive SNP testing is often not done or is not possible.

As you say, someone could take a Y-DNA test and learn that they are I1-Y1835, and that Y18385 is found today at high prevalence in Norway. With more data the confidence intervals on the TMRCAs are reduced but this still doesn’t tell us where Y18385 originated 1850 years ago plus or minus several hundred years. A lot can happen in 2000 years. Why do you think a male line would have stayed in the same location for all those years? There are always many different stories that will explain the data. That’s why it’s important to do the hypothesis testing to determine the most likely scenario.

I don't quite agree. It's part of the job of a population geneticist to investigate where subclades are found and trace back their progressive migration clade after clade. In the case of haplogroup I1, if you notice that a branch is found in Scandinavia and Britain today, you can dig deeper into the phylogeny until you find the split where one subclade is only found in Britain and not in Scandinavia or elsewhere (allowing possibly for minor back migrations, as there are always people who emigrate). At that point, based on the TMRCA and the distribution within Britain, you can determine the likelihood (which will need to be confirmed by ancient DNA tests later) that a subclade is either Anglo-Saxon or Viking. There is nothing extraordinary or magical about that process. That's a well accepted methodology in genetic genealogy.

And that is exactly the methodology I used to determine that R1b originated in Siberia or Central Asia during the late Palaeolithic period, then ended up in various places between the northern Fertile Crescent and Russia during the Mesolithic and Neolithic, and that eventually subclades downstream of R1b-L23 moved westward, reaching central Europe by 2500 BCE, then expanding across western Europe between 2200 and 1200 BCE. I made this migration map in 2013, two years before R1b was found in the Yamna culture.

R1b-migration-map.jpg


I made another similar map (in Flash format so I can't copy it here) in 2009, three years before any ancient Y-DNA had been tested. Of course those predictions couldn't have been possible, especially regarding the dates, without knowledge of archaeology, comparing morphological types (e.g. skull shape), burial types, and above all following the diffusion of bronze technology from the Steppe to Western Europe. I used the same methodology for haplogroup R1a and others. I never mistook using the combination of phylogeny, TMRCA, and archaeological/anthropological evidence. For R1a and R1b, there was the additional evidence from linguistics, as their diffusion matched quite well the estimated divergence ages of the various branches of Indo-European language.

In science, a methodology works if it leads to verifiable results, and mine were verified by ancient DNA test. I know it can be hard for people who have studied exact sciences to accept some of the methodologies of social sciences, which aren't based on mathematics. But that doesn't mean they are necessarily wrong or can't be verified. My academic background is rather unusual since I studied economics, history, then biomedical sciences (immunology, bacteriology, virology, genetics, neuroscience, and so on). Very few people follow such different paths. People readily cross from one natural science to another, or from a social science to another, but there seems to be some kind of enmity between the two groups. I don't care, because I never studied with the aim of getting a job like most people, for to satisfy my thirst for knowledge and understanding. I always knew I would be self-employed as I am too independent and individualistic. I could study all these subjects, plus learn seven languages (self-taught, including English) because I realised that I learn at least twice faster than an average university student, thanks to a quasi-eidetic memory. There are several exceptionally gifted people in my family, but none really like me. I always joke that the less gifted ones became medical doctors (four in total) as they clearly do not have the same curiosity, thirst for knowledge, and the same aptitude to absorb knowledge and detect patterns that nobody else can see.


Going back to the topic of the BritainsDNA controversy, what your team at UCL said could easily be misconstrued by lay people with no knowledge of population genetics. A title like To claim someone has 'Viking ancestors' is no better than astrology or Expensive tests claiming to trace person's ancestry are as dubious as astrology, warn scientists will be perceived by the public as if DNA tests cannot determine ancestry at all, which is nonsense. What is the purpose of genetic genealogy and phylogeography if not to be able to tell where our ancestors came from? The only issue was that naive customers could believe that they were 100% Vikings genetically when their test results showed that their patrilineal haplogroup 'probably' came to Britain with the Vikings. That's a problem of human stupidity, not a problem with the DNA test itself, nor with the ability of genetic genealogy to estimate the likelihood of one's Y-DNA haplogroup being of Viking origin. It is far more hazardous to give a reliable percentage of autosomal DNA of Viking origin at present, and it will always be less reliable in terms of probability than the 'simple' Y-DNA line because of recombinations and very similar shared ancestry between Germanic tribes.

As for mtDNA, I have explained for years that it is practically useless for tracing ancestry for at least the last 4000 or 5000 years. At best mtDNA can distinguish between Mesolithic European, Neolithic Near Eastern farmers and Steppe Indo-Europeans, but not anything more since these populations mixed, as mtDNA evolves very slowly and on a 16,569 bases long sequence any mutation can quickly become hazardous for health. I would have understood that you'd qualify of 'genetic astrology' a company that claimed to be able to tell if someone is of Viking descent based solely on mtDNA. But that is absolutely not the case, and as far as I can tell from the result samples you posted on your blog, they are not really making any outlandish claim. Your criticism was essentially based on the lack of information regarding their sources, or the inability to download the raw mtDNA data. But for most ordinary customers who aren't geneticists and don't care so much about what their test results mean, it's probably not a big issue. I am pretty sure that these were details that could be solved by sending an email to BritainsDNA to request more information. As you said it yourself, this test was heavily marketed in the UK and was therefore intended for ordinary customers, not researchers or professionals. People who are interested enough to want to know more about their test results, including more accurate distribution maps and phylogeography, usually come to this website. ;)

The bottom line is that when lay people read on the BBCE website or top national newspaper that a professor of genetics from a reputed university is saying that DNA tests can't tell you your ancestry, that is what most people will remember, and that will negatively affect all DNA testing companies, not just BritainsDNA. People will just lose confidence in these tests because of the polemic you created. That's not good for researchers either when we know that there are the very at least 10 times more DNA test results from commercial companies than from research labs. Companies like 23andMe have been very clever to gather all this data for medical research while making a profit from it. But even genetic genealogists can benefit from commercial tests through the numerous regional projects at FTDNA. So I am not sure it's a good idea to give such bad press to ancestry tests, especially when it sounds more like a personal vendetta.
 
Last edited:
likelood

Thanks for your reply, Debbie.



I don't quite agree. It's part of the job of a population geneticist to investigate where subclades are found and trace back their progressive migration clade after clade. In the case of haplogroup I1, if you notice that a branch is found in Scandinavia and Britain today, you can dig deeper into the phylogeny until you find the split where one subclade is only found in Britain and not in Scandinavia or elsewhere (allowing possibly for minor back migrations, as there are always people who emigrate). At that point, based on the TMRCA and the distribution within Britain, you can determine the likelihood (which will need to be confirmed by ancient DNA tests later) that a subclade is either Anglo-Saxon or Viking. There is nothing extraordinary or magical about that process. That's a well accepted methodology in genetic genealogy.

Are you able to calculate the likelihood of both the tree and the spatial distribution of its branches?
 
Are you able to calculate the likelihood of both the tree and the spatial distribution of its branches?

I am not sure I understand you question. Could you rephrase it and maybe provide an example? Are you still talking about the specific case of Anglo-Saxon vs Viking, or something more general? The methodology and accuracy obviously varies depending on whether we are talking about a well-documented migration in historical times or an undocumented one in prehistoric times.
 
I am not sure I understand you question. Could you rephrase it and maybe provide an example? Are you still talking about the specific case of Anglo-Saxon vs Viking, or something more general? The methodology and accuracy obviously varies depending on whether we are talking about a well-documented migration in historical times or an undocumented one in prehistoric times.

You used the term 'likelihood', a concept which is frequently used in phylogenetic inference. Can you explain what you mean by 'likelihood'?
 
I don't quite agree. It's part of the job of a population geneticist to investigate where subclades are found and trace back their progressive migration clade after clade.
I am extremely confident that the vast majority of population geneticists would disagree.
In the case of haplogroup I1, if you notice that a branch is found in Scandinavia and Britain today, you can dig deeper into the phylogeny until you find the split where one subclade is only found in Britain and not in Scandinavia or elsewhere (allowing possibly for minor back migrations, as there are always people who emigrate). At that point, based on the TMRCA and the distribution within Britain, you can determine the likelihood (which will need to be confirmed by ancient DNA tests later) that a subclade is either Anglo-Saxon or Viking. There is nothing extraordinary or magical about that process. That's a well accepted methodology in genetic genealogy.
Its not a well-accepted methodology in population genetics. See, for example:
Nielsen & Beaumont (2009) Statistical inferences in phylogeography. Molecular Ecology18, 1034–1047.

Indeed, such a ‘methodology’ is fraught with problems.
In statistics, likelihood is an assessment of the weight of support for a model / model parameter values (θ), given some data. The probability of the data (D) given θ, i.e. P(D|θ), is the likelihood of θ given D, i.e. L(θ|D). This is used widely in science, including population genetics, forensics, etc. It is very useful in phylogenetic tree inference as it provides a means of assessing which is the best of a set of proposed trees. However, to do this you need a likelihood function: a mathematical formulation to calculate P(D|θ). This is possible for trees, but I am not aware of a likelihood function to calculate P(D|θ) where θ is both the tree itself and the spatial / geographic distribution of lineages on that tree through time. If you have such a function, please do let me know.


And that is exactly the methodology I used to determine that R1b originated in Siberia or Central Asia during the late Palaeolithic period, then ended up in various places between the northern Fertile Crescent and Russia during the Mesolithic and Neolithic, and that eventually subclades downstream of R1b-L23 moved westward, reaching central Europe by 2500 BCE, then expanding across western Europe between 2200 and 1200 BCE. I made this migration map in 2013, two years before R1b was found in the Yamna culture.

Have you tested your methodology in cases where you know exactly what the true history is (for example, by applying it to simulated data)?

I made another similar map (in Flash format so I can't copy it here) in 2009, three years before any ancient Y-DNA had been tested. Of course those predictions couldn't have been possible, especially regarding the dates, without knowledge of archaeology, comparing morphological types (e.g. skull shape), burial types, and above all following the diffusion of bronze technology from the Steppe to Western Europe. I used the same methodology for haplogroup R1a and others. I never mistook using the combination of phylogeny, TMRCA, and archaeological/anthropological evidence. For R1a and R1b, there was the additional evidence from linguistics, as their diffusion matched quite well the estimated divergence ages of the various branches of Indo-European language.
So why would you expect the TMRCA of a lineage to correspond to the timing of the demographic processes by which that lineage was spread?

Going back to the topic of the BritainsDNA controversy, what your team at UCL said could easily be misconstrued by lay people with no knowledge of population genetics. A title like To claim someone has 'Viking ancestors' is no better than astrology or Expensive tests claiming to trace person's ancestry are as dubious as astrology, warn scientists will be perceived by the public as if DNA tests cannot determine ancestry at all, which is nonsense. What is the purpose of genetic genealogy and phylogeography if not to be able to tell where our ancestors came from?

Did you not read this bit: “There are some situations where Y chromosome or mitochondrial DNA information can be useful. It is, for example, reasonable to use large samples of these DNA types to say something about the histories of populations, if analyses are performed carefully and at the population level. Also, if genealogical research (parish records, surnames, etc.) suggests that two men share a common male line ancestor in the 16th century, the Y chromosome could be used to support or reject this claim. But individual Y chromosome or mitochondrial DNA types provide no more than the vaguest hint about where their ancestors lived hundreds, or thousands of years ago.”

The only issue was that naive customers could believe that they were 100% Vikings genetically when their test results showed that their patrilineal haplogroup 'probably' came to Britain with the Vikings. That's a problem of human stupidity, not a problem with the DNA test itself, nor with the ability of genetic genealogy to estimate the likelihood of one's Y-DNA haplogroup being of Viking origin. It is far more hazardous to give a reliable percentage of autosomal DNA of Viking origin at present, and it will always be less reliable in terms of probability than the 'simple' Y-DNA line because of recombinations and very similar shared ancestry between Germanic tribes.
I wrote the article because I was concerned that customers were being duped. The company was telling stories that mtDNA and Y chromosome did not, and in most cases, could not support. Is it such a bad thing to point out these legitimate concerns?
 
I don't quite agree. It's part of the job of a population geneticist to investigate where subclades are found and trace back their progressive migration clade after clade. In the case of haplogroup I1, if you notice that a branch is found in Scandinavia and Britain today, you can dig deeper into the phylogeny until you find the split where one subclade is only found in Britain and not in Scandinavia or elsewhere (allowing possibly for minor back migrations, as there are always people who emigrate). At that point, based on the TMRCA and the distribution within Britain, you can determine the likelihood (which will need to be confirmed by ancient DNA tests later) that a subclade is either Anglo-Saxon or Viking. There is nothing extraordinary or magical about that process. That's a well accepted methodology in genetic genealogy.

Genetic genealogy is the combination of genetics with genealogical research. Population genetics requires a different expertise altogether. There is an acquired knowledge base that has been built up over the last 100 years and an extensive literature. To answer questions about historical populations then population genetics techniques must be deployed. An additional problem is the ad hoc nature of the genetic genealogy databases. They are heavily skewed in favour of people of European and British origin. Americans constitute about 70% of the customer database at Family Tree DNA, and the American results are often the subject of founder effects. How are you taking these biases into account? What evidence do you have of Y-chromosome continuity in Europe in the last 1000 years? It is rare to find a male line that has stayed put in the same place. My Cruwys family is a rare example of a surname that has stayed in a single location (Cruwys Morchard in North Devon, England) for over 800 years. However, the surname originates in Normandy or Flanders and not in Britain.

In science, a methodology works if it leads to verifiable results, and mine were verified by ancient DNA test. I know it can be hard for people who have studied exact sciences to accept some of the methodologies of social sciences, which aren't based on mathematics. But that doesn't mean they are necessarily wrong or can't be verified.

Scientists use the scientific method which is a process of formulating hypotheses and testing them. I am still not clear how you have tested your hypotheses. You seem to be doing a post hoc interpretation of the data. It is very easy to detect patterns in data and attribute significance where none exists. This is where the problems of confirmation bias can come into play. Ancient DNA has only provided evidence that a haplotype is in a particular location at a specific point of time. Why do you think this should correlate with the place of origin of the haplogroup?

Going back to the topic of the BritainsDNA controversy, what your team at UCL said could easily be misconstrued by lay people with no knowledge of population genetics. A title like To claim someon has 'Viking ancestors' is no better than astrology or Expensive tests claiming to trace person's ancestry are as dubious astrology will be perceived by the public as if DNA tests cannot determine ancestry at all, which is nonsense. What is the purpose of genetic genealogy and phylogeography if not to be able to tell where our ancestors came from? The only issue was that naive customers could believe that they were 100% Vikings genetically when their test results showed that their patrilineal haplogroup 'probably' came to Britain with the Vikings. That's a problem of human stupidity, not a problem with the DNA test itself, nor with the ability of genetic genealogy to estimate the likelihood of one's Y-DNA haplogroup being of Viking origin. It is far more hazardous to give a reliable percentage of autosomal DNA of Viking origin at present, and it will always be less reliable in terms of probability than the 'simple' Y-DNA line because of recombinations and very similar shared ancestry between Germanic tribes.

A DNA test on its own can't tell us where our ancestors come from. That's why we use genealogical records with the genetic evidence to make inferences within a genealogical time frame. Phylogeography is the process of adding the geographical locations of modern populations to the phylogeny. Phylogeography only tells us about the present-day distribution of populations and not where those populations came from. The problem is that there are often many plausible stories that will fit the data and it is very easy for subjective biases to come into play. The simplest explanation is not necessarily the right one. See: Heber et al 2016. Ancient DNA and the rewriting of human history: be sparing with Occam’s razor. Genome Biol 17: 1. See also the blog post I wrote for Sense About Science on Sense About Genealogical Testing which clarifies the legitimate uses of genetic ancestry for genealogical purposes.

But that is absolutely not the case, and as far as I can tell from the result samples you posted on your blog, they are not really making any outlandish claim. Your criticism was essentially based on the lack of information regarding their sources, or the inability to download the raw mtDNA data. But for most ordinary customers who aren't geneticists and don't care so much about what their test results mean, it's probably not a big issue. I am pretty sure that these were details that could be solved by sending an email to BritainsDNA to request more information. As you said it yourself, this test was heavily marketed in the UK and was therefore intended for ordinary customers, not researchers or professionals. People who are interested enough to want to know more about their test results, including more accurate distribution maps and phylogeography, usually come to this website....

The bottom line is that when lay people read on the BBCE website or top national newspaper that a professor of genetics from a reputed university is saying that DNA tests can't tell you your ancestry, that is what most people will remember, and that will negatively affect all DNA testing companies, not just BritainsDNA. People will just lose confidence in these tests because of the polemic you created. That's not good for researchers either when we know that there are the very at least 10 times more DNA test results from commercial companies than from research labs. Companies like 23andMe have been very clever to gather all this data for medical research while making a profit from it. But even genetic genealogists can benefit from commercial tests through the numerous regional projects at FTDNA. So I am not sure it's a good idea to give such bad press to ancestry tests, especially when it sounds more like a personal vendetta.

The BritainsDNA Chromo2 test was quite a reasonable test. The problems related to their earlier test which was very expensive and only tested around 300 Y-SNPs and 300 mtDNA SNPs. They were also getting the mtDNA haplogroup assignments wrong. This was particularly disappointing for female testers, especially as they could have had their full mtDNA sequenced at FTDNA (16569 bases) for much less than BritainsDNA were charging. However, our criticisms of the company are based not so much on the quality of the tests and the just so reports but the extensive misleading media coverage they have generated. In particular there was concern that the tests were being promoted as part of a "massively subsidised" project with the implication being that the test-takers were contributing to legitimate academic research. If you look at the timeline on the Debunking Genetic Astrology website you'll see that Mark Thomas and David Balding corresponded privately with BritainsDNA but did not get a satisfactory response and instead received a threatening legal letter. The bad science promoted by the company had the effect of bringing population genetics and genetic genealogy into disrepute because the claims made were so outlandish. Some people didn't want to test because they thought it was a scam.

I don't know if you've had any personal contact with customers of BritainsDNA but I've had quite a few of their customers writing to me because they are confused about their results. Some of them have been falsely sold the BritainsDNA test as a genealogy test, and are very disappointed when they discover that the test generally has no application for genealogy. Customers have difficulty locating other sources of information because the company uses its own proprietary SNP naming system (S series SNPs).
 
Last edited:
Empty_Genes and DebbieK, I'm curious about your opinion of the work of Stephen Oppenheimer. I bring him up, because when I think of misleading interpretations of Y-DNA data that have dispersed into discussions by laymen, I think of Oppenheimer's work. It got play in Prospect Magazine, The Telegraph, The New York Times, WalesOnline, etc. The amount of people who have read those articles dwarfs the amount of people who have put any serious thought into population genetics, so even though they were published 10 years ago, many still think that Britain is basically full of direct descendants of Stone Age "Basques."


The flaws in Oppenheimer's methodology, I think you'd agree, are obvious. He basically noticed that Basques are R1b dominant, and Britain is R1b dominant, and voila! Of course, the more we've gotten ancient DNA results, the more his hypotheses have been discounted.


The thing is, the same hasn't been true of most of Maciamo's hypotheses. Haplogroups C-V20 and I as Paleolithic/Mesolithic seems to be holding up, as is Haplogroup G2a as Neolithic, and R1b postdating G2a too. What do you think is different about how they developed their interpretations of the data? Something must be better, at least. I've participated in some of this "post hoc interpretation," and although I don't really consider it an exact science, I've been impressed with the predictive power, in particular of predicting what Y-DNA haplogroups ancient samples will belong to. I don't think it's a problem to also give people best-guess stories about the history of their own haplogroups, at least if you add a lot of probably's. Like, "As Haplogroup I, your lineage has probably been in Europe since the Stone Age, and probably predates most other modern European lines in Europe." It gets tougher when trying to distinguish between Celts and Vikings and the like, but there are similar patterns at that scale as well.


Also, whether you think it's a good idea or not, population geneticists have participated in speculative interpretation of data, much like hobbyists do. They've also often gotten it wrong, usually due to drawing conclusions with insufficient data. For example, Rosser 2000 gave some possible explanations that turned out to be wrong, in particular putting R1b as Paleolithic, and Balaresque 2010 predicted an early Neolithic entry of R1b, which hasn't been holding up.
 
Dear Sparkey

You make some really interesting points; responses below:
Empty_Genes and DebbieK, I'm curious about your opinion of the work of Stephen Oppenheimer. I bring him up, because when I think of misleading interpretations of Y-DNA data that have dispersed into discussions by laymen, I think of Oppenheimer's work. It got play in Prospect Magazine, The Telegraph, The New York Times, WalesOnline, etc. The amount of people who have read those articles dwarfs the amount of people who have put any serious thought into population genetics, so even though they were published 10 years ago, many still think that Britain is basically full of direct descendants of Stone Age "Basques."
I know Stephen Oppenheimer personally, and think he is an extremely decent chap. That said, yes, I agree with you, its nonsense. But the reason I think that is because he is using untested inference methodologies. The speculation of a Paleolithic origin for shared Y chromosomes between Basques and Britain wasn’t even Oppenheimer’s; it was first suggested in Wilson et al (2001: PNAS 98(9):5078-5083).


The flaws in Oppenheimer's methodology, I think you'd agree, are obvious. He basically noticed that Basques are R1b dominant, and Britain is R1b dominant, and voila! Of course, the more we've gotten ancient DNA results, the more his hypotheses have been discounted.
I agree. But as you can probably guess, my concerns about his and other’s using this inference methodology (interpretative phylogeography) are more general. It is of course possible that – based on an interpretative phylogeography – somebody could make predictions that later find support from ancient DNA data. One clear-cut example is that Richards et al (2000: Am. J. Hum. Genet. 67:1251–1276) used this sort of approach to predict that mtDNA U lineages would be predominant in pre-Neolithic Europe, and Bramanti et al (2009: Science 326: 137-140) and others later showed this to be the case.

The thing is, the same hasn't been true of most of Maciamo's hypotheses. Haplogroups C-V20 and I as Paleolithic/Mesolithic seems to be holding up, as is Haplogroup G2a as Neolithic, and R1b postdating G2a too.
It may seem a trivial point, but everybody prior to ~12,000 years ago was Paleolithic/Mesolithic.
What do you think is different about how they developed their interpretations of the data? Something must be better, at least. I've participated in some of this "post hoc interpretation," and although I don't really consider it an exact science, I've been impressed with the predictive power, in particular of predicting what Y-DNA haplogroups ancient samples will belong to. I don't think it's a problem to also give people best-guess stories about the history of their own haplogroups, at least if you add a lot of probably's. Like, "As Haplogroup I, your lineage has probably been in Europe since the Stone Age, and probably predates most other modern European lines in Europe." It gets tougher when trying to distinguish between Celts and Vikings and the like, but there are similar patterns at that scale as well.
Again, I agree that it is OK to “give people best-guess stories about the history of their own haplogroups”. My concerns come with presenting such best-guess stories as science. If you use interpretative phylogeography you can make countless predictions, so obviously some of those will hold up, and those concerning more recent populations are more likely to hold up. But where is the systematic testing of these inference methodologies? There is too much scope for cherry-picking the predictions / confirmation bias, as there is for servicing any of a very large number of mutually exclusive population histories. If I managed to pick some winners in the Epsom races, would you then assume that I have a fool-proof system?
Added to that, just because a particular haplogroup is observed in some ancient DNA sample, that doesn’t mean that is where that haplogroup originated. We recently showed that G2a was present in Aegean early Neolithics (Hofmanova et al (2016: PNAS 113(25): pp6886-6891). We also showed that Aegean early Neolithic individuals were genomically very similar to early farmers from across Europe, in whom G2a is frequently found. So is the presence of G2a in early Aegean framers really that surprising? But that doesn’t mean the Aegean is the G2a homeland.
The main point is that for an inference methodology to be taken seriously as a scientific method, it should systematic (i.e. formulated so that it is free from steering by subjective biases, and can be automated) and then be tested on data where the population histories are fully known. To me, that means tested on data that has been simulated under a known model of population history.

Also, whether you think it's a good idea or not, population geneticists have participated in speculative interpretation of data, much like hobbyists do. They've also often gotten it wrong, usually due to drawing conclusions with insufficient data. For example, Rosser 2000 gave some possible explanations that turned out to be wrong, in particular putting R1b as Paleolithic, and Balaresque 2010 predicted an early Neolithic entry of R1b, which hasn't been holding up.
Indeed, and I could list a whole lot more. But I’d like to make it clear that I do not see hobbyists and professional (i.e. paid) scientists as distinct. I don’t care if you are a professor of some august institution, or working as an assistant examiner in the Swiss Federal Patent Office, or collecting shopping trolleys in a supermarket car park: if you follow the scientific method then you are doing science! In my view there have been many such floored studies – and they are floored primarily because they don’t follow the scientific method, but rather make inferences using ad hoc and untested methodologies.
Interpretative phylogeography isn’t the only untested (and possibly untestable) inference methodology used in the genetic history literature; there are many papers that fall into this category, and most population geneticists consider them problematic at the very least.
 
Dear Sparkey

You make some really interesting points; responses below:

I know Stephen Oppenheimer personally, and think he is an extremely decent chap. That said, yes, I agree with you, its nonsense. But the reason I think that is because he is using untested inference methodologies. The speculation of a Paleolithic origin for shared Y chromosomes between Basques and Britain wasn’t even Oppenheimer’s; it was first suggested in Wilson et al (2001: PNAS 98(9):5078-5083).



I agree. But as you can probably guess, my concerns about his and other’s using this inference methodology (interpretative phylogeography) are more general. It is of course possible that – based on an interpretative phylogeography – somebody could make predictions that later find support from ancient DNA data. One clear-cut example is that Richards et al (2000: Am. J. Hum. Genet. 67:1251–1276) used this sort of approach to predict that mtDNA U lineages would be predominant in pre-Neolithic Europe, and Bramanti et al (2009: Science 326: 137-140) and others later showed this to be the case.


It may seem a trivial point, but everybody prior to ~12,000 years ago was Paleolithic/Mesolithic.

Again, I agree that it is OK to “give people best-guess stories about the history of their own haplogroups”. My concerns come with presenting such best-guess stories as science. If you use interpretative phylogeography you can make countless predictions, so obviously some of those will hold up, and those concerning more recent populations are more likely to hold up. But where is the systematic testing of these inference methodologies? There is too much scope for cherry-picking the predictions / confirmation bias, as there is for servicing any of a very large number of mutually exclusive population histories. If I managed to pick some winners in the Epsom races, would you then assume that I have a fool-proof system?
Added to that, just because a particular haplogroup is observed in some ancient DNA sample, that doesn’t mean that is where that haplogroup originated. We recently showed that G2a was present in Aegean early Neolithics (Hofmanova et al (2016: PNAS 113(25): pp6886-6891). We also showed that Aegean early Neolithic individuals were genomically very similar to early farmers from across Europe, in whom G2a is frequently found. So is the presence of G2a in early Aegean framers really that surprising? But that doesn’t mean the Aegean is the G2a homeland.
The main point is that for an inference methodology to be taken seriously as a scientific method, it should systematic (i.e. formulated so that it is free from steering by subjective biases, and can be automated) and then be tested on data where the population histories are fully known. To me, that means tested on data that has been simulated under a known model of population history.


Indeed, and I could list a whole lot more. But I’d like to make it clear that I do not see hobbyists and professional (i.e. paid) scientists as distinct. I don’t care if you are a professor of some august institution, or working as an assistant examiner in the Swiss Federal Patent Office, or collecting shopping trolleys in a supermarket car park: if you follow the scientific method then you are doing science! In my view there have been many such floored studies – and they are floored primarily because they don’t follow the scientific method, but rather make inferences using ad hoc and untested methodologies.
Interpretative phylogeography isn’t the only untested (and possibly untestable) inference methodology used in the genetic history literature; there are many papers that fall into this category, and most population geneticists consider them problematic at the very least.

I think what Sparky means by Mesolithic is Mesolilitic Europerans that lived 12,000 ybp. As far as Gedmatch is concerned, I usually check the ancient Dna to see if the test is bias and compare them to what we currently know; if the test is way off keister to our present uptodate knowledge and doesn't guess my countries of origins right then the test is most likely bias, you can see the link below. There is a little disclaimer that Maciamo has made, you can also see the link below. Ultimately if you want to understand your results, I recommend studying the countries your ancestors came from and their history. On a side note, G2a was already spread out during the Neolithic so no of course the Aegean is not the original G2a homeland.


http://www.y-str.org/p/ancient-dna.html



http://www.eupedia.com/europe/origins_haplogroups_europe.shtml





http://www.eupedia.com/europe/neolithic_europe_map.shtml
 
From the first link:

"If you go back 3,000 years, your ancestors are almost everybody’s ancestors."

This is utter nonsense.

The number of ancestors does not grow exponentially with generations due to a thing called "pedigree collapse":

https://en.wikipedia.org/wiki/Pedigree_collapse

Yes,
but generally true, at least mathematicly, and
mathematic is the final verification of anything.

It is enaugh to go back 1000 years to have 1,000,000,000,000+ ascendants
when lived 500,000,000 people and in for example everybody country was no
more than 100,000-3,000,000 inhabitants who mostly bred with each other.
It only shows, that any orfanage leftist bubbling about equality of ascendants
is a double Duch speech which is simple a delusion and idiocy.

And ancestor is not the same as ascendant.
 
Top academics, even those researching specifically haplogroups, were not immune from such fancies, and were often the first culprits in spreading them. Everyone remembers how until 2008 the academic establishment would tell the world that haplogroup R1b descended from Cro-Magnons, because its modern geographic distribution looked like it expanded from the Franco-Cantabrian refugium after the LGM.

It is a good example, how much idiocy is spreading by "scholars" who based on
wrong assumptions are making wrong conclusions. I started to be interested in
the stuff since the beginning of 2006 and I didn't buy this absurd theory at all
since the beginning, becasue it made no sense. People are still creating other
often even more idiotic theories, which are falling one after another and which
on this forum is plenty. I stick still to the one, and is going more prooved and
true with every year, becasue only the one is obvious from scriptural, historical,
archeological and genetical records, and another is simply impossible.
 
It is a good example, how much idiocy is spreading by "scholars" who based on
wrong assumptions are making wrong conclusions. I started to be interested in
the stuff since the beginning of 2006 and I didn't buy this absurd theory at all
since the beginning, becasue it made no sense. People are still creating other
often even more idiotic theories, which are falling one after another and which
on this forum is plenty. I stick still to the one, and is going more prooved and
true with every year, becasue only the one is obvious from scriptural, historical,
archeological and genetical records, and another is simply impossible.

What theories do you have in mind? And what exactly do you mean by "the one" just curious.
 
Last edited:
Yes,
but generally true, at least mathematicly, and
mathematic is the final verification of anything.

It is enaugh to go back 1000 years to have 1,000,000,000,000+ ascendants
when lived 500,000,000 people and in for example everybody country was no
more than 100,000-3,000,000 inhabitants who mostly bred with each other.
It only shows, that any orfanage leftist bubbling about equality of ascendants
is a double Duch speech which is simple a delusion and idiocy.

And ancestor is not the same as ascendant.

that is true

it is impossible to sustain all human life created on earth
human life procreates itself to fast

humans, just like anything else on earth should be subject to natural selection
descendants of 2 humans can easily populate a complete planet earth in about 6-700 years
 
Empty_Genes and DebbieK, I'm curious about your opinion of the work of Stephen Oppenheimer. I bring him up, because when I think of misleading interpretations of Y-DNA data that have dispersed into discussions by laymen, I think of Oppenheimer's work. It got play in Prospect Magazine, The Telegraph, The New York Times, WalesOnline, etc. The amount of people who have read those articles dwarfs the amount of people who have put any serious thought into population genetics, so even though they were published 10 years ago, many still think that Britain is basically full of direct descendants of Stone Age "Basques."

The flaws in Oppenheimer's methodology, I think you'd agree, are obvious. He basically noticed that Basques are R1b dominant, and Britain is R1b dominant, and voila! Of course, the more we've gotten ancient DNA results, the more his hypotheses have been discounted.

I've not read Oppenheimer's book. I did buy it but haven't managed to get beyond the first chapter on the Celts. I agree with Mark's comments. In addition Oppenheimer was wrong to place so much emphasis on the Y-chromosome which only represents half of the human population and only a small percentage of the total genome. Fortunately we now have really good data from the ground-breaking People of the British Isles Project. There should be Y-DNA and mtDNA data published from the POBI Project too, hopefully some time next year. I don't think anyone take's Oppenheimer's Basque hypothesis very seriously these days.
 
I apologise for the delayed reply. I was typing a detailed response yesterday, but my PC restarted following a Windows update when I was on the phone and I lost everything. I don't feel like re-writing everything so I will be more succint.

I am extremely confident that the vast majority of population geneticists would disagree.

Then what are they researching? Where is their work leading them? What questions are they trying to answer?

N.B.: I was of course referring to population geneticists working on human populations, and specialising in haplogroups and other tools relating to human evolution and history - not all population geneticists. I am sure that those specialising on horizontal gene transfers between bacteria couldn't care less. But I think that what I meant was clear from the context of haplogroup subclades.

Its not a well-accepted methodology in population genetics. See, for example:

Perhaps that is why the majority of population geneticists specialising in human (pre)history and migrations either cannot come up with anything useful to say about the data they collect, or when they do try to interpret the data, the result is nonsensical as they ignore many of the factors I mentioned above (geographic barriers, archeological evidence, etc.).

One of the first things I learned at university, as part of a philosophy course, was it was vital to analyse data with the right tools. The scale of things isn't the same in fields like astronomy and molecular biology, and it would be absurd to use laws of astrophysics to try to comprehend, say, the ATP synthesis process within a mitochondrion. It's not just a matter of using scientific tools appropriate to the right scale, but also to look at a problem from every angle, using different tolls appropriate to every appropriate level. That is why it would be wrong, for example, to think one could make sense of human psychology solely on the basis on the underlying biochemical processes between neurons, without taking onto considerations other levels such as neural networks, genetic factors (e.g. variants in genes like COMT, DRD2, SERT, MAOA), nutrition, human language and culture, interpersonal interactions, and the physical environment of the person. One thing that gets on my nerves is scientists that are so focused on their specialised field of expertise that they forget or ignore other relevant fields.

Historical population genetics is by its very nature transdisciplinary. The goal is to understand the history of human populations and all the tools available should be combined to achieve that goal. Mathematics have some use, for example to determine TMRCA of Y-haplogroups or calculate autosomal admixtures, but it would be irresponsible to think that one can one can understand human population history using only mathematical models and with complete disregard for variations in human mating behaviours and practices (e.g. exogamy, polygamy, patrilocality, droit du seigneur), causes and motivations for migrations (famine, war, natural disaster, greed, new technological development, new lifestyle, overpopulation, climate change), and without using tools like archaeology or climatology that can shed light on these practices and causes. It is as ridiculous as to try to understand human psychology using only mathematical models. The work of a historical population geneticist like me is more akin to a detective's forensic investigation, gathering all the evidence and using logic and deduction to solve the mysteries of unrecorded human prehistory. No two cases are exactly the same, and we have to work using the evidence and tools that are available to us in each case.

There is no use for theoretical models in this field as there won't be any other application beyond human population history. You can't use the same models for animal population with completely different behaviours, animals that are not organised in societies like us, that do not cultivate plants and domesticate other animals. There is only one human history and once we have reached a satisfying understanding of it, there won't be any use for predictive statistical models any more. Furthermore, a model that works in Bronze Age western Eurasia won't necessarily work in Neolithic Austronesia or Copper Age Mesoamerica because these populations live in completely different environments and have very diverging cultures and lifestyles. Once again, a scientific model based on mathematics cannot take these factors into consideration. It's just not the appropriate tool.

Besides, it's not because a piece of evidence is based on a mathematical model that it is more accurate. TMRCA estimates vary widely between researchers using different methodologies, and none to my knowledge are taking regional historical population densities into consideration. (I have explained this in a post on this forum nearly seven years ago). It would be very hard to work without TMRCA estimates, but they are only estimates and are often less accurate even than radiocarbon dating (which is only approximate).
 
One methodology I have used to trace back the migration of haplogroups, and that to my knowledge nobody else has used, is to look at isolated populations around the world and see which Y-DNA and mtDNA haplogroups they share. The goal is to estimate which Y-DNA and mtDNA haplogroups spread together, with the help of known prehistoric migrations such as the spread of agriculture or bronze technology. I started doing it in January 2010 with Indo-European migrations, which allowed me to isolate some mtDNA lineages for the R1a branch and the R1b branch. This is one of several methods I used to sort out the separate migrations of each branch many years before ancient Y-DNA tests were available, and it turned out to be correct in every region tested so far. I could even tell if an ancient population was mixed R1a+R1b and which of the two would be dominant.

In 2013, I did the same for the migration of Neolithic farmers to North Africa and Iberia linked to haplogroups J1 and T1a. And soon after I analysed the mtDNA lineages of African populations that possessed Y-haplogroup R1b-V88. It is thanks to this method of linking mtDNA to Y-DNA and predicting the haplogroup make-up of the source population that I was able to predict the presence of Y-DNA haplogroup in ancient samples based on the mtDNA samples that had been tested many years earlier, such as the presence of R1b-V88 in Early Neolithic Spain. I also managed to deduce from the accumulated evidence that Y-haplogroup T1a would originated with Early Neolithic farmers in the eastern Fertile Crescent, and not in East Africa (as modern frequencies suggest), nor among the Natufians or Anatolian farmers, as many people had suggested. So far all the ancient data confirmed this.

No need of mathematical models here. Just data analysis, deduction from evidence and logic. No other population geneticist tried this method because it didn't exist. I had to invent it. The problem with most career academics is that they prefer to use tried and tested methods, and preferably, as you said (Mark), methods that can be quantified and tested with mathematical formulae. I am not afraid to try new methods and I don't care what others think of it as long as I get results. Look at Craig Venter who went against the scientific establishment and managed to sequence the human genome faster and at a fraction of the costs of the official US government-financed Human Genome Project. Why can't there be independent-minded scientists like him? Why are so many people concerned about following the rules, protocols and approved methods?

One of the reasons some scientists still believe in God is that they claim its existence cannot be disproved by the scientific method or by mathematics. That's true, but it can be disproved by logic, as Richard Dawkins brilliantly did in the God Delusion, and as I have done too (e.g. here, although I didn't post much on this subject online). Very often, in complex problem solving logic is more powerful than mathematics because logic is based on our neural networks' ability to process vast amounts of data from different sources that cannot be easily rendered into mathematical data. That's why binary computers, despite their tremendous calculating power and speed, which vastly outrange the human brain for mathematics, cannot easily do things that human do almost without thinking, like reading emotions on someone's face or recognise someone's handwriting. That's why artificial neural networks were developed. But they can't be programmed just with formulae. They need to learn by themselves, just like us. Neural networks gather all available evidence, analyse it, and use logic to process it and find a solution. Most of the scientific method (laws, formulae, algorithms) is binary and linear. It is perfect for things like physics, chemistry and technological applications such as mechanics and electronics. But it sucks at making useful predictions in social sciences, because human behaviour and society are based on neural networks (our brains). When it comes to analysing the history of populations, logic using transdisciplinary evidence is the way to do.
 
Last edited:
I was of course referring to population geneticists working on human populations, and specialising in haplogroups and other tools relating to human evolution and history - not all population geneticists.

Population geneticists do not study haplogroups per se. The Y-chromosome represents only two per cent of our DNA and, because only males have a Y-chromosome, it only represents half of the human population. mtDNA is a tiny molecule of just 16569 base pairs, and provides a very limited view of our ancestry because it only follows the all-female line. While Y-DNA and mtDNA are very useful for genetic genealogy and can be used very effectively within a genealogical timeframe they are less useful for exploring the history of populations. That is why research now focuses on autosomal DNA which is so much more informative.

There is no use for theoretical models in this field as there won't be any other application beyond human population history… There is only one human history and once we have reached a satisfying understanding of it, there won't be any use for predictive statistical models any more. Furthermore, a model that works in Bronze Age western Eurasia won't necessarily work in Neolithic Austronesia or Copper Age Mesoamerica because these populations live in completely different environments and have very diverging cultures and lifestyles. Once again, a scientific model based on mathematics cannot take these factors into consideration. It's just not the appropriate tool... Besides, it's not because a piece of evidence is based on a mathematical model that it is more accurate.

Computer modelling is a basic tool of population genetics that is used by all the best researchers in the field. A good model will take into account different factors. A mathematical model is not a piece of evidence. It provides an unbiased means of testing hypotheses and determining which hypothesis is the most plausible. What methodology do you propose to use for testing hypotheses if you don’t wish to use computer models?

One methodology I have used to trace back the migration of haplogroups, and that to my knowledge nobody else has used, is to look at isolated populations around the world and see which Y-DNA and mtDNA haplogroups they share.

It’s always encouraging to see people coming up with new ideas but any new methods have to be tested to see if they work. You have formulated a hypothesis that Y-DNA and mtDNA haplogroups travel in tandem but how have you tested this hypothesis?
 
@DebbieK

From a population history stand point I have to disagree. We have seen YDNA patterns from ancient cultures, much less on the female side. While it may not always be exactly black or white, there have been some patterns. ie: R1a showing up in Corded Ware, H2 and G showing up in Anatolian descended farmers, R1b in Yamnaya...etc
 
From a population history stand point I have to disagree. We have seen YDNA patterns from ancient cultures, much less on the female side. While it may not always be exactly black or white, there have been some patterns. ie: R1a showing up in Corded Ware, H2 and G showing up in Anatolian descended farmers, R1b in Yamnaya...etc

Humans are very good at seeing patterns and attributing meaning to those patterns even when the patterns have no significance. That's why we have the scientific method which is a way of testing hypotheses.
 

This thread has been viewed 33920 times.

Back
Top