23andMe 23andme Ancestry Comp Results of northern Italians and Tuscans

New Paper: 4cM Matches are usually wrong in unphased data (which 23andme uses)

An important paper just came out regarding false segments seen when using non-phased data and segments of 4cM or smaller.

I have analyzed the paper and attempted to translate it's findings into plain English below.

BOTTOM LINE:

Unphased data such as 23andme, FamilyTreeDNA (everyone, really) uses is more often inaccurate for smaller segments. Specifically:

4cM segment matches with strangers are only real (IBD) 33% of the time. 67% of the time they are false-positives (pseudo-segments) caused by using unphased data.
6cM segment matches with strangers are usually real.

So DNA Relatives with a 7cM threshold (as well as a number of SNPs in the segment threshold) are very likely real IBD and those tiny 2cM, 3cM, 4cM segments that people are seeing in Gedmatch etc. are likely (67% of the time) *not* real (IBD).

ABSTRACT URL:
http://arxiv.org/abs/1311.1120

FULL TEXT URL:
http://arxiv.org/pdf/1311.1120v1

PAPER TITLE: "Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis"

MY TRANSLATION INTO ENGLISH FROM TECHNO-SPEAK:

Study looked at segments on Chromosome 21.

A total of 25,432 individuals of European ancestry were used in the study.

Of this total, they used 2,952 Mother/Father/Child "Trios" in order to be able to phase the child's results. The rest of the people in the study were (presumed) unrelated people (what I will call "strangers") used to identify possible segment matches with the child for further analysis.

In the 2,952 Mother/Father/Child Trios they found 13,307,562 2cM-4cM segments that the child matched with one or more of the "strangers".

They then analyzed how many of these segments were also shared by the child's parents (meaning that they are real segments IBD (Identical By Decent)) and found the following:

14% of the segments that the child matched with the stranger were found in a parent. (Could be identified as real IBD segments)
25% of the these segments were partially found in a parent (a shorter "truncated" segment was found in a parent)
61% of these segments were not found at all in the parent.

They checked all of the segments not completely found in the parent to see if testing errors (mis-calls, etc.) or other "false positives" could account for it. They found that testing errors could only account for 3% of the differences (97% were real differences in the segments).

They decided to allow that some of the 25% segments seen partially (truncated in) the parent were probably real, and a result of difficulty in determining start and end points for segments (possibly due using SNP data and different and microdeletions, insertions, etc.?). Specifically, they said that 80% of these were likely real, so 20% were likely not real. (20% of 25%) + 61% = 66% which they rounded up to 67%.

They put the reason for the likely false segments between parents and children as being due to testing services using unphased data, such that the child has a segment which is a combination of segments from both parents and, in the child, appears that it *could* be a real segment but, in reality, is not. (What I call a "pseudo-segment.")

Other Data Found:

Of all segments found between the child and a stranger, over 98% were shorter than 4 cM. So only 2% of all matches between the child and a stranger were 4cM or more.

Most segments longer than 6cM are real segments, but this correspondence "drops rapidly as segment length is reduced."

..............................
I will have to re evaluate by matches now ( above was not written by me)
 
There are numerous threads on the 23andme site where knowledgeable posters made these points years ago. People who aren't long time members would benefit from doing site searches.

I personally haven't looked at any matches below 5cm for years now. It's just a waste of time. Actually, I don't usually bother with anything below the 23andme threshold. At that level it seems to be pretty accurate.
 
Your results are quite shocking. It appears you have a bit Middle Eastern/North African, East Asian/Native American,and Ashkenazi Jewish ancestry. I wonder how that could be? (especially the East Asian/Native American ancestry.)

0,1 % and 0,5 % are irrelevant results.
 

This thread has been viewed 24244 times.

Back
Top