Tip : You can now access this page by typing directly haplogroups.eu into your browser.
If you are new to population genetics...
This page aims at providing detailed descriptions of each haplogroup and their history. If you are unfamiliar with haplogroups or population genetics, we recommend that you familiarise yourself first with the basics by viewing the Video Tutorials about genetics and read our Frequently Asked Questions about DNA tests. Each haplogroup corresponds to a distinct ancestral lineage. Haplogroups are divided into numerous levels of subclades that form a phylogenetic tree, which is just a fancy word for genealogical tree of genetic ancestry. You may also find it useful to visualise the modern geographic distribution of Y-DNA haplogroups to get a sense of they represent.
The information about the origin and ethnic association of haplogroups on this website should not be read as hard facts, but, as is often the case in science, as a model in constant evolution based on the present knowledge and understanding (of the author). Whenever the advancement of genetics couldn't provide irrefutable answers, we have attempted to provide the most likely and logical hypothesis based on archeological, historical and linguistic evidence. This page is being updated regularly to keep up with recent studies giving additional insights or rectifying possibly erroneous theories. Feel free to add comments or share your opinion on the forum.
Nucleobases are the alphabet of DNA. There are four of them : adenine (A), thymine (T), guanine (G) and cytosine (C). They always go by pairs, A with T, and G with C. Such pairs are called "base pairs".
The 46 chromosomes of human DNA are composed of a total of 3,000 million base pairs.
The Y chromosome possess 60 million nucleobases, against 153 million for the X chromosome.
Mitochondrial DNA is found outside the cell's nucleus, and therefore outside of the chromosomes. It consists only of 16,569 bases.
A SNP (single nucleotide polymorphism) is a mutation in a single base pair. At present, only a few hundreds SNP's define all the human haplogroups for mtDNA or Y-DNA.
DNA studies have permitted to categorise all humans on Earth in genealogical groups sharing one common ancestor at one given point in prehistory. They are called haplogroups. There are two kinds of haplogroups: the paternally inherited Y-chromosome DNA (Y-DNA) haplogroups, and the maternally inherited mitochondrial DNA (mtDNA) haplogroups. They respectively indicate the agnatic (or patrilineal) and cognatic (or matrilineal) ancestry.
Y-DNA haplogroups are useful to determine whether two apparently unrelated individuals sharing the same surname do indeed descend from a common ancestor in a not too distant past (3 to 20 generations). This is achieved by comparing the haplotypes through the STR markers. Deep SNP testing allows to go back much farther in time, and to identify the ancient ethnic group to which one's ancestors belonged (e.g. Celtic, Germanic, Slavic, Greco-Roman, Basque, Iberian, Phoenician, Jewish, etc.).
In Europe, mtDNA haplogroups are quite evenly spread over the continent, and therefore cannot be associated easily with ancient ethnicities. However, they can sometimes reveal some potential medical conditions (see diseases associated with mtDNA mutations). Some mtDNA subclades are associated with Jewish ancestry, notably K1a1b1a, K1a9,d K2a2a and N1b.
The study of Y-chromosomes is far more interesting than that of mitochondrial DNA for two reasons.
Firstly, the Y chromosome is a sequence of 60 million "characters" (nucleobases), against only 16,569 for mtDNA. The Y chromosome therefore offers a much greater resolution as mutations are more common, and indeed happen every generation. In contrast, mtDNA mutations happen much more infrequently. Since the time of the Mitochondrial Eve, approximately 200,000 years ago, modern humans have acquired in average 20 mtDNA mutations in each lineage - about one every ten thousand years. Even though the number of mutations has accelerated with the soaring of human population over the last 10,000 years, the dating of lineages based on mtDNA alone remains very approximate, and practically useless for historical times. By sequencing the full Y chromosome, it is theoretically possible to map the entire patrilineal genealogy of humanity (or any other species) within a few generations (or even within one generation). This is a collossal task, and an expensive one too, since full chromosome sequencing (reading every nucleobase one by one) remains very expensive compared to SNP genotyping (checking only for mutations already discovered in other individuals). The arrival of the full Y-chromosome sequencing (or even whole genome sequencing) on the market has permitted to achieve an optimal resolution, but their price remains well above the standard commercial tests. This restricts the overall reach of these tests and the most common haplogroups in rich countries are at present much better studied than the other ones.
The second advantage of Y-DNA over mtDNA is that men have traditionally been less mobile than women (except during military invasions, like the Indo-Europeans, the Vikings or the Arabs). In almost every settled, agricultural society, men are the ones who inherit their parents's property, and therefore remain in the same location generation after generation. Women, on the other hand, were often send away to marry in another village or town, so that their lineages spread more evenly over time, thus progressively erasing the traces of ancient settlement patterns.
Paternal and maternal haplogroups in prehistoric Europe
Following the end of the last Ice Age approximately 12,000 years ago, European hunter-gatherers recolonised the continent from the Ice Age refugia in southern Europe. The vast majority of Mesolithic Europeans would have belonged to Y-haplogroup I. This included I* (the * means that no further subclade was identified), pre-I1, I1, I2*, I2a*, I2a2, I2c, but the most widespread appears to have been I2a1, which was found in most parts of Europe. Northeast Europeans would have belonged mostly to haplogroup R1a, and to a lower extent also I2a2 and R1b. Other minor male lineages were certainly also present in parts of Europe, notably haplogroup A1a, C-V20, and possibly even Q1a.
The maternal lineages of Mesolithic Europeans appears to have been predominantly U4 and U5, but also included several H subclades (H1, H3, H17), T, U2 (U2d et U2e) and V. The presence of mt-haplogroups I and W in Eastern Europe or the North Caucasus is possible but hasn't been confirmed yet.
Based on their modern distributions, mtDNA haplogroups H10 and H11 might well have Mesolithic/Palaeolithic European origins.
There seem to have been several Palaeolithic and/or Mesolithic migrations from Northwest Africa to Iberia. The oldest might have brought West African paternal haplogroup A1a to Western and Northern Europe during the Palaeolithic. A1a has been found in modern populations as far north as Ireland, Scotland, Scandinavia and Finland. The presence of African maternal lineages (L2, L3 and possibly L1b1) has been attested in Neolithic Iberia. Northwest Africans would also have brought U6 and possibly HV0/V lineages to Europe.
A small percentage of sub-Saharan African admixture has been identified in Late Mesolithic Swedes from the Pitted Ware culture (2800-2000 BCE), which would imply that A1a was already present in northern Europe at the time. Another Mesolithic sample from Loschbour in Luxembourg had dark hair and considerably darker skin than modern Europeans.
Distribution map of Y-DNA and mtDNA haplogroup in and around Europe circa 8000 BCE
Click to enlarge.
Neolithic and Chalcolithic Europe
Agriculture first developed in the Levant, then spread to Anatolia, Greece, the Balkans, Italy, Central and Eastern Europe. These Neolithic farmers were confirmed to have belonged primarily to Y-DNA haplogroups G2a, but also included minorities of C1a2, E1b1b, H2 (formerly F3), J1, J2 and T1a lineages, who could have been assimilated in Anatolia before entering Europe. As they advanced across Europe, Neolithic farmers also increasingly assimilated European lineages, notably I2a1 in Southeast Europe, I1 and I2a1 in Central Europe, I2a1 and I2a2a in Western Europe, and E-M78, I2a1 and I2a2a in Southwest Europe.
Hundreds of Neolithic samples from all over Europe (but especially Central Europe and Iberia) have been tested. The new lineages brought by these Near Eastern immigrants included mt-haplogroups HV, J1, J2, K1, K2, N*, N1, T1a, T2b, T2c, T2e, T2f, U3, W, X1, X2, and many subclades of H (including H2, H5, H7, H13 and H20). H4, H8 and H9 seem to have originated in the Near East as well, although no Neolithic sample has been identified in Europe yet.
However, due to the proximity of the Caucasus from the Indo-European homeland, many of these mt-haplogroups were almost certainly also transported by the Indo-Europeans themselves. This would notably be the case of H5, K1a, T2b, U3, W and X2.
The Bronze Age and the Indo-European migrations
The origin of the Indo-European peoples is a subject that has caused much ink to flow among archaeologists and historians. Their Urheimat (original homeland) has been speculated to lie in Anatolia, around the Caucasus, in Iran, in India, in Central Asia, in Russia, or even in Scandinavia. Thanks to Paleogenetics we now know that these people expanded during the Late Copper and Early Bronze Age from the Pontic Steppe to the North of the Black Sea and the Caucasus. There seems to have been two distinct, though closely related, groups of tribes speaking the Proto-Indo-European language, from which descend almost all the European languages today (apart from Basque, Hungarian, Estonian, Finnish and Sami) as well as Armenian, Kurdish, Persian and most North Indian languages. Tribes belonging mainly to the paternal haplogroup R1a reportedly occupied the North of the steppe (forest-steppe and tundra), while in the South (open steppe) were nomadic cow herders belonging mainly to haplogroup R1b.
Their migration both westward to Europe and eastward to Central and South Asia makes it easy to infer which mtDNA haplogroups they carried (=> see also Identifying the original Indo-European mtDNA from isolated settlements). The best matches for R1a are C4a, H1b, H1c, H2a1, H6, H11, K1b1b, K1c, K2b, T1a1a1, T2a1b1, T2b2, T2b4, U2e, U4, U5a1a, W, and several I subclades.
The R1b branch would have originated in eastern Anatolia and/or northern Mesopotamia/Syria during the Early Neolithic period, where they probably domesticated cattle and became primarily cattle herders. Then would have migrated to the western part of the Iranian plateau, crossed the Caucasus to the Pontic Steppe in search for pasture for their cattle, where they mixed to some extent with I2a2 and R1a tribes that inhabited those lands. The maternal lineages of these Near Eastern R1b people would have included haplogroups H5a, H6, H8, H15, I1a1, J1b1a, K1a3, K2a6, U5, and some V subclades (like V15).
MtDNA haplogroups H4 has not been found in Europe before the Late Chalcolithic (Corded Ware culture) and the Early Bronze Age (Unetice culture) and might have been brought by the Indo-Europeans. Likewise, H6 is absent from all Mesolithic or Neolithic samples, and its strong presence in the North Caucasus and Central Asia supports an Indo-European connection.
Chronological development of Y-DNA haplogroups
C => 66,000 years ago (in the East Africa)
E => 62,500 years ago (in Africa)
G => 48,000 years ago (in the Middle East)
K => 46,000 years ago (between the Caucasus and India)
I => 43,000 years ago (around the Black Sea)
J => 43,000 years ago (in the Middle East or the Caucasus)
T => 42,000 years ago (around the Iranian Plateau)
C1a2 => 41,500 years ago (in the Middle East)
E1b1b => 35,000 years ago (in Northeast Africa)
Q & R => 32,000 years ago (in the Central Asia or Siberia)
J1 => 31,000 years ago (in the Caucasus or Zagros mountains)
J2 => 31,000 years ago (in northern Mesopotamia or the Caucasus)
I1 & I2 => 27,500 years ago (in Europe)
T1a => 27,000 years ago (around the Iranian Plateau)
R1b => 23,000 years ago (around the Caspian Sea or in Russia)
R1a => 23,000 years ago (in Russia)
E-M78 => 20,000 years ago (in north-eastern Africa)
G2a => 20,000 years ago (in the Middle East)
I2a1a (M26) & I2a1b (M423) => 18,500 years ago (in southern Europe)
J2a1 => 18,500 years ago (in northern Mesopotamia or in the Caucasus)
E-M123 => 18,000 years ago (around the Red Sea or in the Levant)
I2a2a (M223) => 17,500 years ago (in southern Europe)
J2b1 & J2b2 => 16,000 years ago (around the Iranian Plateau or the Caucasus)
N1c1 => 15,500 years ago (in northern China)
E-M81 => 14,000 years ago (in North Africa)
R1b-M269 => 13,500 years ago (around the Caspian Sea)
I2a2b (L38) => 12,500 years ago (in central Europe)
J1-P58 => 11,500 years ago (in the Middle East or the Caucasus)
I2a2a-L801 => 9,500 years ago (in central or northern Europe)
R1a1a1 (M417) => 8,500 years ago (in Northeast Europe)
T1a-CTS2214 => 8,500 years ago (in the Middle East)
E-V13 => 7,500 years ago (in Central or Southeast Europe)
N1c1-L1026 => 6,500 years ago (in Northeast Europe)
Q1b1a-L245 => 6,500 years ago (in central Asia or in the Middle East)
R1b-L23 => 6,500 years ago (around the Caucasus)
J1-L858 => 5,500 years ago (in the Middle East)
R1b-U106 & R1b-P312 => 5,000 years ago (in Central Europe)
I1a (DF29) => 4,500 years ago (in Scandinavia)
Q1a2-Y4827 => 3,000 years ago (in Scandinavia)
Map of early Bronze Age cultures in Europe around 4,500 to 5,000 years ago
Haplogroup C is an extremely old lineage thought to have appear before or soon after the first migration of Homo Sapiens outside Africa, some 70,000 years ago. Men belonging to haplogroup C would have departed from East Africa during the Ice Age and followed the coasts of Indian Ocean, settling in the Arabian peninsula, the Indian subcontinent, south-east Asia, north-east Asia and Oceania.
The first group to split away was C-Z1426, which colonised the Middle East and South Asia. One branch (CTS11043) might have moved north to Central Asia, then split into two: one tribe moving west to Europe (haplogroup C-V20) while the other migrated to East Asia and survives only in Japan today (haplogroup C-M8). Haplogroup C-V20 probably represents the first migration of Homo Sapiens to Europe 45,000 years ago, and would therefore have been the first to come into contact with European Neanderthals, although Homo sapiens are likely to have interbred with Neanderthals in the Middle East before that.
The second branch of C-Z1426 spread around South Asia, Southwest Asia, and Central Asia, where it is found at low frequencies nowadays (haplogroup C-M356).
During that time, other C tribes continued their eastward migration to south-east Asia, where they split in four main regional clusters. The first branch colonised Indonesia, Melanesia, Micronesia, and Polynesia (haplogroup C2-M38). A second branch would have gone south to Australia, where they became the Aborigenes (haplogroup C4-M347). Another settled in the highlands of New Guinea (haplogroup C-P55). The fourth branch went all the way up the north-east Asia (haplogroup C3-M217) and is found nowadays chiefly among the Mongols, tribes descended from the Mongols (Kalmyks, Hazaras) including Turkic people (Kazakhs, Kyrgyz, Uyghurs, Uzbeks, Tuvans, Yakuts), East Siberian tribes (Buryats, Chukchi, Itelmens, Nivkh, Tungusic peoples), Chinese (Han, Hui, Manchus, Oroqens, Tujia), Koreans and Japanese (especially the Ainus), but also among several indigenous peoples of North America, including some Na-Dené-, Algonquian-, or Siouan-speaking populations.
Haplogroup C is a very rare lineage in Europe. The few Europeans who belong C either belong to the European C-V20, the Middle Eastern C-M358, or the Mongolian C3-M217. Haplogroup C3 has also been identified in one Hunnic skeleton from the Iron Age in present-day Mongolia. Its presence in Europe can therefore be linked to the Hunnic and Mongolian invasions, like haplogroup Q1a.
Haplogroup L (Y-DNA)
Haplogroup L is found mostly in West Asia and South Asia. Its overall frequency ranges between 5 and 15% in Pakistan and western India, with a peak of 23% among the Kalash of northwest Pakistan, and from 1 to 10% in central Asia (mostly in Uzbekistan, Tajikistan and Afghanistan). It is also found in the Middle East (5% in Lebanon, 4.5% in Turkish Kurdistan, 4% in Iran, 3% in Syria), in parts of the the Caucasus (7% in Azerbaijan and Chechnya, 3% in Armenia and Ingushetia), and in isolated parts of Europe (3.5% in north-east Italy, from 0.2% to 1% in the Balkans and Greece, 0.5% in Flanders).
Haplogroup L is divided in four main subclades:
L1a (M27) is the mostly found in India and Sri Lanka, with frequencies decreasing towards Pakistan, southern Iran, the Arabian peninsula. It has also been found in Piedmont (Italy), Rhineland (Germany) and Flanders (Belgium).
L1b (M317) is found chiefly in the South Caucasus, eastern Anatolia and Lebanon. It has also been found in South Tyrol, Russia and Central Asia. Its main subclade L1b1 (M349) has been found in Italy, Switzerland, Austria, Germany, Belgium, England, northern Ireland, and scattered around most of central and eastern Europe and the eastern Mediterranean. The presence of L1b and L1b1 in Europe probably dates back to the Neolithic period.
L1c (M357) is an essentially Gedrosian subclade, found among the Burushos, Kalashs (L1c1-PK3 subclade), and Pashtuns of Pakistan and Afghanistan, but also among the Chechens in the north-east Caucasus. It is also found at low frequencies in other populations of Pakistan, in India, northern Iran, Georgia and Ingushetia. In Europe it has been found in Sicily.
At present L2 (L595) has been found exclusively in Europe (Greece, Italy, southern Germany, Russia) and in the South Caucasus.
Haplogroup H (Y-DNA)
Haplogroup H is typically found among Dravidian populations in the Indian subcontinent, especially in South India and Sri Lanka. In Europe it is found almost exclusively among the Gypsies (Romani), who belong predominantly (between 15% and 50%) to the H1a (M82) subclade of Indian origin. The highest frequencies of haplogroup H among non-Romani Europeans are found in regions with large Romani populations, such as Romania, Slovakia, the southern Balkans, and Andalusia, suggesting that these lineages are also of Romani origin.
Haplogroup H2 P96, known as F3 until 2013) was a minor lineage early Neolithic farmers in the Levant, Anatolia and Europe. It is still found at very low frequencies in western Europe, Armenia, Iran and India.
Haplogroup A (Y-DNA)
A is the oldest of all Y-DNA haplogroups. It originated in sub-Saharan Africa over 140,000 years ago, and possibly as much as 340,000 years ago if we include haplogroup A00. Modern populations with the highest percentages of haplogroup A are the Khoisan (such as the Bushmen) and the southern Sudanese.
There are only rare and isolated cases of European men belonging to haplogroup A. Commercial tests have identified a few Scottish and Irish families (surnames Boyd, Logan and Taylor) all belonging to the same A1b1b2 (M13) subclade. This subclade is normally found in East Africa (Ethiopia, Sudan), but has also been found in Egypt, the Arabian peninsula, Palestine, Jordan, Turkey, Sicily, Sardinia and Algeria. It was certainly brought to Europe by Levantine people, be it during the Neolithic or later (Phoenicians, Jews, immigration within the Roman Empire).
Haplogroup A1a* (M31) has been found in Finland, Norway and eastern England. This subclade is normally found along the west coast of Africa (Guinea-Bissau, Cape Verde, Mali, Morocco) and could have come to Europe during the Paleolithic. Indeed a few percent of sub-Saharan admixture was found among ancient DNA samples from Mesolithic Scandinavia tested by Skoglund et al. (2012).
All mtDNA haplogroups found in Europe descend from the N group, which is thought to represent one of the two initial migrations by modern humans out of Africa, some 60,000 to 80,000 years ago. Nowadays haplogroup N is only found at extremely low frequencies in various parts of Eurasia.
Unfortunately, the tiny size of mitochondrial DNA (approximately 16,500 base pairs as opposed to 60 million for Y-DNA) does not allow a very accurate tracing of ancestry. Basal mitochondrial haplogroups all arose during the Ice Age, a period when humans were nomadic hunter-gatherers, well before the establishment of cities and civilizations. Evene deep subclades generally point to a common Neolithic or Bronze Age ancestry, but rarely later than that, and do not necessarily match any recognisable historical ethnic and linguistic groups. One likely reason is that women, through whom mtDNA is passed, tended to marry outside their ethnic group more often than men (e.g. to secure an alliance between two tribes or kingdoms). Haplogroups associated with European or Middle Eastern descent are H, I, J, K, T, U, V, W and X (except the branch X2a which found among Native Americans).
Chronological development of mtDNA haplogroups
Note that the age of mitochondrial haplogroups is much more difficult to estimate than Y-DNA haplogroups, due to the tiny sequence of mtDNA and the few number of mutations available. The error margin for the dates below is typically of +-5,000 years, but could even exceed that for older haplogroups.
N => 75,000 years ago (arose in North-East Africa)
R => 70,000 years ago (in South-West Asia)
U => 60,000 years ago (in North-East Africa or South-West Asia)
pre-JT => 55,000 years ago (in the Middle East)
JT => 50,000 years ago (in the Middle East)
U5 => 50,000 years ago (in Western Asia)
U6 => 50,000 years ago (in North Africa)
U8 => 50,000 years ago (in Western Asia)
pre-HV => 50,000 years ago (in the Near East)
J => 45,000 years ago (in the Near East or Caucasus)
HV => 40,000 years ago (in the Near East)
H => over 35,000 years ago (in the Near East or Southern Europe)
X => over 30,000 years ago (in north-east Europe)
U5a1 => 30,000 years ago (in Europe)
I => 30,000 years ago (Caucasus or north-east Europe)
J1a => 27,000 years ago (in the Near East)
W => 25,000 years ago (in north-east Europe or north-west Asia)
U4 => 25,000 years ago (in Central Asia)
J1b => 23,000 years ago (in the Near East)
T => 17,000 years ago (in Mesopotamia)
K => 16,000 years ago (in the Near East)
V => 15,000 years ago (arose in Iberia and moved to Scandinavia)
H1b => 13,000 years ago (in Europe)
K1 => 12,000 years ago (in the Near East)
H3 => 10,000 years ago (in Western Europe)
Mitochondrial DNA of prehistoric Europeans
The testing of ancient DNA help us understand how long each haplogroup has been in Europe. Dozens of samples from the Paleolithic and Mesolithic, and hundreds from the Neolithic, Chalcolithic and Bronze Age have already been tested. You can check this non-exhaustive list of Prehistoric European mtDNA by period and culture.
European mtDNA haplogroups and their subclades
Haplogroups H & V (mtDNA)
Haplogroup H is by far the most common all over Europe, amounting to about 40% of the European population. It is also found (though in lower frequencies) in North Africa, the Middle East, Central Asia, Northern Asia, as well as along the East coast of Africa as far as Madagascar.
H1, H3 and V are the most common subclades of HV in Western Europe. H1 peaks in Norway (30% of the population) and Iberia (18 to 25%), and is also high among the Sardinians, Finns and Estonians (16%), as well as Western and Central European in general (10 to 12%) and North-West Africans (10 to 20%). H3 is commonest in Portugal (12%), Sardinia (11%), Galicia (10%), the Basque country (10%), Ireland (6%), Norway (6%), Hungary (6%) and southwestern France (5%). Haplogroup V reaches its highest frequency in northern Scandinavia (40% of the Sami), northern Spain, the Netherlands (8%), Sardinia, the Croatian islands and the Maghreb. It is likely that H1, H3 and V, along with haplogroup U5, were the main haplogroups of Western European hunter-gatherers living in the Franco-Cantabrian refuge during the last Ice Age, and repopulated much of Central and Northern Europe from 15,000 years ago.
Haplogroup H13 is most common in Sardinia and around the Caucasus. Its distribution is reminiscent of Y-DNA haplogroup G2a. The same is true of H2 to a lower extent. This would suggest a Caucasian or Anatolian origin.
H5 and H7 are also common in the Caucasus, but their lower incidence around the Mediterranean, and higher frequency from Anatolia to the Alps via the Danube suggest a possible link with the spread of agriculture (YDNA G2a, etc.) or of the Indo-Europeans (R1b-L23).
Haplogroup U is extremely old. It originated some 60,000 years ago at the confine of North-East Africa and the Middle East, soon after the first Homo Sapiens ventured out of Africa. This is why each of its top-level subclade (U1, U2, U3...) can be seen as a haplogroup in its own right. The main European subclades are U3, U4, U5 and U8/K. U1 is mostly found in the Middle East, U6 in North Africa, U7 from the Near East to India, and the rare U9 from Ethiopia and the Arabian peninsula to Pakistan.
Haplogroup U2 is found primarily in South Asia, but probably is of Indo-European origin as it is found at low frequencies throughout the Pontic-Caspian steppe and has been identified in a 30,000 year-old Cro-Magnon from the middle Don valley in Russia. It might have been the dominant haplogroup of the northern forest-steppe foragers who later became the Proto-Indo-Iranian speakers (see R1a above) and moved massively to Central and South Asia.
Haplogroup U4 is strongly associated with Y-haplogroup R1a. It is found in most of Europe, but especially in Balto-Slavic countries, but also in Siberia, Central Asia, Afghanistan and northern Pakistan. U4 was already present in many parts of Europe (Russia, Sweden, Germany, Portugal) during the Mesolithic period, but seems to have almost disappeared from central Europe during the Neolithic, before being re-introduced by the Proto-Indo-European speakers from Russia and Ukraine during the Bronze Age.
Haplogroup U5 is the most common in Western and Northern Europe. DNA tests on ancient skeletons have shown that U5 was the principal mitochondrial haplogroup of Paleolithic and Mesolithic hunter-gatherers in Northern Europe. Ancient DNA tests conducted in Britain, Germany and Scandinavia indicate that the frequency of U5 has progressively declined over time through the Neolithic, Bronze Age, Iron Age and Middle Ages. Nowadays it remains most common in the far north of Europe, where the Mesolithic population has been least affected by subsequent migrations. For instance, 30 to 50% of the Sami people of northern Scandinavia belong to haplogroup U5b (and about 40% to haplogroup V, which is also pre-Neolithic European origin).
Haplogroup K is the main subclade of U8. It is found throughout Europe and Western Asia, as far away as India. Its highest concentration is in North-West and Central Europe, Anatolia and the southern Arabian peninsula. It is believed to have first arisen somewhere between Egypt and Anatolia approximately 16,000 years ago (estimates range from 22,000 years to as little as 10,000 years before present). It has the largest number of subclades of any haplogroup in spite of its fairly recent age. K1a is the largest subclade. The relatively important presence of K1a in the Near East suggest that it predates the Neolithic migration to Europe. This has been supported by the ancient mtDNA from Neolithic sites. Haplogroup K was never found in Europe prior to to the Neolithic, then suddenly appears at a frequency (17%) much higher than in modern Europeans and similar to that of the present-day Levant. Most of the Neolithic K belongs to the K1a subclade.
Most K1a4, K1a10, K1b, K1c and K2 subclades are typically European. K1a4 is also common in Anatolia and Greece, and could indeed have spread to the rest of Europe from there during the Neolithic period, along with haplogroups J and T (and Y-DNA haplogroups E1b1b, J2 and T). The Indo-Europeans from Anatolia could also have contributed to the propagation of K. K1a1b1a and K1a9 are found primarily among Ashkenazi Jews.
Haplogroup J originated in the Middle East 45,000 years, making it one of the oldest mitochondiral haplogroups in Europe and the Middle East. Haplogroups J1c and J2a1 might have been present in Southeast Europe since the Epipaleolithic, then were probably diffused by Neolithic farmers across the rest of Europe. J2b1a, a mostly Near Eastern subclade, has been found in Neolithic samples in Europe alongside J1c.
Haplogroup J1b is found across the Near East, particularly between the Caucasus, Iran and Arabia. J1b1a is the only J1b subclade typically found among Europeans. It is present all over Europe as well as around the Caucasus, in Central Asia and the Altai, and was almost certainly spread by the R1b branch of the Indo-Europeans.
J1d, J2a2, J2b2 are essentially confined to the Middle East (+ North Africa for J2a2).
Haplogroup T is thought to have originated in the Middle East about 30,000 years ago. It is found throughout Europe, the northern half of Africa through the Near East to Central Asia and Siberia, with pockets in India and North-West China (Xinjiang). Some T1 and T2 subclades are thought to have entered Europe during Late Glacial and the immediate postglacial periods, but to have been dispersed around Europe mostly by later population movements, first with agriculturalists during the Neolithic, then with the Proto-Indo-European speakers from the Pontic Steppe during the Bronze Age.
Present at low frequencies in most of Europe, in Anatolia, around the Caspian Sea, and from the Indo-Pakistani border to Xinjiang, haplogroup W is one of the best maternal markers of Indo-European ancestry (mtDNA equivalent of R1a and R1b). Its highest frequency is in Ukraine, European Russia, Baltic countries and Finland (3 to 5% overall), as well as in northern Pakistan (15%), Punjab (9%) and Gujarat (12%). In India, it is considerably more common among the upper castes and among Indo-European speakers.
Like haplogroup W, haplogroup I is found at low frequency over most of Europe, especially in northern and eastern Europe, and across Central Asia as far as Pakistan and North-West India, with a characteristic presence in the North Caucasus. Haplogroup I first appears in Europe with the arrival of Proto-Indo-European cultures, notably the Unetice culture associated with Y-haplogroup R1b. The absence of haplogroup I from Paleolithic, Mesolithic and Neolithic sites, and from modern non-Indo-European speaking populations such as the Saami, the Basques and the Maghrebians all play in favour of an Indo-European origin.
Haplogroup X is a very old and scattered haplogroup found all over Eurasia, North Africa as well as among Native North Americans. It frequency rarely exceeds 5% of the population in any ethnic group, and is more often restricted to 1 or 2%. X1 is found almost exclusively in North Africa, while X2a is the only lineage present among Amerindians. X2d, X2e, X2n and X4 are found in Europe and Central Asia, and could therefore have been spread at least partially by the Proto-Indo-Europeans.
The strong presence of X2 around the Caucasus, progressively fading towards the Near East and Mediterranean , hints that it could be related to the spread of Y-DNA haplogroup G2a. R1b1b and G2a both having origins around the Caucasus it is unsurprising to find X2 alongside these two Y-DNA haplogroups.
Haplogroup R is the main subclade of N, the one that was to generate the 6 most common European haplogroups (H, V, J, T, U, K). At the time of writing R subclades were numbered from R0 (a.k.a. pre-HV) to R31. Most of them are found in South Asia (R5, R6, R7, R8, R30, R31), Southeast Asia (R9, R21, R22, R24), East Asia (R9/F, R11/B), and even among Papuans (R14) and Australian aborigenes (R12). R0a peaks in the southern Arabian peninsula is common among Arabs and Middle-Easterners. R1a (not to be confused with the homonymous Y-chromosome haplogroup) is found among the Adygei people from the North Caucasus (related to the Maykop culture => see R1b section), Brahmins from northern India, northwestern Russians and Poles - basically all people closely related with the Indo-European expansion. R2 is found from northwest India and Pakistan to Iran, Georgia and Turkey. It could be connected to the Indo-Iranians.
Finno-Uralic people have an overall mtDNA admixture similar to other Europeans, with a higher percentage of W and U5b, and a small percentage of Siberian haplogroups such as N or A. The Sami are characterised by a high percentage of haplogroups U5b1 and V.
The Berbers are the indigenous populationof north-west Africa. Although their Y-DNA is almost perfectly homogenous, belonging to haplogroup E-M81, Berber maternal lineages show a much greater diversity, as well as regional disparity. At least half (and up to 90% in some regions) of the Berbers belong to some Eurasian lineages, such as H, HV, R0, J, T, U, K, N1, N2, and X2, mostly of Middle or Near Eastern origin. 5 to 45% of the Berbers will have sub-Saharan mtDNA (L0, L1, L2, L3, L4, L5). There are only three native North African lineages, U6, X1 and M1, representing 0 to 35% of the people depending on the region.
Haplogroup U6 has been observed from the Iberia and the Canary Islands to Senegal in the West, and from Syria to Ethiopia and Kenya in the East. It is also found at low density in Europe, though mostly limited to Iberia. Approximately 10% of all North Africans belong to this lineage.
The Gypsies (Romani people) originated in the Indian subcontinent and mixed with local population in the Middle East and Eastern Europe over the centuries. About half of the Gypsy population belong to haplogroup M, and more specifically M5 (reflected by Y-haplogroup H1a), which is otherwise exclusive to South Asia. The other mtDNA haplogroups found among the Gypsy community are mostly of Eastern European, Caucasian or Middle Eastern origin, such as H (H1, H2, H5, H9, H11, H20, among others), J (J1b, J1d, J2b), T, U3, U5b, I, W et X (X1b1, X2a1, X2f) (sources). The same diversity exist on the Y-DNA side (45% of H1a, followed by I1, I2a, J2a4b, E1b1b, R1b1b, R1a1a).
The list below is non-exhaustive and include many of the numerous references linked on these websites. Some studies and databases not published on the Web were also used.