Eupedia Genetics

Eupedia Home > Genetics > Haplogroups (home) > Haplogroup G2a

Haplogroup G2a (Y-DNA)

Version française

Author: Maciamo.
Last update September 2016 (updated famous people)

Geographic distribution

Nowadays haplogroup G is found all the way from Western Europe and Northwest Africa to Central Asia, India and East Africa, although everywhere at low frequencies (generally between 1 and 10% of the population). The only exceptions are the Caucasus region, central and southern Italy and Sardinia, where frequencies typically range from 15% to 30% of male lineages.

Most Europeans belong to the G2a subclade, and most northern and Western Europeans more specifically to G2a-L141.1 (or to a lower extend G2a-M406). About all G2b (L72+, formerly G2c) Europeans are Ashkenazi Jews. G2b has also been found around Afghanistan, probably as an offshoot of Neolithic farmers from the Levant.

Haplogroup G1 is found predominantly in Iran, but is also found in the Levant, among Ashkenazi Jews, and Central Asia (notably in Kazakhstan).

G2a makes up 5 to 10% of the population of Mediterranean Europe, but is fairly rare in Northern Europe. The only places where haplogroup G2 exceeds 10% of the population in Europe are Cantabria, central and southern Italy (esp. in the Apennines), Sardinia, northern Greece (Thessaly) and Crete - all mountainous and relatively isolated regions. Other regions with frequencies approaching the 10% include Asturias in northern Spain, Auvergne in central France, Switzerland, Sicily, the Aegean Islands, and Cyprus.

Distribution of haplogroup G in Europe, North Africa and the Middle East

Distribution of haplogroup G in Europe, North Africa and the Middle East


Phylogenetic tree of haplogroup G2a (Y-DNA) - Eupedia


Haplogroup G descends from macro-haplogroup F, which is thought to represent the second major migration of Homo sapiens out of Africa, at least 60,000 years ago. While the earlier migration of haplogroups C and D had followed the coasts of South Asia as far as Oceania and the Far East, haplogroup F penetrated through the Arabian peninsula and settled in the Middle East. Its main branch, macro-haplogroup IJK would become the ancestor of 80% of modern Eurasian people. Haplogroup G had a slow start, evolving in apparent isolation for tens of thousands of years, possibly in Southwest Asia, cut off from the wave of colonisation of Eurasia.

As of late 2015, there were 301 mutations (SNPs) defining haplogroup G, confirming that this paternal lineage experienced a severe bottleneck before splitting into happlogroups G1 and G2. Haplogroup G itself formed around 48,000 years ago. G1 might have originated around modern Iran at the start of the Last Glacial Maximum (LGM), some 26,000 years ago. G2 would have developed around the same time in Southwest Asia. At that time humans would all have been hunter-gatherers, and in most cases living in small nomadic or semi-nomadic tribes. Members of haplogroup G2 appear to have been closely linked to the development of early agriculture in the Levant part of the Fertile Crescent, starting 11,500 years before present. The G2a branch expanded to Anatolia, the Caucasus and Europe, while G2b ended up secluded in the southern Levant and is now found mostly among Jewish people.

There has so far been ancient Y-DNA analysis from Early Neolithic Anatolia as well as most Neolithic cultures in Europe (Thessalian Neolithic in Greece, Starčevo culture in Hungary/Croatia, LBK culture in Germany, Remedello in Italy, and Cardium Pottery in south-west France and Spain) and all sites yielded a majority of G2a individuals, which is the strongest evidence at present that farming originated with and was disseminated by members of haplogroup G. From the start, however, G2a men were accompanied by a diversity of lineages assimilated from local hunter-gathering populations along the way, such as haplogroups C1a2, H2, E-M78, I*, I1, I2 and J2. Occasionally other Near Eastern lineages showed up, like one T1a sample in the LBK culture, and one R1b-V88 in northwest Spain. T1a tribes are thought to have domesticated ovicaprids in the Zagros mountains, while R1b tribes would have domesticated catlle in the north of the Fertile Crescent.

The highest genetic diversity within haplogroup G is found in the Fertile Crescent, between the Levant and the Caucasus, which is a good indicator of its region of origin. It is thought that early Neolithic farmers expanded from the Levant and Mesopotamia westwards to Anatolia and Europe, eastwards to South Asia, and southwards to the Arabian peninsula and North and East Africa. So far, the only G2a people negative for subclades downstream of P15 or L149.1 were found exclusively in the South Caucasus region.

History of haplogroup G

Several historical migrations brought different subclades of haplogroup G to Europe or redistributed them geographically.

Neolithic farmers and mountain herders

The testing of Neolithic remains in various parts of Europe has confirmed that haplogroup G2a was the dominant lineages of Neolithic farmers and herders who migrated from Anatolia to Europe between 9,000 and 6,000 years ago.

Cereal and legume farming first developed 11,500 years ago in the Fertile Crescent, in what is now Israel/Palestine, Jordan, Lebanon, Syria and Iraq, but did not expanded much beyond this region for the first two and a half millennia. The reason for this delay was that early agriculture was too rudimentary to allow an independent subsistence and was merely a way of supplementing the diet of hunter-gatherers. Cultivation started with wheat, figs and legumes. The domestication of wheat and barely was a lengthy process that necessitated the selection of cultivars that possess mutations for larger, less brittle and nonshattering spikes. The flood plains of Mesopotamia were ideal for primitive cereal farming as they did not require irrigation.

Pottery first appears in the Near East approximately 9,000 years ago in northern Mesopotamia. The development of pottery seems to coincide with the sudden expansion of G2a agriculturalists toward western Anatolia and Europe. Pottery allowed easy storing of cereals and legumes and could have facilitated trade with neighbouring ovicaprid and cattle herders, and pig farmers. Goats and sheep had first been domesticated some 11,000 years ago in the Zagros and Taurus mountains on the northern edge of the Fertile Crescent, but were not introduced to the Levant until approximately 8,500 years ago (see The development of goat and sheep herding during the Levantine Neolithic, A. Wasse, pp. 26-27), just after the appearance pottery.

The Neolithic settlement of Çatalhöyük in south-central Anatolia was founded by cereal and pulse farmers who also brought domesticated goats and sheep. Only a few centuries later (c. 6500 BCE) were cattle introduced to Çatalhöyük and other sites in Central Anatolia, presumably by trading with their eastern neighbours. Also around 8,500 years ago, G2a Neolithic farmers arrived in northwest Anatolia and Thessaly in central Greece, as attested by the ancient genomes sequenced by Mathieson et al. (2015) and Hofmanová et al. (2015). G2a farmers from the Thessalian Neolithic quickly expanded across the Balkans and the Danubian basin, reaching Serbia, Hungary and Romania by 5800 BCE, Germany by 5500 BCE, and Belgium and northern France by 5200 BCE. Ancient skeletons from the Starčevo–Kőrös–Criș culture (6000-4500 BCE) in Hungary and Croatia, and the Linear Pottery culture (5500-4500 BCE) in Hungary and Germany, all confirmed that G2a remained the principal paternal lineage even after farmers intermingled with indigenous populations as they advanced.

By 7,800 years ago, farmers making cardial pottery arrived at the Marmara coast in northwest Anatolia with ovicaprids and pigs. These people crossed the Aegean by boat and colonized the Italian peninsula, the Illyrian coast, southern France and Iberia, where they established the Cardium Pottery culture (5000-1500 BCE). Once again, ancient DNA yielded a majority of G2a samples in the Cardium Pottery culture, with G2a frequencies above 80% (against 50% in Central and Southeast Europe).

Nevertheless, substantial minorities of other haplogroups have been found on different Neolithic sites next to a G2a majority, including C1a2, I2 and H2 in Anatolia and Southeast Europe, E-M78, I*, I1, I2a, I2a1 in Central and Southeast Europe (LBK, Starčevo) and E-V13 and I2a in the West Mediterranean (Cardium Pottery). These probably represent assimilated hunter-gatherers descended from Mesolithic and Paleolithic Europeans. It is interesting to note that many of those assimilated lineages, like C1a2, H2 and I* are virtually extinct nowadays. Could it be that these people were ot assimilated after all, but rather enslaved by G2a tribes ?

Expansion of agriculture from the Middle East to Europe (9500-3800 BCE)

Expansion of agriculture from the Middle East to Europe (9500-3800 BCE)

Ötzi the Iceman (see famous individuals below), who lived in the Italian Alps during the Chalcolithic, belonged to haplogroup G2a2a2 (L91), a relatively rare subclade found nowadays in the Middle East, southern Europe (especially Sicily, Sardinia and Corsica) and North Africa. G2a2 (PF3146) is otherwise found at low frequencies all the way from the Levant to Western Europe. In conclusion, Neolithic farmers in Europe would have belonged to G2a, G2a2 (+ subclades) and G2a3 (and at least the M406 subclade).

Nowadays G2a is found mostly in mountainous regions of Europe, for example, in the Apennine mountains (15 to 25%) and Sardinia (12%) in Italy, Cantabria (10%) and Asturias (8%) in northern Spain, Austria (8%), Auvergne (8%) and Provence (7%) in south-east France, Switzerland (7.5%), the mountainous parts of Bohemia (5 to 10%), Romania (6.5%) and Greece (6.5%). The hilly terrain of southern Europe indeed makes it ideally suited for herding goats, which G2a men brought with them during the Early Neolithic period. But the most likely explanation is that mountains provided refuge for G2a tribes after the Proto-Indo-European speakers invaded Europe from the steppes of Russia and Ukraine during the Copper and Bronze Age (see history of R1a and R1b).

Steppe people were almost exclusively cattle and horse pastoralists and first settled in flat regions like the Hungarian Plain, the North European Plain and the Baltic region. Even after reaching Western Europe, they favoured relatively low lying regions like the Low Countries, western France and the British Isles, where R1b lineages now exceeds 60%, and in some places 80% of the population. In fact, the highest percentages of G2a today are found in the regions last invaded by R1a and R1b people. Indo-Europeans didn't penetrate into Iberia until 1800 BCE and did not cover the whole peninsula until 1200 BCE, and pockets of G2a survive in particularly isolated areas like the Pyrenees, the Cantabrian and Asturian mountains, northern Portugal, or the arid highlands of La Mancha. The Proto-Italics only crossed the Alps into Italy from 1300 BCE and settled more densely in the north, explaining the north-south gradient in R1b in modern Italy, which is practically the mirror of Neolithic haplogroups like G2a, J1 and T1a. Sardinians spoke a non-Indo-European language until the Roman conquest 2,000 years ago.

The distribution map of all G2a subclades does not impart just how thoroughly Proto-Indo-Europeans eliminated G2a lineages in the northern half of Europe because Proto-Indo-Europeans also carried one type of G2a from the Pontic Steppe, the G2a3b (L141.1) subclade (see below) Nowadays, the Neolithic G2a3a (M406) is found especially in Anatolia, the southern Balkans, the Apennines, the Alps, central France, and the Iberian peninsula. It only makes a tiny fraction of the G2a in the northern half of Europe, which is chiefly the Indo-European G2a3b variety.

G2a people may have been among the first humans to have acquired the alleles for fair skin. A hunter-gatherer from northern Spain tested by Olalde et al. 2014 still had dark skinned as recently as 7,000 years ago. In contrast, an Early Neolithic farmer from Germany possessed the alleles for fair skin found in modern Europeans. The Neolithic individual was female, but Neolithic men from the same LBK culture were predominantly G2a (but also included around 50% of assimilated Paleolithic/Mesolithic lineages such as C1a2, F, I1 and I2a). It is still unclear exactly when and among which haplogroup fair skin arose, but it has been suggested that the new diet brought by cereral agriculture would have caused deficiencies in vitamin D, which was traditionally absorbed from fish and meat among foragers. Mutations for light skin would have been positively selected among Neolithic agriculturalists to stimulate the production of vitamin D from sunlight in order to compensate for the scarcity of meat.

G2a-L141.1, the Indo-European branch of G2a

Contrarily to other branches of G2a, which are more prevalent in mountainous areas, G2a3b (L141.1), and particularly the G2a3b1 (P303) subclade, is found uniformly throughout Europe, even in Scandinavia and Russia, where Neolithic farmers had only a minor influence. More importantly, G2a3b and its subclades are also found in eastern Anatolia, the Caucasus, Central Asia and throughout India, especially among the upper castes, who represent the descendants of the Bronze Age Indo-European invaders. The combined presence of G2a3b1 across Europe and India is a very strong argument in favour of an Indo-European origin. The coalescence age of G2a3b1 also matches the time of the Indo-European expansion during the Bronze Age.

The homeland of R1b1a (P297) and Pre-Proto-Indo-European speakers is presumed to have been situated in eastern Anatolia and/or the North Caucasus. The Caucasus itself is a hotspot of haplogroup G. Therefore, it is entirely conceivable that a minority of Caucasian men belonging to haplogroup G (and perhaps also J2b) integrated the R1b community that crossed the Caucasus and established themselves on the northern and eastern shores of the Black Sea sometime between 7,000 and 4,500 BCE.

An alternative theory is that G2a3 (L30) came from Anatolia to eastern and Central Europe during the Neolithic (a fact proven by ancient DNA test). Once in Southeast Europe it split in two branches: G2a3a, who followed the Danube to Central Europe (LBK), and G2a3b, who migrated east to the Pontic Steppe and brought agriculture to the region. G2a3b would have mixed with the indigenous R1a people, then with R1b newcomers during the Chalcolithic and Bronze Age. By the time the Proto-Indo-Europeans started their massive expansion, G2a3b men (who apparently belonged overwhelmingly to G2a3b1 and its subclades) would have joined R1b-M269/L23 in the invasion of Old Europe from 4200 BCE (=> see R1b history). G2a3a would have been among the conquered populations of Old Europe, seeking refuge in mountainous areas.

Roman redistribution

By the Iron Age, the G2a population in most of Europe had been decimated by the Indo-European invasions, followed by Celtic warfare. G2a sought refuge from the invaders in the mountains, and like today, reached maximum frequencies in Italy (Apennines, Sardinia) and in the Alps.

The ancient Latins and Romans descend from the Italic tribes who invaded the Italian peninsula from 1200 BCE. They seem to have belonged primarily to haplogroup R1b-U152 (=> see Genetics of the Italian people), but to have carried a substantial minority of G2a3b (L141.1) lineages, especially the U1 and L497 subclades. The Latin homeland in central Italy is one of the hotspots for haplogroup G2a in Europe today. The high level in G2a in the Latium might be due to the dual presence of Indo-European G2a3b and of earlier Neolithic lineages who descended from the Apennines to live in Rome after being absorbed by the Roman civilisation.

If the ancient Romans and other Romanised peoples from the Italian peninsula had any genetic impact on other parts of the Roman Empire (as they should have), they certainly contributed to a moderate increase of G2a lineages (in addition to R1b-U152 and J2) within the borders of the empire. Indeed, the frequency of haplogroup G decreases with the distance from the boundaries of the empire. Haplogroup G is extremely rare Nordic and Baltic countries nowadays, despite the fact that agriculture reached those regions around the same time as Britain or Ireland. Another reason could be that the forested lowlands of northern Germany, Poland and the Baltic were too poor in metals and did not have attract as many Bronze-Age workers from the Caucasus (=> see Metal-mining and stockbreeding explain R1b dominance in Atlantic fringe). Northeast Europe also has a relatively low percentage of haplogroup R1b, which further reinforces the hypothesis that the two haplogroups spread together during the Bronze Age.

Scythian G1

Haplogroup G1 is the South and Central Asian branch of haplogroup G. While G2a men migrated west to Anatolia and Europe in the Neolithic, their G1 cousins migrated east to Persia and India. Only very rare cases of G1 have been found in Europe, including in Britain, Germany, as well as most of southern, central and eastern Europe. How did these G1 lineages get there ?

Central Asia became a merging zone for southern G1 and J2 lineages with northern R1a lineages during the Bronze and Iron Ages. New hybrid peoples were formed, like the Scythians, who once controlled an empire ranging from northern Pakistan to Xinjiang and to Ukraine. The Romans were known to recruit Scythian or Sarmatian horsemen in their legions. According to C. Scott Littleton in his book From Scythia to Camelot, several Knights of the Round Table were of Scythian origin, and the the legend of Holy Grail itself originated in ancient Scythia. This hypothesis was also taken up in the 2004 movie King Arthur, which opens with the arrival of Scytho-Roman cavalry in Britain. However, Scythians were steppe people more likely to belong to haplogroup R1a. If any of them did belong to G, they presumably were G1, not G2a. This would explain the scattered cases of G1 in north-Western Europe though.

The maternal lineages (mtDNA) corresponding to haplogroup G

Two of the main maternal lineages thought to have evolved conjointly with Y-haplogroup G2a are mt-haplogroups N1a1a and X, both minor lineages with roots in the Middle East. Interestingly, N1a and X are directly descended from N*, rather than from macro-haplogroup R - the ancestor of HV, JT and UK, representing 90% of European mtDNA lineages. The long bottleneck evolution of N1a and X mirror that of Y-haplogroup G. Furthermore, both N1a and X were found at exceptionally high frequencies in Neolithic Europe compared with present time, N1a being found in 13% of samples from the Linear Pottery culture (LBK) and X2 generally ranging between 5% and 10% in various Neolithic cultures. Today, X is found in 1.5% of the European population and N1a is under 0.5%. The only modern population with high frequencies of mtDNA X are the Druzes, who also have over 10% of Y-haplogroup G. Nevertheless, N1a and X cannot be seen as being exclusively linked to Y-haplogroup G for the simple reason that Neolithic farmers were a compound of several male (and female) lineages, which also included Y-haplogroups E1b1b (at least M34) and T1a (and perhaps minorities of J1 and J2).

Famous individuals

Ötzi the Iceman, Europe's oldest natural human mummy, dating from 5,300 years ago, had his full genome sequenced (the oldest European genome ever tested) and was found to belong to haplogroup G2a-L91 (G2a2a2, formerly known as G2a4).

On 12 September 2012, archeologists from the University of Leicester announced that they had discovered what they believed were the remains of King Richard III of England (1452-1485) within the former Greyfriars Friary Church in the city of Leicester (see Exhumation of Richard III). The skeleton's DNA matched exactly the mitochondiral haplogroup (J1c2c) of modern matrilineal descendants of Anne of York, Richard's elder sister, confirming the identity of the medieval king. Further tests published in December 2014 revealed that his Y-chromosomal haplogroup was G2 (not tested for downstream mutations, but statistically very likely to be G2a3 as a northern European). This however did not match the Y-DNA of three modern relatives (who were all R1b-U152 xL2) descended from Edward III, Richard III's great-great-grand-father. Richard descends from the House of York, while the modern relatives descend from the House of Lancaster via John of Gaunt. Therefore it cannot be determined at present when the non-paternity event occured in the Plantagenet lineage, and whether most of the Plantagenets monarchs belonged to haplogroup G2 or R1b-U152. Both haplogroups are considerably more common in France than in Britain, however, which is consistent with the French roots of the House of Plantagenets.

Joseph Stalin, who was of Georgian origin, belonged to haplogroup G2a1a. This was determined by testing his grandson, Alexander Burdonsky (his son Vasily's son).

Other famous members of haplogroup G2a

  • Najeeb Halaby (1915-2003) : was an American businessman, government official, celebrated aviator, and the father of Queen Noor of Jordan. The Queen was a guest in the PBS television series Faces of America, and one of her paternal cousins had his Y-DNA tested for the purpose of the series. Her paternal lineage was revealed to be haplogroup G.
  • Jake Gyllenhaal : American actor and son of director Stephen Gyllenhaal and screenwriter Naomi Foner. His Y-DNA was revealed by the PBS television series Finding Your Roots.


Ask your questions and discuss about haplogroups on the Forum

Copyright © 2004-2016 All Rights Reserved.