Eupedia Genetics

Eupedia Home > Genetics > Haplogroups (home) > Iberian genetics

Genetic history of the Spaniards and the Portuguese

Author: Maciamo Hay (originally published in July 2013. Last updated on June 2015)

A PDF version of this page is available on


The Iberian peninsula has a varied and mountainous landscape that has promoted regional division and the isolation of human settlement throughout prehistory and during most of history, until the development of modern transportation. This has created ample opportunities for stark regional variations to develop, be it in culture, language, or genetics. On the other hand, Spain and Portugal are two of the oldest countries in continuous existence in Europe. This long political unity has favoured intermarriages within each country for much longer than in, say, Italy or Germany, which had a moderate uniformising effect on each country's gene pool.

A wide range of peoples have settled in Iberia since the end of the last Ice Age. Phoenicians, Celts, Greeks, Jews, Romans, Goths, Suebi, Franks, Arabs and Berbers. All have left their genetic print on the populations of the regions where they settled. This page attempts to identify their genetic markers through the use of Y-chromosomal (Y-DNA) haplogroups, which are passed on nearly unaltered from father to son, mitochondrial DNA (mtDNA), which is inherited only from one's mother, and genome-wide studies.

The modern Iberian gene pool is overwhelmingly Mediterranean, and yet the sequencing of a 7,000 year-old hunter-gatherer from La Braña in Asturias, revealed that Mesolithic Iberian shared much closer genetic affinities to modern Northeast Europeans (apart from having dark skin). This shows just how much the genetic landscape of the peninsula has changed in the course of a few eventful millennia. Yet, a single Mesolithic genome is not enough to get an unbiased picture of what all Iberian people were like at the time. It is cannot be excluded yet that North Africans hunter-gatherers may have crossed the Strait of Gibraltar on boats and colonised the Iberian peninsula from the south, while northern and central European foragers occupied northern Spain.

Even Neolithic farmers appear to have come from two different sources, each bringing their own set of haplogroups and autosomal admixtures. A first Mediterranean route brought farmers of the Cardium Pottery culture from the Balkans and Italy. Soon afterwards, La Almagra Pottery culture developed in Andalusia, apparently emerging from the present-day Maghreb. This event would explain the presence of both Northwest African and Red Sea DNA, such as Y-haplogroup E-M81, J1 and T, across most of southern and western Iberia.

So, from a very early time Spain was divided genetically between north and south, as well as between east and west. It could be argued that Iberia started to homogenise from the time of the Reconquista, when northerners recolonised in the south, and in the ensuing centuries, when intermarriages brought very slowly but steadily Spaniards together, especially in cities. Nevertheless the regional genetic landscape is still very morselled, both the the maternal and paternal sides.

History of the peoples and tribes of Iberia

Paleolithic to Early Neolithic

Iberia was one of the last region of Europe reached by anatomically modern humans, and therefore also one of the last stronghold for Neanderthals. Modern humans are thought to have reached Iberia from France approximately 28,000 years ago. The last pure Neanderthals may have survived until 24,000 years ago around Gibraltar. The skull of a 4 year-old Neanderthal boy displayed signs of hydridisation bewteen Neanderthals and Homo sapiens. It is now known that all modern Europeans and Asians carry a few percent's of Neanderthalian DNA due to such hybridisation.

During the Last Glacial Maximum (LGM), which lasted approximately from 26,500 to 19,000 years ago, most of northern and central Europe was covered by ice sheets and was virtually uninhabitable for humans. Coastal Iberia, as well as Catalonia, Aragon and south-west France, were one of the temperate refugia for Cro-Magnons. It is thought that Cro-Magnons belonged chiefly to Y-DNA haplogroups C, F and I, but also maybe E-M81 in Iberia as well as E-V13 and J2b in the Balkans, and R1a in eastern Europe. The 7,000-year-old hunter-gatherer tested by Olalde et al. (see above) belonged to Y-haplogroup C1a2 and mtDNA U5b2c1. Almost all other Mesolithic remains from northern and central Europe tested to date yielded I2 lineages.

Whereas Y-haplogroups C1 and F are almost extinct in Iberia today, there are few surviving Cro-Magnon paternal lineages, mostly haplogroup I2a1a (M26), which is found at low frequency all over the peninsula and peaks among the Basques (5% all male lineages).

On the maternal side, several haplogroups could have their roots in Mesolithic Iberia. Ancient DNA tests have confirmed haplogroup U5 as the dominant female European lineage during the Mesolithic, although a few U2e and U4 samples were also found in Northeast Europe. Nowadays U5 is found in 8% of the Spanish population, although not all of it descends from Mesolithic Iberians (if any). The highest frequencies of U5 are observed among the Basques (12%) and the Cantabrians (11%), who belong respectively to U5b1c1 and U5b1f (Basques) and U5b2a1 (Cantabrians). Interestingly, the Mesolithic Asturians belonged to U5b2c1, a subclade that hasn't been found in modern Spain, but in France, Britain and Germany.

Some controversy still surrounds the origin of mt-haplogroups H1, H3 and V. None of them have been identified in Mesolithic samples yet, but that may simply be due to the strong bias of present Mesolithic samples toward central and northern Europe, and the paucity of Mediterranean remains tested. The presence of all three haplogroups alongside Mesolithic U5 in North Africa, Iberia and Northeast Europe point to a common Mesolithic origin. Besides, all four haplogroups are equally rare in the Middle East and follow a north-south gradient indicating an introgression from Europe in historical times.

Neolithic newcomers

Neolithic farmers would have brought a whole new set of haplogroups, while incorporating a minority of Mesolithic lineages, especially in north-eastern Spain. The Y-chromosomal DNA of Neolithic farmers has been tested in various sites around south-east, central and western Europe, and all of them included a majority of haplogroup G2a.

Remains from a Neolithic site from the Cardium Pottery culture in Catalonia also yielded predominantly to haplogroup G2a, but also comprised a single E-V13 sample. This at least confirms that E-V13 was already present in Spain 7,000 years ago. Since E-V13 has never been found in any other Neolithic sites at present, it is still unclear whether it came from the Near East or Greece with Neolithic farmers, or if, as appears more likely, it was indigenous to southern Europe during the Mesolithic. The high frequency of E-V13 is Southeast Europe today and its very low frequency in the Levant suggest that it was native to the Balkans or southern Italy during the Mesolithic and a minority of lineages migrated westward with the advancing G2a farmers. Other minority lineages, notably J1, J2 and T, may have accompanied G2a farmers ever since the left the Middle East. Nonetheless, this hasn't been confirmed by ancient DNA tests yet.

Maternal lineages brought by Neolithic farmers from the Balkans and Anatolia can be safely ascertained by the large number of ancient mtDNA already tested. They included haplogroups HV, H2a2, H5a, H13, H20, J1c, K1a, N1a, T2 and X.

There is overwhelming evidence that Neolithic farmers intermingle with some of the Mesolithic foragers they encountered. Haplogroups F, I1 and I2 were all found next to G2a in various sites tested. Additionally, the Sardinians are the modern European population who are the most closely related to Neolithic farmers, and 37.5% of the male population belongs to Y-haplogroup I2a1a (M26). I2a lineages were found alongside G2a among early Neolithic farmers in Serbia and in Languedoc (southern France), close to the Spanish border. The close phylogenetic relationship between Basque I2a1a and Sardinian I2a1a suggest that they share a common Neolithic origin. In other words, modern I2a1a in Iberia may not actually represent the direct descendants of the I2a1 people who lived in Iberia during the Mesolithic period, but perhaps the descendants of other Mesolithic Europeans, from the Balkans or Italy, who became integrated to the expanding community of Neolithic farmers early on, then spread alongside G2a, the dominant Neolithic male lineage from the Near East.

Likewise, other branches of haplogroup I found in Iberia today, namely I1 and I2a2a (M223), originated in other parts of Europe and arrived to Iberia much later, brought by Germanic tribes in the fifth century CE (see below).

The presence of G2a and other Near Eastern lineages like E1b1b, J1, J2a and T among both the Basques and Sardinians confirms the mixed Mesolithic and Neolithic origin of both populations, and further corroborates the hypothesis of an early assimilation of indigenous Europeans by Near Eastern farmers and herders.

North African route

Another Neolithic expansion originating from the Middle East appears to have diffused across North Africa when the climate was wetter and greener than today. These Neolithic tribes might have been essentially goat herders from the Fertile Crescent who migrated south to the Arabian peninsula, across the Red Sea to the Horn of Africa (Ethiopia, Somalia), the Sudan, Egypt, then westward to the Maghreb, eventually reaching Andalusia some 7,000 years ago, where they established the La Almagra Pottery culture. Based on the correlations between Y-DNA and mtDNA in the Middle East and Northeast Africa, these people would have belonged to Y-haplogroups J1-P58 and T1a and to mt-haplogroups HV, N1, J, K, T and U3.

Just like their G2a counterparts in Europe, the original J1 and T herders would have mixed with the indigenous populations they encountered on the way. In North Africa these would have been almost exclusively people belonging to Y-haplogroup E-M81 and mt-haplogroup L, M1 and U6. They may also have crossed the path of other Middle Eastern herders, namely the R1b-V88 cattle herders, who are thought to have travelled from eastern Anatolia through the Levant to Egypt, and then across the Sahara and the Sahel. R1b-V88 repesents about 3% of male lineages in the Maghreb today and is also found at trace frequencies in Iberia (although some of it may have come from the Phoenicians).

Late Neolithic to Bronze Age

The Late Neolithic period and Copper Age (two periods that juxtapose one another, depending on the region) were very propitious for Iberia. Around 2,800 BCE, a new archeological culture emerged in the Tagus estuary in central Portugal, the so-called Bell Beaker phenomenon. Often referred to as a culture, it was almost certainly not a unified entity, be it politically, linguistically or ethnically, but rather a vast multicultural trade network. For the next 500 years it would spread on land and through maritime routes to various isolated regions in western Europe, including Galicia, Andalusia, Old Castile and Catalonia in Spain, but also to the Brittany, the British Isles, the Low Countries, Jutland, southern Germany, the Rhône valley, the Alps, northern Italy, Sardinia, and as far east as Bohemia. Most of these regions (except central Europe) were already somehwat linked to each others as members of the Megalithic culture, which evolved from the Early Neolithic cultures. Although no Megalithic Y-DNA has been tested yet, Megalithic mtDNA from Brittany is a typical blend of Mesolithic (U5b) and Neolithic (K1a, N1a, X2) lineages, in direct continuity of the Cardium Pottery and Linear Pottery cultures. Consequently, Megalithic people were predominantly G2a people, with minorities of I2a1a, E1b1b and perhaps also J or T.

The Bell Beaker cultural phenomenon did not in fact replace the Megalithic culture in western Europe, but coincided with it. The Beaker people continued to use common Megalithic burials (e.g. passage graves) like their Neolithic ancestors. In central Europe, where no Megalithic culture existed, bell beaker artefacts nevertheless appear due to the presence of western European merchants.

It is perhaps the wealth of Megalithic people that attracted, through the Beaker network, the Indo-European speakers from central Europe, and caused them to invade western Europe and destroy the Megalithic cultures that had lasted for several millennia. Equipped with bronze weapons and horses, these Indo-Europeans were not cereal farmers but cattle herders from the Pontic Steppe, north of the Black Sea. They had already conquered the Balkans, the Carpathians, Poland, Germany, Scandinavia and the Baltic countries between 4,000 and 2,800 BCE, causing the collapse of all the Chalcolithic and Neolithic cultures in those areas. The southern R1b branch had advanced from the Hungarian plain to Bohemia and Germany by 2500 BCE (presence of R1b confirmed by Lee at al. 2012), and continued its migration until the Atlantic coast, reaching Britain and western France by 2,200 BCE and Ireland by 2,000 BCE. These R1b men were the Proto-Celts and their Y-DNA is now found in over half of Spanish and Portuguese men.

Map of early to middle Bronze Age cultures from c. 2,500 to 2,000 BCE

The Pyrenees slowed the progression of the Proto-Celts toward Iberia, but eventually, around 1800 BCE, the first foreign Bronze Age cultures make their appearance in El Argar and Los Millares in south-east Spain, with sporadic sites showing up in Castile by 1700 BCE and in Extremadura and southern Portugal by 1500 BCE.

These Early Bronze Age sites typically did not have more than some bronze daggers or axes and cannot be considered proper Bronze Age societies, but rather Copper Age societies with occasional bronze artefacts (perhaps imported). These cultures might have been founded by small groups of R1b adventurers looking for easy conquests in parts of Europe that did not yet have bronze weapons. They would have become a small ruling elite, would have had children with local women, and within a few generations their Indo-European language would have been lost, absorbed by the indigenous languages (=> see How did the Basques become R1b).

Iberia did not become a fully-fledged Bronze Age society until the 13th century BCE, when the Urnfield culture (1300-1200 BCE) expanded from Germany to Catalonia via southern France, then the ensuing Hallstatt culture (1200-750 BCE) spread throughout most of the peninsula (especially the western half). This period belongs to the wider Atlantic Bronze Age (1300-700 BCE), when Iberia was connected to the rest of Western Europe through a complex trade network. It is during this Bronze Age period, between 1800 and 1200 BCE that R1b-DF23, the main Iberian branch of R1b, probably propagated.

Distribution of haplogroup R1b-DF27 (SRY2627 + M153) in Europe

The last Celtic migration to Iberia, but perhaps most significant in term of cultural impact, happened around 500 BCE, when Central European Celts from the Hallstatt culture expanded over a large swathe of western Europe. Travelling with their families on wagons transporting their belongings, the Celts colonised all central and northwest Iberia, a region that would remain Celtic speaking until the Roman conquest over 400 years later. It is still uncertain what the exact haplogroup composition of the Hallstatt Celts would have been, except that they surely possessed a large percentage of R1b-U152. They might also have carried G2a3b1 and J2b2 lineages, among others. Oddly enough, while R1b-U152 is found everywhere in the Iberian peninsula, its frequency never exceeds 5%.

Map of the Hallstatt culture

The Proto-Celtic and Hallstatt Celtic migrations to Iberia had a considerable impact on the modern gene pool. A bit over half of Portuguese paternal lineages and two thirds of Spanish ones can be traced back to this period under the form of R1b (excluding the Germanic S21 branch and some of the U152 that may be Roman), as well as G2a3b1 and J2b2 (again, excluding one portion that came with the Romans). Celtic maternal lineages are harder to identify, but they indubitably represent a much smaller portion of the Iberian gene pool. For instance, the Basques, who have the highest percentage of R1b, might not have more than 5% of Indo-European mtDNA, which explains why their mother tongue remained non-Indo-European.

No proper study of deep mtDNA subclades exists for Spain or Portugal, but a rough estimation is that between 15% and 30% of maternal lineages can be traced back to Indo-European invaders, be them Celtic, Roman or Germanic. The disparity between paternal and maternal Indo-European lineages is not surprising considering that Proto-Indo-European speakers advanced across Europe, from the Black Sea shores, as military conquerors, and progressively blended with the conquered populations by taking local wives or concubines. Consequently, whereas their paternal lineages spread like a wild fire in all conquered regions, we observe a slowly declining gradient from east to west for maternal haplogroups. In that respect Spanish and Portuguese people may not have less Indo-European mtDNA than the French, or probably have higher levels than South Italians, who weren't much affected by the Indo-European migrations.

Late Bronze Age to Iron Age

Phoenicians & Greeks

Between 1200 and 539 BCE the Phoenicians built a vast commercial empire from their Levantine homeland along the southern Mediterranean as far as Andalusia. The oldest city in Iberia is Cadiz, which was founded by the Phoenicians as Gadir or Agadir in 1104 BCE. The Phoenicians also founded Almuñécar, Malaga, Cartaya and Huelva, and settled in other existing cities such as Tartessos and Carmona.

Based on the haplogroups found in modern Lebanon and in their former colonies, the Phoenicians seem to have carried a mixture of haplogroup J2a, J1, E1b1b, G, R1b-M269/L23, T, L, R1b-V88, R2 and Q1b, roughly in that order of frequency. It is not easy to assess the percentage of modern Iberian lineages of Phoenician origin because many other peoples brought similar haplogroups. The most uniquely Phoenician lineages, which were normally not found among the ancient Greeks and Romans, are Q1b, R1b-V88 and R2. And indeed all of them have been found, mostly in Portugal and south-west Andalusia, but only at trace frequencies (under 0.5%).

The island of Ibiza was another major Phoenician colony, which has the particularity of having been left in isolation for most of its subsequent history. It is therefore likely to have more Phoenician lineages than average. That is probably the case as Adams et al. (2008) found 17% of haplogroup T on Ibiza, by far the highest percentage in Europe for the Middle Eastern lineage, but also 13% of haplogroup G (more than anywhere else in Iberia) and 4% of E-M123, the Levantine variety of E1b1b.

Not surprisingly, the second highest percentage of haplogroup T identified in Iberia is in Cadiz (10%). Like haplogroup T, E-M123 is mostly found in Murcia, Andalusia, Extremadura and Portugal, suggesting that this is where the Phoenicians had the largest genetic impact. Not surprisingly haplogroups J1 and J2a also peak in these regions.

The ancient Greeks had a relatively small impact on the Spanish gene pool, having only a few minor colonies in Catalonia and near Alicante. Modern Catalans have only 2% of haplogroup J2 and 3% of haplogroup E1b1b, the two main Greek lineages. Yet if we account for the contribution of the Romans and other invaders, and allow for the possibility that some E1b1b might be of Mesolithic, Neolithic or even Bronze Age origin, it is doubtful that Greek Y-DNA exceeds 3% of male population in Catalonia, the region with highest potential Greek ancestry.

Romans & Jews

The Romans did not establish a lot of population colonies in Iberia as they did in Gaul. They were only four Roman cities in Hispania: Tarraco (Tarragona), Emerita Augusta (Mérida), Italica (Santiponce, near Seville), and Carthago Nova (Cartagena), re-built on the ruins of the Carthagianian city.

The Romans would have brought very similar lineages to the Hallstatt Celts (R1b-U152, E-V13, G2a3b1 and J2b2), being themselves descended from an earlier migration (c. 1200 BCE) of Hallstatt Italo-Celts. But the Romans also assimilated many neighbouring tribes in Italy, including the Etruscans and the Greeks, who would both have carried E-V13, E-M34, G2a, J2a, R1b-L23 and T lineages. The genetic impact of the Romans is the most difficult to gauge as their haplogroups look essentially like a blend of Hallstatt Celts and Greeks. Comparing the frequencies of R1b-U152 and R1b-L23, and deducting the part attributable to other ethnic groups, there could be anywhere between 1 and 15% of Roman Y-DNA in various regions of Iberia. The highest level in probably found along the Mediterranean coast, in western Andalusia and in Extremadura, because this is where R1b-L23, J2 and E-V13 are the highest, but also because this is where the main Roman population centres were found.

The Jews established communities throughout Spain and Portugal in the early centuries CE, during the Roman period, and became known as Sephardic Jews. Spanish Jews once constituted one of the largest and most prosperous Jewish communities under Muslim and Christian rule, before they, together with resident Muslims, were forced to convert to Catholicism, be expelled, or be killed when Spain became united under the Catholic Monarchs King Ferdinand and Isabella in 1492.

The Golden age of Jewish culture in Spain started with the Umayyad conquest of Iberia in 711 and lasted until the end of the Caliphate of Cordoba and the Almoravid invasion in the 11th century. In the 14th century, approximately 8% of the Spanish population was Jewish. It is not known how many Jews converted to Catholicism to escape persecutions, but it is estimated by historians to have been very large.

The Sephardic Jews that fled the Inquisition and sought refuge in other European countries or Turkey remained a disctinct ethnic group to this day, and it is therefore easy to assess their haplogroup composition. This was in fact done by Adams et al. (2008) in their survey of Spanish paternal lineages (see table below) - although no comparable study exists for mtDNA (only for Ashkenazi Jews). Unsuprisingly, Sephardic Jews have very similar Y-DNA haplogroups as the Lebanese, and must also have been close to the ancient Phoenicians. Unfortunately this makes it almost impossible to distinguish what lineage is of Jewish or Phoenician origin in Iberia, a task made all the more difficult by the interferences from similar haplogroups brought by the Greeks and Romans (J2a, R1b-L23, T) or the Arabs (J1, J2a, T). A rough estimate is that Jewish or Phoenician Y-DNA account for 25-30% in Extremadura and southern Portugal, 15-20% of lineages in central Portugal, Andalusia and possibly also Castile-La Mancha, and less than 10% in most other regions. The only caveat is that these figures do not take into account the genetic contribution of Neolithic herders who may have come from Southwest Asia via North Africa. Only a deeper analysis of the subclades of haplogroups J1 and T could confirm exactly the proportion of Neolithic, Phoenician, Jewish and Arabic paternal ancestry in each region of Spain and Portugal.

Germanic migrations

In the 4th and 5th centuries the cooling of the climate prompted Germanic and Slavic tribes to migrate south and west and to invade the Roman Empire in search of more fertile lands.

In 406, the Alans (who were not Germanic but of Iranian origin), the Suebi and the Vandals crossed the Rhine together, invading Gaul, then three years later, they crossed the Pyrenees into Roman Hispania. The Suebi migrated to the western half of Iberia, where they established the Kingdom of Gallaecia (409–585). The Vandals and the Alans went south to Andalusia, then crossed over the North Africa in 429, where they founded a kingdom that also comprised Sicily, Sardinia and Corsica.

The Suebi were the only of the three tribes that actually settled in Iberia. As a Germanic tribe, they would have brought haplogroups I1, I2a2a (M223, formerly known as I2b1), R1b-U106 and R1a (L664, Z282 and Z283 subclades) to the Iberian peninsula, and indeed all of them except R1a are found essentially in the western half of the Iberian, especially in Portugal and Galicia. R1a is found in northern Castile, Asturias and Cantabria, and could either have been brought there by the Visigoths, or be the descendants of Mesolithic hunter-gatherers (as is the case of the Pasiegos).

The Goths, who were the first to penetrate into the Roman Empire at the beginning of the 4th century, first settling in the Balkans, and eventually split into two factions, the Ostrogoths and Visigoths. The latter, under the command of Alaric I, sacked Rome in 410, then went to establish a Visigothic Kingdom in south-western Gaul in 418. Quickly expanding over all Aquitania, the Visigoths now looked to expand south, and by the middle of the 5th century they had conquered most of central and southern Iberia. In the 580's they annexed the Suebi Kingdom, as well as the land of the Cantabrians and the Basques. The Visigothic Kingdom lasted until the Muslim conquest of Iberia in 711.

The Visigothic Kingdom was the larger and longer lived than the Suebi Kingdom, and yet the Goths do not seem to have had any significant genetic impact on the Iberian population - at least not in terms of Germanic Y-DNA. The reason might simply be that they were no longer a predominantly Germanic tribe. After all, the Goths had lived for many centuries in Eastern Europe and nearly two more centuries in the Balkans before invading Italy, Gaul and Iberia. They could have assimilated a lot of non-Germanic people on the way, notably R1a and I2a1b Slavs and predominantly E1b1b, I2a1b and J2 Balkanic people. It would be pretty complicated at the moment to untangle the Balkanic E1b1b and J2 from all the others (Neolithic, Phoenician, Greek, Roman, Jewish, Arabic) found in Iberia. But it is remarkably easy to check the Eastern European I2a1b (M423) and R1a (M458 and Z280). No historical migration could account for Slavic haplogroups in Iberia apart from the East European populations assimilated by the Goths before the 4th century. The I2a Project at FTDNA has three M423-Dinaric-N and one M423-Isles-B2 from Spain, while the R1a1a and Subclades Y-DNA Project has four Spanish Z280 (CTS1211+) members.

Overall, the Germanic migrations did not leave a lot of Germanic DNA in the Iberian peninsula. That is not suprising considering that there were only 40,000 Suebi who settled there permanently, and they were the biggest contigent if we exclude the heavily hybridised Visigoths. Galicia, northern and central Portugal, and Catalonia are the regions with the highest ratios of Germanic Y-DNA today (approx. 5 to 10% of the male lineages), which is consistent with the historical settlements of the Suebi, and the Frankish influence in Catalonia's case. Paternal lineages of the ruling classes, however, are generally an overestimation of the true genetic conttribution, since foreign invaders turned monarchs and nobles tend to procreate more by having multiple sexual partners (if not multiple wives, at least mistresses or concubines). Unfortunanately it is impossible at present to determine the amount of Germanic mtDNA, as this would require testing full mitochondrial sequences (which very few studies have done to date), and even then it may prove elusive due to the limitations of the extremely short mtDNA sequence. A reasonable estimation is that Germanic genes represent no more than 1% of the Iberian gene pool, with maximums of perhaps 3% or 4% in Galicia and northern Portugal.

Moors & Franks

In the 7th century, early Muslims from the Arabian peninsula started spreading their new faith and conquered a good part of the Middle East and the whole of North Africa under the Umayyad Caliphate. In 711, they crossed the Stait of Gibraltar and invaded Iberia, which they called Al-Andalus. This was the commencement of Moorish period in the peninsula, which would last for nearly eight centuries, until the fall of the Emirate of Granada to the Catholic monarchs in 1492.

The Inquisition killed or expelled a lot of Muslims, but, as was the case with the Jews, many converted to Christianity and remained in Spain and Portugal. As many as 275,000 of these Moriscos, as the converts were known, were expelled from Castille and Valencia in the early 17th century, but many more lingered in other regions, notably Aragon, Andalusia, Extremadura and Portugal. At one point, Moriscos accounted for 20% of the population of Aragon. It is probably not a coincidence that haplogroups E1b1b, J and T make up 20% of modern Aragonese male lineages, despite the fact that the region was never under Phoenician or Greek influence.

The Moors would have been a hybrid population composed of Arabs, belonging chiefly to Y-haplogroups J1-P858 and T, with small amounts of J2a, R1a-Z93 and R1b-V88, and Berbers, who were then almost exclusively E-M81. It is now possible to distinguish Arabic J1-P858 from Jewish J1-L816 and Phoenician J1-YSC234 or J1-YSC76, but none of the studies on Iberian Y chromosomes have tested deep J1 subclades to date. All that is known is that all of these subclades have been found both in Portugal and Spain in commercial DNA tests, but data is insufficient to determine regional proportions of each subclade. Regarding E-M81, many subclades exist, but none have been found to be exclusively European or North African, so it is not yet possible to tell apart M81 that came to Iberia during prehistory from the more recent contribution of the Moors.

The Franks were the ones who stopped the Muslim progression in western Europe by defeating the Moorish armies at the Battle of Tours in 732. Subsequently, under the rule of Charlemagne, the Spanish March was created as a buffer against the Umayyad Caliphate on the Spanish side of the Pyrenees (from Navarre to Catalonia). The March quickly evolved into the independent Kingdom of Navarre (824–1620) and the Frankish County of Barcelona (801–1162), later to become the independent Kingdom of Aragon (1035–1706). The Franks did not, however, colonise the region and the genetic legacy would only have passed through the (potentially proliferous) nobility.

Genome-wide analysis

Looking at autosomal DNA (i.e. the whole genome except the X and Y chromosomes and mitochondrial DNA), Iberian people are remarkably homogeneous - in a way that couldn't be guessed by looking at the distribution of Y-DNA and mtDNA haplogroups only. This is because genes spread fast in a population linked by a common language and a unified political entity. Paternal lineages often maintain regional and local patterns inherited over the centuries and millennia because in patriarchic societies, like Europe has been at least since the Bronze Age, it has consistently been men who inherited their parents's land, and women who married in the next village or town. This kept male lineages more fixed geographically than female lineages or overall genes. Only major geographic or linguistic obstacles, like crossing the snow-capped Cantabrian Mountains, or intermarrying with speakers of an utterly different language like Basque, would have serious hindered the propagation of autosomal DNA in the long term.

The Basque and Catalan exceptions

The Baques are indeed somewhat different genetically from other Spaniards. They have a bit more Northwest European ancestry (inherited from Mesolithic hunter-gatherers), and completely lack Red Sea, Southwest Asian and Caucasian admixtures (see autosomal maps). The absence of Red Sea and Southwest Asian admixture indicates that the Basques do not have any Phoenician, Jewish, Greek, Roman or Arabic ancestry. Looking at maternal lineages, the Basques also stand out from the rest of the peninsula, lacking many haplogroups, be it those associated with African or Southwest Asian ancestry (HV, L, M1, U3, U6) or those linked to the original Indo-European homeland in Eastern Europe (H2a1, H4, H7, H8, H11, H15, I, T1a1a1, U2, U4, W). They make up for it with higher frequencies of Mesolithic and Neolithic lineages (H1, H2a2a, H3, H5a3, J2a1a, J1c, K1a, T2, U5, V and X). This is in perfect agreement with the fact that Basque language is non-Indo-European. What generally comes as a surprise is that 85% of Basque paternal lineages belong to the Proto-Celtic R1b-P312. This can be explained by the fast replacement of male lineages due to warfare with neighbouring Proto-Celts and the establishment of a Celtic ruling class who quickly spread their Y-DNA through polygamy.

Interestingly the Catalans also lack the Southwest Asian ancestry, but do have some Red Sea and Caucasian genes. The Southwest Asian admixture is slightly more common in southern Portugal and Andalusia, which is consistent with the higher historical presence of Phoenician, Roman and Arabic people in that region. The Basques and the Catalans are the only Western European completely lacking genetic contribution from Southwest Asia. This is also translated in an extreme scarcity of Y-haplogroups J1, E-M34 and T, which are all typically Southwest Asian linages.

Middle Eastern and North African DNA in Iberia

The Caucasian, Red Sea and African admixtures are the two least homogeneous autosomal ancestral markers. They are considerably more pronounced in western Iberia, from Galicia and Asturias to Portugal and Andalusia, via north-west Castile and Extremadura. This corresponds almost exactly to the higher frequency of Y-haplogroups E1b1b (both North Africa E-M81 and Southwest Asian E-M34) and T (Middle East and Red Sea), and mt-haplogroups L (Africa) and U3 (Southwest Asia). Although this makes sense for the south-west of the peninsula due to the historical presence of the Phoenicians, Jews, Arabs and Berbers, it is still unclear why the north-west (North Portugal, Galicia, Leon, Asturias) follows the exact same pattern on all levels (autosomal, Y-DNA and mtDNA).

One possibility is that western Iberia was settled by Neolithic farmers from Southwest Asia who arrived via North Africa and picked up E-M81 and mtDNA L, M1 and U6 along the way. The presence of Caucasian and Red Sea admixtures without any substantial Southwest Asian admixture generally points to a Neolithic origin. The Southwest Asian admixture is only substantial in south-west Iberia. A Neolithic origin would make sense were it not for the presence of E-M34 (aka E-M123), which is believed to be the original Proto-Semitic lineage and which would only have arrived in Southwest Asia during the Copper Age or Early Bronze Age. In Europe it is mostly the Romans who spread E-M34, and this lineage is in fact almost completely absent from regions that lied outside the borders of the Roman Empire.

The second hypothesis is that the north-west was historically re-populated by people from the south-west. During the Reconquista the opposite is known to have happened. It also could not have happened during the Moorish period since the Moors never managed to conquer Galicia, Asturias and Cantabria. So that leaves the Phoenician and Roman periods (roughly rom 1000 BCE to 500 CE) as a possible timeframe for this northward population movement. The two hypotheses are not mutually exclusive, and indeed the north-west has less E-M34 than the south, owing to a smaller presence of Phoenician, Roman, Jewish or Arabic lineages. In fact, the most likely explanation is that the bulk of Southwest Asian and North African DNA is north-west Iberia is of Neolithic origin, and lineages like E-M34 came from Roman colonists and Jews who converted to Christianity.

The case of Cantabria is the most telling since Cantabrians have the highest percentages of E-M81, G2a, J1 and T in northern Spain, but almost completely lack E-M34, E-V13 and J2a. This only J2 in the region is J2b, which isn't found in Southwest Asia or North Africa and would most likely have come with Celts from central Europe. All this is entirely consistent with an exclusively Neolithic origin of those lineages and a negligible Roman or Jewish heritage.

Estimating Phoenician and Arabic DNA from the Haak 2015 data

The autosomal data provided by Haak et al 2015 (extended data figure) shows that the Basques and other North Spaniards differ from other Spaniards by the absence of Bedouin-like (purple), Caucaso-Gedrosian (greyish green), and East African (pink) admixtures. These three components are found among the Southwest Asians (Arabs) and North Africans. These undeniably represent the genetic contributions of the Arabs and Berbers from the Moorish period, but also probably to a considerable extent that of the ancient Phoenicians.

The Bedouin-like admixture is the dominant component and accounts for approximately 10% of the Central and South Spanish DNA. This admixture peaks in Saudi Arabia and Yemen and some of it could indicate medieval Arabic ancestry. Then comes the Caucaso-Gedrosian (5%), which is found mostly in the Middle East, but is absent from Morocco and most of Algeria. This admixture is found in Tunisia (8%) and Sardinia (3%) though, which strongly suggests that the Phoenicians brought it to the West Mediterranean. The East African admixture only makes up 1% of Spanish genomes, the same percentage as in Sicilians and North African Jews. Berbers and Egyptians have about 10% of this admixture. Neolithic farmers would have contributed most of the 50% of the orange admixture, which represents the Early European Farmer admixture taken from actual Neolithic samples. Some Neolithic admixture would have come from the Phoenicians and the Moors. Comparing the admixtures found in Lebanon, Sardinia and Tunisia, it seems that the ancient Phoenicians had about one third of Bedouin-like (purple), one third of Caucaso-Gedrosian (greyish green) and one third of Neolithic Farmer (orange).

Since the Caucaso-Gedrosian was probably brought to Spain mostly by Phoenicians (being nearly absent from Algerian, Mozabite and BedouinB), it can be inferred that the Phoenicians contributed approximately 12% of the DNA in an average South Spanish genome (4% for each of the three admixtures). The other 6% of the Bedouin-like admixture would be medieval Arabic in origin. Using the proportions of modern Saudi Arabs as a proxy, we can estimate that the Bedouin-like admixture made up 75% of medieval Arabs' genomes. That would give a total of about 8% of Arabic DNA in a South Spanish genome today.


The majority of Iberian paternal lineages are of Indo-European (R1b, G2a3b1, J2b2 and a small amount of R1a), which can be attributed to the Proto-Celtic and Hallstatt Celtic invaders, and to a lower extent to later Roman and Germanic settlers. In total, these amount to 50-85% of Spanish Y-DNA and 60% of Portuguese Y-DNA. Maternal lineages, on the other hand, appear to have a mostly Neolithic and Mesolithic origin, notably haplogroups H1, H3, HV0, K1a, J1c, J2a1, J2b1a, T2, U5b, V and X, which make up over 80% of the mtDNA in regions like the Basque country or Asturias, and always over 50% of the population of any region.

Western Iberia, from Galicia and Asturias to southern Portugal and western Andalusia, have relatively high percentages of Southwest Asian Y-chromosomal haplogroups (E-M34, J1, J2a, T). Their historical origin is diverse, being the cumulative contributions of Levantine Neolithic herders, Phoenicians, Jews and Arabs, although their exact proportion remains difficult to assess and may vary a lot between regions. What can be ascertained is that northern regions such as Cantabria, Asturias and even Galicia have negligible medieval Arabic, Jewish and Phoenician ancestry, and therefore the presence of Southwest Asian haplogroups should be attributed to Neolithic herders. Maternal Southwest Asian lineages included especially HV, J1d, J2a2, U3, X1 as well as some K, T and X2 subclades. Autosomal data shows a maximum of 12% of Southwest Asian and Red Sea DNA in southern Portugal and western Andalusia, and a minimum of 0% in the Basque country.

Southwest Asian lineages are usually found side with North African lineages, like Y-haplogroup E-M81 and mt-haplogroups L, M1 and U6. The most likely explanation for the presence in Iberia is that they "hitchhiked" with Neolithic herders and medieval Arabic invaders passing through the Maghreb. Some North African lineages may even have come during the late Glacial period. The origin of mtDNA H1, H3 or HV0/V is unclear. They may have have been present in Iberia and/or the Maghreb in the Mesolithic period, since these three lineages are also found all over North Africa. Yet it can't be excluded that they integrated the Neolithic agricultural community in the Maghreb and moved into Iberia at that time. Autosomal data shows an average of 5% of North African DNA in the western half of Iberia, and 1 or 2% in the eastern half.

North-eastern Spain, from the Basque country to Catalonia, was colonised by Neolithic farmers from Italy and France, and consequently has the lowest incidence of Southwest Asian or North African DNA in the peninsula today.

Migrations and settlements in historical times had a smaller impact on the genetic structure of Iberian than Neolithic and Bronze Age events. Only Y-DNA can be used today to measure the contributions of other European populations in Iberia, and even Y-DNA can't yield accurate estimate without large quantities of high-resolution data. The Romans left perhaps between 1% and 15% of Y chromosomes behind them, with a higher proportion along the Mediterranean coast, in Andalusia and in Extremadura. Germanic male lineages now make up about 4% of the overall population, with the highest frequencies (6-10%) oberved in the north-west and Catalonia.

Y-DNA frequencies by region

Distribution of Y-DNA haplogroups in Spain and Portugal

Total samples : Spain = 1798 ; Portugal = 1458 ; Sephardic Jews = 174. The Y-DNA frequencies for Lebanon are also indicated for the sake of comparison with the historical Phoenician homeland.

Region/Haplogroup I1 I2*/I2a I2b R1a R1b G J2 J*/J1 E1b1b T Q N Sample size
Portugal 2 1.5 3 1.5 56 6.5 9.5 3 14 2.5 0.5 0
Spain 1.5 4.5 1 2 69 3 8 1.5 7 2.5 0 0
  • Andalusia
  • 0 9.5 0 3.5 58.5 3 10.5 2 10 3 0 0
  • Aragon
  • 2 14.5 1 2 60.5 1 10.5 0 5 4 0 0
  • Asturias
  • 2 2 0 2.5 58.5 8 8 2 14 3 0 0
  • Basque country
  • 0.5 5 0 0 85 1.5 2.5 0.5 2.5 0 0.5 0
  • Cantabria
  • 1 3 2 8.5 55 10.5 3 2.5 11 2.5 0 0
  • Castile & Leon
  • 0.5 2 0.5 3 64 5 6 1 16 2 0 0
  • Castile-La-Mancha
  • 1.5 1.5 0.5 1.5 66 8 10 4 5 2 0 0
  • Catalonia
  • 2 3.5 1.5 1.5 66.5 4.5 7.5 1.5 8.5 1 0 0
  • Extremadura
  • 3.5 5 1 0 50 5 11.5 0 18.5 5 0 0
  • Galicia
  • 3 2.5 1.5 0 63 3 3.5 1 22 0.5 0 0
  • Valencia
  • 3 5.5 1 3 63.5 1 6 2 13.5 1.5 0 0
    Sephardic Jews 0 1 0 5 13 15 25 22 9 6 2 0
    (Lebanon) 2 1.5 1.5 2.5 8 6.5 26 20 17.5 5 2 0

    Sample sizes

    : Under 100 samples
    : 100 to 250 samples
    : 250 to 500 samples
    : 500 to 1000 samples
    : Over 1000 samples

    Sources of Y-DNA frequencies

    MtDNA frequencies by region

    Region/Haplogroup L HV H H1+H3 H5 HV0+V J T1 T2 U2 U3 U4 U5 U K I W X Other Size
    Portugal 6.4 0.1 43.9 (26) (2.1) 4.8 6.8 3.3 6.3 1.2 0.9 1.7 6.5 3 6.1 2.2 1.8 2 2.9 1448
    Spain 2.4 0.7 44.1 (28) (2.6) 7.5 6.6 2.1 6.4 1.1 1.4 1.9 8.1 1.8 6.3 1.1 1.4 1.7 5.5 2506
  • Andalusia
  • 7.4 0.8 44.3 (29.5) 4.8 8.9 2.3 2.6 1 1 1.6 5.7 1.2 6.8 1.3 1.6 3.2 3.1 310
  • Aragon
  • 1.2 1 39.3 5 15.8 0 10.9 0.8 0 1.7 9.6 0 4.2 1.7 0.8 0.8 7.2 119
  • Asturias
  • 0 1 54.1 5.6 9 0 1.1 0 2.2 0 12.3 2.2 7.9 1.1 1.1 0 2.4 89
  • Basques
  • 0.3 0.8 49 (44) (2.8) 7.9 7.6 1.5 6 1 0.3 0.8 11.7 1.9 5.3 0.6 1.1 2.3 1.8 618
  • Cantabria
  • 1.6 2.5 37.6 (27) 19 3.7 0.4 2.5 0.8 1.2 2.9 10.7 2.5 3.7 2.9 0 0 0.4 242
  • Catalonia
  • 3.1 0.5 29.5 7.5 7 1.3 7.6 1.3 2.5 3.8 10.1 3.9 10 1.3 5 2.5 3.1 80
  • Galicia
  • 3.7 1 58.5 (34) 3.8 8.6 1.1 3.7 1.6 0 0.5 5.4 1.5 4.9 0.5 1.6 1.1 0.5 185


    Copyright © 2004-2017 All Rights Reserved.