The Iron Age genomic dataset from France111
112
A total of 145 individuals were targeted for palaeogenomic analyses (Table S1). DNA113
was extracted, and DNA libraries were built with a partial uracil-DNA glycosylase treatment,114
allowing for the assessment of postmortem deamination patterns (2% to 29%) expected for115
ancient DNA data. Initial screening via shotgun sequencing of 1 to 2 million reads was used116
to select libraries with an amount of endogenous DNA above 15%, leading to the exclusion117
of 92 individuals. For the remaining individuals who passed these quality criteria, we118
sequenced the libraries to an average depth of 0.178X (Table S2). We found overall119
negligible level of contamination in our dataset by testing for heterozygosity of polymorphic120
sites on the X chromosome in males (Table S3). The dataset resulting from these successive121Journal Pre-proof
4
quality selections encompasses low-coverage genomes for 49 individuals originating from 27122
sites, dating from the Bronze Age (N = 2) and the Iron Age periods (N = 47). We compiled123
the IA data with 18 low-coverage genomes already published for IA groups from France124
(Brunel et al., 2020), leading to a total of 65 low-coverage genomes distributed in 6125
geographical areas: Alsace (N = 20), Champagne (N = 5), Normandy (N = 3), North (N =126
10), South (N = 18) and Paris Basin (N = 9) (see Figure 1A, STAR Methods and Tables S1127
and S3). The IA dataset is unbalanced in terms of the chronological distribution of the128
individuals, with 11 individuals dated to the Early Iron Age and 54 dated to the Late Iron Age129
period (Figure 1B). This can be partly explained by the funerary treatment and the use of130
cremation (see, for example, Dedet, 2004 for southern France). The few humans remains131
(from southern or north-western France) available for genomic analyses represent deceased132
who escaped cremation and benefited from non-ordinary funerary practices. Therefore, the133
corpus available for genomic analysis may not be representative of the entire population134
living at the time. For instance, for southern France, genetically analysed individuals135
correspond to severed heads (see STAR Methods. site of Le Cailar) or to neonates buried in136
settlements (see STAR Methods. site Le Plan de la Tour). The dataset is also unbalanced in137
terms of regional representativeness, with the Normandy region providing the lowest number138
of genomes due to the low DNA conservation in the coastal Urville-Nacqueville necropolis139
targeted (Table S1). Finally, among the 65 individuals, if 33 were males and 32 were females,140
the sex ratio within each region was unbalanced, with notably more females in Alsace and141
more males in the South (Table S2). With this frame in mind, we analysed our data with142
published ancient individuals (n = 5225) genotyped on the 1240k panel (Mathieson et al.,143
2015) as well with modern (n = 6461) individuals from a panel of modern-day worldwide144
populations genotyped on the Affymetrix Human Origins (HO) panel. From the present145
study’s dataset, 65 individuals with more than 20,000 SNPs on the 1240k panel were used for146
the downstream genome-wide analyses (see STAR Methods and Table S2). We found no147
first-degree relatives among IA individuals from present-day France allowing us to keep the148
full dataset for downstream analyses (see STAR Methods, Table S3 and Figure S4).149
150
We first explored our data qualitatively using principal component analysis (PCA) by151
projecting the ancient genomes onto the genetic variation of an HO set of west Eurasians152
(Figure 1C and S1). French IA individuals fall within the genomic variability of the modern-153
day French population. IA samples from Spain and Great Britain also fall within modern-day154
populations from the same region, highlighting a certain degree of continuity from the Iron155
Age to modern-day populations in western Europe, confirming previous results based on156
mitochondrial DNA (Fischer et al., 2018). The PCA also shows a clinal distribution of our IA157
French samples according to their latitudinal position: the northern samples are closer to the158
extant Great Britain population, and the southern samples are closer to the Spanish population159
(Figure S1). These observations are fully consistent with genomic studies conducted on160
modern Europeans and highlight a geographically and genomic intermediate position of the161
French groups between north-western and south-western European populations (Novembre et162
al., 2008).163
164
165
To test further the genomic variability of the new IA genomes, we grouped the166
individuals among different chrono-cultural groups, i.e., according to their region of origin167
and, when possible, to their dating (Early vs. Late Iron Age): EIA_Alsace (from 800 BC to168
450 BC), LIA_Alsace (from 450 BC to 50 BC), IA_Champagne, IA_Normandy, IA_North,169
IA_Paris_Basin and IA_South. We then carried out a qpWave analysis iterated over all170
individuals in the pool, testing for significant evidence of heterogeneity relative to the171
5
remaining chrono-cultural group (see STAR Methods and Figure 2). Individuals were172
considered genomic outliers from the chronological-cultural group from which they originate173
when the qpWave p value was < 0.05 (Fernandes et al., 2020). This resulted in the174
identification of six individuals as outliers: BES1248, PECH3 and PEY163 stand as outliers175
from the IA_South group, CROI11 from the EIA_Alsace group, COL239 from the176
LIA_Alsace group and GDF1341 from the IA_Paris_Basin group. The analyses at the177
regional level were consequently conducted separately on these individuals and their chrono-178
cultural groups. The outlier status of these special individuals will be further discussed.179
18