The genetic origin of Daunians

Yes, but that J2b2-L283 was probably brought by the Illyrian part of Daunians. It's interesting that they give an indirect hint that the J2b2-L283 might have been potentially richer in CHG.


we do not know who the Daunian merged/absorbed with as per italic people around Foggia ................but their next door neighbours to the west where always the samnites until the 3rd Roman-Samnite war.......then the Romans moved in after this war

IIRC Venosa was also Samnite in origin
 
I think the authors are pretty circumspect about all of this.

"It is not clear whether these connections indicate a movement of people or a sharing ofcultural ideas and a conclusive answer to the origin of the Daunians remains elusive. From aparsimony perspective, the genetic results point to an autochthonous origin (e.g. a geneticcontinuity of Daunians with the population that inhabited the area prior to the examined historicalperiod), here mainly marked by the presence of WHG signature, although we cannot excludeadditional influences from Croatia (ancient Illyria), as described by available historical sourcesand by the material remains (De Juliis 1988; Norman 2016)."

In another part of the paper they say that some contribution from the Balkans is plausible.

". Three of them, which clustered close to modern Italians in the PCA (ORD001, ORD014 and SGR003, Fig. 1C), show higher affinity with the Iron Age Croatian sample (ORD004 followed this pattern too, but with lower f3 values). However, the remaining majority are closest to the Roman Republicans, which can be interpreted as representative of local Iron Age peninsular Italy ancestry, as also indicated by our MDS results."

I think it's interesting, going over the text of this paper once again, to think of the work Jovialis has done with the Balkan samples, as well as the Roman Republican samples.

The archeological/linguistic data is pretty clear on this. The Albanian language is undeniably linked with Messapian. There are countless comparisons between the two.

Messapians were always a mixture of Western Balkan migrants and native Italians. I just wish we did a deep subclade analysis on those J2B2-L283 samples, to see which part of the Western Balkans they came from.

"The Iapygians most likely left the eastern coasts of the Adriatic for Italy from the 11th century BC onwards,[25] merging with pre-existing Italic and Mycenean cultures and providing a decisive cultural and linguistic imprint"

That was written in 2005^.
 
Yes, well, "linguists" were wrong about the Etruscans. As for the Indo-European linguists, put five of them in a room and you'll get five different opinions. That's why I stay away from such discussions. It's like being on a merry-go-round.

The majority of these samples are closest to the Iron Age Central Italians, i.e. the Etruscans and the Latins. That's why the authors come down on the side of saying it was an authochthonous culture. Does that mean there wasn't some genetic impact from across the Adriatic? No, it doesn't.

It's not implausible, although the fact that a minority of the samples show a similarity to Iron Age Croatians is not dispositive in and of itself for me. Northern Italians are very close to ancient Illyrians. Does that mean that Illyrians came and settled in Italy, or does it mean that similar people settled both Northern Italy and "Illyria". I think the latter is much more "plausible"

As for the Messapian people, let's wait and see their genetic profile. They were different from the Daunians in that they accepted foreign influences much more readily. Perhaps that translated to admixture as well.
 
Yes, well, "linguists" were wrong about the Etruscans. As for the Indo-European linguists, put five of them in a room and you'll get five different opinions. That's why I stay away from such discussions. It's like being on a merry-go-round.

The majority of these samples are closest to the Iron Age Central Italians, i.e. the Etruscans and the Latins. That's why the authors come down on the side of saying it was an authochthonous culture. Does that mean there wasn't some genetic impact from across the Adriatic? No, it doesn't.

It's not implausible, although the fact that a minority of the samples show a similarity to Iron Age Croatians is not dispositive in and of itself for me. Northern Italians are very close to ancient Illyrians. Does that mean that Illyrians came and settled in Italy, or does it mean that similar people settled both Northern Italy and "Illyria". I think the latter is much more "plausible"

As for the Messapian people, let's wait and see their genetic profile. They were different from the Daunians in that they accepted foreign influences much more readily. Perhaps that translated to admixture as well.

How were linguists wrong about Etruscans? Etruscans and Romans are genetically linked, but the Etruscan language is clearly non-Indo European. That's not an uncommon occurrence. The Frankish royalty spoke a Germanic language, while the common people spoke French. The Normans spoke French, while the lower class spoke English. With the Bulgars, they were Turkic, but the population spoke Slavic. Some times the royalty imposes their language on the population, sometimes they don't. And the Etruscans seems to have initially adopted the language of their rulers, but later switched to Latin like their kin.

Messapian and Albanian are undeniably linked. They show you an entire section of cognates from whatever inscriptions we have. Not to mention all the grammar similarities too.

Messapic lexical itemEnglish translationProto-Messapic formPaleo-Balkan languagesOther Indo-European cognatesSources
anamother*annā (a nursery word)Proto-Albanian: *na(n)nā, *amma; Albanian: nënë/nana, ëmë/âmë ('mother')Hittite: annaš ('mother'); Latin: amma ('mother'); Greek: ámma ('mother, nurse');[61]
andaand, as wellProto-Abanian: *edhō/êndō; Albanian: edhe/ênde ('and', 'yet', 'therefore')Latin: ante ("opposite, in front of"); Hittite: anda; Greek: endha/ΕΝΘΑ; ('and' , 'as well')[62]
apafrom*apoProto-Albanian: *apo; Albanian: (për-)apë ('from'); Albanian (Gheg): pi (PI < apa) ('from') or pa (PA < *apa) ('without')Greek: apó; Sanskrit: ápa[63]
atabulussiroccoProto-Albanian: *abula; Albanian: avull ('steam, vapor')Proto-Germanic: *nebulaz ('fog')[64]
aranfield*h₂r°h₃ā-Proto-Albanian: *arā: Albanian: arë, ara ('field')Hittite: arba- ('border, area'); Latvian: ara ('field')[65]
bàrkabellyProto-Albanian: *baruka; Albanian: bark ('belly')[66]
Barzidihi(personal name)Illyrian: Bardyl(l)is;Proto-Albanian: *bardza; Albanian: bardhë/bardhi, Bardha ('white', found also in anthroponyms, e.g., Bardhyl)[a]
[68]
bennan(a sort of vehicle)*bennaGaulish: benna (a kind of 'carriage')[69]
biles/bilihisonProto-Albanian: *bira; Albanian: bir, pl. bilj - bij ('son')Latin: fīlius ('son')[70]
biliā/bilina
daughter*bhu-lyāProto-Albanian: *birilā; Albanian: bijë - bija ('daughter'); older dialect bilë - bila ('daughter')Latin: fīlia ('daughter')[70]
bréndon; bréntionstag; stag's headProto-Albanian: *brina; Albanian: bri, brî ('horn'; 'antler')Lithuanian: briedis, ('elk'); Swedish: brinde ('elk')The Messapic word is at the origin of the toponym Brendésion (Βρενδέσιον), Brentḗsion (Βρεντήσιον), modern Brindisi
[71]
DamaturaMother Earth (goddess)*dʰǵʰ(e)m- maturaProto-Albanian: *dzō; Albanian: dhe ('earth')Latvian: Zemes Māte ('Mother Earth')Whether the (pre-)Illyrian form is at the origin of the Greek goddess Demeter or the contrary is unclear.[72]
[73]
deiva; dīvagod; goddessSanskrit: devá ('heavenly, divine'); Lithuanian Diēvas; Old Norse: Týr[74]
denvoice*ghenProto-Albanian: *džana; Albanian: zë/zâ, zër/zân ('voice')[75]
hazavaθito offer (sacral)ha- is a prefix, zav- is the same root as in Greek: χεών, Sanskrit ju-hô-ti and Avestan: zaotar- ('sacrificer')[76]
hipadeshe/she/it offers, dedicates, sets up*supo dhē-s-tProto-Albanian: *skūpa: Albanian: hip ('go up') and dha/dhash ('he gave/I gave')[77]
hipakaθioffer, set upAlbanian: hip ('go up') and ka/kam ('he has/I have') > hip-ka-[78]
klaohi/klohihear, listen (invocative)*kleu-s-Albanian: kluoj/kluaj/kluhem ('call, hear')Greek: klythí ('hear'); Sanskrit: śrudhí ('hear'); Slavic: slušati ('hear'); Lithuanian: klausyti ('hear')[79]
kossomeone*qwoProto-Albanian: *kuša; Albanian: kush ('who')Tocharian A: Kus ('who')[80]
manot*meh₁Albanian: ma, me, mosGreek: ; Sanskrit: [81]
menzafoal*mendyoProto-Albanian: *mandja; Albanian: mëz - maz ('foal'); mend ('to suckle'); Romanian (< Dacian) mînz ('foal')Gaulish: mandus ('foal')[82]
nerman*ner-Proto-Albanian: *nera; Albanian: njeri ('man')Greek: ανηρ ('man'); Sanskrit: nar- ('man')[83]
penkahehfiveProto-Albanian: *pentše; Albanian: pesë ('five')Lithuanian: penki ('five')[84]
rhīnósfog, mist, cloudProto-Albanian: *rina: Albanian: re, rê, rên ('cloud')[85]
tabarā; tabaraspriestess; priest (lit. 'offerer')*to-bhorā; *to-bhorosAlbanian: të bie/të bar, bjer/bar ('bring', 'carry')Greek: ϕορός ('bring'); Latin: ferō ('bring')[86]
teutāTaotor
community, people(name of a god)

*Toutor
Illyrian: Teuta(na) ('mistress of the people', 'queen')Oscan: touto ('community'); Old Irish: túath ('tribe, people'); Lithuanian: tautà ('people'); Gothic þiuda 'folk'[87]
veinanhis; one'sAlbanian: vetë ('himself, oneself')Sanskrit: svayàm ('himself')[88]
Venasdesire (name of a goddess)*wenosLatin: Venus; Old Indic: vánas ('desire')[89]
Zissky-god*dyēsIllyrian: dei- or -dí ('heaven, god', as a prefix or suffix);Albanian Zojz ('sky-god')
Hittite: šīuš ('god'); Sanskrit: Dyáuṣ; Greek: Zeus; Latin: Jove ('sky-god')[90]
 
Just stop. Messapians weren't Daunians, for one thing. We have no inscriptions for the Daunian language.

For another, I've occasionally read some posts in the Albanian language thread, where everybody has their favorite linguist. It doesn't inspire confidence.

As to the Etruscan language, there were "linguists" here and at anthrogenica and theapricity and on and on who argued vociferously that the Etruscan language couldn't be a European language; it had to come from somewhere in the east. None of them, including people who post here, wanted to know that inscriptions found further east were probably left by Etruscan traders. Btw, I have yet to see ANYONE man up and say "I WAS WRONG." Don't presume to tell me the history of this or what the arguments were; I've been at this for 12 years. Btw, you're still wrong about the Etruscans. The steppe admixed people who brought R1b lineages adopted the language of the native Italian Chalcolithic people already here, unlike the "Latins"; unless, that is, someone can explain how steppe admixed people spoke a non-Indo-European language. Think Basques.

Now go play this game on your Albanian threads. This is a paper about THE GENETIC ORIGIN of the Daunians, and we're talking about genetics. If you have something interesting to add in that regard which hasn't yet been addressed, by all means post it.
 
it being understood that the Daunian were different from the Messapi, but since we are also talking about Messapi here, I consider the recent discovery (08.01.2022) which took place in the Lower Salento to be relevant.
New light on the Messapi, research in the necropolis of Alezio (Lower Salento). An excavation campaign unearths tombs and the ceremonial square of the site in Puglia. The grave of a child; olives as an offering to propitiate the afterlife. The Messapian necropolis of Monte d'Elia, in Alezio, thanks to an initial research campaign conducted by the Archeology Laboratory of the University of Salento, offers new evidence of an ancient civilization. The findings: a ceremonial square, numerous tombs, including that of a child, an ossuary, the remains of olives, are just some of the new discoveries. Having concluded the excavation operations in recent days, we now continue with the analysis of the finds. Over the course of a few weeks of research, new fundamental data emerged for the knowledge of Messapian civilization, first of all through the topographical reconstruction of the Monte d'Elia area and the recognition of the funerary rituals practiced there in ancient times. Of extreme importance is the fact that concerns the identification of a large ceremonial square around which, within enclosures built with large boulders, the groups of tombs belonging to nuclei of families or clans were concentrated. This was the arrival point of the processions that accompanied the deceased on the last journey from the house to the place of burial. More detailed elements come from the excavation of burials that were not intercepted during the investigations of the 1980s by the Archaeological Superintendence of Puglia. In fact, a pit was identified, with a floor in limestone blocks and a frame in carparo, inside which the remains of at least 12 individuals were accumulated. In short, an ossuary linked to the functioning of the necropolis and to the practice of reusing funerary structures for various depositions. Some objects belonging to the grave goods were found, such as a lamp, a plate, a trozzella, a typical Messapian vase, two loom weights and a javelin point.
 
Just stop. Messapians weren't Daunians, for one thing. We have no inscriptions for the Daunian language.

For another, I've occasionally read some posts in the Albanian language thread, where everybody has their favorite linguist. It doesn't inspire confidence.

As to the Etruscan language, there were "linguists" here and at anthrogenica and theapricity and on and on who argued vociferously that the Etruscan language couldn't be a European language; it had to come from somewhere in the east. None of them, including people who post here, wanted to know that inscriptions found further east were probably left by Etruscan traders. Btw, I have yet to see ANYONE man up and say "I WAS WRONG." Don't presume to tell me the history of this or what the arguments were; I've been at this for 12 years. Btw, you're still wrong about the Etruscans. The steppe admixed people who brought R1b lineages adopted the language of the native Italian Chalcolithic people already here, unlike the "Latins"; unless, that is, someone can explain how steppe admixed people spoke a non-Indo-European language. Think Basques.

Now go play this game on your Albanian threads. This is a paper about THE GENETIC ORIGIN of the Daunians, and we're talking about genetics. If you have something interesting to add in that regard which hasn't yet been addressed, by all means post it.

Mind being a little more respectful? All I said is that Etruscan is not an Indo-European language. That's an established fact. I didn't say it's not a European language, because not all European languages are Indo-European in nature. Look at Basques.

As for the Daunians, they are all classified as Iapygians along with Messapians, and they spoke the Messapian language. We have inscriptions from their civilizations. I don't understand the controversy here. Pardon me for being a "biased Albanian" though.

The Iapygians were a "relatively homogeneous linguistic community" speaking a non-Italic, Indo-European language, commonly called 'Messapic'. The language, written in variants of the Greek alphabet, is attested from the mid-6th to the late-2nd century BC.[6]
 
Mind being a little more respectful? All I said is that Etruscan is not an Indo-European language. That's an established fact. I didn't say it's not a European language, because not all European languages are Indo-European in nature. Look at Basques.

As for the Daunians, they are all classified as Iapygians along with Messapians, and they spoke the Messapian language. We have inscriptions from their civilizations. I don't understand the controversy here. Pardon me for being a "biased Albanian" though.

The Iapygians were a "relatively homogeneous linguistic community" speaking a non-Italic, Indo-European language, commonly called 'Messapic'. The language, written in variants of the Greek alphabet, is attested from the mid-6th to the late-2nd century BC.[6]


The term Iapygians
the area includes all of modern Apulia and part of northeastern Basilicata (mainly the Melfese). This area corresponds roughly to ancient Iapygia (or Apulia), which, according to Greek and Roman literary tradition as well as in the archaeological record, was occupied by three Iapygian peoples: the Messapians to the south (roughly the Salentine Peninsula), the Peucetians in central Apulia, and the Daunians to the north, including the Melfese, now in Basilicata

The iapodes/Japodes refers to the Croatian Illyrian tribes that the Daunians are linked to
 
Mind being a little more respectful? All I said is that Etruscan is not an Indo-European language. That's an established fact. I didn't say it's not a European language, because not all European languages are Indo-European in nature. Look at Basques.

As for the Daunians, they are all classified as Iapygians along with Messapians, and they spoke the Messapian language. We have inscriptions from their civilizations. I don't understand the controversy here. Pardon me for being a "biased Albanian" though.

The Iapygians were a "relatively homogeneous linguistic community" speaking a non-Italic, Indo-European language, commonly called 'Messapic'. The language, written in variants of the Greek alphabet, is attested from the mid-6th to the late-2nd century BC.[6]

Mind not posting under yet another sock?

You asked how linguists were wrong about the Etruscan language; I told you. You seemed confused about who adopted whose language, so I tried to clear up the confusion.

Maybe you think it's disrespectful to say that linguists always disagree with one another, and consensus is hard to come by, or that the thread on the Albanian language is full of people spitting out their favorite linguist's opinion and how the linguist of someone else is completely wrong? I don't see how you can deny that, and I don't think it's disrespectful.

Now I "will" be disrespectful, but not to you, unless you happen to be a inguist. I don't think linguistics is a "science" at all, so someone's linguistic theory is never going to be a determining factor for me. It's like psychology and sociology, which are called social sciences but are also not sciences, and whose conclusions, quoted by everyone and his mother, can rarely be replicated. I'm entitled to my opinion about all of this, whether or not it bothers linguists and social scientists.

My point is that there is nothing in this paper to suggest that the Daunians are transplanted people from the Balkans. How could they be, when the majority of them are closest to CENTRAL ITALIAN Iron Age people? Could there have been some influx from the Balkans? It's certainly possible, even plausible. Somehow that's not good enough for you?

Instead of taking offense at the littlest thing, maybe people should just get on with the actual work of analyzing genetic data as logically and reasonably and with as little, yes, bias, as possible.
 
Mind not posting under yet another sock?

You asked how linguists were wrong about the Etruscan language; I told you. You seemed confused about who adopted whose language, so I tried to clear up the confusion.

Maybe you think it's disrespectful to say that linguists always disagree with one another, and consensus is hard to come by, or that the thread on the Albanian language is full of people spitting out their favorite linguist's opinion and how the linguist of someone else is completely wrong? I don't see how you can deny that, and I don't think it's disrespectful.

Now I "will" be disrespectful, but not to you, unless you happen to be a inguist. I don't think linguistics is a "science" at all, so someone's linguistic theory is never going to be a determining factor for me. It's like psychology and sociology, which are called social sciences but are also not sciences, and whose conclusions, quoted by everyone and his mother, can rarely be replicated. I'm entitled to my opinion about all of this, whether or not it bothers linguists and social scientists.

My point is that there is nothing in this paper to suggest that the Daunians are transplanted people from the Balkans. How could they be, when the majority of them are closest to CENTRAL ITALIAN Iron Age people? Could there have been some influx from the Balkans? It's certainly possible, even plausible. Somehow that's not good enough for you?

Instead of taking offense at the littlest thing, maybe people should just get on with the actual work of analyzing genetic data as logically and reasonably and with as little, yes, bias, as possible.

Sock? It's amusing that you'd think I'd waste my time with making sock accounts on a random forum. I haven't posted here since like 2013 or something.

There is a difference between analyzing unattested language/s like "Illyrian". That's the source of the argument in that other thread. That there's no first hand accounts to extensively analyze.

You're denying basic facts of attested languages. That's a whole other debate. Messapian and Albanian are both attested languages (as is Etruscan), and I don't know of any linguist that denies a link between the two. If that's the case, please show me. Messapian is studied by Albanian, because there are very little other cognates outside of it.
 
might be of use

http://www.asciatopo.altervista.org/illyria.html

The Roman province of Illyricum was bounded by the Ras^a river (toward Venetia), the river Drin (toward Macedonia), and the Adriatic sea. Toward Pannonia and Moesia in the interior the boundaries are less clear but they should have followed the mountain ranges of Velebit, Bosnia, and Montenegro.

From the river Ras^a to the river Zrmanja and then to the Krka, the land was inhabited by Iapydes and Liburni, from the Krka to the river Neretva, it was called Dalmatia, and from the Neretva to the Drin, Illyria proper (Barbara or Romana).



Illyrian Proper is Pliny term for Southern Illyrian who where not celtinized.

Illyricum in Pliny times was from south Slovenia to South Montenegro.

Italian terms of Barbara or Romana needs to be looked at
 
Mind not posting under yet another sock?

You asked how linguists were wrong about the Etruscan language; I told you. You seemed confused about who adopted whose language, so I tried to clear up the confusion.

Maybe you think it's disrespectful to say that linguists always disagree with one another, and consensus is hard to come by, or that the thread on the Albanian language is full of people spitting out their favorite linguist's opinion and how the linguist of someone else is completely wrong? I don't see how you can deny that, and I don't think it's disrespectful.

Now I "will" be disrespectful, but not to you, unless you happen to be a inguist. I don't think linguistics is a "science" at all, so someone's linguistic theory is never going to be a determining factor for me. It's like psychology and sociology, which are called social sciences but are also not sciences, and whose conclusions, quoted by everyone and his mother, can rarely be replicated. I'm entitled to my opinion about all of this, whether or not it bothers linguists and social scientists.

My point is that there is nothing in this paper to suggest that the Daunians are transplanted people from the Balkans. How could they be, when the majority of them are closest to CENTRAL ITALIAN Iron Age people? Could there have been some influx from the Balkans? It's certainly possible, even plausible. Somehow that's not good enough for you?

Instead of taking offense at the littlest thing, maybe people should just get on with the actual work of analyzing genetic data as logically and reasonably and with as little, yes, bias, as possible.


Here I disagree: linguistic, spite not a "hard" science, is science when practised by someones who stay on facts and distinguish between acquired certitudes and hypothesis: serious linguists in some way. What is true is that we see today a lot of new theories on languages popping up, which I modestly find science fiction. It isn't to say we can put all linguists in the same bag...
 
Concerning the ties between some ancient Italy people and Northwestern Balkans we ought to be careful because it seems Italy has received a lot of people from this region at least between Chalcolithic and Iron, and not only across sea but also through the northeastern lands.
It's of little value as an hazardous hypothesis, but I wonder if the principal factor of differenciation between Celts and Italics (I see between bronze and early Iron) geographic distance left aside is not this geographic proximity of Italics with N-W Balkans and the exchanges made possible?
 
Concerning the ties between some ancient Italy people and Northwestern Balkans we ought to be careful because it seems Italy has received a lot of people from this region at least between Chalcolithic and Iron, and not only across sea but also through the northeastern lands.
It's of little value as an hazardous hypothesis, but I wonder if the principal factor of differenciation between Celts and Italics (I see between bronze and early Iron) geographic distance left aside is not this geographic proximity of Italics with N-W Balkans and the exchanges made possible?

Most of the migration into Italy north of Rome, at least, came, imo, through the flat lands in the northeast which skirt the Alps. From France into Italy and vice versa in, say, Liguria, it's easier to travel by boat. Yes, there are mountain passes from the central Alps into Italy; I was born and lived along one of the major routes, the Via Francigena. It's much easier to skirt them.

That doesn't mean that I think the gene flow necessarily came from the Balkans around Croatia, for example. I think that it is just as plausible that the similarities between Balkan people and Italians are due in some part to the fact that the same groups migrated to both places.

That's as true for the steppe admixed groups in central Europe as it is for the eastern admixed groups which influenced Italy on a cline from south to north.

I see all of this in my own results using Jovialis' calculator. My results are a combination of Italian Chalcolithic, La Tene, Bulgarian Bronze Age, and Minoan.
 
My aim was to say the possible (relatively important) ties could be multifolds. But Croatia (before today political name) could have been an important point of departure for Italy at different times even if of course things are a bit more complicated.
 
I tested ORD001 … the paper says that they aligned the fastq files with bwa mem … disabled seeding, … though I get the same results with the default settings, and I get the same combined raw-data output as the .bam on ENA.

xstVvt2.jpg


I decided to run the fastq of ORD001 with bwa aln (short reads) with disabled seedlings, the standard for Ancient samples, … though is much slower than bwa mem.

--- as reference:

... linux based:

ORD001 mapping as per the paper (bwa mem modified settings for Ancient samples):
Code:
bwa mem -k 19 -r 2.5 ref.fa ORD001.fastq.gz | samtools view -bS - > ORD001_bwa_mem_K19-2_5.bam

samtools sort ORD001_bwa_mem_K19-2_5.bam -o ORD001_bwa_mem_K19-2_5.sort.bam

samtools index ORD001_bwa_mem_K19-2_5.sort.bam

# coordinates
ORD001_bwa_mem_K19-2-5_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ORD001 mapping with default setting:
Code:
bwa mem ref.fa ORD001.fastq.gz | samtools view -bS - > ORD001_bwa_mem_default.bam

samtools sort ORD001_bwa_mem_default.bam -o ORD001_bwa_mem_default.sort.bam

samtools index ORD001_bwa_mem_default.sort.bam

# coordinates
ORD001_bwa_mem_default_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ENA .bam coordinates:
Code:
ORD001_ENA_bam_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ORD001 mapping with bwa aln with disabled seedlings, ... modified settings for Ancient samples:

Code:
bwa aln ref.fa ORD001.fastq.gz -n 0.01 -o 2 -l 1024 > ORD001.sai

bwa samse ref.fa  ORD001.sai ORD001.fastq.gz -f ORD001.sam

samtools view -Sb ORD001.sam > ORD001.bam

samtools sort ORD001.bam -o ORD001.sort.bam

samtools index ORD001.sort.bam

 ####### to Add Read Group tags and index bam files:

java -jar build/libs/picard.jar AddOrReplaceReadGroups INPUT=ORD001.sort.bam OUTPUT=ORD001.RG.bam RGID=rg_id RGLB=lib_id RGPL=platform RGPU=plat_unit RGSM=sam_id VALIDATION_STRINGENCY=LENIENT

samtools index ORD001.RG.bam

####### to Mark and remove duplicates:

java -jar build/libs/picard.jar MarkDuplicates I=ORD001.RG.bam O=ORD001.DR.bam M=output_metrics.txt REMOVE_DUPLICATES=True VALIDATION_STRINGENCY=LENIENT &> logFile.log

samtools index ORD001.DR.bam

####### Local reads realignment
java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R ref.fa -I ORD001.DR.bam -o target.intervals

java -jar GenomeAnalysisTK.jar -T IndelRealigner -R ref.fa -I ORD001.DR.bam -targetIntervals target.intervals -o ORD001.final.bam --filter_bases_not_stored

samtools sort ORD001.final.bam -o ORD001.final.sort.bam

samtools index ORD001.final.sort.bam

#######  flagstat file (some info about the .bam)
samtools flagstat ORD001.final.sort.bam > ORD001.final.sort.txt

coordinates:
ORD001_bwa-aln_Dod_K12b,2.45,0,2.89,1.61,30.29,31.24,0,0.86,8.77,1.24,20.63,0

... the created files:

6xgDlHp.png



more info:
https://paleogenomics-course.readthedocs.io/en/latest/4_ReadsMapping_v2.html
 
Last edited:
I tested ORD001 … the paper says that they aligned the fastq files with bwa mem … disabled seeding, … though I get the same results with the default settings, and I get the same combined raw-data output as the .bam on ENA.

xstVvt2.jpg


I decided to run the fastq of ORD001 with bwa aln (short reads) with disabled seedlings, the standard for Ancient samples, … though is much slower than bwa mem.

--- as reference:

... linux based:

ORD001 mapping as per the paper (bwa mem modified settings for Ancient samples):
Code:
bwa mem -k 19 -r 2.5 ref.fa ORD001.fastq.gz | samtools view -bS - > ORD001_bwa_mem_K19-2_5.bam

samtools sort ORD001_bwa_mem_K19-2_5.bam -o ORD001_bwa_mem_K19-2_5.sort.bam

samtools index ORD001_bwa_mem_K19-2_5.sort.bam

# coordinates
ORD001_bwa_mem_K19-2-5_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ORD001 mapping with default setting:
Code:
bwa mem ref.fa ORD001.fastq.gz | samtools view -bS - > ORD001_bwa_mem_default.bam

samtools sort ORD001_bwa_mem_default.bam -o ORD001_bwa_mem_default.sort.bam

samtools index ORD001_bwa_mem_default.sort.bam

# coordinates
ORD001_bwa_mem_default_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ENA .bam coordinates:
Code:
ORD001_ENA_bam_Dod_K12b,2.13,0,3.17,1.09,29.50,31.57,0,0.75,9.39,1.76,20.64,0

ORD001 mapping with bwa aln with disabled seedlings, ... modified settings for Ancient samples:

Code:
bwa aln ref.fa ORD001.fastq.gz -n 0.01 -o 2 -l 1024 > ORD001.sai

bwa samse ref.fa  ORD001.sai ORD001.fastq.gz -f ORD001.sam

samtools view -Sb ORD001.sam > ORD001.bam

samtools sort ORD001.bam -o ORD001.sort.bam

 ####### to Add Read Group tags and index bam files:

java -jar build/libs/picard.jar AddOrReplaceReadGroups INPUT=ORD001.sort.bam OUTPUT=ORD001.RG.bam RGID=rg_id RGLB=lib_id RGPL=platform RGPU=plat_unit RGSM=sam_id VALIDATION_STRINGENCY=LENIENT

samtools index ORD001.RG.bam

####### to Mark and remove duplicates:

java -jar build/libs/picard.jar MarkDuplicates I=ORD001.RG.bam O=ORD001.DR.bam M=output_metrics.txt REMOVE_DUPLICATES=True VALIDATION_STRINGENCY=LENIENT &> logFile.log

samtools index ORD001.DR.bam

####### Local reads realignment
java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R ref.fa -I ORD001.DR.bam -o target.intervals

java -jar GenomeAnalysisTK.jar -T IndelRealigner -R ref.fa -I ORD001.DR.bam -targetIntervals target.intervals -o ORD001.final.bam --filter_bases_not_stored

samtools index ORD001.final.sort.bam

#######  flagstat file (some info about the .bam)
samtools flagstat ORD001.final.sort.bam > ORD001.final.sort.txt

coordinates:
ORD001_bwa-aln_Dod_K12b,2.45,0,2.89,1.61,30.29,31.24,0,0.86,8.77,1.24,20.63,0

... the created files:

6xgDlHp.png



more info:
https://paleogenomics-course.readthedocs.io/en/latest/4_ReadsMapping_v2.html

thanks


I ran the ENA results

Target: ORD001_ENA_bam_Dod_K12b
Distance: 0.8054% / 0.80540055 | ADC: 0.25x RC
33.6 C7-Villa_Magna_MA
16.6 I3592
13.5 I1311
11.6 VEN022
8.8 NE_Iberia_c6CE_PL
8.4 ETR007
7.5 NE_Iberia_c8-12CE


I3592 ...............I3592 2457–2203 calBCE (3844±33 BP, BRAMS-1218) Alburg-Lerchenhaid, Spedition Häring, Stkr. Straubing, Bavaria Germany

I cannot find I1311
 
thanks
I ran the ENA results
Target: ORD001_ENA_bam_Dod_K12b
Distance: 0.8054% / 0.80540055 | ADC: 0.25x RC
33.6 C7-Villa_Magna_MA
16.6 I3592
13.5 I1311
11.6 VEN022
8.8 NE_Iberia_c6CE_PL
8.4 ETR007
7.5 NE_Iberia_c8-12CE
I3592 ...............I3592 2457–2203 calBCE (3844±33 BP, BRAMS-1218) Alburg-Lerchenhaid, Spedition Häring, Stkr. Straubing, Bavaria Germany
I cannot find I1311

,, it's on the Dod, K12b SOURCE (Vahaduo) ... by Jovialis .. I think:
Code:
I1311:Olalde_2019,0,1.99,0,0,29.92,32.2,0,2.84,17.19,0.03,15.82,0
 
,, it's on the Dod, K12b SOURCE (Vahaduo) ... by Jovialis .. I think:
Code:
I1311:Olalde_2019,0,1.99,0,0,29.92,32.2,0,2.84,17.19,0.03,15.82,0


seems odd ...............

Target: I1311:Olalde_2019
Distance: 0.0141% / 0.01414175 | ADC: 0.25x RC
99.7 I1311
0.1 Beaker_Central_Europe
0.1 I1982
0.1 NE_Iberia_Hel_(Empúries2)


closest to it is LBK_EN I5755 Central Europe .....Olande paper 2018
 
Guys keep in mind that outside ORD009... the rest of the samples have ~5% coverage/quality...
 

This thread has been viewed 147195 times.

Back
Top