7 Comments
User's avatar
Ted Albert Torrey's avatar

Great analysis as always! I really appreciate your ability to discern caveats and alternative interpretations.

One thing striking in the Reilly et al. paper was their findings increased the volume of Denisovan variants introgressed into living populations to almost 70% of the Denisovan genome. It sure speaks to Denisovan genomic closeness to us on a gene-functional level, despite the long time of separation from our common ancestor.

John Hawks's avatar

Thanks, I really did come to my endpoint as a surprise to myself. There is so much noise in these datasets!

I concur on the fraction of Denisovan variants. The next post I'll look at the other 2026 paper by Hsieh and coworkers, in which they characterize 11 centromeres of archaic origin, 8 Denisovan, in only two whole genomes from Papua. That's close to a fourth of the centromeric variation in these two people!

Kirill Pankratov's avatar

> introgressed into living populations to almost 70% of the Denisovan genome

I didn't quite understand this number from the paper. May be somebody can explain it better. Surely it doesn't mean that these Oceanian people are 70% Denisovans. But what does it actually mean?

John Hawks's avatar

I believe this is the percentage of the overall huamn genome that has some Denisovan haplotype surviving in some people. For Neanderthals this has lately been said to be in the neighborhood of 40%. If you had only sampled one modern human and done this calculation, it would be the percent of introgression for that individual, but as you sample more individuals the "all surviving Denisovan" percentage keeps monotonically increasing, while the "average Denisovan content" percentage is well sampled with a few individuals and stays around the same as you sample more individuals.

Ted Albert Torrey's avatar

With respect to the difference in total human genomic introgression by Denisovans versus that by Neanderthals, 2 potential factors come to mind:

Time - the various Denisovan introgressions came later (say, roughly 45 to 30 kya) than did the Neanderthal introgressions (say, roughly 75 to 60 kya). So there could have been up to twice as long for Neanderthal-introgressed variants to be selected against.

Source population variation - multiple and likely diversified Denisovan populations contributed variants for selection to operate on, while the Neanderthal-introgressed variants probably came from a less-diverse population. Until we have actual Neanderthal genomes from SW Eurasia, we won't know what their diversity was. Perhaps some of the "ambiguous" introgressed variants are from those Neanderthals distinct from the NW Eurasian ones for which we have sequence now.

Kirill Pankratov's avatar

Very nice explanation, I now understand the Reilly paper (which is very difficult to read) considerably better!

One interesting thing standing out in the other paper (last figure of this post) is how divergent various Denisovans from modern humans compared to Neanderthals. Human genomes are >99.9% similar to each other, and the similarity to Vindija Neanderthals seems to peak around 99.6-99.7%, which is consistent with previous estimates.

But with Denisovans there is a significant percentage at much lower similarity - 97% and even 96%. If I understand it correctly, this is a huge divergence, must be of erectus-like scale (though we don't have genomes of the latter) or even bigger, almost all the way to chimps. Should it be surprising or is it expected?

John Hawks's avatar

A challenge interpreting some of these papers, particularly the Antoine-Derouet one but I've seen a similar issue in others, is that they are giving an unusual scale for reporting these percentages.

The 97% or 96% in some of those figures refers not to the site homozygosity (the "humans are 98% chimpanzee" number) but instead to the fraction of SNP alleles that are shared. The "match rate" values in the Reilly et al. study are the fraction of *archaic-derived* SNP alleles that are shared, which is lower yet.

Neither of those is the percent DNA sequence homology. These are sort of artificial measures that don't correspond in a straightforward way to textbook mathematical population genetics. They could in principle be derived mathematically but in practice they are used as estimators based on how they vary in simulation studies, so it's kind of design-your-own measurement.