john hawks weblog

paleoanthropology, genetics and evolution

biogeography

  • Quote: Osborn on biogeography

    Thu, 2009-08-06 20:01 -- John Hawks

    Henry Fairfield Osborn, "Hunting the ancestral elephant in the Fayûm desert":

    One of the most fascinating problems of paleontology, therefore, is to ascertain the birthplace of each of the great animal groups -- the fertile or arid nursery wherein they first took on their peculiar and characteristic form, and, like the races of man, felt the power and strength to go forth and invade the lands of other races.

    References:

    Osborn HF. 1907. Hunting the ancestral elephant in the Fayûm desert: Discoveries of the recent African expedition of the American Museum of Natural History. The Century Magazine 74(6):815-835.

  • Bigfoot biogeography

    Fri, 2009-07-03 13:52 -- John Hawks

    So a couple of weeks ago, the Journal of Biogeography published a paper arguing that humans and orangutans are sister taxa.

    This week, the journal has published a paper on the biogeography of Sasquatch.

    Yes, that's correct. The same journal that published the updated "red ape" paper has now published a Bigfoot paper.

    OK, Journal of Biogeography, I'll be your straight man. I mean, I bow to no one in my ability to snark on human evolution, but this is like some sort of karmic singularity.

    Besides, this Bigfoot paper is really good. The authors intend it as a "tongue-in-cheek" example, and it works on that basis, as an illustration of the "garbage in, garbage out" principle. Biogeographers have been using increasingly sophisticated computer algorithms to predict the "ecological niche" of species. The algorithms take information about sightings or recorded incidences of a species, find commonalities among those sightings against maps of other ecological data (rainfall, forest type, presence of other species, etc.), and spit out an ecogeographic distribution for the target species.

    Of course, the algorithm will come to some result, regardless of how well each piece of data in the system is known. And the algorithms are complex enough that the creeping effects of errors may be hard to evaluate:

    While the value of publicly available sample locality data is not questioned, the consequent introduction of errors in the accuracy of specimen identity and georeferencing could be problematic for developing ENMs [that is, "ecological niche models"] from public data sources (Graham et al., 2004; Soberón & Peterson, 2004). Although georeferencing inaccuracies can be identified in databases from qualitative or quantitative accuracy thresholds (e.g. http://manisnet.org/GeorefGuide.html), poor taxonomy and/or misidentification may be less detectable. This issue may be particularly problematic, for example, with cryptic species or subspecies that are morphologically similar but may have very distinct ecological requirements and geographic distributions, or for those data sources that contain indirect observations rather than references only to physical specimens.

    What better way to illustrate the problem, than by applying the analytical methods straight to a "species" whose existence is, shall we say, "questionable."

    The authors additionally raise one particular element of confusion that may enter into ecological niche modeling -- sightings of similar species may confound each other. They consider the example of black bears -- another large North American mammal that occurs in many of the areas where Bigfoot sightings are reported. They run the same analysis on black bears as they did on Sasquatch, finding a large overlap in distributions. But interestingly, they have fewer observations for bears than they do for Bigfoot -- leading them to an underestimate of the actual range of black bears. They reflect on the possible interpretations of the overlap:

    Thus, the two 'species' do not demonstrate significant niche differentiation with respect to the selected bioclimatic variables. Although it is possible that Sasquatch and U. americanus share such remarkably similar bioclimatic requirements, we nonetheless suspect that many Bigfoot sightings are, in fact, of black bears.

    From my perspective, this paper is important for two reasons, neither of them really having to do with large North American primates. First is a social and legal angle. Increasingly, the habitat distributions of endangered or threatened species are evaluated on the basis of similar computer models of ecological niche. In particular, the changes in species distributions under scenarios of climate change are modeled in this way. This computer modeling has the appearance of objectivity, and it certainly allows reams of data to be statistically simplified into human-readable maps. That makes the results of such analyses really valuable to cases where political and legal units need to make decisions about how to comply with threatened species regulations.

    But if the data going into the model aren't correct, then the predictions of the models won't reflect reality. The question is, how much will they be wrong?

    In this case, the limited black bear dataset leads to a substantial underestimation of the black bear habitat niche. And possible confusion between Sasquatch and black bear sightings raises the possibility that any rare species will be significantly overrepresented in ecological niche modeling when such confusions are possible. Neither of these outcomes tells us how bad such errors are likely to be, but they point to real weaknesses in the maps generated by these computer algorithms.

    I expect that smart lawyers will be finding ways to use this Bigfoot paper a lot.

    Second, I think the paper is important at this moment in paleoanthropology. Late last year, I wrote about a paper that evaluated ecological niche models for Neandertals and people who made the Aurignacian ("'Competitive exclusion' and the extinction of Neandertals: should we believe it?"). The paper, by William Banks and colleagues, had used the observed distribution of archaeological sites between certain radiocarbon date intervals to estimate an ecological niche model for the two hominid groups.

    I think that paper was very good work, but it obviously is subject to uncertainties in the initial observations -- just as the Sasquatch data are. The archaeological observations add even more uncertainties of dating and certainty of association between biology and archaeology. And the Sasquatch data set is much, much larger in terms of sighting numbers; meaning that the archaeological cases ought to have more error, when it comes to evaluating niche flexibility.

    For the purposes of the Banks et al. (2008) paper, I think their conclusions are pretty secure. They tested the hypothesis that the reduction of Neandertal occurrences over time could be explained by climate change; they were able to reject that hypothesis by showing the ecological niche breadth of earlier Neandertals included paleoenvironments that were very widespread across Europe throughout the period when they were declining. It's a nice demonstration.

    But the Sasquatch example shows that we have to evaluate the ability of ecological niche modeling to test each hypothesis, on the basis of the data that are likely to be available. Can the model show that the spread of Aurignacian people caused the Neandertals to decline? That depends on our confidence about the dating and biological associations at early Aurignacian sites. The computer algorithm gives a structured way to reduce information that is already manifest in the data.

    References:

    Lozier JD, Aniello P, Hickerson MJ. 2009. Predicting the distribution of Sasquatch in western North America: anything goes with ecological niche modelling. J Biogeogr (early online) doi:10.1111/j.1365-2699.2009.02152.x

    Banks WE, d’Errico F, Peterson AT, Kageyama M, Sima A, et al. (2008) Neanderthal Extinction by Competitive Exclusion. PLoS ONE 3(12): e3972. doi:10.1371/journal.pone.0003972

  • Overstating the obvious

    Tue, 2009-03-24 01:41 -- John Hawks

    I'm reading this interesting paper by Joseph Pickrell and colleagues, titled, "Signals of recent positive selection in a worldwide sample of human populations". The paper recounts the results of a selection scan in the Human Genome Diversity panel, which was reported in two publications last year. This is an interesting sample because it includes individuals from 53 population samples around the world.

    I was waiting to present any observations about selection from the HGDP set until Pritchard's lab had published on them, since the initial publications had mentioned that this analysis was forthcoming. Now that it's appeared, I'll be pointing to a lot of these data in upcoming posts.

    So I was reading with great interest. Then I found this statement:

    Reports of ubiquitous strong (s = 1-5%) positive selection in the human genome (Hawks et al. 2007) may be considerably overstated (8).

    I'm a little concerned that someone reading that might think that Pickrell and colleagues had actually tested our hypothesis about the number of recent strongly selected alleles. I'm also uncertain about the word, "ubiquitous", which means "everywhere." I mean, does that really sound like the kind of word I would use? It's just begging for trouble. It's like saying there's "ubiquitous" evidence of Neandertal contribution to the later European gene pool. Even if I thought it was true, I wouldn't put it in a paper!

    We reported that roughly seven percent of genes appeared to be selected. Pickrell and colleagues list a rather large number of candidate loci for selection, and don't give any estimate or test of the number genome-wide. I think one might be able to count the regions listed in the data supplement for an estimate of what they thought was important enough to list, but I can't get the supplement yet. Since these candidate loci require 16 supplementary figures to list, maybe there are a lot of them. They do list a subset of more than 110 in the paper itself.

    So what's the basis for saying we overstated anything? They suggest one reason for caution about the interpretation of candidate loci for selection:

    We find that putatively selected haplotypes tend to be shared among geographically close populations. In principle, this could be due to issues of statistical power: broad geographical groupings share a demographic history and thus have similar power profiles. However, strongly selected loci are expected to show geographical patterns largely independent of demography—depending on the relevant selection pressures, they can be highly geographically restricted despite moderate levels of migration, or spread rapidly throughout a species even in the presence of little migration (Nagylaki 1975; Morjan and Rieseberg 2004) (8).

    But wait a minute! If a gene were selected strongly and still polymorphic in human populations, it shouldn't be very old. So it can't have spread rapidly throughout the human species even in the presence of little migration. There hasn't been any time for this kind of spread.

    To give a little mathematical perspective, one common way of modeling the dispersal of an advantageous gene is the Fisher diffusion wave model. In a Fisher wave, the gene grows logistically at any single point in space, and the allele frequencies form a standing wave that travels through space at a constant velocity. That velocity in a population uniform across 2-dimensional space is σ times the square root of s, where s is the selection coefficient and σ the root mean square dispersal distance -- basically, the average distance a person moves between his birth and the birth of his children.

    If we want to know about dispersal of selected genes in early agriculturalists, we will need to know how far they move -- that's generally less than 10 km on average. So a gene selected strongly with a 5 percent advantage should move around 2.2 km/generation. Over the 400 generations since the beginning of agriculture, we'd expect a new allele to have dispersed across an area with a radius of less than 1000 km.

    So in other words, it's just implausible that a selected allele would have a geographic distribution very different from drift, at least under the Fisher wave model. But obviously, some alleles have gone a lot farther than 1000 km in the last 10,000 years. Humans don't disperse strictly according to a Gaussian distribution, as assumed by the Fisher model; they sometimes disperse long distances. This can have a large impact on the spread of an advantageous allele. But it is an irregular phenomenon -- a stochastic event.

    Let's consider the results a bit further. Here's a passage from page 1:

    We find extensive sharing of putative selection signals between genetically similar populations, and limited sharing between genetically distant ones. In particular, Europe, the Middle East, and Central Asia show strikingly similar patterns of putative selection signals.

    Which is exactly what we would predict from the history of these populations. Most signals of selection in Europe are Neolithic in date. The Neolithic was not only a time of massive population growth, but also the time of greatest mismatch between the human population and its novel agricultural environment. The dispersal of Neolithic lifeways from West Asia into Europe, and the recurrent incursions of Central Asian languages westward across the steppe into Europe and southward into the Indian subcontinent are the major features of the last 10,000 years of history in those regions. Don't we expect them to share a lot of selection? And if it took the massive migrations and interactions in those regions to generate this shared pattern of selection, shouldn't we expect other regions of the world, which lacked as extensive long-distance movements, to share fewer?

    In this case, the critical information for evaluating the evidence is historical and archaeological. We can't just say that the candidate loci for selection have a similar geographic distribution to those that aren't selected. We need to evaluate the likelihood that they would have some other distribution. That likelihood is very low for most instances of selection, but may be high for a fraction of cases, or for some regions where long-distance dispersal was a more important aspect of population history.

    So if we have a locus that is inconsistent with drift on the basis of linkage, we can reject drift. What if the geographic distribution is still consistent with drift? Should we doubt the linkage analysis? I don't see why -- basic biogeography says that most recently selected genes should have similar geographic distributions to drift.

    References:

    Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK. 2009. Signals of recent positive selection in a worldwide sample of human populations. Genome Res (early online) doi: 10.1101/gr.087577.108

Subscribe to biogeography

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.