john hawks weblog

paleoanthropology, genetics and evolution

selection

  • Mailbag: Exaptation and standing variation

    Tue, 2012-01-24 12:11 -- John Hawks
    This may sound like a dumb question, but I am trying to understand the difference between “selection on standing variation” and the concept of “exaptation”. They seem to mean the same thing? Am I missing something?

    Thanks for any help you can provide.

    No problem. Exaptation almost always refers to a phenotypic trait, and specifically the case where it used to do one thing, and has changed because of natural selection for some other function.

    Selection on standing variation is usually just a contrast with selection on a new mutation. A new mutation that comes under positive selection will rapidly increase in frequency and thereby generate lots of signs we can recognize, for example genetic hitchhiking.

    Selection on an old mutation that has already existed in the population for a long time (and is therefore "standing" variation) also can cause the mutation to increase in frequency, but this will not necessarily cause hitchhiking or other easily recognizable patterns, because copies of the mutation that have existed in the population for a long time probably are not all linked to the same set of mutations at other loci.

    Practical example: Lactase persistence. We know that lactase persistence in Europeans is selection on a new mutation. If people carrying the key lactase persistence mutation did not all share near-identical region of chromosome 2 around that mutation, we would suspect it was selection on standing variation (when we learned about lactase persistence more than 10 years ago, this was not resolved yet and many geneticists thought it would turn out to be standing variation). Lactase persistence is *arguably* an exaptation, because it uses the mechanism that evolved for one purpose (babies digesting mothers' milk) and changed it under selection for another purpose (adults digesting cow milk).

  • Mailbag: Neandertal derived SNP alleles

    Tue, 2011-12-13 09:48 -- John Hawks

    Re: Neandertal introgression, 1000 Genomes style:

    Long-time reader of your blog, non-paleo/anthro/genetics person, here. But please read on:

    Just a couple of brief questions.

    (i) It seems that it would make sense to look at pairwise comparisons (of shared derived Neanderthal SNP alleles) both within a population (e.g., Asians, or CEU) and between them, and build a histogram of how often they overlap.

    (ii) Then one could remove from the data set all such African shared SNPs - assuming that most of them are incomplete lineage sorting but that Africa had the initial superset of alleles before ooA (I know some are likely West Asian or European admixture, reducing the data set slightly more than necessary), and repeat (i) and similar diagnostics. Is the typical unmodified genome chunk length around such sites much longer than in (i) - can one date this? Can one now better quantify the actual admixture percentage outside of Africa?

    Wouldn't such a procedure give more insight about how Neanderthal introgression is distributed, when it occurred, and perhaps where it occurred?

    I am sure you are already working on similar ideas - just wanted to know if you agree that these may be low-hanging fruit to pursue.

    Thanks!

    Hi -- thanks for writing!

    I started with exactly the approach you describe, when we were working exclusively with SNP data in the spring. For example:

    http://johnhawks.net/weblog/reviews/neandertals/neandertal_dna/europe-ch...

    We were using linked haplotypes rather than single SNPs but the filtering process was the same.

    Now I am hopeful that we will have decent age estimates for the introgressing SNPs from a different technique. I would rather find these ages independently of filtering by geographic location, because having this information will greatly simplify testing models of ancient population dynamics. If we succeed at this, we will also have a test of selection based on the same allele ages.

    I am continuing to update and you'll see these results not long after we get them!

  • HLA class-I loci in Neandertals and Denisova

    Thu, 2011-08-25 21:08 -- John Hawks

    With draft sequences of genomes from several Neandertals and from Denisova, we can begin to investigate known human variations that affect phenotypes. In practice, this is a very simple approach -- take alleles that we know exist in recent human populations, and see if they are in the DNA sequences of these ancient people. My lab has been following this line of research, trying to get information about aspects of biology that are not evident from the skeleton. The immune system is one of the most fascinating, both because of its extensive variation in living people, and because we might be able to test hypotheses about the diseases and parasites that ancient humans faced.

    Today Science has released an early manuscript edition of a paper by Laurent Abi-Rached and colleagues (bibliographic information not yet available), which identifies the HLA class-I alleles present in the three highest-coverage Neandertal genomes from Vindija (Vi 33.16, 33.25, and 33.26) and the Denisova pinky genome. The paper is very brief and fairly straightforward, providing provisional HLA class 1 allele types for these individuals, discussing possible haplotype associations among these alleles that may have been in the ancient genomes, and providing the frequency of those alleles in present-day human populations.

    These archaic individuals carried HLA types that are presently rare in Africa and more common outside of Africa, supporting the hypothesis that these alleles in living people originated in those archaic populations. The linkage between alleles at different HLA class-I genes also supports that hypothesis. The present immune system biology of humans was strongly shaped by the interaction of different regional populations of archaic humans.

    The title of the paper calls this "multiregional admixture", and the word "introgression" appears 8 times. Good for us!

    (This is the point where I grumble about the lack of citations in this paper....OK, done grumbling.)

    Selected genes may have a very different pattern from neutral genes

    This paper is the first demonstration that gene variants of functional importance were not only inherited from Neandertals and Denisovans but were valuable and selected in later populations.

    We already knew that humans today have gene variants from these archaic humans. Neandertal genes presently account for around 3 percent of the genomes of people outside Subsaharan Africa. My lab has been studying the pattern of frequency of these genes ("Europe and China have different Neandertal genes"). Most of the genes shared between the Neandertal genome and living people outside Africa are presently very rare -- most occur only in a single individual in our sample of Europeans and Chinese people, for example.

    These HLA class-I alleles are different. Some of them are quite common today. If they came from the Neandertals and Denisovans -- that is, if they were not present in the African people who make up most of our ancestry genome-wide -- then these alleles must have increased quite a lot during the recent evolution of people outside Africa.

    The best explanation for the large increase in frequency of these genes in modern human populations is selection. If readers want to get an introduction to the scientific literature on the topic of functional genes, I can suggest a detailed review paper I wrote with Greg Cochran on the dynamics of introgression and selection as applied to Neandertals [1], and a review paper we wrote in Trends in Genetics about identifying genes in living humans that that may have come from archaic populations [2]. In both papers, we discuss the dynamics of functional genes that may be affected by selection in modern human populations and how they differ from the predictions for neutral loci affected only by genetic drift. The new paper by Abi-Rached and colleagues follows on that line of inquiry.

    I think the hypothesis of adaptive introgression is very likely, and that we shouldn't be at all surprised that the immune system might house many good examples of it.

    A look at the most extreme examples, involving the Denisova genome, shows the extent that these functional genes might reflect introgression well beyond that indicated by most of the genome. The HLA class-I alleles present in the Denisova genome are most common today in South Asia (HLA-C*12:02, HLA-C*15 which is also common in Australia) and Southeast Asia (HLA-A*11). These regions of the Old World have no substantial evidence of Denisova inheritance across their genomes. Yet they may very well have substantial frequencies (up to 48 percent for HLA-A*11) of HLA class-I alleles from the archaic Denisovan population.

    Reasons to be cautious

    This is the point where I have to make a note of caution. Even though I personally think it is likely that these HLA alleles really did introgress into the modern population from Neandertals and Denisovans, their geographic pattern really isn't enough to demonstrate this without question.

    Reports earlier this summer described some of the work this group was doing on HLA class-I loci, including a public lecture by PI Peter Parham. I noted at the time that the geographic distribution of the alleles mentioned in that lecture seemed a mismatch for the hypothesis of a Denisovan origin for the alleles ("The immune systems of archaic humans"). For example:

    HLA-A*11 is very common in Papua New Guinea, but it is also very common in north India and in China. These two areas otherwise show no significant evidence of Denisova ancestry. We might conclude that the HLA-A gene just has an unusually high level of introgression into Asian populations, not typical of the genome as a whole. That's certainly possible. But without finding any substantial number of derived mutations in the HLA-A*11 variant in the Denisova genome and in living Asians, it is hard to rule out that the sharing of HLA-A*11 in all these populations is just coincidence.

    Of course, if the allele were absent in Africa, that would weigh in favor of the idea it is shared by Late Pleistocene interbreeding outside Africa. But HLA-A*11 is in Africa, just very rare. And it's in Europe. This is the kind of locus that is difficult to interpret: if it has any tiny disadvantage against malaria, for instance, its rarity in Africa is easily explained as a function of recent evolution, while its presence almost everywhere outside Africa would be no surprise even if there were never any interbreeding.

    The story of HLA-C*12:02 is similar. It's common in PNG, but also broadly across South Asia and into Iran, areas where no substantial evidence of Denisovan ancestry has been demonstrated.

    Introgression under selection is a good hypothesis for why these alleles should be so much more broadly distributed than the evidence from the rest of the genome. But introgression isn't the only explanation, because the alleles might have been retained by balancing selection, with recombinant haplotypes suppressed by purifying selection. We might use haplotype age to test the hypothesis. If the alleles were retained by ILS, they would look much older than if they came in from an archaic population by introgression. But as I'll describe below, in this case we actually have the opposite problem: these haplotypes look too young to have come in by introgression, likely a consequence of selection long after the Neandertals and Denisovans had contributed their genes to us.

    The curious case of HLA-B*73

    If I agree that the results of this paper are pretty likely, why am I still cautious? Well, the most confusing thing in this paper is an allele described in great detail that they didn't find in the archaic genomes. And I know from experience that not finding things is a pretty common occurrence when we go looking for odd things that might have come from Neandertals.

    There's a detective story here, that probably explains the initial interest of this group in the Neandertal genome, but that just didn't pan out in their search through the archaic genomes. The allele is HLA-B*73.

    Parham and colleagues [3] first characterized this allele, which is remarkably different from other HLA-B alleles. Homologs of HLA-B*73 are present in living apes, suggesting that the different human alleles originated before we diverged from gorillas. The retention of such an ancient allele in humans isn't a surprise in the HLA system, because many very divergent alleles have been kept in the population across evolutionary time by balancing selection. What's a bit surprising about HLA-B*73 is its limited diversity in living people. It appears to have persisted in humans throughout our evolution, but people today who carry the allele have very similar sequences, and it is nearly always linked to one single allele at the nearby gene, HLA-C (HLA-C*15). Also, the allele is very rare inside Africa and reaches its highest frequency in West Asia., where it occurs in only 4.5 percent of people. Because of this strange pattern, Parham and colleagues suggested that the allele may have been inherited from Neandertals.

    When I was in graduate school working on modern human origins, I took a special interest in genes that had this pattern of variation. HLA-B*73 was not the only one, there are others.

    The variation of the HLA-B*73 allele and its association with HLA-C*15 correspond very well to the predictions we presented in our paper on identifying introgression from archaic humans [2]. It's a highly divergent allele in humans compared to others, and it appears not to have recombined much with nearby genes, suggesting it was sequestered in another population through much of the diversification of present-day HLA alleles. But the HLA system is actually a rotten place to look for this kind of evidence, because there are many, many instances where ancient alleles have been retained in human populations by balancing selection. As we pointed out in 2008, a deep root to the gene tree and a rarity of recombination can be good evidence of introgression, but balancing selection and inhibitions to recombination are alternatives to introgression for explaining this pattern of variation.

    There's no necessary contradiction between the two processes, and ancient DNA in this case could establish that the allele was both under selection and came from archaic humans. The problem: they didn't find the allele in the archaic genomes.

    So why did they spend so much time in this paper discussing this allele? My guess is that they were surprised not to find it. But they did find HLA-C*15 in the Denisova genome, which is often linked to HLA-B*73 in living people who carry it. That makes for an indirect argument:

    C*12:02 and C*15 were formed before the Out-of-Africa migration (Fig. 2H and fig. S15) and exhibit much higher haplotype diversity in Asia than in Africa (fig. S16), contrasting with the usually higher African genetic diversity (20). These properties fit with C*12:02 and C*15 having been introduced to modern humans through admixture with Denisovans in west Asia, with later spreading to Africa (21, 22) (Fig. 1F and fig. S11 for C*15). Given our minimal sampling of the Denisovan population it is remarkable that C*15:05 and C*12:02 are the two modern HLA-C alleles in strongest LD with B*73 (Fig. 1E). Although B*73 was not carried by the Denisovan individual studied, the presence of these two associated HLA-C alleles provide strong circumstantial evidence that B*73 was passed from Denisovans to modern humans.

    I would go one simpler: Given that HLA-B*73 is most common today in West Asia, I suspect it came from West Asian Neandertals. There's no reason why the HLA genes of European Neandertals should have been identical to West Asian Neandertals. Today's Europeans are different from today's West Asians in the frequencies of these alleles, so why not in the past as well? For that matter, we really only have two alleles from European Neandertals for HLA-B (since the paper finds that

    Why do the Vindija Neandertals all have the same HLA types?

    It's a pretty good question. The paper cannot distinguish the genotypes from these three individuals. That's not the same as saying they're exactly the same type, since the sequences are very low coverage, but probably they were. Here's what the paper says:

    Genome-wide analysis showing three Vindija Neandertals exhibited limited genetic diversity (3) is reflected in our HLA analysis: each individual has the same HLA class I alleles (fig. S17). Because these HLA identities could not be the consequence of modern human DNA contamination of Neandertal samples, which is <1% (3), they indicate these individuals likely belonged to a small and isolated population (fig. S18).

    Still, I think this indicates a pretty high degree of inbreeding among these individuals. I wonder what the organ registry for Neandertals would have looked like.

    (Not so) final words

    I have more to write on the topic of linkage disequilibrium among these genes. The rate of recombination between HLA-B and HLA-C is high enough that a haplotype between these genes should have mostly decayed in the time since our mixture with archaic humans. HLA-C and HLA-A are an order of magnitude further apart, so linkage between alleles of these genes should have been totally erased in the time since any archaic admixture.

    That means that the extended haplotypes reported in this study must reflect selection in the period since the population mixture and introgression. The story isn't a simple case of inheritance from archaic humans, it is rather more complex. But more on that later.

    I think this paper confirms that it will be really productive to look at archaic genomes for variants present in living humans. Identifying modern human alleles in a Neandertal isn't really very exciting science, though. I've been doing this on my blog for a year now. It's a tricky job to type these HLA alleles, compared to genotyping many other genes, as we discovered. Still, I never really expected that reporting on genotypes in the public domain would be sufficient to get printed in Science.

    Still, this set of three genes is particularly interesting. And the paper does add evidence from one additional locus, KIR3DS1, which also has the pattern where an allele rare in Africa but common in Asia is present in the Denisova genome.

    If it turns out that we have widespread adaptive introgression in Asia today from Denisovans, that will change the game of studying the origins of these populations. Based on the genome-wide comparison, it looks like the genetic interaction that led to the habitation of Asia did not involve Denisovans, who contributed only to populations at the most eastern extreme of habitation in island Southeast Asia. But the only Denisovans we know about lived near the geographic center of the Asian landmass, not at the extreme southeastern extreme.

    The HLA pattern may suggest a more widespread pattern of mixture across Asia, which was later overwritten by population movements of people who didn't have Denisovan ancestry. That means that the habitation of Asia was a process of successive migrations and replacements, which imperfectly covered up the evidence of archaic intermixture. The genes that remain as signs of this intermixture are those that had selective advantages in later populations.


    References

    Synopsis: 
    Abi-Rached and colleagues report that the human system owes much to the Neandertals and Denisovans.
  • Mailbag: Could autism genes be adaptive?

    Wed, 2011-08-17 22:53 -- John Hawks
    I have always wondered if autism could be an adaptive mutation. However, since I myself have autism, and specifically one of the more fortunate types of autism. I've figured it would make me a monumental bleep to take such a notion seriously. But when I saw your article, I figured why not go out on a limb and run this fleck of curiosity by an expert. So could it be?

    P.S. Love your Faq #6! An induction schema that compliments the contributer once, but insults him an unlimited number of times. LOL. Unfortunately, I highly doubt those types of people would get the irony.

    Thanks!

    It's hard to say without knowing many of the genes that increase the probability of being on the spectrum. If you read in genetics now about the "hidden heritability", this is one of the cases -- we know that the trait has a strong genetic influence, but in large samples we don't find strong evidence for any single gene.

    It's likely that the heritability is explained by many different genes, each of which is rare in the population. That pattern would make it less likely that the genes that influence autism are adaptive -- many (but not all) adaptive traits are cases where a relatively small number of common genes influence the trait. But we won't really know until we have a better account of the genes involved.

  • Positive selection on killer whale mtDNA

    Wed, 2010-09-08 00:05 -- John Hawks

    I've written about the study of selection on human mtDNA many times, and discussed the signs that Neandertal mtDNA may have disappeared because of selection.

    I love how larger samples are starting to get zoologists to test the neutral hypothesis much more widely. This week, a new paper in Biology Letters by Andrew Foote and colleagues [1] shows that different populations of killer whales. They find possible evidence for positive selection on amino acid-coding variants in cytochrome b in two Antarctic populations.

    Here's the last paragraph of the paper. This isn't totally clear without the context (describing the whale populations) but it gives the best short summary of the complexity that was found.

    Based on morphological differences [21] and reciprocal monophyly of the mitogenome sequences [12], it has been suggested that type B and type C are distinct species. Positive selection on the cytochrome b could therefore be caused by adaptive divergence relating to a combination of variables that influence metabolic requirements, such as body size or diet; type C is a fish-eating dwarf form of killer whale, whereas type B is one of the largest forms of killer whale and primarily feeds upon seals [21,22] (J. W. Durban & R. L. Pitman 2010, unpublished data). However, the amino acid changes in both ecotypes could be the result of parallel evolution owing to environmental conditions such as oxygen concentration or sea temperature. Both type B and type C at least seasonally inhabit Antarctic pack ice, and both have been sighted over-wintering in the pack ice [21]. The third Antarctic ecotype, for which we found no evidence of positive selection, inhabits the offshore ice-free waters during the austral summer and over-winters at lower latitudes [21]. However, the mutations are in the opposite direction for each ecotype, suggesting that divergent evolution may be more likely. The two changes were private alleles within type B and type C, respectively, and neither substitution was found in the reconstructed ancestral sequence (electronic supplementary material), suggesting that each mutation has occurred and become fixed and almost fixed, respectively, since type B and type C diverged from their most recent common ancestor, approximately 0.15 Ma [12]. Therefore, the ancestral form may not have been subject to the same selective pressures.

    Some thoughts:

    1. We know how well "reciprocal monophyly" has turned out for human and Neandertal mtDNA genomes...

    2. It's interesting how much play there seems to be in the mitochondrial genome. Lots of ways to change and have small phenotypic effects that may be adaptive in one or another ecology. The system as a whole is relatively robust to many mtDNA changes.

    3. Many years ago, whale mtDNA was being explained in very similar ways to humans -- a matter of small effective size, in this case exacerbated by matrilineal pod structure. Might well be for many kinds of whales, but selection makes the story more complex.


    References

  • Time to revise the mtDNA timescale?

    Wed, 2010-08-18 23:35 -- John Hawks

    Krzysztof Cyran and Marek Kimmel (2010) have presented a revised set of estimates of the human mtDNA most recent common ancestor (MRCA). It's an interesting theoretical paper, written for the purpose of developing a method that doesn't rely on the same assumptions as the usual coalescent models.

    Their new method gives an estimate of 174,000 years ago for the human MRCA. They report an upper/lower range as 96,000 to 449,000 years ago. That range does not represent a confidence interval on the estimate, it's an upper/lower based on extreme assumptions about human/Neandertal genetic distance and the human/Neandertal MRCA.

    The Neandertal mtDNA has really affected the way we estimate human MRCA, at least for the mitochondrial genome. Chimpanzees are just too distant. When we compare human and chimpanzee mtDNA genomes, there has been a lot of parallelism and reversal on both lineages, because mutations have hit the same place multiple times. Multiple hits and purifying selection make a mess out of rate estimation -- generally, they make the human MRCA seem a lot older than it truly was. The Neandertals are closer, and are therefore less of a problem.

    But the Neandertal-human MRCA itself was poorly known, as long when we had only chimpanzees to calibrate the mutation rate....

    That's what we discovered earlier this year with the mtDNA genome of the Denisova specimen [1] ("The Denisova mtDNA sequence: The X-Woman"). Denisova is an outgroup to the human-Neandertal mtDNA clade, which diverged from our mtDNA ancestors around a million years ago. Sliding in that branch redated the human-Neandertal MRCA down to 460,000 years ago. Unfortunately, that paper came too late for Cyran and Kimmel [2] to use the revised human-Neandertal MRCA in their calculations. They assumed a date of 511,000 years ago for the human-Neandertal MRCA.

    Still, the paper gives enough detail to work out the effect of a lower human-Neandertal MRCA on their estimate. They obtained their lower bound (96,000 years) by assuming a human-Neandertal MRCA of 389,000 years. If we substitute in the Denisova-informed human-Neandertal MRCA, we can figure that the human MRCA will be around 130,000 years ago or so.

    That's awfully recent.

    I don't want to go too far with these numbers. My first objection is that they all assume the total absence of selection, when we have long known that some human mtDNA clades have been selected in some parts of the world. It's entirely possible that the human MRCA is recent because of natural selection on some mitochondrial-linked phenotype ("Complete Neandertal mitochondrial sequence, and selection on human (not Neandertal) mtDNA", "Has the dam broken on mtDNA selection?", "Selection, nuclear genetic variation, and mtDNA").

    And even if we assume no selection at all, there's not a lot to be gained by increased precision of these estimates. Branch lengths of an mtDNA genealogy give only extremely wide estimates of ancient events. Saying that something happened "around 50,000 years ago, plus or minus 35,000", it hardly matters whether we change that to "around 43,200 years ago, plus or minus 35,000." I would even argue that the round estimate is better, because it doesn't communicate a misleading impression of precision.

    Still, it does a lot of good to know whether estimates are systematically biased in one direction. And this work, combined with what we know about the Neandertal and Denisova complete mtDNA genomes, suggests that our mtDNA branch lengths may have been biased too high.

    It remains to be seen how much of the human mtDNA tree will be affected by this logic. The most recent branches can in many cases be calibrated against historical events, and ultimately parent-offspring comparisons. So those aren't likely to change much. What worries me is that critical period around 30,000--80,000 years ago, when human mtDNA lineages were diversifying worldwide. The timescale of mtDNA divergence is already out of whack with the rest of the genome. Pushing these divergences more recent will make the fit between mtDNA and autosomal estimates worse. But given the wide variance on coalescence times, Cyran and Kimmel's estimates are consistent with the hypothesis that these might be substantially higher -- so it's hard to guess whether the apparent mismatch is real or not.

    I might have missed this paper if it weren't for the press release about it from Rice University. But what a misleading release! It's headlined, "Mother of all humans lived 200,000 years ago" -- which the paper doesn't conclude. If that were the conclusion, it wouldn't be news, because it's confirming a widely-used estimate that's more than 20 years old.

    But there are actually several interesting angles to the story that the press release fails to mention. Their estimation method may prove useful for many species for which we have no good demographic model -- a problem that the release alludes to, but doesn't feature. The method they develop came from a similar process, which had formerly led to a much, much higher estimate of human MRCA. Their estimate is a lot lower -- in large part because they can exploit the Neandertal genetic information. And then there's the likely possibility that the actual MRCA may be much lower, which would truly be unexpected compared to most earlier work.

    At the end of their paper, Cyran and Kimmel give a short discussion of the history of the Out of Africa mtDNA story. They mention the idea that some people favoring the multiregional hypothesis had suggested older dates for the human mtDNA MRCA. Aside from O'Connell [3], however, they didn't cite this literature. The conclusion of a short timescale, with a MRCA around 200,000 years ago, was challenged by a number of geneticists [4],[5]. The most common point was that the upper confidence limit on the MRCA estimate must be very high -- potentially 800,000 years ago or more, because of the great uncertainty about rates, coming from the chimpanzee-human branch length. This remains a problem, although the availability of a Neandertal outgroup helps to clarify which changes on the human lineage are actually recent.

    It's sort of interesting that even in the current paper, we still have an upper estimate of the human MRCA that's nearly 450,000 years ago! I don't think that the assumptions going into that value are realistic, but there's no real upper confidence bound on the central estimate. It might well go as high as 450,000 years, given the huge uncertainty in the depth of the deepest branches of that African mtDNA genealogy.

    So I guess I'm not really sure we've advanced very far in 20 years!


    References

    Synopsis: 
    A study of human variation adds precision to the human mtDNA mutation rate; I compare to results from archaic humans.
  • Selection incidental to laboratory life

    Wed, 2010-04-14 17:00 -- John Hawks

    Olivia Judson's column is a very useful essay on selection incidental to laboratory life for model organisms ("Laboratory Life"). She discusses fruit flies and wasps, but I'll give you a passage about mice:

    Mice show a host of changes, too. Compared to their wild relations, laboratory mice are typically bigger, more docile, reach sexual maturity earlier and die younger. Some of these changes can appear quickly: one study found that the ability to reproduce later in life declined within 10 generations of the mice being bred in the laboratory.

    Intriguingly, laboratory mice also have longer telomeres than wild mice. (Telomeres are the segments of DNA at the ends of chromosomes; they are thought to play a role in aging and cancer.) Since no one is deliberately breeding mice for extra-long telomeres, this must arise as some consequence of laboratory life. But what?

    Take away predators and foraging requirements, and select for fecundity and docility. Lots of things can happen, not all good.

  • R. A. Fisher's model of adaptation

    Mon, 2009-10-26 01:25 -- John Hawks

    Chapter 2 of R. A. Fisher's Genetical Theory of Natural Selection is remarkable for many reasons. In it, he presents a model of selection in an age-structured population, the concept of reproductive value, and the Fundamental Theorem. Toward the end of the chapter, he discusses "The Nature of Adaptation," presenting a geometric model to justify the assertion that the probability of favorable genetic changes declines as the effect size of those changes increases.

    In order to consider in outline the consequences to the organic world of the progressive increase of fitness of each species of organism, it is necessary to consider the abstract nature of the relationship which we term 'adaptation.' This is the more necessary since any simple example of adaptation, such as the lengthened neck and legs of the giraffe as an adaptation to browsing on high levels of foliage, or the conformity in average tint of an animal to its natural background, lose, by the very simplicity of statement, a great part of the meaning which the world really conveys. For the more complex the adaptation, the more numerous the different features of conformity, the more essentially adaptive the situation is recognized to be. An organism is regarded as adapted to a particular situation, or to the totality of situations which constitute its environment, only in so far as we can imagine an assemblage of slightly different situations, or environments, to which the animal would on the whole be less well adapted; and equally only in so far as we can imagine an assemblage of slightly different organic forms, which would be less well adapted to that environment (38).

    I've highlighted that last sentence, which is saying that organisms fit their environments in possibly many different ways, so that their fitness is not actually tied in any single feature, such as "tint," but is instead an optimum of many features, with respect to any single factor the organism may be more or less well adapted. The rest of that paragraph continues on to make the same point.

    Then:

    The statistical requirements of the situation, in which one thing [the organism] is made to conform to another [the environment] in a large number of different respects, may be illustrated geometrically. The degree of conformity may be represented by the closeness with which a point A approaches a fixed point O. In space of three dimensions we can only represent conformity in three different respects, but even with only these the general character of the situation may be represented. The possible positions representing adaptations superior to that represented by A will be enclosed by a sphere passing through A and centered at O.

    This is really very simple, and the geometric model reveals an interesting switch. Suppose that we imagine an organism as a set of a traits, each of which lies at some distance d1, d2, d3, ..., da from the optimum value for that trait, O. We could imagine adaptation as a stepwise process, in which a any one of the a traits may change, and only those changes that reduce da will potentially be selected.

    But there's no reason at all why we should consider every trait as an independent entity. Suppose that a single genetic change could improve two traits, or three, or even all the traits at the same time.

    Or, more interesting, a change might greatly improve trait 1, while making all the rest of the traits marginally worse. Without a word, Fisher has removed the issue of adaptation from the fit of many separate variables, to a single distance in multidimensional space -- a change from cartesian to polar coordinates, as it were.

    If A is shifted through a fixed distance, r, in any direction its translation will improve the adaptation if it is carried to a point within this sphere, but will impair it if the new position is outside.

    The geometric model really assumes very little. It tells us nothing at all about the relationship between fitness and any particular phenotype, aside from assuming that (1) the relation is continuous within the sphere centered on O with radius A, and (2) there are no "holes" of low fitness within that sphere. This is not a fitness landscape. Fisher's view, as will become clear, was that species are well-adapted in nature. He assumes that the distance between A and O is always rather small in comparison to the kinds of phenotypic effects that mutations might cause. So in assuming that the fitness function is continuous, and that the area is small, he more or less automatically arrived at the assumption that there's only one peak at O.

    The next part is the famous one that people remember:

    If r is very small it may be perceived that the chances of these two events are approximately equal, and the chance of an improvement tends to the limit 1/2 as r tends to zero; but if r is as great as the diameter of the sphere or greater, there is no longer any chance whatever of improvement, for all points within the sphere are less than this distance from A.

    In this model, small changes are roughly 50-50 beneficial versus deleterious, but since their effect is very small, it hardly matters. Big changes are much less likely to be beneficial -- and if they exceed twice the distance from A to O, they can never be beneficial.

    After this quote, he gives an exact expression for the probability that a given change r is beneficial ((1/2)(1-(r/diameter))), but this is limited to the three-dimensional model. Then, there's another interesting one-sentence switch:

    The chance of improvement thus decreases steadily from its limiting value 1/2 when r is near zero, to zero when r equals d. Since A in our representation may signify either the organism or its environment, we should conclude that a change on either side has, when this change is extremely minute, an almost equal chance of effective improvement or the reverse; while for greater changes the chance of improvement diminishes progressively, becoming zero, or at least negligible, for changes of a sufficiently pronounced character.

    Remember that A represents the current "degree of conformity" of the organism to its "particular situation." This degree of conformity might be changed either by changing the organism or its environment -- the implication fo the last confusing sentence from the first paragraph, above.

    The point O is not an objective location (as it would be if we assumed it is the optimum within the current environment). Instead, it is a geometric abstraction that exists only in so far as it bears a relation to A in the present multidimensional space of "adaptation". We may change the distance from A to O either by changing the organism or by changing its environment. Fisher refers explicitly to this "assemblage of slightly different situations" with respect to both -- and again, the concept of "slightly different" underlies the assumption that the fitness function within the sphere is continuous.

    It seems to me that the idea of "niche construction" falls very easily within Fisher's model. An organism that systematically alters its environment may thereby change its level of adaptation. So even though mutation is an obvious candidate for a process described by the model, I've continued to refer to "change" as a nonspecific term for the unit of adaptation.

    The representation in three dimensions is evidently inadequate; for even a single organ, in cases in which we know enough to appreciate the relation between structure and function, as is, broadly speaking the case with the eye in vertebrates, often shows this conformity in many more than three respects. It is of interest therefore, that if in our geometrical problem the number of dimensions be increased, the form of the relationship between the magnitude of the change r. and the probability of improvement, tends to a limit which is represented in Fig. 3. The primary facts of the three dimensional problem are conserved in that the chance of improvement, for very small displacements tends to the limiting value 1/2, while it falls off rapidly for increasing displacements, attaining exceedingly small values, however, when the number of dimesnions is large, even while r is still small compared to d.

    Here we see the problem of universal pleiotropy emerging. As the number of dimensions of adaptability increases, the probability that one change will be a net increase in fitness, considering all dimensions, must decrease. This probability remains larger when the distance between A and O is larger, but declines as the number of dimensions increases.

    However, what we would call "universal pleiotropy" is intrinsic to Fisher's assumption that the direction of changes in this multidimensional space is random. If changes may occur in any direction, this is the same as asserting that a mutation may induce correlated changes between any set of phenotypes. With thousands of genes, and therefore thousands of dimensions of genetic "adaptedness", we might guess that the dimensionality is high enough to make this assumption useful.

    If on the other hand, the direction of changes is constrained in some way, then the dimensionality of the space accordingly is smaller by some degree. This is the rough equivalent of modularity in Fisher's model: if we say that some genetic correlations cannot be changed, we are saying that the phenotypic structure of the organism is modularized.

    The remainder of the section discusses the general case in spaces with higher than three dimensions. The basic point is that the probability that a change of a given size will be adaptive increases with the distance from A to O, decreases with the effect size r, and also decreases with the number of dimensions.

    Fisher finishes with a paragraph that, had he begun the section with it, might have made everything much clearer:

    The conformity of these statistical requirements with common experience will be perceived by comparison with the mechanical adaptation of an instrument, such as the microscope, when adjusted for distinct vision. If we imagine a derangement of the system by moving a little each of the lenses, either longitudinally or transversely, or by twisting through an angle, by altering the refractive index and transparency of the different components, or the curvature, or the polish of the interfaces, it is sufficiently obvious that any large derangement will have a very small probability of improving the adjustment, while in the case of alterations much less than the smallest of those intentionally effected by the maker or the operator, the chance of improvement should be almost exactly half.

    But there is an obvious objection: if you twist a knob on a microscope, it may move one of the lenses, but it isn't going to polish it. It's not possible to effect simultaneous changes on distinct elements of the microscope, because a microscope is in fact modular in its construction and controls. That doesn't disprove Fisher's model, but at least in the microscope case, the possible changes are not random in direction, but are constrained to "Cartesian"-like independent axes.

    Fisher's general idea is different from the fitness landscape, in the sense used by Sewall Wright. Fisher assumes a single adaptive optimum; Wright espoused the possibility of a "rugged" landscape in which large genetic changes might diverge from the nearest local optimum yet place the population near an alternate (possibly higher) local optimum. But the geometric model shares many properties with the fitness landscape, including its assumptions about hill-climbing toward the local optimum.

    I pointed to Sergey Gavrilets work a couple of weeks ago. Fisher's geometric model is most similar to the second meaning of fitness landscape, which pertains to the mean fitness of a population given a discrete genotype -- or in this case, a discrete phenotype-environment combination. Fisher's model is entirely about the relationship of two equilibria: the population mean fitness before and after a given change. It does not deal with the dynamics of the transition from one state to another.

    References:

    Fisher RA. 1930. The Genetical Theory of Natural Selection. Clarendon Press, Oxford.

  • Gene regulation and its evolution

    Mon, 2009-09-21 13:20 -- John Hawks

    I wrote early this week about Hopi Hoekstra's work on pigmentation evolution in mice ("The color of mice"). The linked article focusing on this empirical work didn't mention her interesting involvement in the debate over the nature and importance of gene regulation as a target of selection.

    I wanted to point out some articles on the topic, by Hoekstra and others, because more and more, paleoanthropological hypotheses are being found to involve the evolution of gene regulation. I'll just note a couple of examples (diet and pigmentation), and hint that there are more coming in the next year.

    If the topic of gene regulatory evolution, or cis-regulation in particular, are obscure, let me recommend a 2008 primer article by Wisconsin geneticist Sean B. Carroll: ("Evo-Devo and an Expanding Evolutionary Synthesis: A Genetic Theory of Morphological Evolution "). Carroll is probably the most well-known advocate of an "evo-devo" perspective on morphological evolution, and in particular the hypothesis that most morphological evolution may be explained by changes to cis-regulation -- "self-acting" sequence elements that affect gene transcription, such as promoter or enhancer elements.

    In a 2007 commentary, Hoekstra and Jerry Coyne presented a critique of the idea that cis-regulation is a central mechanism of adaptation. Here's a quote from their conclusion (Hoekstra and Coyne 2007:1006):

    While the study of cis-regulatory evolution is an important endeavor, justifiably championed by [evo-devotee Sean B.] Carroll and others, our survey of the theory and empirical data shows that the widespread enthusiasm for the importance of cis-regulatory change in evolution is at best premature. Analyzing the verbal theory, one finds no compelling reason to draw a distinction between the genetic basis of anatomical versus physiological evolution. Nor is there good reason to accept the a priori argument that—for either anatomy or physiology—changes in cis-regulatory genes are more likely to be fixed in evolution than are changes in the coding region of genes.

    Everyone agrees that changes in cis-regulation, trans-regulation and good old-fashioned changes to protein sequences may all be selected, and there are examples of each being involved in the evolution of new adaptive phenotypes. So at that level, the theoretical disagreement is relatively sterile -- all of them are possible and cases are known for each.

    The question is whether any of them account for a preponderance of adaptive evolution. Is there anything special about cis-regulation, or any other kind of change? Are they coequal, do they occur in proportion to the number of regulatory elements, amino acid-coding positions, gene duplications? Do any of them release constraints on adaptive changes, allowing more rapid evolution?

    Why does anybody care? Well, there is a mercenary answer: They all have their own empirical research agendas to look out for, and some of them work mainly with experimental models and techniques effective for studying cis-regulation, others on trans-regulation and still others on classical polymorphisms. To me, these are totally boring topics, since I'm not hoping for any funding to do molecular work on gene regulation. Hopefully, the funding conflict will become less important as genomic methods get cheaper. Of course, when it's no longer difficult to find out the answers, we'll have a decent survey of empirical cases!

    Search strategies. A second explanation for why we should care is a practical one. Now that we are able to get genomes from any species we like, the question arises: what is a sensible strategy for forming hypotheses about adaptive (and non-adaptive) change? What should we be looking for?

    Coyne and Hoekstra wrote a 2007 perspective on an article about amylase adaptive evolution inhuman populations. They returned to the issue of whether we should expect a predominant target of adaptive change: cis-regulatory or so-called "structural" mutations to coding sequences.

    The amylase results [showing adaptive change in gene regulation by duplication] follow a related study on the genetics of human dietary differences. In 2006, Tishkoff and colleagues [11] identified a mutation in the upstream regulatory region of the gene for lactase, an enzyme important for digesting milk, in pastoral African populations. Using an in vitro system, they showed that this mutation could increase gene expression. The relevant mutation, however, is not a duplication, but probably a change in cis-regulation. (An independent cis-regulatory mutation at this locus, also conferring lactose tolerance, was identified earlier in European populations [12].)

    Even in the simplest cases of adaptation, then — increased enzyme production to handle new diets — evolution works in multiple ways. Obviously, no amount of a priori speculation will tell us which sorts of mutations will be important; the answer, unfortunately, requires meticulous, case-by-case analysis of putative adaptations.

    Humans may be a poor model organism for considering this question. For one thing, we have a large store of loss-of-function mutations that have been selected for resistance to disease. The same thing probably occurs in other species, but the exceptional number of new diseases in humans may tilt the scales in favor of "structural" mutations.

    Still, these diet-related examples show pretty clearly that multiple mechanisms of gene regulation may be targets of recent selection. That's also evident when we consider human pigmentation variation, a system that is relatively well-understood now from a genetic perspective, for the same reason that Hoekstra's deer mice pigment variations are tractable. In a genome-wide context, it now looks like cis-regulation has been a frequent target of recent adaptive evolution (Kudaravalli et al. 2009), but most of the well-studied examples of recent adaptive change are amino-acid coding

    The breadth of pleiotropy. Pleiotropy ought to impede adaptation. If genes are solving multiple problems -- by interacting with distinct functional networks -- then changes that make one function better may often make others worse. If genes interact widely enough, then optimization becomes extremely difficult or impossible -- what Stuart Kauffman (1993) called a "complexity catastrophe." Make a system complicated enough, and the probability falls to nil that a random change might improve it.

    If evolution were generally mutation-limited in this way, you might well expect to see a highly modularized system of gene regulation evolve, and that's precisely the argument for the cis-regulatory evo-devo model. I don't have an opinion on the general question of how often adaptations may make use of this modular system of regulation as opposed to trans-regulation, duplication, or straight-on coding substitutions. It seems like toolkit genes, which are both highly conserved and strongly pleiotropic, may have evolved by altering cis-regulation more often than other means. Those genes have highly modularized "cassettes" of cis-regulatory elements that control their expression in different contexts. One regulatory element can change without necessarily impeding the function of the gene in other contexts.

    Empirical pigmentation research helps to illuminate the dispersal (and limits to dispersal) of recently selected mutations. That makes it a very relevant model system for understanding recent human evolution. Many human (and Neandertal) mutations to MC1R are trans-regulatory -- by altering the sequence of the hormone receptor, these mutations downregulate the pathway that converts pheomelanin to eumelanin. The nature of this regulatory change is structural -- it's an actual change in the gene product that affects pigmentation.

    From the deeper perspective of paleoanthropology, the evolution of form is the central topic. How did the distinctive conformation of the bipedal pelvis evolve? Some paleoanthropologists have already laid out scenarios in which morphological evolution took a small number of very broad changes -- pelvis, spine, and femora as integrated units that may have had correlated effects on arms and other morphological structures. Such hypotheses make the background assumption of strong pleiotropy on a hard-to-explore adaptive landscape. If human evolution was a product of a few hard-to-get mutations, which might not have happened at all, then our emergence was contingent on a series of unlikely events.

    Pleiotropic constraints may help to explain why we see a pulse of rapid adaptive evolution in humans along with population growth. Adaptive mutations rarely appeared during the earlier Pleistocene, making many possible changes were mutation-limited. If certain kinds of regulatory changes were very easily evolvable, then we might expect to see a different pattern of recent evolution.

    Still, most folks don't think very much about pleiotropy as a constraint on human evolutionary change. The "one gene, one trait" model is really universal out there. Certainly, if you ask people, they'll give you the textbook answer -- genes have more than one function; phenotypes are influenced by many genes. But I can't tell you how many times I've heard people refer to "skin color genes" as if they did nothing else.

    References:

    Carroll SB. 2008. Evo-Devo and an Expanding Evolutionary Synthesis: A Genetic Theory of Morphological Evolution. Cell 134:25-36. doi:10.1016/j.cell.2008.06.030

    Coyne JA, Hoekstra HE. 2007. Evolution of protein expression: New genes for a new diet. Curr Biol 17:R1014-R1016. doi:10.1016/j.cub.2007.10.009

    Hoekstra HE, Coyne JA. 2007. The locus of evolution: Evo devo and the genetics of adaptation. Evolution 61:995-1016. doi:10.1111/j.1558-5646.2007.00105.x

    Kauffman SA. 1993. The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, New York.

    Kudaravalli S, Veyrieras J-B, Stranger BE, Dermitzakis ET, Pritchard JK. 2009. Gene expression levels are a target of recent natural selection in the human genome. Mol Biol Evol 26:649-658. doi:10.1093/molbev/msn289

  • Quote: Fisher on the limits of diffusion

    Thu, 2009-09-17 22:16 -- John Hawks

    R. A. Fisher and Sewall Wright introduced diffusion approximation methods into genetics; Fisher (1937) was the first to consider spatial disperal using a reaction-diffusion model. I found this quote a useful expression of his acknowledgment of the limits of the model:

    The use of the analogy of physical diffusion will only be satisfactory when the distances of dispersion in a single generation are small compared with the length of the wave. In reality diffusion is a complex process, compounded often of the diffusion of gametes, and that of larvae, in addition to adult forms; a more exact treatment than that supplied by a simple coefficient would involve the interaction of these components, and the stages at which the selective advantage was enjoyed. So far as it is applicable, the analogy of physical diffusion, therefore, greatly simplifies the problem (355-356).

    The paper has no references.

Pages

Subscribe to selection

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.