Rare Variants of the Almost Entirely Universal Genetic Code are Evidence of Evolution, Not Design

by Ardea Skybreak

Revolutionary Worker #1216, October 19, 2003, posted at rwor.org

Nearly 100% of all the living plant and animal species--from complex mammals to the simplest bacteria--use the exact same genetic code (the exact same set of chemical instructions) to direct the assembly of the many different kinds of protein molecules that living bodies need in order to function. The fact that this genetic code--the chemical "rule book" for making proteins--is exactly the same in essentially all organisms (with only a few very minor exceptions) is itself very strong evidence that all living species are related to each other and evolved out of a long series of common (shared) ancestors. But Creationists like to slide over and dismiss the fact that the genetic code is so nearly universal and instead make a big fuss out of the fact that there are a small number of minor exceptions to this general rule (a few primitive organisms have been found to use a very slightly different genetic code to assemble proteins), as if these few exceptions could somehow prove that the different species aren't all related through lines of common descent, and that a divine power, rather than natural processes of evolution, must have made organisms the way they are. The truth is there is absolutely no scientific basis to support this Creationist reasoning. Let's look at this a bit:

Many people know that genes found in DNA are responsible for the inheritable features that can be passed on from generation to generation. But many people don't realize that in the course of an organism's life, the only thing genes really do is provide a set of chemical instructions which serve as a blueprint for the assembly of the many different kinds of protein molecules found in living bodies. You can think of a gene as a section of a DNA molecule which "codes" (provides chemical instructions) for the assembly of a particular kind of protein molecule; different genes "code" for different kinds of proteins. Each protein molecule is made up of chemical components known as amino acids. These amino acids are arranged in different sequences (they are lined up differently) in different kinds of proteins. There are only 20 different kinds of amino acids, but that's enough to produce an enormous number of different kinds of proteins. Think of how different colored beads can be strung on a necklace in many different ways (in a different order) to produce different kinds of necklaces. In a similar way the specific order in which amino acids are strung together produces many different kinds of proteins, which then go on to perform different kinds of functions.

This basic process of protein assembly takes place inside the cells of every living organism. But the big question is: what is it that makes it possible for the cells to assemble the different amino acids in just the right order to produce the different kinds of proteins? Why don't the different amino acids just get strung together in any old random jumble? That's where the genetic code comes in.

The genetic code is the mechanism through which chemical information contained in a gene (a section of a DNA molecule) is "read" and "translated" into the set of instructions which cells use to string the amino acids together into one particular order rather than another. Think of the genetic code as the "rule book" for protein assembly. And, once again, this genetic code rule book is exactly the same in bacteria, or a rose, or a human being. If, as the Creationists believe, all the different species of plants and animals were actually unrelated to each other, and had come into being completely separately (instead of evolving out of long lines of shared ancestors) there would be no reason for them to be using the exact same chemical rule book for assembling proteins from the genetic blueprints found in DNA. The fact that all living species (including humans) do use that same exact rule book (with only those few very minor variations found in some micro-organisms) is itself amazingly strong evidence that all living organisms are, in fact, related to each other and descended from a long series of common ancestors, going all the way back to the first simple bacteria-like forms of life which emerged on this planet more than 3 billion years ago.

And what about those few exceptions? In recent decades biologists have discovered that a few simple species of organisms actually do use a very slightly different genetic code to translate the genetic instructions for assembling amino acids in the right order. These very small variations have so far been found only in a few free-living micro-organisms , and also in mitochondria and chloroplasts .*

The minor exceptions to the general rule that all organisms use exactly the same genetic code are certainly interesting, and studying these rare exceptions will no doubt increase our understanding of evolutionary processes and mechanisms. But none of this changes the basic fact that almost 100% of all living plant and animal species--from complex mammals to the simplest bacteria-- still use exactly the same genetic code rule book for protein assembly.

And even those few slight variations in the genetic code make sense in terms of what we understand about evolution. To understand this better, let's look a bit more at the process of how the genetic code translates DNA information into proteins. How does the molecular information contained in one particular gene (section of DNA) end up leading to the assembly of one particular protein molecule, with its amino acids all lined up in the right order? The process starts with the DNA itself, which is made up of many different sequences of four nitrogen-bearing chemical compounds known as nucleotides (adenine, thymine, guanine, and cytosine, referred to by the letters A,T,G,C). In a double-stranded DNA molecule (the famous "double-helix"), these nucleotides "pair up" with each other in consistent face-to-face "complementary" ways (A lines up with T, and G lines up with C). In the day-to-day process of protein synthesis, the first step is that such a double-stranded DNA molecule unravels or "opens up," and a slightly different molecule (known as messenger RNA ) comes over and lines up to one of the DNA strands, "pairing up" in a similar complementary (or "matching up") process (except that in RNA molecules uracil [U] replaces thymine so that, when RNA lines up to the DNA, A pairs with U, and G still pairs with C). The messenger RNA molecule then "carries" this complementary copy of the information contained on the DNA out of the cell nucleus and into the cell cytoplasm, taking it to cell structures known as ribosomes ; there--thanks to other RNA molecules and yet another round of chemical "pairing up"--it gets translated into the instructions for assembling one or another protein molecule.

But let's go back for a minute to the DNA strand at the very beginning of the process: the nucleotides on the sections of DNA strands known as "genes" are organized in little sequences of threes called triplets or "codons" (for instance U-C-A, or A-U-G, and so on). Each triplet ends up "coding" for a particular amino acid: for instance, the triplet U-U-U codes for the amino acid phenylalanine, the triplet U-G-G codes for the amino acid tryptophan, the triplet G-A-U codes for the amino acid aspartate, etc. There are even triplets that simply send the signal that protein assembly should start (A-U-G), or that it should stop (U-G-A). In addition, it's interesting and relevant to note that while (in almost 100% of organisms) a specific triplet codes for just one amino acid, many amino acids are coded for by more than one triplet. For instance, the triplet sequences A-C-U, A-C-C, A-C-A, and A-C-G all code for the production of the amino acid threonine.

The overall sequence of different triplet "codons" (the order in which they are lined up) determines the order in which different amino acids end up being assembled, resulting in the different kinds of proteins specified at the start by the different genes on the DNA strands. In the course of this DNA-to- protein process, a number of different molecules (known as messenger RNA, transfer RNA, and ribosomal RNA) essentially serve as intermediaries , first "reading" the original triplet sequences present on the DNA strands, then "transporting" the corresponding complementary sequences to a different part of the cells (the ribosomes), then attaching to them yet another complementary RNA sequence, which grabs up the corresponding amino acids and strings them up in the order which forms a particular protein chain.

This is obviously a pretty complex multi-part process. For the purposes of this discussion it is not really necessary to understand this process in detail. But getting even a rough sense of what is involved in generating different kinds of proteins can help us to better appreciate that the few "variations in the nearly universal genetic code" that have been found to occur in some simple organisms like mycoplasmas (primitive bacteria-like organisms) really are very minor . They involve just a few changes in what just a few of the triplets normally code for. For instance, in almost all organisms, U-G-A triplets code for the "stop" signal which marks the end of a protein assembly process; but in some primitive mycoplasmas, U-G-A triplets code instead for the amino acid tryptophan.

Such minor variations don't change the fact that the identical genetic code shared by almost 100% of all plant and animal species (in which all the different triplets code for exactly the same amino acids) is extremely strong evidence that all species are derived from shared ancestors. In addition, molecular geneticists are finding that even those rare cases where there are slight variations in the genetic code can also be explained by the normal workings of natural evolution. Years ago many biologists used to think that the genetic code had remained completely unchanged in all organisms since the earliest origins of life on this planet, because any changes occurring in such basic molecular processes would, they thought, have so totally disrupted normal cellular functioning that any such mutations would likely have been quickly eliminated by natural selection. But today biologists understand that things going on at the molecular level are not quite as fixed and rigid as some people used to think. For instance, we now know that pieces of DNA called transposons frequently "jump" from one place to another on chromosomes, causing nearby genes to mutate. And sometimes a triplet "codon" on the DNA which normally "codes" for the production of just one particular amino acid, can undergo some changes and start coding for a different amino acid, but without causing a total collapse in normal cellular functioning. This is not mere speculation: such changes have actually been observed to occur in laboratory populations of living organisms.

So today most biologists agree that: 1) the variations of the universal genetic code which have been found in a handful of primitive species are not only rare, but also extremely minor , and that it is therefore still essentially correct to speak of a universal (or "nearly universal") genetic code-- one that cuts across all species lines and testifies to their relatedness and descent from common ancestor species ; 2) the discovery of those minor variations in the genetic code have not disrupted the overall evolutionary phylogenies ("family trees") which had been worked out previously through other means (this simply means that these minor variations are not significant enough to disrupt the basic patterns of relatedness and ancestor-descendant evolutionary sequences in different plant or animal lines which biologists have previously been able to reconstruct from both the fossil evidence and from the molecular evidence); and 3) those rare and minor variants in the standard genetic code reveal that the genetic code itself can undergo at least a certain amount of evolutionary modification over time.

It is significant that those few variations in the basic genetic code seem to have had little or no effect on overall protein synthesis or other cellular functions. In the evolutionary past, specific mutations which happened to occur in DNA triplet codons would not have been directly subject to Darwinian natural selection in any case (since natural selection operates principally on the level of whole populations of reproducing individuals, each of which is the manifestation of a complex interplay between its total genetic make-up and its environment); but even where such small modifications may have had no effect one way or the other on the "reproductive fitness" of individual organisms who had these mutations, these small modifications could nevertheless have been passively transmitted through the generations as organisms reproduced and as their populations evolved.

And again, it is becoming increasingly clear that such mutations can occur without completely disrupting a cell's functioning (as many used to predict would happen, and as Creationists still try to argue). It seems this has to do at least in part with the fact that there is typically a lot of redundancy ("multiple copies" of things) in all natural systems, including at the molecular level. One of the implications of such redundancy is that a minor change could take place in a triplet sequence (one normally involved in coding for a particular amino acid in the assembly of a particular protein) without fundamentally disrupting the cell's synthesis of that original protein--simply because there would still be a lot of other copies of that codon which had not mutated, and which could therefore still go on coding for the original protein in the standard ways.

Understanding this helps us understand some of what's basically wrong with the thinking of "Intelligent Design." Creationists like Michael Behe who argue that there is some kind of absolute "irreducible complexity" in complex multi-step molecular reactions which would have prevented significant evolutionary modifications in such processes from taking hold in the past because such changes would (according to Behe) have been too disruptive and damaging to overall cellular functioning. Behe's frankly narrow and mechanical reasoning (which underestimates the creative implications of much of the natural "slop" and redundancy of life) is what leads him and others to conclude that evolving life could never have given rise to such complex biochemical processes "all on its own" through just natural processes--and to conclude instead that the very existence of such complex sub-cellular machineries is evidence of conscious design by some kind of "intelligent designer."

It is certainly reasonable to describe the multi-step process of protein synthesis as "complex." And it is also true that if at a given point all the different parts of this process don't work together in just the right ways, a particular protein won't be properly formed, and this can cause problems. But that doesn't mean evolutionary modifications could never have occurred in that system (at one or more points in the past) without causing the system as a whole to collapse. As has been stressed many times in the course of this series, there are always some relative limits and constraints on how much or in what ways a particular feature of living organisms can change at any given point (simply because evolutionary modifications can only take place on the basis of natural variation that is already present in a living population and whose physical and chemical properties will necessarily channel and restrict what change is possible at that time); so it is right to recognize that there are always going to be some material limits on what variations are possible at a given point. But there is never some kind of absolute limit or constraint--one that would prevent any and all change at a given point--at any level of organization of matter.

So, while the basic process of protein synthesis does involve complex molecules and complex mechanisms (about which more is being learned every day), the discovery of those minor exceptions to the general rule that all organisms share the same universal genetic code actually shows us that some changes can in fact take place even in some of the fundamental underlying processes that are basic to the functioning of all life forms on this planet.

The fact that a randomly occurring mutation in a triplet codon can cause that codon to switch over to making a different amino acid without this causing the whole system of protein synthesis to collapse (especially if un-mutated extra copies of the codon continue to perform their original functions) is something that has actually been observed in the lab. Despite even such direct evidence, the Creationists (including the Intelligent Design types) continue to stubbornly ignore much of what biologists now understand about how evolution can build new features of life out of pre-existing material-- including at the molecular level--while still preserving and maintaining previous features and functioning.

(For readers interested in a more technical discussion of some of these subjects, see for instance the article "Variations in the Genetic Code: Evolutionary Explanations" by Finn and Jean Pond in the Sep-Oct 2002 issue of the newsletter of the National Center for Science Education [NCSE].)


*Mitochondria and chloroplasts are sub-cellular "organelles" which contain their own, separate, DNA; they are thought to have once been free-living microorganisms themselves, but now function exclusively as part of the energy-producing machinery of the animal and plant cells they are part of.

[Return to article]]