© 2016 Shmoop University, Inc. All rights reserved.
Gene to Protein

Gene to Protein

Cracking the Code: Translation

RNA contains only 4 different nucleotides. A protein can contain up to 20 different amino acids. You don't need to be a math genius to realize the problem here. How does something with only four letters encode for something with 20 different subunits? The answer to this question actually wasn't worked out until the 1960s, and it is called the genetic code.

The discovery of the genetic code has some cool history. In fact, it involves brotherhoods, Russians, and code names for all its members. The brotherhood, called the "RNA Tie Club," was founded by the Russian physicist George Gamov. The club was a group of prominent scientists, each of whose members was given a name based on one of the 20 amino acids. The club met a couple times a year to think about how the genetic code might work. Who said scientists couldn't be cool?

Who finally figured out the genetic code? Was it one of the eight Tie Club members who went on to win the Nobel Prize? Nope. It was two "American underdogs" (not members of the Tie Club), Har Gobind Khornana from the University of Wisconsin and Robert Holley from Cornell University. Har Gobind Khorana discovered a way to produce chains of nucleic acids, while Robert Holley figured out the exact structure of a tRNA. The tRNA molecule is the adaptor molecule that connects the protein-encoding mRNA to an amino acid. After these discoveries, the code fell into place.

The Genetic Code Demystified

The genetic code describes how the nucleotide sequence of a gene is translated into an amino acid sequence. RNA acts as a middleman messenger. According to the genetic code, three RNA nucleotides (think of them as a three letter word) code for a single amino acid. This three-letter code is called a codon. And like everything DNA/RNA related, we read the strand starting at the 5ˊ end.

With all those nucleic acids hanging about in the chain, how does translation know where to start? Start codons, of course. The AUG codon signals the beginning of translation, and also codes for the amino acid methionine. There's also stop codons that signal where translation should end. And between the start and stop codons, there's a bunch of other codons that together make up the genetic code. Altogether this genetic code explains how protein expression is possible.

Warning: Don't let the stop codons trick you. They don't actually code for the insertion of an amino acid. All they do is what their names suggest…stop translation. Finally, biology is straightforward, right?

For the most part, the genetic code is universal among organisms, but there are a few slight changes in certain systems. For example, mitochondria and chloroplasts have their own slightly modified code.

Mitochondria and chloroplasts are thought to draw their roots from a prokaryotic ancestor. As a result, they produce their own proteins in a way that is reminiscent of their evolutionary heritage, and different from the cell where they reside. This special code produces many unique proteins that aren't found in other parts of the cell.

Now finally, the table you've been waiting for…the genetic code. Okay, maybe this table isn't "I won the lottery" exciting, but it will make your biology class a whole lot easier.


The genetic code. Please, hold your applause.

There it is, in all its splendor. The genetic code. Each three letter codon encodes for a different amino acid. And some amino acids are encoded by multiple codons.

The Reading Frame

You might be thinking, "Well, that's all fine and good, but how does translation know when to start defining the codons? All it takes is a shift in a single nucleotide, and we've got a whole new set of words." Of course, our cells have that all figured out.

The location where you start reading the RNA sequence is called the reading frame. A protein is said to have three reading frames.

Consider the RNA sequence GUGACUGAUUGU.

By using the table above, you can see which amino acid is needed to make the protein. Let's start by saying that GUG is the first codon. If that's our reading frame, then we'd divide the codons like this:

GUG ACU GAU UGU = valine, threonine, aspartic acid, and cysteine

Now let's move the reading frame over by one base (in other words, skip the first G). Now we have:

UGA CUG AUU GU_ = stop, leucine, isoleucine, and two bases looking for a third

If we move two bases from the beginning, and start with the second G, we get:

GAC UGA UUG U = aspartic acid, stop, leucine, and one lonely amino acid

So you can see that it's important to start translation at the right place. Moving the start site over by even a single base completely changes the amino acid sequence coded for by the RNA. More importantly, in this instance, it actually results in the insertion of a stop codon. When a stop codon is reached in the translation sequence, no further amino acids are added to the chain. This can be disastrous to the cell, because it results in a protein that's prematurely truncated (or shortened). Changes like this one are called nonsense mutations.

Brain Snack

Check out the Nobel Prize website for even more tales of adventure in uncharted intellectual territory.

People who Shmooped this also Shmooped...