DNA Basics

So, what is the difference between a chromosome, a gene, a protein and DNA? I mean where do they all fit in?

-A curious adult from California

December 16, 2008

With words like these being bandied about willy-nilly, it isn't surprising that you are a bit confused by all of this. Here is a breakdown of what each means and how they relate.

DNA is the chemical that chromosomes and genes are made of. DNA itself is made up of four simple chemical units that are abbreviated as A, G, C, and T. These letters are used to form three letter words that cells can read.

A chromosome is simply a very long piece of DNA that cells can easily copy. Humans usually have 23 pairs of chromosomes. They are numbered 1-22 with the 23rd pair being either XX in girls or XY in boys.

A gene is a stretch of DNA on a chromosome that has the instructions for making a protein. Each chromosome has many genes with humans having over 22,000 genes in all. A gene's instructions for a protein are written in the three letter code I referred to before.

A protein is a molecular machine that does a specific job. Some proteins like amylase help us digest food. Others like opsins help us see colors. And still others like the globins help our blood take oxygen to and carbon dioxide away from our cells.

Beta Globin (Hemoglobin)

Let's look at the HBB gene and the beta globin protein as an example to make this all more concrete. From here on out, I will call beta globin by its more common name, hemoglobin.

The hemoglobin gene, HBB, is found on chromosome 11. This chromosome is a little less than 135 million DNA letters long and HBB is just one of its over 1500 different genes. Chromosome 11 is definitely one of the most gene-rich chromosomes we have.

Near the middle of this chromosome is 1600 or so DNA letters that together make up the HBB gene. The actual instructions for making hemoglobin are found in 444 letters within this 1600 letter stretch. (The other letters are used to figure out when, where, and how much of the protein to make.)

A quick summary up to this point. Human DNA consists of around 3 billion DNA letters. Around 4.5% of this is chromosome 11. And around 0.000015% of human DNA has the the instructions for hemoglobin.

The instructions for hemoglobin are written in the three letter words (or codons) I talked about earlier. To understand how these codons work, I need to give a brief description of what a protein is.

Proteins are strings of amino acids stuck together. The number and particular order of the 20 different amino acids determines what a protein can do. Proteins can range anywhere from 34,350 amino acids in the muscle protein titin to less than 100 amino acids in proteins like insulin.

As I said, the instructions for putting a protein together are found in the codons of a gene. Each codon tells the cell which amino acid to add to the protein.

For example, the 444 DNA letters of the HBB gene that tells the cell how to make hemoglobin starts out like this:


To make things easier, we'll split this up into three letter words:


Each of these codons tells a cell which amino acid to add. For example, an ATG tells the cell to add a methionine (or Met). So, like most other proteins, hemoglobin starts out with Met. Next comes a valine (Val), then a leucine (Leu), etc. When we keep adding amino acids we end up with the following:


This goes on for another 141 amino acids and you have hemoglobin.

Hemoglobin is a critical protein in our blood. It carries oxygen to and carbon dioxide away from cells. We need for the amino acids to be put together just right or we can end up with a disease. Like sickle cell anemia.

Sickle cell anemia is a disease that affects around 72,000 people in the U.S. Most of these folks have ancestors who came from Africa. (Click here to find out why it is so common in people of African descent.)

People with sickle cell anemia have a single difference in the 444 letters of their hemoglobin gene instructions. Their gene starts out with:


This means that their hemoglobin protein starts out with: Met-Val-Leu-Thr-Pro-Val-Glu... This one difference is enough to cause sickle cell anemia! In other words, a difference of 1 out of 3 billion letters can cause the problems of sickle cell anemia. Similar small changes in other genes can cause problems like cystic fibrosis, dwarfism, etc.

So there you have it. Genes and chromosomes are made up of the 4 letters of DNA. And cells read the letters found in stretches of DNA called genes to make proteins that help our bodies and minds run properly.

Very 60's video explaining going from RNA to proteins.

By Dr. Barry Starr, Stanford University

The genetic alphabet has 4 letters. The genetic language is made of 64 three letter words.

One DNA letter difference causes these cells to sickle.