Genetics* is the study of the organization, expression and transfer of heritable information. Genetics is a central pillar of modern biology. Large areas of study within biology don't make sense except the context of a firm understanding of genetics. It is a diverse, multidisciplinary field that informs and is informed by other areas of scientific inquiry, including biochemistry, ecology, evolution, geoscience and medicine. 

below is a list of interactive genetics illustrations:

Molecular Genetics

Molecular genetics* is the field within genetics that deals with the molecular bases of inheritance. At the center of the field of study is the structure and function of the nucleic acids DNA and RNA.

Nucleotides in DNA

The study of modern genetics depends on an understanding of the physical and chemical characteristics of DNA. Some of the most fundamental properties of DNA emerge from the features of its four basic building blocks, called nucleotides. Knowing the composition of nucleotides and the differences between the four nucleotides that make up DNA is central to understanding DNA’s role in living systems.

This illustration introduces nucleotide and the terminology used to describe them.

DNA is a nucleotide polymer, or polynucleotide. Each nucleotide contains three components:

  1. A five carbon sugar
  2. A phosphate molecule
  3. A nitrogen-containing base.

The sugar carbon atoms are numbered 1 to 5. The nitrogenous base attaches to base 1, and the phosphate group attaches to base 5. DNA polymers are strings of nucleotides. Cells build them from individual nucleotides by linking the phosphate of one nucleotide to the #3 carbon of another. The repeating pattern of phosphate, sugar, then phosphate again is commonly referred to as the backbone of the molecule.

The sugar in DNA is deoxyribose. Deoxyribose differs from ribose (found in RNA) in that the #2 carbon lacks a hydroxyl group (hence the prefix “Deoxy”). This missing hydroxyl group plays a role in the three-dimensional structure and chemical stability of DNA polymers.

Nucleotides in DNA contain four different nitrogenous bases: Thymine, Cytosine, Adenine, or Guanine. There are two groups of bases:

  • Pyrimidines: Cytosine and Thymine each have a single six-member ring.
  • Purines: Guanine and Adenine each have a double ring made up of a five-atom ring attached by one side to a six-atom ring.

The order of nucleotides along DNA polymers encodes the genetic information carried by DNA. DNA polymers can be tens of millions of nucleotides long. At these lengths, the four-letter nucleotide alphabet can encode nearly unlimited information.

Nucleosides are similar to nucleotides, except they do not contain a phosphate group. Without this phosphate group, they are unable to form chains.

Test your knowledge of Nucleotides with a quiz

Overview of the illustration

Related Content

Subject tag:

Nucleotides in RNA

Ribonucleic acids, also called RNA, is the intermediary molecule used by organisms to translate the information in DNA* to proteins. RNA is also required for DNA replication, regulates gene expression, and can function as an enzyme.

Like DNA, RNA is a polymer - made up of chains of nucleotides*. These nucleotides have three parts:

  1. A five-carbon ribose sugar
  2. A phosphate molecule
  3. One of four nitrogenous bases: adenine, guanine, cytosine, or uracil

RNA nucleotides form polymers of alternating ribose and phosphate units linked by a phosphodiester bridge between the #3 and #5 carbons of neighboring ribose molecules.

RNA nucleotides differ from DNA nucleotides by a hydroxyl group linked to the #2 carbon of the sugar. This hydroxyl group allows RNA polymers to assume a more diverse number of shapes compared to DNA polymers. The extra hydroxyl group also makes RNA polymers less stable than DNA polymers. The greater variety of shapes RNA polymers form is part of the reason RNA serves more functions than DNA.

Test your understanding of the concepts covered by answering the Nucleotides in RNA practice problems

Video Overview:

Related Content

Subject tag:

Complementary Nucleotide Bases

DNA* is the information molecule of the cell. DNA’s capacity to store and transmit heritable information depends on interactions between nucleotide bases and on the fact that some combinations of bases form stable links, while other combinations do not. Base pairs that form stable connections are called complementary bases.

Consistent pairings of complementary bases allow cells to make double-stranded DNA from a single strand template, create messenger RNA from DNA and synthesize proteins from individual amino acids by matching nucleotides bases on messenger RNA with their complementary bases on transfer RNA.

The polynucleotides chains that make up DNA and RNA form via covalent bond*s between sugar and phosphate subunits of neighboring nucleotides along a chain. In addition to the strong covalent bonds that hold polynucleotide chains together, bases along a polynucleotide chain can form hydrogen bonds with bases on other chains (or with bases elsewhere on the same chain, as with secondary structure in RNA).

The formation of stable hydrogen bonds depends on the distance between two strands, the size of the bases and geometry of each base. Stable pairings occur between guanine and cytosine and between adenine and thymine (or adenine and uracil in RNA). Three hydrogen bonds form between guanine and cytosine. Two hydrogen bonds form between adenine and thymine or adenine and uracil.

Complementary pairs always involve one purine and one pyrimidine base*. Pyrimidine-pyrimidine pairings do not occur because these relatively small molecules do not get close enough to form hydrogen bonds. Purine-purine links do not form because these bases are too large to fit in the space between the polynucleotide strands. Asymmetry in the structure of non-complimentary purine - pyrimidine pairs cause some crowding and prevent stable bonds from forming.

Take the concept quiz to test your understanding of complementary nucleotide bases.

Video Overview

Related Content

Subject tag:

DNA Polymerase

DNA polymerases are the enzymes that replicate DNA in living cells. They do this by adding individual nucleotides to the 3-prime hydroxl group of a strand of DNA. The process uses a complementary, single strand of DNA as a template.

The energy required to drive the reaction comes from cutting high energy phosphate bonds on the nucleotide-triphosphate's used as the source of the nucleotides needed in the reaction.

The illustration above highlights important aspects of the reaction.

DNA polymerases can not create new strands of DNA. They only synthesis double stranded DNA from single stranded DNA. The starting point is a a stretch of single stranded DNA which is double stranded for at least part of its length. In the polymerase chain reaction the double stranded stretch is created by attaching short DNA primers. In living cells, RNA primers are used.

DNA polymerase uses the bases of the longer strand as a template. During strand elongation, two phosphates are cleaved from the incoming nucleotide triphosphate and the resulting nucleotide monophosphate is added to the DNA strand. This results in the:

  • Formation of a phosphodiester bond between the phosphate attached to the 5' carbon of the incoming nucleotide and the hydroxyl group on the trailing 3' carbon
  • Release of a pyrophosphate molecule
  • Extension of the DNA polymer by one nucleotide

Removing two phosphates from the incoming nucleotide and bonding the remaining phosphate to the oxygen on the 3' carbon of the existing strand maintains the repeating sugar-phosphate-sugar-phosphate pattern that makes up the backbone of each DNA polymer.

Orientation of the strand is important. Dependence on energy from the phosphates linked to the 5-prime carbon of the incoming nucleotides means that DNA polymerase can only extend DNA strands by adding nucleotides to the 3-prime end of a DNA strand.

Test your understanding of the concepts covered by this illustration with the DNA Polymerase concept questions.

Video Overview

Related Content

Subject tag:

Semiconservative Replication

During DNA* replication, double-stranded DNA molecules separate, and the cellular machinery uses each strand as a template for the synthesis of a new strand resulting in the formation of two identical copies of the original double-stranded molecule. Forming a new double strand from each strand of the original DNA molecule is called semiconservative replication.

The term "semiconservative" captures the idea that each round of DNA replication produces hybrid molecules, each of which contains one old strand and one newly synthesized strand.

The following illustration shows this process over two rounds of replication:

During each round of replication, the amount of DNA doubles. The original strands remain intact and end up in different daughter strands.

The pattern of Semiconservative DNA replication was proposed in a 1953 paper by Watson and Crick. They did not call it semiconservative, but their description captures the idea that the two original strands are used as templates to make new double strands:

"…our model for deoxyribonucleic acid is, in effect, a pair of templates, each of which is complementary to the other. We imagine that prior to duplication the hydrogen bonds are broken, and the two chains unwind and separate. Each chain then acts as a template for the formation onto itself of a new companion chain, so that eventually we shall have two pairs of chains, where we only had one before. Moreover, the sequence of the pairs of bases will have been duplicated exactly."

Source Watson and Crick, 1953 (pdf)

Watson and Crick's paper proposed a mechanism, but provided no experimental evidence. The evidence for semiconservative replication was provided several years later by a 1958 paper by Matthew Meselson and Franklin W. Stahl (pdf).

Video Overview

Related Content

Subject tag:

Replication Fork

The replication fork* is a region where a cell's DNA* double helix has been unwound and separated to create an area where DNA polymerases and the other enzymes involved can use each strand as a template to synthesize a new double helix.

An enzyme called a helicase* catalyzes strand separation. Once the strands are separated, a group of proteins called helper proteins prevent the strands from coming back together.

DNA polymerase can not create new polymers. The enzyme can only extend existing strands by adding new nucleotides* to the 3'-hydroxyl end of an existing polymer. So before DNA polymerase can begin working, primase* (a type of RNA polymerase) binds to each strand of DNA at the replication fork and synthesizes a short (3 to 10 base) strand of RNA. This short RNA polymer called a primer provides a strand end for DNA polymerase to add bases to.

Since DNA polymerases can only add nucleotides to the 3'-hydroxyl end of a nucleotide polymer, and the two strands of the original DNA helix are oriented in opposite directions - synthesis of new polymers has to proceed in opposite directions on each of the two template strands at the replication fork.

In one direction, DNA is replicated as one continuous strand. This is called the leading strand. The other strand is called the lagging strand.

On the lagging strand, the new strand's 3'-hydroxyl end points away from the replication fork. This forces the elongation process to occur in a discontinuous manner. As replication moves along the template strand, a series of shorter DNA polymers form. Each stretch is initiated with its own RNA primer.

The shorter lengths of double-stranded DNA formed along the lagging strand are called Okazaki fragments.

DNA polymerase III performs most of the synthesis activity, but when an Okazaki fragment extends to the point that it overlaps with the previous RNA primer, RNA nucleotides are removed and replaced by DNA. This requires DNA Polymerase I, which has exonuclease activity.

Once the RNA primer is completely replaced by DNA, the two DNA fragments are joined by a ligase enzyme.

*The process is quite complex and involves numerous enzymes and helper proteins. This discussion is limited to the major enzymes involved.

Video Overview

Related Content

Subject tag:

Meselson–Stahl Experiment

In their second paper on the structure of DNA*, Watson and Crick (pdf) described how DNA's structure suggests a pattern for replication:

"…prior to duplication the hydrogen bonds are broken, and the two chains unwind and separate. Each chain then acts as a template for the formation onto itself of a new companion chain, so that eventually we shall have two pairs of chains, where we only had one before." - Watson and Crick, 1953

This is called semiconservative replication.

Today we know that this is the pattern used by living cells, but the experimental evidence in support of semiconservative replication was not published until 1958. In the 5 years between Watson and Crick's suggestion and the definitive experiment, semiconservative replication was controversial and other patterns were considered.

Three hypothesized patterns were proposed:

  • Semiconservative - The original double strand of DNA separates and each strand acts as a template for the synthesis of a complimentary strand.
  • Conservative replication - the original double strand of DNA remains intact and is used as a template to create a new double stranded molecule.
  • Dispersive replication - similar to conservative replication in that the original double strand is used as a template without being separated, but prior to cell division, the strands recombine such that each daughter cell gets a mix of new and old DNA. With each round of replication, the original DNA gets cut up and dispersed evenly between each copy.
Knowing what we know now about how DNA behaves, the fact that the dispersive pattern was a popular seems odd, but this pattern was favored by several well known, prominent scientists. These scientists did not like the semiconservative pattern. They thought the helical nature of the double stranded DNA molecule would make it difficult for the strands to be unwound, separated and copied in the way needed for semiconservative replication to be possible. The dispersive pattern of cutting the helix once every rotation eliminated the need for unwinding the helix. Support for the dispersive hypothesis remained strong until proof of semiconservative replication was provided by Meselson and stahl's 1958 paper (pdf).

The methods Meselson and Stahl developed allowed them to distinguish existing DNA from newly synthesized DNA and to track new and old DNA over several rounds of replication.

They accomplished this by labeling cells with different stable isotopes of nitrogen. First, a culture of bacterial cell were grown for several generations in a media containing only 15N ( a stable, heavy isotope of Nitrogen). After this period* of growth, all of the DNA in the cells contained 15N. These cells were then rinsed and put into a media containing only the more common, lighter isotope of nitrogen (14N). As the cells grew and divided in this fresh media, all newly synthesized DNA would contain only the lighter nitrogen isotope, while DNA from the original cells would still contain 15N. In this illustration above, 15N labeled DNA is shown in orange and 14N labeled in green.

The 15N and 14N labeled DNA was then tracked using high speed centrifugation and a density* gradient created with cesium chloride (CsCl).

During centrifugation in a CsCl gradient, DNA accumulates in bands along the gradient based on its density. Since 15N is more dense than 14N, 15N enriched DNA accumulates lower down in the centrifuge tube than the 14N DNA. DNA containing a mixture of 15N and 14N ends up in an intermediate position between the two extremes.

By spinning DNA extracted at different times during the experiment, Meselson and Stahl were able to see how new and old DNA interacted during each round of replication.

The beauty of this experiment was that it allowed them to distinguish between the three different hypothesized replication patterns. The key result occurs at the second generation when all three proposed replication patterns give different results in the CsCl gradient.

That Meselson and Stahl's experiment showed the pattern predicted by the semiconservative hypothesis provided the definitive experimental evidence in support of the process proposed by Watson and Crick.

Video Overview

Related Content

Subject tag:

Restriction Enzymes

Restriction enzymes cut DNA* at specific sites based on the sequence of bases along the strand at the cut site. These enzymes were first identified and studied in strains of the bacteria E. Coli in the 1950’s and 60’s. The term restriction was used to describe them because their activity restricted the growth of viruses that infect E. coli.

Restriction enzymes are nucleases - enzymes that cut nucleic acid polymers (i.e. DNA and RNA). There are two types of nuclease: endonuclease and exonuclease. Endonucleases make cuts within a DNA polymer. Exonucleases remove individual nucleotides* from the end of a strand. Restriction enzymes are a type of endonuclease - they cut at specific sites in the middle of DNA strands.

The ability of these enzymes to cut DNA at specific sites provide bacteria with a type of immune system that cuts up and, therefore, deactivates foreign DNA such as that introduced by viruses. To be effective, the patterns recognized by each bacteria’s restriction enzymes do not recognize any sequence patterns found in that bacteria’s genome. The specificity of the activity of restriction enzymes has made them useful tools in many molecular biology procedures and techniques.

An important aspect of restriction recognition sites is that they are palindromic. This means the short recognition sequence reads the same way on both strands resulting in the enzyme cutting both strands of a double-stranded DNA molecule.

There are several ways to classify restriction enzymes. The most obvious way is in terms of the base pattern each enzyme recognizes. Many of the recognition sequences used in molecular biology are six bases long, but recognition sequence pattern and lengths vary from enzyme to enzyme. The most widely used enzymes require a perfect match to cut, but others allow for some variation.

Another useful classification system for restriction enzymes is the position of the cut. Many restriction enzymes do not cut in the center of their recognition sequence resulting in overhanging or ‘sticky ends’. Others cut at the mid-point of the recognition sequence leaving no overhang. These are referred to as blunt end cuts.

An example of an enzyme that leaves blunt ends is SmaI. SmaI recognizes the six base sequence: CCCGGG. When cut the end of the two strands are:


Contrast this with the enzyme XmaI which recognizes the same six base pattern: CCCGGG, but cuts in a way that leaves overhangs on either side of the cut. The result of this type of cut is two new, matching strand ends:


This cutting pattern leaves two matching ends with the four base 3`-CCGG… overhang.

The predictable way in which restriction enzymes cut DNA at specific locations makes them extremely useful for molecular biology. Knowing the recognition site of a particular restriction enzyme and the sequence of a DNA strand makes it possible to predict the number of cuts and the position of the cuts that that enzyme will make on that length of DNA. Cutting a set of strands with a panel of restriction enzymes and running them out on a gel is the basis for the comparative technique known as restriction mapping.

Cutting two sequences with a restriction enzyme and then gluing them back together with a ligase is a common and relatively easy way to make hybrid DNA sequences in the lab.

Related Content

Subject tag:

Genetics of Organisms

The genetics* of organisms deals with the physical expression of gene at the organismal level and the organization and transfer of genetic material as it passes from generation to generation during reproduction.

Alleles, Genotype and Phenotype

Genetics is the study of the organization, expression, and transfer of heritable information. The ability for information to pass from generation to generation requires a mechanism. Living organisms use DNA. DNA is a chain, or polymer, of nucleic acids. Individual polymers of DNA can contain hundreds of millions of nucleic acid molecules. These long DNA strands are called chromosomes. The order of the individual nucleic acids along the chain contains information organisms used for growth and reproduction. The use of DNA as the information molecule is a universal property of all life on Earth. Our cellular machinery reads this genetic information allowing our bodies to synthesize the many enzymes and proteins required for life

The illustration explores the relationship between the presence of different alleles at a specific locus and an organism's genotype and phenotype. The organism in the model is a plant. It is diploid, and the trait is flower color. Below is a youtube video demonstrating the use of the illustration anda problem set you can use to test your understanding of these concepts.

Genetic information is carried in discrete units called genes. Each gene contains the information required to synthesize individual cellular components needed for survival. The coordinated expression of many different genes is responsible for an organism's growth and activity.

Within an individual species, genes occur in set locations on chromosomes. This allows their locations to be mapped. The position of a specific gene on a chromosome is called its locus.

Variations in the order of nucleic acids in a DNA molecule allow genes to encode enough information to synthesize the huge diversity of different proteins and enzymes needed for life. In addition to differences between genes, the arrangement of nucleic acids can differ between copies of the same gene. This results in different forms of individual genes. Different forms of a gene are called alleles.

Organisms that reproduce sexually receive one complete copy of their genetic material from each parent. Having two complete copies of their genetic material makes them diploid. Matching chromosomes from each parent are called homologous chromosomes. Matching genes from each parent occur at the same location on homologous chromosomes.

A diploid organism can either have two copies of the same allele or one copy each of two different alleles. Individuals who have two copies of the same allele are said to be homozygous at that locus. Individuals who receive different alleles from each parent are said to be heterozygous at that locus. The alleles an individual has at a locus is called a genotype. The genotype of an organism is often expressed using letters. The visible expression of the genotype is called an organism's phenotype.

Alleles are not created equal. Some alleles mask the presence of others. Alleles that are masked by others are called recessive alleles. Recessive alleles are only expressed when an organism is homozygous at that locus. Alleles that are expressed regardless of the presence of other alleles are called dominant.

If one allele completely masks the presence of another at the same locus, that allele is said to exhibit complete dominance. However, dominance is not always complete. In cases of incomplete dominance, intermediate phenotypes are possible.

Gene interactions can be quite complicated. The example above demonstrates a simple situation in which a single gene corresponds to an individual trait. In more complicated cases, multiple genes can influence an individual trait. This is called polygenic inheritance. In these situations, the relationship between specific alleles and characteristics is not as straightforward.

In his famous pea plant studies, Mendel studied seven traits that have the characteristics needed to allow the observation of inheritance of discrete traits. The traits he studied were seed shape, seed color, flower color, seed pod shape, seed pod color, flower position, and plant stature.

Among the significant contributions of Mendel's work was the understanding that information was passed from one generation to the next in discrete units rather than through blending.

Demonstration video:

Related Content

Subject tag:

Punnett Square

During sexual reproduction, a parent is equally likely to pass on to its offspring either of the two alleles it has at each genetic locus. This makes it possible to list and estimate the probability of specific genotypes being produced from the pairing of two individuals. Given two allele from each parent, four allele combinations are possible. These combinations and their probabilities can be readily visualized using a Punnett square.

To set up a single locus Punnett Square, the genotype of each parent is placed on the sides of a four chambered box. One parent’s alleles are placed across the top. The alleles of the other parent are placed down one side. The alleles on the edges guide how the central squares are filled in. Once complete, a Punnett square shows the genotypes possible from crossing two individuals. Each of the four boxes in the square contains one of the four possible genotypes. The genotype in each box has a 25% probability of occurring every time the two individuals are crossed. If two boxes contain the same genotype, the probability of that genotype occurring doubles to 50%.

Punnett squares are most commonly used to examine genotype probabilities from one genetic locus at a time. They can be used to look at more than one locus at time, but some find the resulting diagrams complicated and difficult to interpret.

The model below illustrates the use of a Punnett Square to determine the possible genotypes that can arise from mating two individuals with known genotypes. The organism in the model is a plant. The plant is diploid. The trait is flower color. Below the illustration is a youtube video demonstrating its use. There is also a problem set you can use to test your understanding of these concepts.

Video Demonstration
Related Content

Subject tag:

Genotype and Phenotype Probabilities

Patterns of genetic inheritance obey the laws of probability. In a monohybrid cross, where the allele*s present in both parents are known, each genotype* shown in a Punnett Square* is equally likely to occur. Since there are four boxes in the square, every offspring produced has a one in four, or 25%, chance of having one of the genotypes shown.

Like flipping a coin, previous matings do not influence the results of subsequent matings. Because of random variation, the actual number of each genotype produced over a series of matings (or crosses) between two individuals will differ slightly from the expected 25% per box.

The illustration above explores how the probabilities predicted by a monohybrid Punnett Square relate to the actual pattern of genotypes and phenotype*s produced from repeatedly crossing two individuals.

Test your understanding of genotype and phenotype probabilities

Video Overview

Related Content

Subject tag:

X-Linked Inheritance

Chromosome*s that both males and females possess in matched sets are called autosome*s. The X and Y-chromosomes that determine the sex of an individual in mammals follow a different pattern and are called allosome*s. The genes present on the X and Y-chromosomes are called sex-linked genes. Sex-linked genes on the X-chromosome are X-linked genes. Genes on the Y-chromosome are Y-linked.

Females have two X-chromosomes. Males have one X and one Y-chromosome.

Females have two X-chromosomes. Males have one X and one Y-chromosome.
Females have two X-chromosomes. Males have one X and one Y-chromosome.

With both an X and a Y-chromosome, males inherit both X and Y-linked traits, while females only inherit X-linked traits. Since males have only one copy of each sex chromosome, they are hemizygous for all sex-linked genes, and they always express the phenotype* of the allele* they get. In other words, their phenotypes always match their genotype*s.

Females get two copies of X-linked genes, demonstrating the more typical dominant-recessive expression patterns of non-sex linked traits.

These patterns cause expression patterns of sex-linked traits to differ between male and female offspring.

The X-chromosome is larger and contains more genes than the Y-chromosome, so most sex-linked traits are X-linked traits.

Wild-type fruit flies have dark red eyes, but there are recessive alleles of this eye color gene (called the white gene) that cause individuals to have white eyes. As a recessive trait, the white eye phenotype is masked by the presence of a wild-type (red encoding) allele. If the white gene were on an autosome, it would exhibit classical Mendelian inheritance patterns . However, the gene is on the X-chromosome, making it an excellent illustration of sex-linked inheritance patterns.

Select one male and one female individual for the P1 generation and click 'begin' to explore eye color inheritance patterns in fruit flies:

Since this particular gene that controls eye color is on the X-chromosome, females (XX) carry two copies, and males (XY) only carry one. In females, the presence of one dominant red encoding allele (XW) will produce red eyes even if the individual is heterozygous for the white allele. Females can be:

  • Homozygous dominant for the red encoding allele - genotype: XWXW; phenotype: red eyes.
  • Heterozygous - genotype XWXw; phenotype: red eyes.
  • Homozygous recessive with two white encoding alleles - genotype XwXw; phenotype white eyes.
Three allele combinations possible in females.
Three allele combinations possible in females.

With only one copy of the X-chromosome, all males are hemizygous for this gene. They have only two options:

  • Hemizygous dominant - genotype: XWY; phenotype: red eyes
  • Hemizygous recessive - genotype: XwY; phenotype: white eyes.
  • Two allele combinations possible in males.
    Two allele combinations possible in males.

    Observing the ratio of male and female red and white-eyed individuals produced with reciprocal cross*es shows the difference between sex-linked and classic Mendelian inheritance patterns. Reciprocal crosses involve crossing true breeding red and white-eyed individuals.

    Two reciprocal crosses are possible
    Two reciprocal crosses are possible A) a true-breeding red-eyed female with a white-eyed male and B) a true-breeding white-eyed female with a red-eyed male.

    Performing the first reciprocal cross: a true-breeding red-eyed female (homozygous dominant) with a true-breeding white-eyed male (hemizygous recessive) results in an F1 generation comprised entirely of red-eyed individuals. 100% of the F1 generation having red-eyes is consistent with what would be predicted based on Mendelian inheritance of a recessive allele. However, with an X-linked gene, the reason for red eyes differs between males and females.

    All the female offspring are heterozygous, receiving an X-chromosome with a red allele from their mother and an X-chromosome with the white allele from their father. The presence of the red allele from the mother masks the white allele. Male offspring only have one X-chromosome, which they received from their female parent. Since the female parent is homozygous, whichever allele the males get, they will receive a red-eye allele.

    Females are red-eyed because the presence of the recessive copy is masked. Males are red-eyed because they only have one copy of the gene, and that copy is for the red allele.

    The females’ phenotype and genotype are consistent with the patterns discovered by Mendel, but the males, as hemizygotes, are not.

    The differences between the sexes become more apparent in a cross using the red-eyed F1 male and red-eyed F1 females. This cross produces a 3:1 ratio of red-eyed to white-eyed individuals, but all white-eyed individuals are male. No females have white eyes because they received one of their X-chromosomes from their hemizygous dominant, red-eyed father. The male offspring all received their single X-chromosome from the heterozygous female parent, so half received a red allele, and half received a white allele.

    First three generations of the first reciprocal cross
    First three generations of the first reciprocal cross.

    Inheritance patterns with the other reciprocal cross (homozygous recessive female with hemizygous dominant male) diverge from the Mendelian pattern more quickly. The F1 generation contains an equal proportion of white and red-eyed individuals, but all males have white eyes, and all females have red eyes.

    First three generations of the second reciprocal cross
    First three generations of the second reciprocal cross.

    Crossing these F1’s again results in a 1:1 ratio of red and white-eyed individuals, but in the F2, half the female offspring and half the male offspring have red eyes.

    In both reciprocal crosses, patterns of inheritance beyond the F2 generation vary depending on which F2 individuals are chosen for the cross.

    X-linked recessive phenotypes are more commonly observed in males because males are hemizygous for sex-linked traits. Females can be heterozygous for a trait and therefore carry the recessive allele without expressing it. These carrier females have a 50% chance of passing the recessive alleles to their male offspring. These male offspring can not be carriers. If they receive the recessive allele, they will express the recessive trait.

    Females expressing detrimental recessive traits like Hemophilia are particularly rare because the only way for a female to be more than a carrier is for a female carrier to produce a daughter with an affected male. The extreme case of an affected female mating with an affected male produces 100% affected offspring.

    Test your understanding of the patterns discussed above with the x-linked gene fill in the blank and multiple choice questions

    Video Overview Related Content

    Subject tag:

    Population Genetics

    Population genetics* is the study of the genetic of interbreeding populations and how they change over time. 


    Drift and Selection

    The Hardy-Weinberg equation describes allele frequencies in populations. It predicts the future genetic structure of a population the way that Punnett Squares predict the results of an individual cross. The equation calculates allele frequencies in non-evolving populations. It is based on the observation that in the absence of evolution, allele frequencies in large randomly breeding populations remain stable from generation to generation.

    In real populations, evolution does occur and allele frequencies vary over time. This divergence between real, evolving populations and theoretical, non-evolving populations allows the Hardy-Weinberg equation to be used to explore the effect of evolution on populations. Two major factors that cause real populations to diverge from the equilibrium predicted by the Hardy-Weinberg equilibrium are genetic drift and natural selection.

    The following illustration shows changes in actual allele frequencies over time compared to the stable structure predicted by the Hardy-Weinberg equation.

    Number in first generation:
    Structure of parent population:
    50/50 BB/bb
    All Heterozygous

    Genetic drift is the random variation that results in specific individuals producing more or less offspring than predicted by chance alone. This is most pronounced in small populations and is a major reason real allele frequencies do not remain at Hardy-Weinberg equilibrium values. Genetic drift is random and as such does not result in populations becoming more adapted to their environment.:

    Natural selection increases the frequency of a favored allele over another and can cause significant departures from Hardy-Weinberg equilibrium.

    Assuming a trait controlled by two alleles where p is the frequency of one allele and q is the frequency of the other allele, the sum of the frequencies must equal 1:

    p + q = 1

    Given p and q, the Hardy-Weinberg equation is:

    p2 + 2pq + q2 = 1


    • p2 equals the proportion of the population that is homozygous for allele 1
    • q2 equals the proportion of the population that is homozygous for allele 2
    • 2pq is the proportion heterozygotes in the population.

    The Hardy-Weinberg Equilibrium only holds if evolution is not occurring. For evolution to not occur, seven conditions need to be met:

    1. No mutations: changes in allele frequencies are not changing due to mutations.
    2. No natural selection - All genotypes have the same reproductive success.
    3. The population is infinitely large
    4. Mating is completely random
    5. No migration - There is no flow of genes in or out of the population due to migration.
    6. All individuals produce the same number of offspring.
    7. Generations are non-overlapping

    While real populations don’t maintain the stable allele frequencies predicted by the Hardy-Weinberg equilibrium, the equation can be used to determine the rates and types of evolutionary change and the types of changes occurring in a population.

    Exploration of population dynamics using Hardy-Weinberg frequencies revels many patterns. For example, the Hardy-Weinberg equation shows how poorly represented alleles persist in populations and the role heterozygotes play in producing individuals with deleterious, homozygous recessive traits.

    Test your understanding with the population genetics problem set

    Video Overview

    Related Content

    Subject tag:

    Hardy-Weinberg Equilibrium Calculator

    The relationship between allele frequencies and genotype frequencies in populations at Hardy-Weinberg Equilibrium is usually described using a trait for which there are two alleles present at the locus of interest.

    This calculator demonstrates the application of the Hardy-Weinberg equations to loci with more than two alleles. Visit the genetic drift and selection illustration for more on the Hardy-Weinberg Equilibrium.


    Number of alleles: 2

    Allele Frequencies Equation -

    Genotype Frequencies Equation -
    p2 + q2 + 2(p)(q) = 1

    Update the values by changing the allele frequency in the blue box below the graph. The calculator has a check that prevents the allele frequencies from summing to any value other than 1. To avoid having your values changed, make sure your values sum to one and enter them from top to bottoms (p then q then r ...)

    Number of genotypes for a given number of alleles Given n alleles at a locus, the number genotypes possible is the sum of the integers between 1 and n:

    • With 2 alleles, the number of genotypes is 1 + 2 = 3
    • 3 alleles there are 1 + 2 + 3 = 6 genotypes
    • 4 alleles there are 1 + 2 + 3 + 4 = 10 genotypes.

    The general formula for finding the sum of a set of integers from 1 to n is:

    Genotypes = n * n+1 / 2

    The calculator does not go beyond 5 alleles and 15 possible genotypes. However, the equation above can be used to calculate the number of genotypes for a locus with any number alleles.

    If a population has 10 alleles for a specific gene, the combined, total number of homozygous and heterozygous genotypes present in the population will be:

    (10 * 11) / 2 = 55

    This breaks down to 10 homozygous genotypes and 45 heterozygous genotypes. The sum of the allele frequencies would still need to equal 1 :

    p + q + r + s + t + u + v + w + x + y = 1

    As would the sum of the genotype frequencies:

    p2 + 2pq + 2pr + 2ps + 2pt + 2pu + 2pv + 2pw + 2px + 2py + q2 + 2qr + 2qs + 2qt +2qu + 2qv + 2qw + 2qx + 2qy + r2 + 2rs + 2rt + 2ru + 2rv + 2rw + 2rx + 2ry + s2 + 2st+ 2su + 2sv + 2sw + 2sx + 2sy + t2 + 2tu + 2tv + 2tw + 2tx + 2ty + u2 + 2uv + 2uw +2ux + 2uy + v2 + 2vw + 2vx + 2vy + w2 + 2wx + 2wy + x2 + 2xy + y2 = 1

    Related Content

    subject tag: