2 minute read

Overlapping Genes

Overlapping genes are defined as a pair of adjacent genes whose coding regions are partially overlapping. In other words, a single stretch of DNA codes for portions of two separate proteins. Such an arrangement of genetic code is ubiquitous. Many overlapping genes have been identified in the genomes of prokaryotes, eukaryotes, mitochondria, and viruses.

For two genes to overlap, the signal to begin transcription for one must reside inside the second gene, whose transcriptional start site is further "upstream." In addition, the "stop" signal for the second gene must not be read by the ribosome during translation, using the RNA copy of the gene. This is possible because RNA is read in triplets, meaning that it can contain three separate sequences that can be "read" by the cell's protein-making machinery. Such sequences of nucleotide triplets are called reading frames, and they are different in the RNA transcripts of the overlapping genes.

Overlapping genes enable the production of more proteins from a given region of DNA than is possible if the genes were arranged sequentially. Indeed, for the bacteriophage PhiX174, overlapping of genes is necessary. The amount of DNA present in the circular, single-stranded DNA genome of this virus would not be sufficient to encode the eleven bacteriophage proteins if transcription occurred in a linear fashion, one gene after another.

The genome economy afforded by overlapping genes extends to the human genome. The recently completed sequencing of the human genome has revealed between 30,000 and 70,000 genes. Yet evidence suggests that the human genome encodes 100,000 to 200,000 proteins. At least part of the information for the extra proteins may come from the presence of hitherto undiscovered overlapping genes, although more may come from alternative splicing of exons in a single gene.

In algae called Guillardia, a structure called a nucleomorph contains only about 500,000 base pairs of DNA, a very small genome, yet produces almost 500 proteins. Part of the efficient packaging of the genome is due to 44 overlapping genes. A nucleomorph is a remnant of a nucleus from an ancient eukaryotic organism that became incorporated into the algae.

One consequence of overlapping genes is to reduce the tolerance for mutation. Virtual experiments conducted within the past several years using a software system called Avida have indicated that overlapping reduces the probability of accumulating so-called neutral mutations in a gene (mutations that have no effect). Neutral mutations are unlikely with overlapping genes, because the mutation must have no effect on two genes with different reading frames.

The evolutionary origin of overlapping genes is not yet clear. Recent research indicates that they may have arisen due to several mutational events. These may include the loss of a signal to stop the transcription process in a gene and a shift in the reading sequence of the genetic components.

Brian Hoyle


Douglas, S., et al. "The Highly Reduced Genome of an Enslaved Algal Nucleus."Nature 410 (April 2001): 1091-1096.

Lewis, R., and B. A. Palevitz. "Genome Economy." The Scientist 15 (June 2001): 21.

Internet Resources

"-1 Programmed Frameshift in the Regulation of the PhiX174 Lysis Gene." CarnegieMellon University (2000). <http://info.bio.cmu.edu/Courses/03441/TermPapers/2000TermPapers/group1/phix.html>.

"The Evolution of Genetic Codes." Michigan State University (1999). <http://www.cse.msu.edu/~ofria/home/pubs/abstracts/THESIS.html>.

Additional topics

Medicine EncyclopediaGenetics in Medicine - Part 3