Nucleobase is one of the molecules of a nucleotide that carries information. It is also a component of DNA and RNA. The helix strands of DNA are connected between themselves by pairs of nucleobases. Nucleobases are nitrogen-containing biological compounds found within DNA, RNA, nucleotides, and nucleosides. Also termed nitrogenous bases or simply bases, their ability to form base-pairs and to stack upon one another lead directly to the helical structure of DNA and RNA.

The primary nucleobases are cytosine, guanine, adenine, thymine and uracil, abbreviated as C, G, A, T, and U, respectively. They are usually simply called bases in genetics. Because A, G, C, and T appear in the DNA, these molecules are called DNA-bases; A, G, C, and U are called RNA-bases. Cytosine, thymine, and uracil are all pyrimidines. In normal spiral DNA the bases form pairs between the two strands: A with T and C with G. Purines pair with pyrimidines mainly for dimensional reasons – only this combination fits the constant width geometry of the DNA spiral.

Structure and Function of Nucleobase

At the sides of nucleic acid structure, phosphate molecules successively connect the two sugar-rings of two adjacent nucleotide monomers, thereby creating a long chain biomolecule. These chain-joins of phosphates with sugars (ribose or deoxyribose) create the “backbone” strands for a single- or double helix biomolecule. In the double helix of DNA, the two strands are oriented chemically in opposite directions, which permits base pairing by providing complementarity between the two bases, and which is essential for replication of or transcription of the encoded information found in DNA.

DNA and RNA contain, next to the four canonical nucleobases, a number of modified nucleosides that extend their chemical information content. RNA is particularly rich in modifications, which is obviously an adaptation to their highly complex and variable functions. In fact, the modified nucleosides and their chemical structures establish a second layer of information which is of central importance to the function of the RNA molecules. Also the chemical diversity of DNA is greater than originally thought. Next to the four canonical bases, the DNA of higher organisms contains a total of four epigenetic bases: m5dC, hm5dC, f5dC und ca5dC. While all cells of an organism contain the same genetic material, their vastly different function and properties inside complex higher organisms require the controlled silencing and activation of cell-type specific genes. The regulation of the underlying silencing and activation process requires an additional layer of epigenetic information, which is clearly linked to increased chemical diversity. This diversity is provided by the modified non-canonical nucleosides in both DNA and RNA.