Secondary structure

Secondary structure in biochemistry and structural biology describes the general three-dimensional form of local regions or overall shape of biopolymers. It does not, however, refer to specific positions in three-dimensional space, which are considered to be tertiary structure. The secondary structure of a protein may include regions of alpha helices, beta sheets, turns, and random coil, or a few less common structures. Secondary structures can often be identified by circular dichroism spectroscopy. Nucleic acids also have secondary structure, most notably single-stranded RNA molecules.


3D structure of the Myoglobin protein: alpha helices are shown in colour, and random coil in white, there are no beta sheets in shown. This protein was the first to have its structure solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, which led to them receiving a Nobel Prize in Chemistry in 1962.

Proteins

The DSSP Code

The DSSP code is frequently used to describe the protein secondary structures with a single letter code. DSSP is an acronym for "Dictionary of Protein Secondary Structure", which was the title of the original article actually listing the secondary structure of the proteins with known 3D structure. The secondary structure is assigned based on hydrogen bonding patterns as those initially proposed by Pauling et al. in 1951 (before any protein structure had ever been experimentally determined).

• G = 3-turn helix (3_10 helix). Min length 3 residues.
• H = 4-turn helix (alpha helix). Min length 4 residues.
• I = 5-turn helix (pi helix). Min length 5 residues.
• T = hydrogen bonded turn (3, 4 or 5 turn)
• E = beta sheet in parallel and/or anti-parallel sheet conformation (extended strand). Min length 2 residues.
• B = residue in isolated beta-bridge (single pair beta-sheet hydrogen bond formation)
• S = bend (the only non-hydrogen-bond based assignment)

In DSSP residues which are not in any of the above conformations is designated as ' ' (space), which sometimes gets designated with C (coil) or L (loop). The helices (G,H and I) and sheet conformations are all required to have a reasonable length. This means that 2 adjacent residues in the primary structure must form the same hydrogen bonding pattern. If the helix or sheet hydrogen bonding pattern is too short they are designated as T or B, respectively. Other protein secondary structure assignment categories exist (sharp turns, Omega loops etc.), but they are less frequently used.

RNA

RNA secondary structure is generally divided into helices (contiguous base pairs), and various kinds of loops (unpaired nucleotides surrounded by helices). Another reasonable definition of secondary structure of RNA is that it defines which nucleotides bind each other, and, for example, nucleotide pairs that are bound form helices. RNA secondary structure can also include pseudoknots and base triples.

For many RNA molecules, the secondary structure is highly important to the correct function of the RNA often more so than the actual sequence. This fact aids in the analysis of non-coding RNA sometimes termed "RNA genes". RNA secondary structure can be predicted with some accuracy by computer, and many bioinformatics applications use some notion of secondary structure in analysis of RNA.

Alignment

Both protein and RNA secondary structures can be used to analyze sequences by alignment. These alignments can be made more accurate by the inclusion of secondary structure information, in addition to the usual use of sequence.

Distant relationships between proteins whose primary structures are unalignable can sometimes be found by secondary structure.

Prediction

Algorithms to predict RNA secondary structure typically use dynamic programming, and many algorithms are based on Stochastic context-free grammars.

___________________

Go to Start | This article uses material from the Wikipedia