Goal

JAR3D matches RNA hairpin and internal loop sequences to motif groups from the RNA 3D Motif Atlas, by exact sequence match for sequences already observed in 3D and by probabilistic scoring and edit distance for novel sequences. RNA hairpin and internal loops are often represented on secondary structure diagrams as if they are unstructured, but in fact most are structured by non-Watson-Crick basepairs, base stacking, and base-backbone interactions. Analysis of 3D structures shows that different RNA sequences can form the same RNA 3D motif, as is apparent in many motif groups in the RNA 3D Motif Atlas. JAR3D matches sequences to motif groups based on the ability of the sequences to form the same pattern of interactions observed in 3D structures of the motif. Because the RNA 3D Motif Atlas incorporates new RNA 3D structures every four weeks, the performance of JAR3D will improve over time.

Functions of recurrentĀ 3D motifs include:

  • Architectural roles introducing bends in helices (e.g. kink-turns) or changing helical twist (e.g. C-loops)
  • Anchoring RNA tertiary interactions (e.g., GNRA loops and loop-receptors)
  • Providing sites for proteins or small molecules to bind.

Inferring the 3D structures of hairpin and internal loops is a step on the way toward correctly predicting full RNA 3D structures starting from sequence.

Input and Output

JAR3D accepts single or multiple sequences having one or many loops. See the Examples above. One loop: To specify the break between strands in internal loops, use an asterisk *. Sequence(s) without an asterisk and shorter than 25 nucleotides are interpreted as hairpins. Internal and hairpin loops should include closing Watson-Crick basepairs, with nucleotides running in 5' to 3' order within each strand. Many loops If a secondary structure is provided, it is used to extract internal and hairpin loops. Single sequences without a secondary structure are folded by UNAfold and multiple sequence alignments are folded with RNAalifold.

The output shows the best-matching motif groups from the RNA 3D Motif Atlas and statistics measuring the quality of the match. The user can view a representative instance from each motif group and explore the group further at the RNA 3D Motif Atlas page for the motif group.

Method

  1. We extract all hairpin and internal loops from a non-redundant set of RNA 3D structures from the PDB/NDB and cluster them in geometrically similar families.
  2. For each recurrent motif, we construct a probabilistic model for sequence variability based on a hybrid Stochastic Context-Free Grammar/Markov Random Field (SCFG/MRF) method we developed.
  3. To parameterize each model, we use all instances of the motif found in the non-redundant dataset and knowledge of RNA nucleotide interactions, especially isosteric basepairs and their substitution patterns.
  4. Given the sequence of a hairpin or internal loop from a secondary structure as input, each SCFG/MRF model calculates the probability that the sequence forms a given 3D motif. If the score is in the same range as sequences known to form the 3D structure, we infer that the new sequence can form the same 3D structure.
    Read more