Background
Ribonucleic acid (RNA) molecules are important in the performance of biological processes in the cell. Some of their known roles include protein synthesis and transport, catalysis, and chromosome replication and regulation. Studies have shown that there are different types of RNA that perform the different biological functions. These RNA molecules have a vast number of structures. Using graph theory, we aim to describe and analyze these structures and apply the fingings to many important problems such as RNA design. Our graphical representations are limited to RNA secondary elements. Still, graph representation allows enumeration of RNA's repertoire. Since we find that only few RNAs have been found compared to the number of possible topologies, these graphs will help the search of missing RNA structures and stimulate the production of RNAs in the laboratory.
RNA trees and pseudoknots are two major types of 2D RNA secondary structures, distinguished by the topology of their base pairing patterns. An RNA tree is a branching network of helical stems interrupted by bulges and junctions that end in loops, except at the 3' and 5' ends. An RNA pseudoknot has a stretch of nucleotides within a hairpin loop that pairs with nucleotides external to that loop. RNA trees can be represented using tree graphs (see How to Produce RNA Tree Graphs). However, tree graphs cannot represent RNA pseudoknot topologies. Moreover, there is another RNA topological type which cannot be represented as trees: RNAs with stems connected by a single strand. We call such structures RNA bridges. Thus, more general (non-tree) graphs are required to graphically represent existing RNA trees, pseudoknots and bridges.
Purpose
To represent, catalogue and analyze existing and hypothetical RNA trees, pseudoknots and bridges.
Process
We begin this subproject by taking the pseudoknot sequences available in Pseudobase and drawing the corresponding pseudoknot structures which are then converted into the dual graphs of this website's database. These dual graphs provide a simplified image of what the actual secondary structures look like, but they are more complex than the graphs provided in the Tree Graph Database. Because of the simplicity of tree and dual graphs, we are able to apply the tools of graph theory to study these graphs in more depth. By using computational methods we calculate the corresponding Laplacian matrices and their eigenvalues for each graph. This allows us to analyze the various eigenvalues and to search for clusters and relationships between the eigenvalues of an RNA found in Nature and those for the RNA that have not yet been found. Finally, these data and information will be applied to the design of RNA as well as to the search of RNA.
Database Description
The dual graphs for the existing and non-existing RNAs are distinguished by color. In addition, we organize dual graphs according to two different methods. In one approach, graphs are ordered by the number of vertices. In the other method, graphs are ordered by functional types or families. You may look at all of the graphs with a particular number of vertices or of certain type at any one time. In each grouping, graphs are ordered by the second eigenvalue of their corresponding Laplacian matrix. More information about a particular structure may be obtained by clicking on the graph. Links to different parts of the rest of this database have been provided as well as links to other useful databases and programs.