RAG: RNA-As-Graphs Database - Concepts, Analysis, and Features





Motivation: RNA's repertoire is growing rapidly with progress in RNA genomics. The direct correlation between the conservation of RNA secondary structures with their functional properties offers an opportunity for cataloging and classifying RNA secondary motifs and, eventually, predicting novel motifs. Although several RNA databases have been built, there are no comprehensive, quantitative schemes for cataloguing and measuring the range and diversity of RNA's structural repertoire. Understanding RNAs structural diversity is vital for identifying novel RNA structures.

Results: We describe an RNA-As-Graphs (RAG) Database that catalogues and ranks all mathematically possible (including existing and candidate) RNA secondary motifs derived from graphical enumeration results. As recently reported, we represent RNA secondary structures as two-dimensional graphs, which specify the connectivity between RNA secondary structural elements, such as loops, bulges, stems, and junctions. We archive RNA tree motifs as "tree graphs" and other motif types, including pseudoknots, as general graphs called "dual graphs". All RNA motifs are catalogued by graph vertex number (equivalently, RNA sequence length) and ranked by topological complexity as measured using the second smallest eigenvalue corresponding to the graph's (Laplacian) matrix representation, a quantitative tool well known in graph analysis. RAG is an intuitive but systematic cataloguing of all possible RNA secondary motifs; significantly, it leads immediately to suggest candidates for novel RNA motifs. Exploring candidate novel RNAs will likely stimulate searches for novel RNA motifs or submotifs, either naturally occurring or produced synthetically in the laboratory via RNA design. The RAG database can further be used to help identify structural and functional properties of user-supplied RNA secondary structures. Thus, RAG's enumerated RNA motifs contribute a useful resource for cataloguing, identifying and predicting RNA motifs.

Availability: The database is accessible on the web at http://monod.biomath.nyu.edu/rna






Click to go back to the publication list