BackofenLab/ExpaRNA

Name: ExpaRNA

Owner: Bioinformatics Lab - Department of Computer Science - University Freiburg

Description: Find the longest common subsequence of exact pattern matchings of two RNAs

Created: 2016-03-31 12:41:35.0

Updated: 2016-03-31 13:03:20.0

Pushed: 2016-11-02 12:27:09.0

Homepage: null

Size: 132

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Build Status

ExpaRNA

Pairwise comparison of RNAs based on exact sequence-structure matches

The program finds the longest common subsequence of exact pattern matches (LCS-EPM). This is the best co-linear arrangement of substructures common to two RNAs.

The Complexity of the algorithm is O( n2 m2 ) time and O (nm) space for two RNAs of lengths n and m.

Motivation: Specific functions of ribonucleic acid (RNA) molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. ExpaRNA is a new RNA sequence?-structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural motifs.

Based on a fast detectable set of overlapping and crossing substructure matches for two nested RNA secondary structures, our method ExpaRNA (exact pattern of alignment of RNA) computes the longest co-linear sequence of substructures common to two RNAs. Applied to different RNAs, our method correctly identifies sequence-??structure similarities between two RNAs.

Results: We have compared ExpaRNA with two other alignment methods that work with given RNA structures, namely RNAforester and RNA_align. The results are in good agreement, but can be obtained in a fraction of running time, in particular for larger RNAs. We have also used ExpaRNA to speed up state-of-the-art Sankoff-style alignment tools like LocARNA, and observe a tradeoff between quality and speed. However, we get a speedup of 4.25 even in the highest quality setting, where the quality of the produced alignment is comparable to that of LocARNA alone.

Dependencies
Contribution

Feel free to contribute to this project by writing Issues with feature requests or bug reports.

Cite

If you use IntaRNA, please cite our article:

 10.1093/bioinformatics/btp065

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.