LC Sciences News

Technologies for Genomics and Proteomics Discoveries

16
Aug

The Significance of a miRBase Update

Next-gen sequencing has provided new insight into the microRNAome, is accerating the rate of discovery of new small RNAs, and is responsible for many of the changes to the miRBase sequence database in recent updates. But a miRBase update is not merely the adding of new sequences.  Though that is a large part of it, sequences are also updated, corrected, or in some cases deleted.

Deleting Sequences

  • Misannotated sequences are deleted – next-gen sequencing data will enable more of this
  • Cases of duplicate entries mapping to a single genomic locus are cleaned up – some enabled by new genome assembly releases
  • Ex – Drosophila pseudoobscura V18 – 211 database entries, V19 – 210 database entries.

Renaming Sequences

  • The process of retiring the miR/miR* nomenclature, in favour of the -5p/-3p nomenclature began with version 17 of the database.
  • Version 17 – Drosophila sequences renamed.
  • Version 18 – Human, Mouse, and Zebrafish sequences renamed.
  • Version 19 – the miR* nomenclature is finally retired for all species.
  • The miRBase curators note that the names are meant to be useful, but are not formally stable. They recommend that sequence names shouldn’t be used to convey complex information. Instead they recommend to use miRNA accession numbers, which do remain stable between releases, or you can always quote the sequence to be truly unambiguous. (miRBase Blog post regarding naming)

Adding Sequences

  • Version 19 includes 3171 new hairpins and 3625 novel mature products.  That’s an 18% increase in the database size in a single update.

Because the rate of increase in size of the database is accelerating, it is making it more and more difficult for array manufacturers (that use the miRBase sequence database for probe content) to keep their microarrays up-to-date with the latest sequence information.  Often, pre-made arrays, sitting on the shelf, contain only a fraction of the known content for a particular species.

In comparison, LC Sciences’ microRNA microarrays are produced only as needed. This is made possible by our µParaflo® Biochip Technology – a high performance microfluidic custom microarray platform, enables on demand synthesis of microarrays.

Our microRNA arrays cover all species for which sequence data are available in the miRBase Sequence Database and the Plant MicroRNA Database.

Comparison of miRBase Versions

Supplier miRBase Version Available Total miRBase Entries miRBase Release Date
LC Sciences 19 21264 Aug 2012
Exiqon 18 18226 Nov 2011
Phalanx 18 18226 Nov 2011
Agilent 17 16772 April 2011
Affymetrix 17 16772 April 2011
ABI qPCR Cards 14 10883 Sept 2009

miRBase Entires

A comparison of the array versions available from various suppliers shows a significant amount of miRNA expression information is missed when using these out of date arrays for miRNA expression profiling analysis.

Comparison of miRBase Coverage

Supplier Species miRBase Version Unique Mature miRNAs Information Missed on Earlier Array Versions
LC Sciences Human 19 2019
Mouse 19 1265
Rat 19 722
Exiqon / Phalanx Human 18 1921 5%
Mouse 18 1157 9%
Rat 18 680 6%
Agilent Human 16 1212 40%
Mouse 17 1111 12%
Rat 16 679 6%
Affymetrix Human 17 1733 14%
Mouse 17 1111 12%
Rat 17 680 6%
ABI qPCR Cards Human 14 894 56%
Mouse 14 700 44%
Rat 14 388 46%

Mature Comparison Graph