General Considerations

The time to start thinking about validating your microarray results is not after you have completed your experiment, but instead before you start. Generally, your ability to validate results is dependent on the statistical validity of your data which is in turn dependent on good experimental design.

Good experimental design is essential to ensure statistically significant results and to justify scientific conclusions. Good basic experimental design includes randomization of samples, (assignment to conditions, prep order, location in instrument, etc.) and the use of biological replicates. Biological variability is a fundamental characteristic of gene expression and for experiments performed with a small number of biological replicates, results may be due to biological variation and/or experimental variation rather than the difference in conditions/treatment. Hence, they may not be reproducible; and it is impossible to know whether expression patterns are specific to the individuals in the study or are a characteristic of the different sample groups. Therefore, you will not be able to validate results from poorly designed experiments (e.g. experiments with only a single sample per test group)

The appropriate number of biological replicates to use is dependent on the specifics of your experiment (e.g. species, sample type, expected degree of variability, effect that treatment or condition has on miRNA expression).[1]

Validation Expectations

We would define a validated result as one that demonstrates a change in expression in the same direction (e.g. up or down) as the original result. The fold change or the degree of change in expression may differ simply due to the differences inherent to these methods of determining expression levels. For examples, TaqMan QPCR is a very specific method and measures the expression level of one very specific miRNA sequence. Microarray on the other hand, will not necessarily distinguish the difference between the potential isoform sequences of any given miRNA – e.g, if the iso-miRNA is just a shorter version or sequence shifted version. Additionally, TaqMan QPCR and array assays use different enzymatic reactions involving reverse transcription and ligation, respectively. The two reactions do not always have the same yield.

Sample Selection

It is very important to always use the exact same samples in validation experiments as you did in the original microarray experiment. You may not be able to validate expression results using similar samples or replicate samples from the same group, or even from a separate culture dish. There is natural variability in expression levels between individual specimens and experimental variability between RNA extractions, hence the need for replicates in experimental design. Therefore, the sample tested should come from the exact same RNA extraction prep as was used for the original microarray experiment.

miRNA Sequence Selection

There are three criteria that should be met in order to consider that a miRNA expression result can be validated: signal intensity, fold-change, and p-value.

     Signal Intensity Value

The signal intensity for each miRNA can be found in the normalized data file named S1XXXXX_MultiArray Analysis_Data.xls. You can use the following intensity values as a guideline to categorize your results:

< 500 represent very low intensity data

~500-2000 – low intensity

~2000-10,000 – medium intensity

> 10,000 – high intensity

For validation experiments, we recommend to focus on those miRNAs that show a minimum intensity of 500 in at least one of the sample groups.

Lower intensity signal miRNAs will be difficult to validate because the QPCR cycle numbers (and hence standard deviation) will be increased. In these cases, the standard deviation may become higher than the fold-change you are trying to validate.

     Fold Change

The fold-change value for each miRNA has been calculated and can be found in the far right column of the statistical test in your In-depth Data Analysis Report and are stated as the log2 values.

We recommend selecting miRNAs for validation that exhibit at least a 2-fold change in expression which corresponds to a log2 value of > +1 or <-1.

     p-Value

The p-value for each miRNA has been calculated and can be found in the statistical test result tables (e.g. T-test-01-XXXX.xls) in the second column – adjacent to the miRNA name. It is a measure of the statistical significance of a given expression result.

Unfortunately, there is no hard and fast p-value criteria for validation as this value will be dependent on the number of samples in your experiment. In general, a lower p-value is a more significant result and is more likely to be a true biological difference and thus validatable.

For a typical experiment with 3 biological replicate samples per group, look for p-value < 0.05 as significant result.

Sequence Verification

Always check to make sure the sequence of the QPCR assay you are using for validation is the exact same as the sequence we have listed on your data result from the microarray experiment.

miRBase[2] is the official database of miRNA sequence information on which almost all commercially available microarray content is based (including LC Sciences). This database is updated periodically to include newly discovered miRNAs but also to correct name and sequence errors. As a result, the same miRNA name may have a different sequence in different versions of the database. LC Sciences’ custom microarrays are updated in-sync with miRBase updates; however, QPCR assays routinely lag behind in updates to sequence information.

Additional, over time there have been changes to miRNA nomenclature rules. As an example, in 2012 miRBase phased out the miR/miR* nomenclature in favor of the -5p/-3p nomenclature. The human, mouse, and C. elegans miRNAs were updated in the miRBase version 18 release and the remaining species were changed with the version 19 release. (See the miRBase blog for more information on name changes to miRNA sequences.(http://www.mirbase.org/blog/)

Selection of Controls

Normalization to endogenous control genes is currently the most accurate method to correct for potential RNA input or reverse transcription (RT) efficiency biases. Careful selection of an appropriate set of controls is extremely important as significant variation has been observed between samples.

For each sample, we suggest using at least one, preferably two, small RNA controls from the following list.

Human tissues

RNU48
U47
RNU6B

Mouse tissues

snoRNA202
snoRNA234

Cell linesRNU24
RNU38B
Z30

Additionally, we recommend to add one or more miRNAs, which have similar expression levels in all your samples (based on the microarray data), to your control list. The following miRNAs have been found relatively stable in human tissues and NCI-60 cell lines. However, each experiment is different. For microarray expression data from LC Sciences, it is preferable that you select control miRNAs that have high signal intensities (>10,000).

Human tissues

has-miR-26b
has-miR-92

Cell lines

has-miR-16
has-miR-423

Also, please refer to this Application Note from Applied Biosystems – “Endogenous Controls for Real-Time Quantitation of miRNA Using TaqMan® MicroRNA Assays” for more on selection of controls for validation experiments.

 


[1] Zhou X, Zhu Q, Eicken C, Sheng N, Zhang X, Yang L, and Gao X. (2012) MicroRNA profiling using µParaflo microfluidic array technology. Methods Mol Biol 822, 153-82. [abstract]

[2] Kozomara A and Griffiths-Jones S. (2011) miRBase: integrating microRNA annotation and deep-sequencing data. NAR 39(Database Issue), D152-D157. [article]


Technical Bulletin – Experimental Design for Microarray Experiments Technical Bulletin – microRNA Microarray Service – In Depth Data Analysis Guide