Methylation analysis methods can benefit from prior targeted capture and amplification of suspected CpG regions in order to reduce the complexity of samples and focus the analysis on specific genomic segments.
CpG (cytosine-phosphate-guanine) dinucleotide sites are regions of DNA where a cytosine nucleotide occurs next to a guanine nucleotide in the linear sequence of bases. In mammals, enzymes (methyltransferases) can add a methyl group to the cytosines in CpG sites to form a 5-methylcytosine cytosine nucleotide (5mC). Methylating the cytosine within a gene can turn the gene off, and represents one of several epigenetic mechanisms of gene regulation. Methylation of CpG sites plays an essential role in normal development in mammals and aberrant methylation has been noted in a variety of diseases, including many forms of cancer.
Hence, tools for the study of DNA methylation are essential. As with most genomic analysis tools, they must be: quantitative, high-throughput, cost-effective, and both scalable and flexible with respect to coverage. Ideally, one would be able to efficiently investigate the methylation of large numbers of CpGs in large numbers of samples. The standard method for measuring methylation involves treatment of DNA with sodium bisulfite which causes conversion of unmethylated cytosines (C) to uracils (U), whereas 5mCs remain unchanged. These differences in reactivity of Cs and 5mCs to bisulfite can be distinguished by subsequent microarray1 or sequencing2 methods. There is a trade off as each of these technologies has strengths and weaknesses. While microarrays are generally regarded as very high-throughput (many samples can be processed very quickly), next-gen sequencing is considered a global view (a single analysis can be genome-wide and cover many CpGs).
Both of these methods can benefit from prior targeted capture and amplification of suspected CpG regions in order to reduce the complexity of samples and focus the analysis on specific genomic segments. The use of oligonucleotides for targeted capture increases both sample throughput and coverage while decreasing cost per sample. LC Sciences OligoMix® has been demonstrated as an effective method of oligo synthesis for capture in targeted sequencing3,4 applications and as a method to reduce error rates in gene synthesis5,6 applications. Using an OligoMix® synthesis strategy vs. individual oligo synthesis further increases flexibility, scalability and cost efficiency of targeted methylation analysis methods.
A group of researchers from Affymetrix and Stanford University has demonstrated the utility of targeted capture in a microarray application for methylation analysis: Methylation by Target Amplification by Capture & Ligation (mTACL)1. They went on to show that OligoMix® synthesized oligos perform equally well as individually produced probes manufactured using a single-plex PCR amplification and subsequent pooling strategy.
In the mTACL approach, the regions to be analyzed are first captured and ligated to common primers, reacted with bisulfite, amplified, and then analyzed by hybridization of the product to a microarray. This approach is high-throughput and allows analysis of hundreds of thousands of CpGs from many samples in parallel. The researchers designed a large set of probes (~20K) and quantitatively assessed the methylation of over 140,000 CpGs in more than 200 samples using mTACL. They assess the precision and accuracy of their assay with technical replicates, in a receiver operating characteristic (ROC) curve analysis (which depicts a true positive rate vs. a false positive rate), and by comparison with next-gen sequencing results.
Figure 1 – mTACL Assay overview. (A) dU probes are tools for the targeted capture of genomic loci. An individual probe consists of a double-stranded DNA molecule in which all of the T’s in the sequence have been substituted with dU. (B) Scheme for determining methylation using the dU probes and bisulfite treatment. (Nautiyal S et al. PNAS 2010)
One of the challenges of the mTACL methodology lies in the construction of capture probe panels. The current study required more than 19,000 single-plex PCR reactions, which were subsequently pooled. Thus, the procedure is labor intensive and costly, making it impractical for construction of larger and custom panels. As a parallel oligo synthesis technology capable of producing virtually unlimited numbers of oligos of lengths up to 100 nucleotides as a pool, OligoMix® overcomes this barrier and represent a significantly more cost effective method for construction of probe panels than single-plex PCR. The researchers compared the single-plex PCR constructed probes with probes generated by OligoMix® using ROC analysis and observed no significant difference in performance.
Figure 2 – ROC analysis using data from each experiment to determine whether the specificity and sensitivity achieved was similar with the oligo pool panel vs. the full-length panel. ROC analysis would be very sensitive to small deterioration in performance. ROC curves are similar for both methods. (Upper) Panel created from pooled oligos. (Lower) Full-length panel. Sample comparisons are 0% vs. 10% CpG methylated (red); 0% vs. 25% CpG methylated (blue); 0% vs. 50% CpG methylated (magenta); and 50% vs. 100% CpG methylated (green). (Nautiyal S et al. PNAS 2010)
“Oligo pools therefore represent an inexpensive means for constructing large and custom dU probe panels and greatly improve the flexibility of the assay with respect to coverage…Using oligo pools, no enzymatic steps are required after the PCR, making the probe very scalable.”
More recently, a group of researchers from University of California, San Diego has demonstrated the utility of targeted capture in a next-gen sequencing application for methylation analysis: Library-Free Bisulfite Padlock Probes (BSPPs)2. In the BSPP sequencing approach, padlock probes are annealed to bisulfite converted genomic DNA, captured targets are circularized then PCR amplified with bar-coded primers and directly sequenced via Illumina sequencing. After a broad initial study composed of ~330,000 probes that measured methylation of ~500,000 CpG sites, they point out that a quick focused follow up on a smaller set of the candidate regions identified by the genome-scale scanning experiment is often necessary.
Figure 3 – BSPP Assay overview – Each padlock probe has a common linker sequence flanked by two target-specific capturing arms (red) that anneal to bisulfite converted genomic DNA. The 3′ end is extended and ligated with the 5′ end to form circularized DNA. After removal of linear DNA, all circularized captured targets are PCR-amplified with barcoded primers and directly sequenced. (Diep D et al. Nature Methods 2012)
When describing the development of a follow-up assay, they note that, “Such an assay needs to be customizable to different genomic targets, scalable to a very large sample size (1,000–100,000 samples), and inexpensive.” Demonstrating its flexibility, the researchers chose OligoMix® to perform the validation experiment on the genomic regions identified by BPSS and to additionally evaluate regions 1kbp upstream and downstream. They go on to show that even with shorter capturing sequences and a 100-fold smaller target area, OligoMix® still achieved an enrichment factor of ~6,500. They further identified regions of aberrant methylation in induced pluripotent stem cells and demonstrated that aberrant methylation continues further upstream and downstream than observed previously.
Figure 4 – UCSC Genome Browser view showing an example of aberrant iPSC specific methylation after reprogramming of PGP1 fibroblasts into iPS cells. Circles represent a location with measurable methylation state, with black indicating unmethylated and gold indicating methylated. The Agilent 330K probe set identified a small intronic region containing aberrant methylation in the iPS cells that are not present in either the fibroblast progenitors or a control hESC line. The LC Sciences 4K probe set was designed to characterize the methylation state upstream and downstream of this region. This focused assay revealed that the abnormal methylation also extended into the exonic region of GRM7. (Diep D et al. Nature Methods 2012)
This analysis demonstrated that an OligoMix® strategy can be used to produce a focused probe set to validate specific regions of interest identified in global scanning using either their genome-wide probe set or other methods.
The UCSD group has additionally made available software tools (used in this study) for the design of padlock probes (ppDesigner) as well as an analysis pipeline for read mapping and methylation quantification (bisReadMapper).
These tools are available at: http://genome-tech.ucsd.edu/public/Gen2_BSPP/
- Nautiyal S, Carlton VE, Lu Y, Ireland JS, Flaucher D, Moorhead M, Gray JW, Spellman P, Mindrinos M, Berg P, Faham M. (2010) High-throughput method for analyzing methylation of CpGs in targeted genomic regions. Proc Natl Acad Sci 107(28), 12587-92.
- Diep D, Plongthongkum N, Gore A, Fung H, Shoemaker R, Zhang K. (2012) Library-free methylation sequencing with bisulfite padlock probes. Nature Methods 9(3):270-2.
- Teer JK, Bonnycastle LL, Chines PS, Hansen NF, Aoyama N, Swift AJ, Abaan HO, Albert TJ; NISC Comparative Sequencing Program, Margulies EH, Green ED, Collins FS, Mullikin JC, Biesecker LG. (2010) Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res 20(10), 1420-31.
- Myllykangas S, Buenrostro JD, Natsoulis G, Bell JM, Ji HP. (2011) Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide selective sequencing. Nat Biotechnol 29, 1024–27.
- Tian J, Gong H, Sheng N, Zhou X, Gulari E, Gao X, Church G. (2004) Accurate multiplex gene synthesis from programmable DNA chips. Nature 432, 1050-54.
- Matzas M, Stähler PF, Kefer N, Siebelt N, Boisguérin V, Leonard JT, Keller A, Stähler CF, Häberle P, Gharizadeh B, Babrzadeh F, Church GM. (2010) High fidelity gene synthesis by retrieval of sequenceverified DNA identified using high-throughput pyrosequencing. Nat Biotechnol 28(12), 1291-94.