With such large numbers of variants now identified through rapidly increasing use of next generation sequencing (NGS), it has become important to determine the functionality of these variants to assess their clinical diagnostic, prognostic, or therapeutic potential.
A Need for Production of Complex Libraries
Synthetic oligonucleotides have been used to recognize specific genomic regions now for decades. In the current era of high-throughput genomics, researchers have recently adopted the use of oligo pools over individually synthesized oligos, for creating complex libraries of designed sequences.
OligoMix® enables us to synthesize thousands of designed sequences at once on a single microarray chip. By synthesizing the sequences in massive-parallel on a microfluidic chip, the overall cost and time required for synthesis is dramatically decreased and the cumbersome procedure of separate reactions for individual sequences is avoided.
Creative Solutions Have Begun to Emerge
By coupling the power of massively parallel DNA synthesis with NGS, OligoMix® users, looking to analyze large numbers of genetic variants, have begun to develop creative bioassays for functional analysis of these variants.
Genome-wide association studies (GWAS) have identified many disease-associated noncoding variants, but cannot distinguish functional single-nucleotide polymorphisms (fSNPs) from others that reside incidentally within risk loci.
Researchers at Brigham and Women’s Hospital and Harvard Medical School developed an unbiased high-throughput screen that employs type IIS enzymatic restriction to identify fSNPs that allelically modulate the binding of regulatory proteins1. They used OligoMix to produce a library of constructs consisting of a 31-bp SNP sequence with the SNP centered in the middle on the BpmI cutting site flanked with two BpmI binding sites and a primer included for high-throughput sequencing.
Diagram of tandem SNP-seq and FREP
a, (1) SNPs that fail to bind regulatory proteins such as transcription factors (TF) are negatively selected by PCR after type IIS restriction enzyme (IIS RE) cleavage (top right); protected fSNPs can then be enriched by PCR (bottom right). (2) SNP-seq construct. A 31-bp SNP sequence with the SNP centered in the middle on the BpmI cutting site is flanked with two BpmI binding sites. A primer is included for high-throughput sequencing. The whole construct can be amplified using G5 and G3 primers. Bio, biotin; NGS, next-generation sequencing. b, (1) The FREP construct with BamHI (blue box) and EcoRI (green box) restriction sites flanking a 31-bp sequence centered on the fSNP of interest (red) and attached to a magnetic bead by streptavidin and biotin. Parallel procedures using the test fSNP and a control sequence enable identification of sequence-specific protein associations. (2) Incubation with nuclear extract followed by extraction of constructs from unbound nuclear proteins by magnetic bead separation. (3) EcoRI digestion removes 3′ DNA and proteins. (4) BamHI digestion removes 5′ DNA, the beads and proteins, and proteins binding single stranded-DNA, which is not cut and therefore is extracted with the bead. (5) Protein complex identification with mass spectrometry. (6) Identification of associated proteins for each SNP. Large yellow and blue circles represent SNP-specific binding proteins; other colored shapes represent non-SNP-specific proteins binding to the construct.
The researchers coupled this approach, termed SNP-seq, with flanking restriction enhanced pulldown (FREP) to identify regulation of CD40 by three disease-associated fSNPs via four regulatory proteins, RBPJ, RSRC2 and FUBP-1/TRAP150. Applying this approach across 27 loci associated with juvenile idiopathic arthritis, they identified 148 candidate fSNPs, including two that regulate STAT4 via the regulatory proteins SATB2 and H1.2.
Demonstration of the binding of RBPJ to rs4810485 and TRAP150 to rs6065926
a, Gel supershift showing the binding of RBPJ to rs4810485 and TRAP150 to rs6065926. Arrows indicate the supershifted bands. b, ChIP showing endogenous binding of RBPJ to rs4810485 (top) and TRAP150 to rs6065926 (bottom) c, Sequencing trace showing the heterozygous genotype G/T on rs4810485 from mutant 21. d,e, Flow cytometry and western blot showing reduced expression of CD40 in mutant 21 versus WT control. f, ChIP of mutant 21 with an anti-RBPJ antibody showing the specific binding of RBPJ to rs4810485 site by comparison with an anti-IgG. g, The ratio of risk allele G versus non-risk allele T at rs4810485 in input and ChIP DNA, showing a significant enrichment of the G allele in the ChIP sample.
These findings establish the utility of tandem SNP-seq/FREP to bridge the gap between GWAS and disease mechanism.
Large scale sequencing studies and genome-wide association studies (GWASs) have identified 1000s of genotype–phenotype associations. Some of the phenotype-associated variants alter gene function and many of them are in linkage disequilibrium with the functional variants. The functional impacts of variants can be predicted using bioinformatic algorithms, but the in silico predictions are often incorrect and need experimental validation. While there are several experimental methods to functionally test variants, most do not have the capacity to simultaneously test the large number of variants.
Researchers from the Indiana University School of Medicine have developed PASSPORT-seq (parallel assessment of polymorphisms in miRNA target-sites by sequencing), a high-throughput bioassay that involves pooled synthesis, parallel cloning and single-well transfection followed by next-generation sequencing (NGS) to functionally test 100s of mirSNPs at once2. This assay produced results that are reproducible and consistent with luciferase reporter assays, a gold-standard platform widely used to assess gene expression in vitro.
Workflow of the PASSPORT-seq bioassay
(A) 100 Reference and variant miRNA binding regions each with the same 15–20 bp flanking sequence was synthesized as an oligonucleotide pool. (B) Using the flanking universal sequences, the oligonucleotide pool was amplified and made double stranded by PCR. pIS-0 plasmid was linearized by restriction enzymes. (C) The double stranded oligonucleotides were inserted into the linear plasmid using the NEBuilderTM gene assembly system. (D) Chemically competent bacteria were transformed with the plasmid pool containing the test miRNA binding regions. Transformed bacteria were plated on four plates. (E) All colonies from the plates were harvested, combined and scaled up in liquid culture. Plasmids were isolated from the liquid culture. (F) Three cell lines were transfected with the plasmid pool and incubated for 48 h after which cDNA was prepared from total RNA. (G) miRNA binding regions were amplified using universal primers that were uniquely barcoded for replicates within cell lines and for the input plasmid pool. (H) The barcoded PCR products were combined to form the sequencing pool.
The utility of the bioassay was demonstrated by testing 100 mirSNPs in HEK293, HepG2, and HeLa cells. The results of several of the variants were validated in all three cell lines using traditional individual luciferase assays. Fifty-five mirSNPs were functional in at least one of three cell lines (FDR ≤ 0.05); 11, 36, and 27 of them were functional in HEK293, HepG2, and HeLa cells, respectively.
Validation of the PASSPORT-seq assay
(A) Correlation between the percent-change of variant alleles compared to respective reference alleles observed in the experimental and validation PASSPORT-seq runs. (B) Functional mirSNPs identified by the PASSPORT-seq assay. For each SNP, the observed percent change in the expression of the variant allele compared to the respective reference allele in predicted miRNA binding site was calculated Statistically significant changes after correction for multiple testing using Benjamini and Hochberg algorithm are indicated by colored boxes. Blue boxes indicate a reduction in the variant allele expression and Orange boxes indicate increased expression.
The PASSPORT-seq assay is a powerful tool that bridges bioinformatic predictions and high-throughput mechanistic investigation of functional genetic variants that affect miRNA–mRNA interactions.
Thanks to the creative assay development of these researchers and others, OligoMix® has been demonstrated as an effective method of high-throughput oligo synthesis for use in functional variant analysis.
OligoMix® has also been demonstrated as an effective method of high-throughput oligo synthesis for production of guide RNA libraries in CRISPR/Cas 9 high-throughput functional genomic screens
- Li G, Martínez-Bonet M, Wu D, Yang Y, Cui J, Nguyen HN, Cunin P, Levescot A, Bai M, Westra HJ, Okada Y, Brenner MB, Raychaudhuri S, Hendrickson EA, Maas RL, Nigrovic PA. (2018) High-throughput identification of noncoding functional SNPs via type IIS enzyme restriction. Nat Genet 50(8):1180-1188.
- Ipe J, Collins KS, Hao Y, Gao H, Bhatia P, Gaedigk A, Liu Y, Skaar TC. (2018) PASSPORT-seq: A Novel High-Throughput Bioassay to Functionally Test Polymorphisms in Micro-RNA Target Sites. Front Genet 9:219.