Structure of the NGS library | Core of DNA technologies (2023)

As sequencing power increases and experimental scale increases, generating libraries for sequencing is often the rate-limiting step. We will be happy to discuss with you the options and protocols suitable for your specific research projects. We can create standard and specialized libraries of many types, including genomic DNA with inserts of different sizes, RNA-seq with ribo-depleted or chain-specific options, exome capture, ChIP-seq and microRNA-seq. We simplify andautomated library preparationand now you can create up to 96 different barcode libraries withIntegenX Apollo 324robots andScyclone G3 brake caliperfor consistent quality and fast processing. We also offer training and access to the robots if you want to use the instruments for large-scale projects.

The starting material for building the Illumina library is usually double-stranded (ds) DNA from any source: genomic DNA, BACs, PCR amplicons, ChIP samples, any type of RNA that has been converted to ds-cDNA (mRNA, total normalized - RNA, smRNAs) etc. Pretty much anything you can think of ends up as dsDNA or can be converted to dsDNA. This dsDNA is then fragmented (if it isn't already, as with ChIP). The average fragment length should not exceed 600 bp (HiSeq 2500, MiSeq) or 350 bp (HiSeq 3000). Then the ends are repaired and A-tailed, the adapters are ligated, size selection is performed, and then PCR is performed to create the final library ready for sequencing. Different library types may differ in detail (eg PCR-free library), but this is the basic workflow. A great forum for all sorts of sequel-related questions on all platforms is theSeqanswers.com-Forum.

Quantification and purity of DNA/RNA samples
The amounts of DNA and RNA used below and in this table apply when samples are quantified using a fluorometric method (eg, Qubit, PicoGreen, RiboGreen). Fluorometry offers advantages in precision and specificity (eg, DNA dyes do not bind to/measure RNA). If using a spectrophotometer (eg Nanodrop) we recommend sending twice the amount of sample as this type of measurement is generally unreliable. In any case, amounts of sample in excess of the minimum requirements increase the complexity of the library. Spectrophotometer readings are very useful for assessing the purity of samples. For DNA samples, the 260/280 ratio must be between 1.8 and 2.0 and the 260/230 ratio must be greater than 2.0. For RNA samples, the 260/280 ratio must be between 1.8 and 2.1 and the 260/230 ratio must be greater than 1.5. Values ​​outside these ranges indicate contamination. OReal-time PCR corecan perform DNA and RNA extractions.

Sequencing Library Preparation Services: Sample Requirements

Also note theComplete table of requirements

DNA-based libraries

The performance specifications of the libraries we produce depend on the source material. Genomic DNA, double-stranded cDNA libraries, BACs, or other materials available in microgram quantities will almost always produce high quality libraries.

Guidelines for Submitting Library-Worthy DNA
Place 2 µg or more of high quality DNA (concentration > 50 ng/µl, OD 260/280 close to 1.8; 260/230 ratio > 2.0) in EB or TE buffer (preferably EB buffer) or water in molecular biological quality. Building a library can also be attempted, with restrictions, with fewer inputs. If the total input material for library preparation is less than 100 ng, special library preparation protocols must be used. Sample amounts of 5 µg DNA are recommended for PCR-free libraries; Working with less is possible.

ChIP Libraries
We offer the construction of libraries from immunoprecipitated chromatin material. For these more complex experiments, discussions with key personnel about suitability of starting material and construction strategy are recommended. No guarantees are given for this library service other than that we will do our best! For a general context, theTechnical note on ChIP-Seq dataEChIP-Seq Data Sheetfrom Illumina can be interesting.

Match Pair Libraries
Pair pair library sequencing produces long insert paired end reads. Libraries are created by self-ligating long DNA fragments and marking the junctions to create chimeric library molecules that bring together sequences originally separated by 2 kb to 12 kb. We use the Illumina Nextera Mate Pair Kit, which uses a transposase enzyme to fragment and label DNA in a single step. The tags are biotinylated, allowing the selection of junctions containing fragments. Unlike older peer-to-peer library protocols, the Nextera kit is very reliable with the exception of initial sizing of the fragment. As with all other analyzes of long DNA fragments, the quality of the DNA is also important here. Please send us a photo of the gel before sending the DNA samples. Samples should run on agarose gels as a band of 20 kb or larger.
The Nextera kit offers two protocols: The "gelless" version (1ug input), which is of particular interest when little input DNA is available. Mate pair fragment sizes for this protocol generally range from 1.5 kb to 10 kb. Surprisingly, the SSPACE scaffolder can still work with this data.
The "Gel Plus" version requires at least 4 µg input DNA (and four times the reagents) and uses gel extractions to size fragments within a range of +- 700 bp for shorter partners and within +- 2 kb for selected longer partners up to 10-12 KB. Due to fragmentation uncertainties, sendat least double the amountthe sample.
Theoretically, the sizes of the fragments resulting from the labeling depend only on the amount of DNA supplied. In practice, fragment lengths vary considerably between different DNA samples of similar amounts. This variability between samples can be observed even after accurate quantification of DNA by fluorometry. However, reactions are adjustable for aliquots of the same sample. Especially when very specific size ranges are desired, it is often necessary to repeat the labeling reaction with adjusted amounts of DNA. We could then combine similarly sized gel extraction fractions from two labeling reactions to generate highly complex libraries for the desired size ranges. Let us know why specific insert size ranges are important to your project.
Due to difficulties in predicting fragment size ranges, we report paired recharge rates including two labeling reactions. If we can generate the desired library with a single tag, we charge the lowest price for preparing the library with a tag.

target enrichment
Several companies offer services and platforms that generate full exome or target amplification. we offer thefluidigma access matrix,which uses nanofluidics for low-cost targeting to generate barcoded amplicon libraries ready for Illumina sequencing.Sequence Acquisition Librariesare those in which specific genomic regions are enriched after generation and sequencing of an indexed library. This strategy allows for very deep and focused sequencing and can be implemented for a variety of applications. Several companies offer platforms that can generate this material, including Illumina, RainDance, Agilent, NimbleGen, and Fluidigm. Technical information aboutAgilent,Nimblegen,rain dance, EQiagen(which uses a PCR-based non-hybridization capture strategy for enrichment) are available but cannot be guaranteed to be up-to-date. Please consider them a starting point for further investigation (we have contact information for company representatives if needed) and for informational purposes only (no implied endorsements, etc.).

A RNA-Seq Library

A little misleading because all libraries end up as DNA, but that's referring to the starting material. We offer RNA-Seq library preparation with a variety of options such as ribo depletion, poly-A enrichment, chain specific libraries as described below, as well as micro RNA (miRNA) and small RNA library preparations.

Guidelines for submitting library-worthy RNA
Provide at least 1 µg (preferably 2-5 µg) of total RNA at a concentration of at least 50 ng/µl (1 µg for poly-A enrichment; 2 µg for ribo-depleted libraries; it is recommended to use less starting material possible, but we cannot guarantee results). Make sure your RNA isolation protocoluses a DNAse digestionStep or other means to remove DNA from the sample. On an agarose gel, DNA contamination is visible as a trace band of fragments significantly larger than RNA (>10 kb). To verify the purity of RNA samples, the 260/280 ratio must be between 1.8 and 2.1 and the 260/230 ratio must be greater than 1.5. Poly-A enrichment, ribo depletion, and chain-specific library preparation are among the commonly requested types of services (see below for more technical details). We recommend following Illumina's recommendations - for human samples, use total RNA with a Bioanalyzer RIN score of 8 or better, for plant material, RIN values ​​may be lower and tissue specific (this depends mainly on chloroplast content). Libraries for easily degraded RNA samples should be prepared using ribo depletion protocols. If possible, avoid RNA extraction protocols using Trizol or related phenolic reagents (silica column-based kits are less likely to retain contaminants). When using Trizol, protocols that include column-based purification (eg, Direct-zol, TRIzolPlus) should be used. note thatan additional column cleaningis requiredfor isolated RNA samplesPAXgene Tubesor withPAXgene-Kits.RNA samples must be eluted in molecular biology water, always stored in a -80 degrees freezer and shipped on dry ice.AllRNA samples require Bioanalyzer sample quality control(or equivalent). These QC traces can be sent by customers or we can do the QCfor a feeinstead.

poly-A enrichment
Total RNA samples can contain up to 90% ribosomal RNA sequences that are not significant for transcriptome or gene expression studies, while mRNAs typically represent only 1-2% of total RNA. Therefore, enrichment of samples for mRNAs is highly desirable. Poly-A enrichment is the most commonly used method to enrich mRNA sequences from eukaryotic total RNA samples; mRNAs are selected by hybridization with poly-T oligos attached to magnetic beads.

Ribosomal RNA depletion
There are several commercially available kits to remove ribosomal RNA from your total RNA. The main reason for rRNA depletion is the reduction in the abundance of ribosomal RNA, especially when the transcripts do not contain polyA (bacterial RNA) and also when you keep all long classes of non-coding RNA (lncRNA) and polyA in your sample. Commercial rRNA Removal Solution kits are available for different types of total RNA; they include humans, mice, rats, bacteria (gram positive or negative), plant leaves, plant seeds and roots, and yeast. Ribo depletion protocols can also allow analysis of slightly degraded RNA samples. We ask for at least 2 µg of total RNA for the preparation of ribodepleted libraries. As always, libraries can be generated with less material, but complexity can suffer.

Micro RNA and Small RNA Libraries
We offer library construction for micro and small RNAs from total RNA using the Illumina protocol and reagents. We scale the libraries with high precision using the Blue Pippin system. The recommended minimum amount of total RNA required for these preparations is 1 microgram (recommendations for human samples). As the total RNA composition can vary greatly between tissues and organisms, aim to provide at least 2 µg of total RNA. Also make sure that your RNA isolation method actually retains micro and small RNAs. Total RNA samples must be placed in molecular biological water at a concentration of 200 ng/ul. High quality RNA is recommended (total RNA samples must have RIN scores of 8 or greater as determined by a Bioanalyzer QC) and must have been DNAse treated prior to sample submission.

Strand-specific RNA libraries
By default, we generate strand-specific RNA-seq libraries in the nucleus. Please let us know if you prefer traditional non-stranded library preparation. Strand-specific (also known as stranded or targeted) RNA-seq libraries greatly enhance the value of an RNA-seq experiment. By adding strand of origin information, they can accurately delineate transcript boundaries in regions with genes on opposite strands and determine the transcribed strand of non-coding RNAs. During cDNA synthesis, dUTP is incorporated in second strand synthesis. After adapter ligation, the strand containing dUTP is selectively degraded to reveal strand-to-RNA-seq information. Thus, the forward read of the resulting sequencing data represents the “antisense strand” and the reverse read represents the “sense strand” of the genes (for Trinity transcriptome assemblies, the “–RF” orientation flag should be used).

More thoughts about the library

Bibliotecas PCR books
Libraries generated without amplification reduce bias in library preparation. Therefore, they can improve sequencing coverage of genomic regions such as GC-rich regions, promoters, and repeat regions, and improve detection of sequence variants. Note that PCR-free libraries are more difficult to quantify and quantify (see bottom of page) and yields from these libraries tend to be lower (10-15%) compared to amplified libraries. Preparation of the PCR-free library also requires a larger amount of starting material (>5-fold).

indexed libraries
Indexing, also called barcoding, allows multiple libraries to be sequenced into a single lane; H. multiplexing. By default, all libraries generated by us are provided with a barcode. Multiplexing is necessary when the typical lane output of 15-25 million MiSeq reads, 120-180 million HiSeq 2500 reads, or 260-310 million HiSeq 3000 reads is greater than is needed for a single library ( eg sequencing BACs, PCR-generated fragments, small microbial genomes, transcriptomes, exomes, ChIP and small RNA applications). Multiplexing is also the best way to minimize potential track-to-track sequencing variations, as all of your samples are subject to the same sequencing conditions. For example, if you need two lanes of sequencing for six samples, we recommend 6-plexing and sequencing in two lanes rather than 3-plexing per lane. The principle is that short nucleotide "barcodes" are attached to each library using specific adapters containing these sequences. Libraries containing different indexed adapters are then constructed, quantified, pooled in equimolar amounts and sequenced. Informal deconvolution of barcodes allows multiple libraries to be sequenced into a single lane, saving cost and time. So far, two methods have been used for this: using the commercially available indexing kits (Illumina TruSeq, Nextera or Bioo Scientific) or synthesizing your own adapter oligos with your own barcodes. With the Illumina TruSeq v2 A and B Library Prep Kits, you can use up to 24 different barcodes per kit to multiplex up to 48 libraries. Bioo Scientific offers Illumina (NEXTflex) compatible barcodes with up to 96 barcodes. The Nextera kit (Epicentre/Illumina) uses double indexing and transposon-mediated fragmentation ("tagging") followed by PCR amplification to incorporate barcode adapters (so a PCR-free library is not an option when using the Nextera kit ). The dual indexing/adapter identification strategy (with up to 12 indexes available for adapter 1 and up to eight indexes for adapter 2) allows for up to 96 unique dual index combinations.

Homemade indexing has been used successfully by many users. Avoid the "inline" barcode strategy and use Truseq or Nextera adapter designs instead (ie barcodes are read in a separate read and do not interfere with cluster registration). It is important to ensure that the base composition of the indices is balanced to optimize the ability of the image analysis software to discriminate signals.

Libraries: make your own

Construction of the library involves fragmentation of the DNA (if necessary, depending on the nature of the initial sample), enzymatic treatment of the DNA to repair and A-tail the fragments, ligation of sequencing adapters to these fragments and subsequent amplification by PCR (or skip this for PCR-free Libraries), with or without size selection, depending on the application. Below is more information on these different aspects of library construction. we also offerTraining workshop on preparing next-generation librariesfor comprehensive hands-on training in preparing high-quality libraries for Illumina sequencing.

fragmentation
DNA to be processed in a sequencing library must first be broken into small fragments. The average insert length should not exceed 600 bp (HiSeq 2500, MiSeq) or 350 bp (HiSeq 3000). There are several methods for doing this, each with its own advantages and disadvantages. Many protocols and centers trust and recommend a data fragmentation device.Covaris, which uses adaptive focused acoustics to dissect DNA into appropriately sized fragmentsCovaris E220can meet high-throughput library production needs. We mainly use aCovaris E220EDiagenode Bioruptor NGS (ou Bioruptor UCD-200).Access to these tools is available through Core, with standard training and enrollment policies forMain equipment availableThe power.

Basic DNA and RNA library protocols
We used library kits from Illumina, Wafergen, Kapa Biosystems, Bioo Scientific, CloneTech, and NuGen as the source for fragment repair, tail, and amplification enzymes. There are a number of other next generation products currently being used in the research community. For mRNA-seq libraries, we currently use the stranded Illumina kit. There are new products and we encourage you to research if RNA-seq libraries are of interest. In particular, the ribosomal RNA depletion protocols that integrate into the Illumina kit and the new library production and RNA amplification capabilities from NuGen and CloneTech have expanded the services we offer and will undoubtedly continue to do so. In other words, keep checking this site to see how things are progressing.

Illumina Adapter Oligonucleotides
Illumina adapter oligonucleotide sequences are availableHere. Illumina tends to only sell their adapters together with library preparation kits. Other suppliers of fully compatible, off-the-shelf adapters are Bioo Scientific. Custom summaries from companies such as MWG-Eurofins, Bioneer and IDT are another valid option. Two things to note - the indexed “upper” adapter (starting with GATC) needs to be phosphorylated, and the universal “lower” adapter can be synthesized with a special bond between the 3' T terminal and the anterior C. This phosphorothioate bond makes the protruding T (after annealing the upper and lower adapter oligos) more resistant to nuclease, reducing the likelihood of adapter dimers (more on adapter dimers below).

Libraries: Quality Control (QC)

The quality of the library is the most important factor for the success of your sequencing run, both in terms of the number of reads generated (quantification) and the validity of the sequence obtained (content). Construction and analysis methods are evolving; while a useful early paper by Quail et al. from the Sanger Institute lists a number of improvements over standard Illumina protocols in library preparation and analysis. If you create your own libraries, you might want to download themthis paperit is acomplementary methodsTable for the many practical topics covered. We performed two quality control actions on all sequenced libraries in our core - analysis on the Agilent Bioanalyzer and quantification with the Kapa Biosystems Illumina Library Quantification Kit.

Bioanalyzer provides a detailed visual examination of libraries. The electropherogram of the "perfect" library shown here shows a single peak of the expected molecular weight. Additional common forms include primer dimers (at about 80-85 bases), adapter dimers (about 120 bases), and broader bands with higher MW than the expected peak. Primer dimers, which are minimized by the use of magnetic beads, are not a problem unless they completely dominate the reaction. Adapter dimers can pose a problem because they sequence much more efficiently. As a result, regardless of the ratio of adapter dimers in your library, you will see an even higher percentage of reads in your final data files. Adapter dimers can be minimized by adjusting the adapter:insert ratio during library construction and taking care during gel extraction or other sizing steps. The larger MW visible on the bioanalyzer, generally more irregular shapes, are likely the result of over-amplification during the last PCR step. While some levels of these are tolerable, if they are very evident, the library should be reamplified from the material extracted from the gel.

We used a qPCR assay tolibrary quantification– The Kapa Biosystems qPCR assay is performed on every library we sequence (and is included in the sequencing price). This allowed us to provide much more consistent cluster values, resulting in more consistent read counts. It is imperative to maximize the recovered data given the time and cost involved, especially in long runs, which is why we strongly recommend this quantification.

Library, shipping, and storage requirements

Presentation
We must receive both electronic (dnatech@ucdavis.edu) and hard copy (shipped with samples) copies of the appropriate submission form. All customer-supplied libraries must be accompanied by Bioanalyzer (or similar) traces. If no traces are transmitted, we carry out the bioanalyzer analysis for a fee. Please visit thePage to download submission forms and more detailed instructions. The same form is currently used for library preparation and library sequencing submission. Please contact us if you have any questions about the information required; It is important that you enter all of this information to minimize the chance of making mistakes in these expensive and time-consuming experiments. One thing we need to know is the approximate size of the desired bet; In the absence of specific preferences, we recommend around 220 bp for most mRNA and DNA libraries, but insert sizes should be larger for MiSeq runs with longer read times. For certain applications, such as B. Reassembly, different sizes may be required and we can accommodate. But again, we recommend checking the suitability of these values ​​for the experiment you want to run.

Sequencing library requirements
Standard library submission requirements are at least15 µl volume at a concentration of 5 nM(e.g. 2.3 ng/ul for a 700 bp library). More volume and/or greater concentration are welcome. We can work with less library (up to 1 nM), but quantitation becomes less reproducible, the library becomes less stable, and relatively larger amounts of DNA from the library adhere to the sides of the storage tube. Lower sequencing yield is the likely result for library concentrations of 1 nM or less, and we cannot guarantee the quantity or quality of data for such libraries. The best buffer for storing and sending libraries is 10mM Tris/0.01% Tween-20 pH=8.0 or 8.4, but EB buffer is also acceptable. If possible, use 0.6mL or 1.5mL low binding tubes. If you do not provide a bioanalyzer trace (or equivalent) from your library, we will do so for a fee. Note that the DNA insert size(s) must not exceed 700 bp and most Illumina adapters add approximately 120 bases to the length of the fragment as seen on the bioanalyzer. When submitting your libraries for sequencing, use our Illumina Sequencing Submission Form (printed copy with samples and email to dnatech@ucdavis.edu), specify the Bioanalyzer profile, library preparation methods, and index sequences used . We measure the quantity of your libraries using real-time PCR (included in the sequencing price).

Libraries for HiSeq 3000 sequencing—The latest generation of sequencers hasstricter library requirements and requires higher library concentrations.The average insert size should ideally be 350 bp and the "tail" of longer fragments should not exceed 550 bp. The new cluster chemistry is more sensitive to adapter dimers: 5% adapter dimer contamination can result in 60% of readings coming from these dimers. Therefore, it is very important that there is no evidence of an adapter dimer peak (about 120 bp) in the bioanalyzer pathway. Our preferred library submission requirements are at least15 µl of 5 nM concentration volume(eg 1.6 ng/ul for a 470 bp library). More volume and/or greater concentration are welcome. Lower sequencing yields are likely at library concentrations of 2 nM or less and we cannot guarantee the quantity or quality of data for such libraries.

QC of bibliotecas books of PCR– The quality of these libraries is difficult to assess. Some of the adapters in these libraries are single-stranded. Therefore, they tend to migrate more slowly than fully double-stranded amplified libraries on the bioanalyzer. In most cases the libraries appear to be 70 to 100 nt longer than they really are - however the bioanalyzer traces can also deviate much more (eg 500 bases). To be certain of the actual lengths of the library fragments, we recommend amplifying and running an aliquot (1 µl) of the libraries with 8 cycles of PCRboththe PCR-free sample and the amplified sample in the bioanalyzer. If multiple PCR-free libraries are to be pooled, consider quantifying each library by qPCR prior to pooling.

Custom sequencing startersNote that they are only used for a small minority of sequencing projects. Custom sequencing primers must be shipped with the libraries at a concentration of 100 µM and in a volume of 20 µl each. Make sure the sequencing primer design fits the chosen Illumina platform. Miseq and Hiseq platforms use different annealing temperatures.

planning
Once your library or library raw material is ready, you should turn it in as quickly as possible to get the next available slot in the queue. Executions occur when we fill either two (fast mode) or eight (high output mode). Cues in a HiSeq flow cell and their timing can vary based on the type of service and primary activity. Turnaround time for MiSeq runs is typically five to eight days, while for HiSeq sequencing you should expect three to five weeks; In either case, allow a week or two for library preparation. OHiSeq sequencing scheduleis now available on our website.

Sample/Library Storage Policy
Please let us know if you would like to collect your samples/libraries after sequencing and we will be happy to accommodate you; otherwise, due to space limitations, they are only stored for six months after the sequencing runs are completed.

References

Top Articles
Latest Posts
Article information

Author: Kimberely Baumbach CPA

Last Updated: 30/11/2023

Views: 5901

Rating: 4 / 5 (41 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Kimberely Baumbach CPA

Birthday: 1996-01-14

Address: 8381 Boyce Course, Imeldachester, ND 74681

Phone: +3571286597580

Job: Product Banking Analyst

Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.