Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlapReport as inadecuate

Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlap - Download this document for free, or read online. Document in PDF available to download.

BMC Genomics

, 16:675

Transcriptomic methods


BackgroundWhile RNA-sequencing RNA-seq is becoming a powerful technology in transcriptome profiling, one significant shortcoming of the first-generation RNA-seq protocol is that it does not retain the strand specificity of origin for each transcript. Without strand information it is difficult and sometimes impossible to accurately quantify gene expression levels for genes with overlapping genomic loci that are transcribed from opposite strands. It has recently become possible to retain the strand information by modifying the RNA-seq protocol, known as strand-specific or stranded RNA-seq. Here, we evaluated the advantages of stranded RNA-seq in transcriptome profiling of whole blood RNA samples compared with non-stranded RNA-seq, and investigated the influence of gene overlaps on gene expression profiling results based on practical RNA-seq datasets and also from a theoretical perspective.

ResultsOur results demonstrated a substantial impact of stranded RNA-seq on transcriptome profiling and gene expression measurements. As many as 1751 genes in Gencode Release 19 were identified to be differentially expressed when comparing stranded and non-stranded RNA-seq whole blood samples. Antisense and pseudogenes were significantly enriched in differential expression analyses. Because stranded RNA-seq retains strand information of a read, we can resolve read ambiguity in overlapping genes transcribed from opposite strands, which provides a more accurate quantification of gene expression levels compared with traditional non-stranded RNA-seq. In the human genome, it is not uncommon to find genomic loci where both strands encode distinct genes. Among the over 57,800 annotated genes in Gencode release 19, there are an estimated 19 % about 11,000 of overlapping genes transcribed from the opposite strands. Based on our whole blood mRNA-seq datasets, the fraction of overlapping nucleotide bases on the same and opposite strands were estimated at 2.94 % and 3.1 %, respectively. The corresponding theoretical estimations are 3 % and 3.6 %, well in agreement with our own findings.

ConclusionsStranded RNA-seq provides a more accurate estimate of transcript expression compared with non-stranded RNA-seq, and is therefore the recommended RNA-seq approach for future mRNA-seq studies.

KeywordsRNA-seq Gene quantification Stranded Non-stranded Transcriptomics Transcriptome profiling Gene overlap Electronic supplementary materialThe online version of this article doi:10.1186-s12864-015-1876-7 contains supplementary material, which is available to authorized users.

Download fulltext PDF

Author: Shanrong Zhao - Ying Zhang - William Gordon - Jie Quan - Hualin Xi - Sarah Du - David von Schack - Baohong Zhang


Related documents