Statistical methods on detecting differentially expressed genes for RNA-seq dataReport as inadecuate

Statistical methods on detecting differentially expressed genes for RNA-seq data - Download this document for free, or read online. Document in PDF available to download.

BMC Systems Biology

, 5:S1

First Online: 23 December 2011


BackgroundFor RNA-seq data, the aggregated counts of the short reads from the same gene is used to approximate the gene expression level. The count data can be modelled as samples from Poisson distributions with possible different parameters. To detect differentially expressed genes under two situations, statistical methods for detecting the difference of two Poisson means are used. When the expression level of a gene is low, i.e., the number of count is small, it is usually more difficult to detect the mean differences, and therefore statistical methods which are more powerful for low expression level are particularly desirable. In statistical literature, several methods have been proposed to compare two Poisson means rates. In this paper, we compare these methods by using simulated and real RNA-seq data.

ResultsThrough simulation study and real data analysis, we find that the Wald test with the data being log-transformed is more powerful than other methods, including the likelihood ratio test, which has similar power as the variance stabilizing transformation test; both are more powerful than the conditional exact test and Fisher exact test.

ConclusionsWhen the count data in RNA-seq can be reasonably modelled as Poisson distribution, the Wald-Log test is more powerful and should be used to detect the differentially expressed genes.

Download fulltext PDF

Author: Zhongxue Chen - Jianzhong Liu - Hon Keung Tony Ng - Saralees Nadarajah - Howard L Kaufman - Jack Y Yang - Youping Deng


Related documents