RNA-Seq has become the technology of choice for interrogating the transcriptome.
However, most methods for RNA-Seq differential expression (DE) analysis do not
utilize the prior knowledge of biological networks to detect DE genes. With the
increased availability and quality of biological network databases, methods that
can utilize this prior knowledge are needed and will offer biologists with a
viable, more powerful alternative when analyzing RNA-Seq data. In this approach,
we propose a three-state Markov Random Field (MRF) method that utilizes known
biological pathways and interaction to improve sensitivity and specificity and
therefore reducing false discovery rates (FDRs) when detecting differentially
expressed genes from RNA-seq data. The method requires normalized count data
(e.g. in Fragments or Reads Per Kilobase of transcript per Million mapped reads
(FPKM/RPKM) format) as its input and it is implemented in an R package pathDESeq
available from Github. Simulation studies demonstrate that our method
outperforms the two-state MRF model for various sample sizes. Furthermore, for a
comparable FDR, it has better sensitivity than DESeq, EBSeq, edgeR and NOISeq.
The proposed method also picks more top Gene Ontology terms and KEGG pathways
terms when applied to the real datasets from colorectal cancer and
hepatocellular carcinoma studies, respectively. Overall, these findings clearly
highlight the power of our method relative to the existing methods that do not
utilize prior knowledge of the biological network.