Details of talk
|Title||Multiscale approaches for analyses of high-throughput sequencing data|
|Presenter||Heejung Shim (The University of Melbourne)|
|Session||Biostatistics and Bioinformatics|
Identification of differences between multiple groups in cellular phenotypes measured by high-throughput sequencing assays is frequently encountered in genomics applications. For example, common problems include detecting differences in transcription factor binding/chromatin accessibility across tissues/conditions using ChIP-seq/ATAC-seq data. These high-throughput sequencing data provide high resolution measurements on how traits vary along the whole genome in each sample. However, typical analyses fail to exploit the full potential of these high resolution measurements, instead aggregating the data at coarser resolutions, such as genes, or windows of fixed length. Previously, we developed a wavelet-based (normal-based) multi-scale method, WaveQTL, that better exploits the high-resolution information, and demonstrated that WaveQTL has more power than a simpler window-based method. Motivated by this, we developed another multi-scale method, multiseq, that models the count nature of the sequencing data directly, making the method potentially perform well at small sample sizes or for low read counts. In this talk, first I will present key ideas behind multi-scale approaches for analyses of high-throughput sequencing data. Then, I will introduce the second method, multiseq, and demonstrate that multiseq has better power than negative binomial based window methods. Finally, I will discuss how multi-scale models can be used in applications to other biological questions.