Abstract
The next generation sequencing technology generates ultra high dimensional data. However, it is computationally impractical to estimate an entire Directed Acyclic Graph (DAG) under such high dimensionality. In this paper, we discuss two different types of problems to estimate subnetworks in ultra high dimensional data. The first problem is to estimate DAGs of a subnetwork adjacent to a target gene, and the second problem is to estimate DAGs of multiple subnetworks without information about a target gene. To address each problem, we propose efficient methods to estimate subnetworks by using layer-dependent weights with BIC criteria or by using community detection approaches to identify clusters as subnetworks. We apply such approaches to the gene expression data of breast cancer in TCGA as a practical example.
Original language | English |
---|---|
Pages (from-to) | 657-676 |
Number of pages | 20 |
Journal | Statistics and its Interface |
Volume | 10 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2017 |
Keywords
- Bayesian network
- Directed acyclic graph
- High dimension
- Penalized likelihood
- Subnetworks
ASJC Scopus subject areas
- Statistics and Probability
- Applied Mathematics