Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population.

TitleJoint skeleton estimation of multiple directed acyclic graphs for heterogeneous population.
Publication TypeJournal Article
Year of Publication2019
AuthorsLiu, Jianyu, Wei Sun, and Yufeng Liu
JournalBiometrics
Volume75
Issue1
Pagination36-47
Date Published2019 Mar
ISSN1541-0420
KeywordsBreast Neoplasms, Computer Graphics, Computer Simulation, Epidemiologic Research Design, Female, Gene Expression, Genes, Neoplasm, Humans, Models, Statistical
Abstract

The directed acyclic graph (DAG) is a powerful tool to model the interactions of high-dimensional variables. While estimating edge directions in a DAG often requires interventional data, one can estimate the skeleton of a DAG (i.e., an undirected graph formed by removing the direction of each edge in a DAG) using observational data. In real data analyses, the samples of the high-dimensional variables may be collected from a mixture of multiple populations. Each population has its own DAG while the DAGs across populations may have significant overlap. In this article, we propose a two-step approach to jointly estimate the DAG skeletons of multiple populations while the population origin of each sample may or may not be labeled. In particular, our method allows a probabilistic soft label for each sample, which can be easily computed and often leads to more accurate skeleton estimation than hard labels. Compared with separate estimation of skeletons for each population, our method is more accurate and robust to labeling errors. We study the estimation consistency for our method, and demonstrate its performance using simulation studies in different settings. Finally, we apply our method to analyze gene expression data from breast cancer patients of multiple cancer subtypes.

DOI10.1111/biom.12941
Alternate JournalBiometrics
Original PublicationJoint skeleton estimation of multiple directed acyclic graphs for heterogeneous population.
PubMed ID30081434
PubMed Central IDPMC6546091
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States
R01 CA189532 / CA / NCI NIH HHS / United States
R01 GM105785 / GM / NIGMS NIH HHS / United States
R01 GM126550 / GM / NIGMS NIH HHS / United States