Title | SAFE-clustering: Single-cell Aggregated (from Ensemble) clustering for single-cell RNA-seq data. |
Publication Type | Journal Article |
Year of Publication | 2019 |
Authors | Yang, Yuchen, Ruth Huh, Houston W. Culpepper, Yuan Lin, Michael I. Love, and Yun Li |
Journal | Bioinformatics |
Volume | 35 |
Issue | 8 |
Pagination | 1269-1277 |
Date Published | 2019 Apr 15 |
ISSN | 1367-4811 |
Keywords | Algorithms, Cluster Analysis, Gene Expression Profiling, RNA-Seq, Sequence Analysis, RNA, Single-Cell Analysis |
Abstract | MOTIVATION: Accurately clustering cell types from a mass of heterogeneous cells is a crucial first step for the analysis of single-cell RNA-seq (scRNA-Seq) data. Although several methods have been recently developed, they utilize different characteristics of data and yield varying results in terms of both the number of clusters and actual cluster assignments.RESULTS: Here, we present SAFE-clustering, single-cell aggregated (From Ensemble) clustering, a flexible, accurate and robust method for clustering scRNA-Seq data. SAFE-clustering takes as input, results from multiple clustering methods, to build one consensus solution. SAFE-clustering currently embeds four state-of-the-art methods, SC3, CIDR, Seurat and t-SNE + k-means; and ensembles solutions from these four methods using three hypergraph-based partitioning algorithms. Extensive assessment across 12 datasets with the number of clusters ranging from 3 to 14, and the number of single cells ranging from 49 to 32, 695 showcases the advantages of SAFE-clustering in terms of both cluster number (18.2-58.1% reduction in absolute deviation to the truth) and cluster assignment (on average 36.0% improvement, and up to 18.5% over the best of the four methods, measured by adjusted rand index). Moreover, SAFE-clustering is computationally efficient to accommodate large datasets, taking <10 min to process 28 733 cells.AVAILABILITY AND IMPLEMENTATION: SAFEclustering, including source codes and tutorial, is freely available at https://github.com/yycunc/SAFEclustering.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
DOI | 10.1093/bioinformatics/bty793 |
Alternate Journal | Bioinformatics |
Original Publication | SAFE-clustering: Single-cell Aggregated (From Ensemble) clustering for single-cell RNA-seq data. |
PubMed ID | 30202935 |
PubMed Central ID | PMC6477982 |
Grant List | P01 CA142538 / CA / NCI NIH HHS / United States R01 HG006292 / HG / NHGRI NIH HHS / United States R01 HL129132 / HL / NHLBI NIH HHS / United States |
SAFE-clustering: Single-cell Aggregated (from Ensemble) clustering for single-cell RNA-seq data.
Project: