Learning discriminative and structural samples for rare cell types with deep generative model. 2022

Haiyue Wang, and Xiaoke Ma
School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.

Cell types (subpopulations) serve as bio-markers for the diagnosis and therapy of complex diseases, and single-cell RNA-sequencing (scRNA-seq) measures expression of genes at cell level, paving the way for the identification of cell types. Although great efforts have been devoted to this issue, it remains challenging to identify rare cell types in scRNA-seq data because of the few-shot problem, lack of interpretability and separation of generating samples and clustering of cells. To attack these issues, a novel deep generative model for leveraging the small samples of cells (aka scLDS2) is proposed by precisely estimating the distribution of different cells, which discriminate the rare and non-rare cell types with adversarial learning. Specifically, to enhance interpretability of samples, scLDS2 generates the sparse faked samples of cells with $\ell _1$-norm, where the relations among cells are learned, facilitating the identification of cell types. Furthermore, scLDS2 directly obtains cell types from the generated samples by learning the block structure such that cells belonging to the same types are similar to each other with the nuclear-norm. scLDS2 joins the generation of samples, classification of the generated and truth samples for cells and feature extraction into a unified generative framework, which transforms the rare cell types detection problem into a classification problem, paving the way for the identification of cell types with joint learning. The experimental results on 20 datasets demonstrate that scLDS2 significantly outperforms 17 state-of-the-art methods in terms of various measurements with 25.12% improvement in adjusted rand index on average, providing an effective strategy for scRNA-seq data with rare cell types. (The software is coded using python, and is freely available for academic https://github.com/xkmaxidian/scLDS2).

UI MeSH Term Description Entries
D000073359 Exome Sequencing Techniques used to determine the sequences of EXONS of an organism or individual. Complete Exome Sequencing,Complete Transcriptome Sequencing,Whole Exome Sequencing,Whole Transcriptome Sequencing,Complete Exome Sequencings,Exome Sequencing, Complete,Exome Sequencing, Whole,Exome Sequencings, Complete,Sequencing, Complete Exome,Sequencing, Complete Transcriptome,Sequencing, Exome,Sequencing, Whole Exome,Sequencing, Whole Transcriptome,Transcriptome Sequencing, Complete,Transcriptome Sequencing, Whole,Transcriptome Sequencings, Complete
D012313 RNA A polynucleotide consisting essentially of chains with a repeating backbone of phosphate and ribose units to which nitrogenous bases are attached. RNA is unique among biological macromolecules in that it can encode genetic information, serve as an abundant structural component of cells, and also possesses catalytic activity. (Rieger et al., Glossary of Genetics: Classical and Molecular, 5th ed) RNA, Non-Polyadenylated,Ribonucleic Acid,Gene Products, RNA,Non-Polyadenylated RNA,Acid, Ribonucleic,Non Polyadenylated RNA,RNA Gene Products,RNA, Non Polyadenylated
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D016000 Cluster Analysis A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both. Clustering,Analyses, Cluster,Analysis, Cluster,Cluster Analyses,Clusterings
D017423 Sequence Analysis, RNA A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE. RNA Sequence Analysis,Sequence Determination, RNA,Analysis, RNA Sequence,Determination, RNA Sequence,Determinations, RNA Sequence,RNA Sequence Determination,RNA Sequence Determinations,RNA Sequencing,Sequence Determinations, RNA,Analyses, RNA Sequence,RNA Sequence Analyses,Sequence Analyses, RNA,Sequencing, RNA
D059010 Single-Cell Analysis Assaying the products of or monitoring various biochemical processes and reactions in an individual cell. Analyses, Single-Cell,Analysis, Single-Cell,Single Cell Analysis,Single-Cell Analyses
D020869 Gene Expression Profiling The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell. Gene Expression Analysis,Gene Expression Pattern Analysis,Transcript Expression Analysis,Transcriptome Profiling,Transcriptomics,mRNA Differential Display,Gene Expression Monitoring,Transcriptome Analysis,Analyses, Gene Expression,Analyses, Transcript Expression,Analyses, Transcriptome,Analysis, Gene Expression,Analysis, Transcript Expression,Analysis, Transcriptome,Differential Display, mRNA,Differential Displays, mRNA,Expression Analyses, Gene,Expression Analysis, Gene,Gene Expression Analyses,Gene Expression Monitorings,Gene Expression Profilings,Monitoring, Gene Expression,Monitorings, Gene Expression,Profiling, Gene Expression,Profiling, Transcriptome,Profilings, Gene Expression,Profilings, Transcriptome,Transcript Expression Analyses,Transcriptome Analyses,Transcriptome Profilings,mRNA Differential Displays

Related Publications

Haiyue Wang, and Xiaoke Ma
September 2022, IEEE transactions on medical imaging,
Haiyue Wang, and Xiaoke Ma
February 2020, Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence,
Haiyue Wang, and Xiaoke Ma
January 2012, IEEE transactions on medical imaging,
Haiyue Wang, and Xiaoke Ma
February 2010, BMC bioinformatics,
Haiyue Wang, and Xiaoke Ma
July 2023, Nucleic acids research,
Haiyue Wang, and Xiaoke Ma
August 2007, IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society,
Haiyue Wang, and Xiaoke Ma
November 2018, Neural networks : the official journal of the International Neural Network Society,
Haiyue Wang, and Xiaoke Ma
February 2021, Journal of molecular modeling,
Haiyue Wang, and Xiaoke Ma
October 2019, Machine learning in medical imaging. MLMI (Workshop),
Copied contents to your clipboard!