Multimodal data fusion using sparse canonical correlation analysis and cooperative learning: a COVID-19 cohort study. 2024

Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA. ahmetgorkemer@gmail.com.

Through technological innovations, patient cohorts can be examined from multiple views with high-dimensional, multiscale biomedical data to classify clinical phenotypes and predict outcomes. Here, we aim to present our approach for analyzing multimodal data using unsupervised and supervised sparse linear methods in a COVID-19 patient cohort. This prospective cohort study of 149 adult patients was conducted in a tertiary care academic center. First, we used sparse canonical correlation analysis (CCA) to identify and quantify relationships across different data modalities, including viral genome sequencing, imaging, clinical data, and laboratory results. Then, we used cooperative learning to predict the clinical outcome of COVID-19 patients: Intensive care unit admission. We show that serum biomarkers representing severe disease and acute phase response correlate with original and wavelet radiomics features in the LLL frequency channel (cor(Xu1, Zv1) = 0.596, p value < 0.001). Among radiomics features, histogram-based first-order features reporting the skewness, kurtosis, and uniformity have the lowest negative, whereas entropy-related features have the highest positive coefficients. Moreover, unsupervised analysis of clinical data and laboratory results gives insights into distinct clinical phenotypes. Leveraging the availability of global viral genome databases, we demonstrate that the Word2Vec natural language processing model can be used for viral genome encoding. It not only separates major SARS-CoV-2 variants but also allows the preservation of phylogenetic relationships among them. Our quadruple model using Word2Vec encoding achieves better prediction results in the supervised task. The model yields area under the curve (AUC) and accuracy values of 0.87 and 0.77, respectively. Our study illustrates that sparse CCA analysis and cooperative learning are powerful techniques for handling high-dimensional, multimodal data to investigate multivariate associations in unsupervised and supervised tasks.

UI MeSH Term Description Entries

Related Publications

Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
March 2024, Biometrical journal. Biometrische Zeitschrift,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
January 2021, Frontiers in genetics,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
December 2010, Proceedings. IEEE International Conference on Bioinformatics and Biomedicine,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
January 2016, Frontiers in aging neuroscience,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
July 2017, IEEE transactions on medical imaging,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
August 2013, BMC bioinformatics,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
August 2016, BMC systems biology,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
January 2009, Statistical applications in genetics and molecular biology,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
September 2020, Biometrika,
Ahmet Gorkem Er, and Daisy Yi Ding, and Berrin Er, and Mertcan Uzun, and Mehmet Cakmak, and Christoph Sadee, and Gamze Durhan, and Mustafa Nasuh Ozmen, and Mine Durusu Tanriover, and Arzu Topeli, and Yesim Aydin Son, and Robert Tibshirani, and Serhat Unal, and Olivier Gevaert
January 2020, IEEE access : practical innovations, open solutions,
Copied contents to your clipboard!