Development of data fusion techniques for the integration of multi-domain genomic data from uveal melanoma
Max Pfeffer, André Uschmajew, Adriana Amaro, and Ulrich Pfeffer
Contact the author: Please use for correspondence this email.
Submission date: 20. Jun. 2019 (revised version: June 2019)
Download full preprint: PDF (4210 kB)
Uveal melanoma (UM) is a rare cancer that is well characterized at the molecular level. Two to four classes have been identified by the analyses of gene expression (mRNA, ncRNA), DNA copy number, DNA-methylation and somatic mutations yet no factual integration of these data has been reported. We therefore applied new algorithms for data fusion, joint Singular Value Decomposition (jSVD) and joint Constrained Matrix Factorization (jCMF), as well as similarity network fusion (SNF), for the integration of gene expression, methylation and copy number data that we applied to the Cancer Genome Atlas (TCGA) UM dataset. Variant features that most strongly impact on definition of classes were extracted for biological interpretation of the classes. Data fusion allows for the identification of the two to four classes previously described. Not all of these classes are evident at all levels indicating that integrative analyses add to genomic discrimination power. The classes are also characterized by different frequencies of somatic mutations in putative driver genes (GNAQ, GNA11, SF3B1, BAP1). Innovative data fusion techniques confirm, as expected, the existence of two main types of uveal melanoma mainly characterized by copy number alterations. Subtypes were also confirmed but are somewhat less defined. Data fusion allows for real integration of multi-domain genomic data.