CO-FUNDED PROJECT BY IVADO
For this project, we have implemented several machine learning and deep learning methods to analyze data from a variety of omics technologies, including genomics, epigenomics, transcriptomics, proteomics and metabolomics. We explored several representations of the genetic data based on the encoding of whole sequences (RecDL), genetic variants (Diet Network), and genetic ontology (DeepSimDef). We conducted metabolomics studies of heart disease using machine learning approaches to investigate the impact of myocardial infarction on the patients’ metabolome. This revealed a differential fatty acid signature depending on the drug using unsupervised learning methods. We also identified lignoceric acid, potentially important in heart failure, using the XGboost method, a metabolite currently undergoing biological validation. Finally, we evaluated the generalizability of our approaches, including the Diet Network’s approach to predicting ethnicity based on genomic data. We demonstrated that the approach could help us to make accurate predictions on independent datasets with different sets of genetic markers and different levels of missing data, which are ubiquitous in omics data. Our work has also revealed the importance of biological interpretation in prediction, an aspect on which our future work will focus.
Lead Genome Centre : Génome Québec
Partner : IVADO
Co-investigators:
Simon | Gravel | Université McGill |
Yoshua | Bengio | Université de Montréal |