Case study


Comparative analysis of supervised integrative methods for multi-omics data


Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect multiple omics data from a single sample. These large datasets have led to a growing consensus that a holistic approach was needed to identify new candidate biomarkers and unveil mechanisms underlying disease aetiology, key to precision medicine. In collaboration with a US partner, this project aimed at landscaping and benchmarking supervised integrative approaches.


Due to the relative novelty of the field, numerous challenges remain in multi-omics analysis among which:

  • High dimensionality that significantly impacts inference
  • Data heterogeneity, likely to reduce the biological signal due to heterogeneous biases and systematic errors across platforms;
  • Interpretation, where the huge amount of information makes meaningful conclusions difficult to draw


BIOASTER reviewed and selected cutting-edge machine learning methods,representative of the main families of integrative approaches (matrix factorization,multiple kernel methods, ensemble learning and graph-based approaches). Methods were subsequently evaluated on both simulated and real world datasets; the latter were carefully selected to cover various medical applications (infectious diseases, oncology,and vaccine).


Integrative approaches showed comparable or higher performances on simulations and outperformed
non-integrative methods on real-world data. More specifically, multiple kernel and matrix factorization demonstrated a strong ability to uncover modest effectsin high dimensionality settings.


The expertise acquired in this project will help BIOASTER and its partners refine both biomarker discovery and decipher molecular function underlying new mode of actions.

Additional information

You want to know more about this case study? Contact us!