Rencontres R 2016 - Sciencesconf.org

sciencesconf.org:r2016-toulouse:106370

Handling Missing Rows in Multi-Omics Data Integration: Multiple Imputation in Multiple Factor Analysis Framework

Valentin Voillet 1, @ , Philippe Besse 2, @ , Laurence Liaubet 1, @ , Magali Sancristobal 1, @ , Ignacio Gonzalez 3, @

1 : Génétique, Physiologie et Systèmes d'Élevage

Institut national de la recherche agronomique (INRA) : UMR1388

2 : Institut de Mathématiques

Université de Toulouse Paul Sabatier

3 : Mathématiques et Informatiques Appliquées

Institut national de la recherche agronomique (INRA) : UMR875

In omics data integration studies, it is common, for a variety of reasons, that some individuals are not present in all data tables. Missing row values are challenging to deal with because most statistical methods cannot be directly applied to incomplete datasets. To overcome this issue, we propose a multiple imputation (MI) approach in a multiple factor analysis (MFA) framework. MI involves filling the missing rows with plausible values, resulting in m completed datasets. MFA is then applied to each completed dataset leading to m different component configurations. Finally, the m configurations are combined to yield one consensus solution. We showed that MI-MFA configurations were closer to the true configuration (obtained from the original data) even when a significant number of individuals were missing, thus providing improved results.

Type :	:	poster
Thématiques	:	Analyse de données
PDF version	:	PDF version

Poster

Personnes connectées : 1