%0 Journal Article %9 ACL : Articles dans des revues avec comité de lecture répertoriées par l'AERES %A Milet, Jacqueline %A Courtin, David %A Garcia, André %A Perdry, H. %T Mixed logistic regression in genome-wide association studies %D 2020 %L fdi:010080443 %G ENG %J BMC Bioinformatics %@ 1471-2105 %K GWAS ; Mixed-models ; Logistic regression %K AFRIQUE DE L'OUEST %M ISI:000595711300001 %N 1 %P 536 [17 ] %U https://www.documentation.ird.fr/hor/fdi:010080443 %> https://horizon.documentation.ird.fr/exl-doc/pleins_textes/divers20-12/010080443.pdf %V 21 %W Horizon (IRD) %X BackgroundMixed linear models (MLM) have been widely used to account for population structure in case-control genome-wide association studies, the status being analyzed as a quantitative phenotype. Chen et al. proved in 2016 that this method is inappropriate in some situations and proposed GMMAT, a score test for the mixed logistic regression (MLR). However, this test does not produces an estimation of the variants' effects. We propose two computationally efficient methods to estimate the variants' effects. Their properties and those of other methods (MLM, logistic regression) are evaluated using both simulated and real genomic data from a recent GWAS in two geographically close population in West Africa.ResultsWe show that, when the disease prevalence differs between population strata, MLM is inappropriate to analyze binary traits. MLR performs the best in all circumstances. The variants' effects are well evaluated by our methods, with a moderate bias when the effect sizes are large. Additionally, we propose a stratified QQ-plot, enhancing the diagnosis of p values inflation or deflation when population strata are not clearly identified in the sample.ConclusionThe two proposed methods are implemented in the R package milorGWAS available on the CRAN. Both methods scale up to at least 10,000 individuals. The same computational strategies could be applied to other models (e.g. mixed Cox model for survival analysis). %$ 020