Power of inclusion: Enhancing polygenic prediction with admixed individuals
Yosuke Tanigawa
Sharut Gupta
Zoom Link: https://mit.zoom.us/j/94204370795?pwd=eFZwYXVuWmVsQzE1UTRZN2VtY0lkUT09 with passcode 387975
Abstract: Predicting heritable traits and genetic liability of disease from individuals’ genomes has important implications for tailoring medical prevention and intervention strategies in precision medicine. Polygenic score (PGS), a statistical approach, has recently attracted substantial attention due to its potential relevance in clinical practice. Admixed individuals offer unique opportunities for addressing limited transferability in PGSs. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals in developing more equitable PGS models.
Bio: Yosuke Tanigawa, PhD, is a research scientist at MIT’s Computer Science and Artificial Intelligence Lab. To incorporate interindividual differences in disease prevention and treatment, he develops computational and statistical methods, focusing on predictive modeling with high-dimensional human genetics data, multi-omic dissection of disease heterogeneity, and therapeutic target discovery. His recent works focus on inclusive training strategies for genetic prediction algorithms and dissecting the molecular, cellular, and genetic basis of phenotypic heterogeneity in Alzheimer’s disease. He received many awards, including the Charles J. Epstein Trainee Awards for Excellence in Human Genetics Research and MIT Technology Review’s Innovators Under 35 Japan.
Abstract: Predicting heritable traits and genetic liability of disease from individuals’ genomes has important implications for tailoring medical prevention and intervention strategies in precision medicine. Polygenic score (PGS), a statistical approach, has recently attracted substantial attention due to its potential relevance in clinical practice. Admixed individuals offer unique opportunities for addressing limited transferability in PGSs. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals in developing more equitable PGS models.
Bio: Yosuke Tanigawa, PhD, is a research scientist at MIT’s Computer Science and Artificial Intelligence Lab. To incorporate interindividual differences in disease prevention and treatment, he develops computational and statistical methods, focusing on predictive modeling with high-dimensional human genetics data, multi-omic dissection of disease heterogeneity, and therapeutic target discovery. His recent works focus on inclusive training strategies for genetic prediction algorithms and dissecting the molecular, cellular, and genetic basis of phenotypic heterogeneity in Alzheimer’s disease. He received many awards, including the Charles J. Epstein Trainee Awards for Excellence in Human Genetics Research and MIT Technology Review’s Innovators Under 35 Japan.