Enhancing Genetic Research through Multi-Ethnic Cohort Imputation: Recent Advances and Insights
Genetic imputation is a powerful tool in multi-ethnic cohort studies, enabling researchers to predict missing genetic variants by leveraging reference panels that represent diverse ancestral backgrounds. This process not only enhances the statistical power of genome-wide association studies (GWAS) but also aids in uncovering genetic associations across various ethnic groups, contributing to a more inclusive understanding of genetic predispositions to diseases and traits.
Rapid and Accurate Multi-Phenotype Imputation for Large Cohorts
Recent advancements have introduced efficient machine learning-based algorithms, like PIXANT, which have significantly improved the speed and accuracy of phenotypic imputation for large datasets. For instance, a study leveraging PIXANT for the UK Biobank data successfully imputed 425 phenotypes for over 277,000 individuals, identifying more GWAS loci compared to pre-imputation analyses and rediscovering novel genes associated with complex traits (Gu et al., 2023).
Enhancing Genotype Imputation in Underrepresented Populations
A significant challenge in genetic imputation is the suboptimal accuracy in populations underrepresented in reference panels. A notable approach to address this issue involves sequencing a subset of an underrepresented cohort to inform the selection of population-specific SNPs, thereby improving imputation accuracy. This method was effectively demonstrated in a Tanzania-based cohort, showing the benefits of tailored add-on SNPs to base H3Africa array data (Xu et al., 2021).
The Role of Diverse Reference Panels in Multi-Ancestry Fine-Mapping
The construction of a large HLA reference panel encompassing global population diversity has shown promising results in multi-ancestry cohorts. This diversity enables more accurate imputation and fine-mapping, facilitating the discovery of novel genetic associations, such as in HIV host response studies (Luo et al., 2021).
The Importance of Genetic Ancestry and Self-identified Race/Ethnicity
Research demonstrates that considering race/ethnicity information can enhance the understanding of population-specific genetic architecture in GWAS. An algorithm producing a surrogate variable for self-identified racial/ethnic information has shown utility in ethnicity-specific GWAS, enabling more accurate genetic analysis (Fang et al., 2019).
Promoting Equity in Genetic Research Through Improved Imputation
The inclusion of diverse samples in reference panels, such as the Trans-Omics for Precision Medicine (TOPMed) initiative, has begun to address imputation accuracy disparities among different ancestries. This approach highlights the necessity of increasing diversity in genetic studies to ensure equitable research outcomes (Cahoon et al., 2023).
In conclusion, recent studies underscore the importance of incorporating multi-ethnic perspectives and diverse reference panels in genetic imputation. This inclusivity not only improves the accuracy of imputation across underrepresented populations but also broadens the scope of genetic discoveries, contributing to a more comprehensive understanding of human genetics.
Reference:
Gu, L., Wu, H., Zhang, Y., Liu, T., He, J., Liu, X., Chen, G., Jiang, D., & Fang, M. (2023). Rapid and accurate multi-phenotype imputation for millions of individuals. bioRxiv.
Xu, Z., Rüeger, S., Zwyer, M., Brites, D., Hiza, H., Reinhard, M., Borrell, S., Isihaka, F., Temba, H., Maroa, T., Naftari, R., Hella, J., Sasamalo, M., Reither, K., Portevin, D., Gagneux, S., & Fellay, J. (2021). Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations. PLoS Computational Biology, 18.
Luo, Y., Kanai, M., Choi, W., Li, X., Sakaue, S., Yamamoto, K., Ogawa, K., Gutierrez-Arcelus, M., Gregersen, P., Stuart, P., Elder, J., Forer, L., Schönherr, S., Fuchsberger, C., Smith, A., Fellay, J., Carrington, M., Haas, D., Guo, X., Palmer, N., Chen, Y., Rotter, J., Taylor, K., Rich, S., Correa, A., Wilson, J., Kathiresan, S., Cho, M., Metspalu, A., Esko, T., Okada, Y., Han, B., McLaren, P., & Raychaudhuri, S. (2021). A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nature genetics, 53, 1504 - 1516.
Fang, H., Fang, H., Hui, Q., Lynch, J., Honerlaw, J., Assimes, T., Assimes, T., Huang, J., Huang, J., Vujkovic, M., Damrauer, S., Pyarajan, S., Pyarajan, S., Gaziano, J., Gaziano, J., Duvall, S., O'Donnell, C., O’Donnell, C., Cho, K., Cho, K., Chang, K., Wilson, P., Tsao, P., Tsao, P., Ramoni, R., Breeling, J., Huang, G., Muralidhar, S., Moser, J., Whitbourne, S., Brewer, J., Concato, J., Warren, S., Argyres, D., Stephens, B., Brophy, M., Humphries, D., Do, N., Shayan, S., Nguyen, X., Hauser, E., Sun, Y., Zhao, H., Wilson, P., McArdle, R., Dellitalia, L., Harley, J., Whittle, J., Beckham, J., Wells, J., Gutierrez, S., Gibson, G., Kaminsky, L., Villareal, G., Kinlay, S., Xu, J., Hamner, M., Haddock, K., Bhushan, S., Iruvanti, P., Godschalk, M., Ballas, Z., Buford, M., Mastorides, S., Klein, J., Ratcliffe, N., Florez, H., Swann, A., Murdoch, M., Sriram, P., Yeh, S., Washburn, R., Jhala, D., Aguayo, S., Cohen, D., Sharma, S., Callaghan, J., Oursler, K., Whooley, M., Ahuja, S., Gutierrez, A., Schifman, R., Greco, J., Rauchman, M., Servatius, R., Oehlert, M., Wallbom, A., Fernando, R., Morgan, T., Stapley, T., Sherman, S., Anderson, G., Sonel, E., Boyko, E., Meyer, L., Gupta, S., Fayad, J., Hung, A., Lichy, J., Hurley, R., Robey, B., Striker, R., Sun, Y., Tang, H., & Tang, H. (2019). Harmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies.. American journal of human genetics.
Cahoon, J., Rui, X., Tang, E., Simons, C., Langie, J., Chen, M., Lo, Y., & Chiang, C. (2023). Imputation Accuracy Across Global Human Populations. bioRxiv.