Loading icon

Decoding Multiple Sclerosis Risk Through Multi-Omics and Machine Learning

Post banner image
Share:

Multiple sclerosis (MS) is a complex neuroinflammatory disease marked by immune-mediated demyelination and neurodegeneration in the central nervous system. Although more than 200 MS-associated genetic loci have been reported by genome-wide association studies, many variants lie in noncoding regions, making it difficult to determine which genes are functionally responsible for disease risk. The article addresses this unresolved problem by integrating genomic, transcriptomic, proteomic, and machine-learning approaches to move beyond association signals and prioritize biologically meaningful MS risk genes.

Multi-Omics Strategy: From GWAS Signals to Functional Genes
The study used a stepwise integrative framework combining MS GWAS summary statistics with brain-derived expression quantitative trait loci, splicing quantitative trait loci, and protein quantitative trait loci. The authors applied summary-data-based Mendelian randomization and colocalization analyses to test whether genetic variants associated with MS also influenced gene expression, RNA splicing, or protein abundance through shared causal variants. This design is scientifically important because it helps distinguish plausible causal genes from nearby genes that may simply be linked through genomic proximity.

Transcriptomic Findings: Expression and Splicing as Mechanistic Layers
The analysis identified multiple MS-associated transcriptomic signals, including both expression-linked and splicing-linked genes. SMR analysis prioritized 28 splicing-associated signals and 66 expression-associated genes, with many passing colocalization criteria, suggesting that the same causal variants may influence both molecular regulation and MS susceptibility. Notably, genes such as TNFRSF1A, SP140, ZC2HC1A, IFITM1, and IFITM3 emerged among risk-associated candidates, emphasizing that altered RNA abundance and alternative splicing may both contribute to MS pathogenesis.

Coexpression Networks and Immune Pathway Convergence
To connect genetic findings with disease-relevant transcriptional organization, the authors performed weighted gene coexpression network analysis using PBMC expression data from MS patients and healthy controls. Three gene modules were strongly associated with MS, and their overlap with SMR-prioritized genes produced 23 shared candidate genes. Functional enrichment analyses showed consistent involvement of immune-related biology, especially lymphocyte activation, immune-system regulation, NF-κB signaling, and Epstein–Barr virus infection pathways, reinforcing the central role of immune dysregulation in MS.

Machine Learning Signature: A Ten-Gene Predictive Model
The study then used LASSO regression to refine the candidate genes into a 10-gene MS signature: ACP2, IL7, MYNN, RGS1, SAE1, SP140, TRAF3, TSPAN31, TYMP, and ZC2HC1A. This model showed very high internal predictive performance, with an AUC of 0.983 in validation, and retained moderate-to-strong performance across three independent external datasets. Although these findings are promising, the authors appropriately caution that further validation is needed, particularly against other neurological and inflammatory diseases that can clinically resemble MS.

Proteomic Validation: ZC2HC1A and TRAF3 as High-Priority Candidates
Among the 10 signature genes, ZC2HC1A and TRAF3 were further validated at the protein level through integration of brain pQTL data with MS GWAS statistics. Both proteins showed significant positive associations with MS risk and strong colocalization evidence, supporting their prioritization as mechanistic candidates. TRAF3 is especially relevant because of its known involvement in TNF receptor signaling, B-cell regulation, NF-κB pathway control, and antiviral immune responses, while ZC2HC1A may influence immune regulation through loci affecting T-cell activation and neighboring immune genes such as IL7.

Broader Significance and Future Directions
Overall, this article presents a rigorous multi-omics framework for translating MS genetic associations into candidate causal genes and potential biomarkers. Its major contribution is the integration of regulatory genomics, coexpression networks, proteomic validation, immune-cell inference, and predictive modeling into a unified analysis. The findings support the hypothesis that MS risk is shaped by genetically regulated immune dysfunction, particularly involving CD4+ T cells, mast-cell states, NF-κB signaling, and potentially Hedgehog pathway activity. Future functional experiments, larger multi-ancestry cohorts, and clinical validation studies will be essential before these genes can be used confidently for diagnosis, prognosis, or therapeutic targeting.

Disclaimer: This blog post is based on the provided research article and is intended for informational purposes only. It is not intended to provide medical advice. Please consult with a healthcare professional for any health concerns.

References:
Chen, M., Zhao, D., Fan, H. et al. Integrated multi-omics and machine learning prioritize key immune genes for multiple sclerosis risk prediction. Mamm Genome 37, 38 (2026). https://doi.org/10.1007/s00335-026-10207-6