Gene–Environment Interactions in Multiple Sclerosis: Evidence from UK Biobank Polygenic Risk Modeling
Multiple sclerosis (MS) is widely understood as a complex disorder whose susceptibility emerges from both inherited variation and non-genetic exposures. The article by Jacobs and colleagues uses the scale and depth of the UK Biobank to address a central unresolved question in MS epidemiology: does genetic liability modify the effect of established environmental risk factors? Rather than asking whether genes and environment independently contribute to MS (a premise already supported by decades of work), the investigators focus on interaction—the possibility that the impact of a given exposure is amplified (or attenuated) among individuals with higher inherited risk. This conceptual framing directly targets the persistent gap between known heritability and explained risk, sometimes discussed as “missing risk,” for which gene–environment interplay is a plausible contributor.
Cohort, Case Ascertainment, and Environmental Exposures
The study draws on nearly the full UK Biobank cohort with both genotype and phenotype data, identifying 2,250 MS cases and ~486,000 controls.
Case status was derived from linked clinical coding (ICD-10 G35 and related sources), self-report, primary care records, and death registration, with a sensitivity analysis requiring corroboration by at least two sources to reduce misclassification.
To mitigate reverse causation, the authors prioritized exposures plausibly occurring before MS onset (e.g., childhood/adolescent variables), including childhood body size at age 10 (as a proxy for childhood obesity), smoking before age 20, age at menarche, infectious mononucleosis before age 20, and several additional early-life metrics.
This design choice is methodologically important because MS prodromal changes could otherwise distort exposure–disease associations.
Baseline Association Analyses: Which Exposures Track With MS in UK Biobank?
Before testing interaction, the investigators first re-evaluate established MS risk factors within UK Biobank using multivariable logistic regression adjusted for age, sex, ethnicity, deprivation, and birth latitude.
Three exposures show strong evidence of association after Bonferroni correction: larger childhood body size at age 10 (e.g., “plumper” vs “thinner,” OR ≈ 1.36), smoking before age 20 (OR ≈ 1.21), and earlier age at menarche (OR < 1 per year, consistent with higher risk at younger menarche ages).
The forest plot on page 7 visually summarizes these exposure effects, reinforcing that childhood body size and early smoking stand out among the candidate early-life variables evaluated.
Importantly, the authors also report that these environmental associations remain similar after incorporating the major MS-associated HLA alleles (HLA-DRB115:01 risk; HLA-A02:01 protective), supporting at least partial independence at the level of main effects.
Building Genetic Liability Measures: PRS With and Without the MHC
A distinguishing feature of the study is its careful construction of polygenic risk scores (PRS) using external weights from the largest MS genome-wide association study meta-analysis, and the deliberate creation of two PRS variants: one including the major histocompatibility complex (MHC) and one excluding it.
This is not a cosmetic choice: the MHC contains the strongest common variant signals for MS, and separating MHC from non-MHC burden helps clarify whether interactions reflect classic HLA biology or broader polygenic architecture. In the validation cohort, both PRS are strongly associated with MS, with the MHC-inclusive PRS explaining more variance (Nagelkerke pseudo-R² ~0.033) than the non-MHC PRS (~0.013).
The graphics on page 8 show the relative performance of multiple PRS specifications and illustrate that MS odds increase monotonically across PRS deciles—an internal consistency check that supports score validity even when overall explained variance remains modest.
Principal Finding: Interaction Between Childhood Obesity Proxy and Polygenic Risk
The central result is evidence for additive interaction between childhood body size and polygenic susceptibility. Specifically, the attributable proportion due to interaction (AP) between childhood body size and PRS is approximately 0.17 for both the MHC-inclusive and non-MHC PRS, with confidence intervals excluding zero and p-values below conventional thresholds.
Additive interaction here means that the combined effect of high genetic risk and larger childhood body size exceeds what would be expected from summing their independent contributions on the risk scale. The forest plot on page 10 (Figure 4A) provides a compact visual representation: among evaluated exposure–PRS pairs, childhood body size is the interaction signal with the most robust support.
Notably, the authors report no strong evidence of multiplicative interaction, which is common in complex disease contexts and underscores why both scales are informative: biologically meaningful synergy can exist on one scale but not the other.
Interpretation, Biological Plausibility, and Translational Implications
A key interpretive nuance emphasized by the authors is that statistical interaction is not synonymous with biological interaction. Nevertheless, the childhood obesity signal has substantial plausibility in MS pathogenesis given convergent evidence cited by the paper, including Mendelian randomization studies supporting a causal role of higher BMI/childhood obesity in MS risk.
From a translational perspective, the study motivates a prevention framework that is rarely feasible in low-incidence diseases: risk stratification. If the adverse effect of childhood obesity is meaningfully stronger among individuals with high genome-wide genetic liability, then interventions targeting childhood obesity might yield greater absolute benefit in genetically enriched groups, potentially improving the efficiency and power of prevention trials.
The authors also describe suggestive evidence for gene–gene interplay, reporting additive interaction between non-MHC PRS and HLA-DRB1*15:01 carriage (page 7), which, if replicated, would further support the notion of layered genetic architecture shaping MS susceptibility.
Limitations and What a Robust Next Step Would Require
The paper is explicit about limitations that constrain causal and predictive interpretation. MS case definition relies on health-record coding and self-report rather than criteria-confirmed diagnoses; exposures such as childhood body size and adolescent smoking are retrospective and thus vulnerable to recall bias; and UK Biobank’s demographic structure (predominantly White, older at recruitment, socioeconomically selected) creates risks of selection and collider bias.
Critically, while the authors split the dataset into training and testing subsets to reduce PRS overfitting, this is not equivalent to replication in an independent cohort.
A rigorous next step would therefore combine (i) replication of the PRS–childhood obesity interaction in an external MS cohort with harmonized exposure definitions, and (ii) deeper localization analyses to identify which variants or pathways drive the observed interaction, as explicitly recommended by the authors.
If those steps confirm robustness, the work would strengthen a clinically relevant narrative: prevention in MS may depend not only on reducing exposure prevalence, but on understanding who benefits most from exposure modification based on inherited susceptibility.
Disclaimer: This blog post is based on the provided research article and is intended for informational purposes only. It is not intended to provide medical advice. Please consult with a healthcare professional for any health concerns.
References:
Jacobs, B. M., Noyce, A. J., Bestwick, J., Belete, D., Giovannoni, G., & Dobson, R. (2021). Gene-environment interactions in multiple sclerosis: a UK Biobank study. Neurology: Neuroimmunology & Neuroinflammation, 8(4), e1007.
