Blood protein levels predict leading incident morbidities and mortality in the UK Biobank 

A vibrant depiction of a galaxy filled with countless stars, dust, and nebulae.
A vibrant depiction of a galaxy filled with countless stars, dust, and nebulae.
  • Research Output

Blood protein levels predict leading incident morbidities and mortality in the UK Biobank 

Objective:

The overarching aim of this study was to explore the intricate relationship between the circulating proteome and age-related diseases and mortality using data from the UK Biobank. By analyzing blood data from 47,600 individuals over 16 years and linking it with electronic health records, our primary objective was to elucidate the associations between 1,468 Olink protein levels and the occurrence of 23 age-related diseases and mortality. Through this comprehensive investigation, we aimed to shed light on the biological mechanisms and pathways underlying disease precursors, thereby highlighting the significance of the circulating proteome in deciphering disease indicators.

Solutions:

  • Developed protein-based scores (ProteinScores) using penalised Cox regression.
  • Utilized these scores to quantify and stratify the risk of disease onset.
  • Employed ProteinScores on test sets to demonstrate proficiency in predicting the 10-year onset of incident outcomes.
  • Compared ProteinScores with conventional metrics such as age, sex, and lifestyle factors to assess their predictive capability.
  • Investigated the potential of ProteinScores in identifying and stratifying risk for major age-related diseases, particularly type 2 diabetes.

Challenges:

  • Handling and analyzing a vast amount of blood data from 47,600 individuals over 16 years, all taken from the UK Biobank.
  • Ensuring the accuracy and reliability of protein measurements and their associations with diseases.
  • Developing robust protein-based scores that effectively predict disease onset and outperform existing markers.
  • Addressing potential confounding variables and biases in the data analysis process.
  • Integrating findings from proteomic analysis with clinical and genetic data for comprehensive disease prognosis.

Impact:

  • Identified 3,201 associations between 961 protein levels and 21 distinct incident outcomes, providing valuable insights into disease mechanisms.
  • Established ProteinScores that demonstrated enhanced predictive accuracy compared to conventional metrics and other clinically relevant biomarkers.
  • Highlighted the potential of the plasma proteome in early risk identification and stratification for major age-related diseases.
  • Showcased the superiority of ProteinScores, especially in predicting the onset of type 2 diabetes, surpassing existing clinical markers and polygenic risk scores.
  • Contributed to a deeper understanding of proteomic signatures and their crucial role in disease prognosis through collaborative efforts with academic and industry partners.