Skip to main content

Foresight AI case study

The Foresight project represents a groundbreaking AI initiative to transform predictive healthcare in the UK.

Using de-identified NHS data from approximately 57 million people in England, researchers at University College London (UCL) and King's College London (KCL) are developing an innovative system to help clinicians, planners and policymakers predict health outcomes following the COVID-19 pandemic more effectively – while maintaining strict patient privacy standards.


Data integration and patient impact

Foresight is an AI model specifically designed for healthcare that has been trained in the NHS England Secure Data Environment (SDE) on de-identified NHS data.

Like ChatGPT, Foresight learns to predict what happens next based on previous events, working like an auto-complete function for medical timelines. Predictions are validated against real-world data. 

The NHS England SDE provided secure access to extensive de-identified records to train Foresight AI. It included: 

  • data from 57 million individuals (including hospital admissions, vaccinations and other medical records) 

  • over 10 billion healthcare events across more than 40,000 types (for example, admissions, diagnoses, procedures and medications) 

  • records spanning November 2018 to December 2023

By accessing population wide data, Foresight AI also studies minority groups and rare diseases, which are often overlooked when using smaller datasets.  

Training large language AI models like Foresight requires significant computing power. Industry partners Amazon Web Services (AWS) and Databricks provide this essential resource while never accessing the data or AI model. The research team's access to the SDE was facilitated by the BHF Data Science Centre at Health Data Research UK, following rigorous approval processes and involving members of the public in approving and shaping the research.

Through this innovative collaboration between NHS England, academia and industry, Foresight is exploring how large-scale, de-identified health data can improve healthcare efficiency and fairness across the population. 

Foresight shows the incredible potential of AI combined with secure access to the deidentified data held in NHS England's SDE. This project is paving the way for smarter, more predictive healthcare – helping us improve patient outcomes across the UK.
AI models are only as good as the data they are trained on. To benefit all patients, the AI must be trained on data that represents everyone.
We are delighted to support the Foresight team. Through access to secure and scalable cloud computing resources, and the latest cutting-edge GPUs for training and optimising AI models, the NHS will be able to innovate faster and accelerate the development of this AI tool.
Developing AI models at a large scale is challenging. By collaborating with NHSE and AWS using Mosaic AI, we have demonstrated what is possible for truly impacting patient care. We're proud to be part of this pioneering initiative.

Only approved researchers working under strict NHS governance can access the data, which remains within the SDE at all times, as does the trained model. Industry partners provide infrastructure but cannot access the raw data, AI model or any outputs. 


Patient impact

While still in development, Foresight demonstrates significant potential benefits for healthcare delivery: 





Conclusion

Foresight exemplifies how data-driven healthcare can predict and manage health risks proactively, supporting strategic healthcare shifts. Patient privacy remains central, with stringent measures in place: 

  • all data is de-identified 

  • only approved researchers can access data within the SDE 

  • no data or outputs leave the secure environment without oversight 

  • public and patient involvement ensures alignment with public expectations 

To expand Foresight and validate the model further, the research team aims to: 

  • add more data sources – clinician notes, lab results, imaging data 

  • extend historical data analysis to deepen understanding of long-term health trends 

  • expand use cases beyond COVID-19-related research 

  • test in other nations to verify model effectiveness  

The current dataset is broad but shallow. We hope to improve Foresight-SDE by incorporating richer sources like clinician notes, blood tests and scans.

Partners

NHS England Secure Data Environment 

University College London (UCL) 

King's College London (KCL) 

NHS Foundation Trusts (King's College Hospital, South London and Maudsley) 

British Heart Foundation Data Science Centre's CVD-COVID-UK/COVID-IMPACT Consortium 

Amazon Web Services (AWS) 

Databricks 

CogStack 

National Institute for Health and Care Research Biomedical Research Centres 

UK Research and Innovation 

Medical Research Council 

Last edited: 7 May 2025 8:01 am