Skip to main content

Advanced analysis components to support SNOMED PaLM mapping project

The goal of this guide is to outline advanced analytics components and key considerations that may be relevant to the SNOMED PaLM mapping project. These insights will support the development of best practice guidance and an options paper for the programme.

Current Chapter

Advanced analysis components to support SNOMED PaLM mapping project


Author: Jonny Pearson – Lead Data Scientist, Data Science Team, NHS England

Acknowledgment: Special thanks to Avish Vijayaraghavan for providing reference materials from the internal project 'Resource-Constrained Annotation Workflows for Paediatric Histopathology Reports using LLMs.'

A literature review was conducted to investigate current practices in ontology mapping, combined with an existing review on using large language models (LLMs) for histopathology reports. Much of the broader literature was excluded because it covered:

  • associated tasks using ontology mapping to improve classification performance
  • ill-formed ontologies or those affected by missing data in a data set (missingness)
  • studies focused solely on extensional matchers or matching via upper ontologies

Structure

The role of advanced analytics in the SNOMED PaLM Mapping project can be considered across 3 areas, 2 of which are in scope and 1 that is future-facing.

  1. Current scope: Parsing semi-structured data to structured

Involves converting a semi-structured lab Read code into identifiable entities (component, property, inheres to, direct site, for example). This includes handling many-to-many mappings.

  1. Current scope: Matching using terminological methods

Uses the extracted categories to create matching scores against a target ontology. Includes pre-processing for acronyms and using referential maps (such as existing PBCL mappings).

  1. Future scope: Parsing unstructured data to structured

Primarily focuses on extracting mappable content from full pathology reports in free-text form. This process precedes or supplements the first two steps but is beyond the current scope.


Last edited: 23 May 2025 9:20 am