Skip to main content

Person_ID and Token_Person_ID in HES outpatients 2022/23

Summary

These derived fields are used as a mechanism through which to assign activity within a data product to being related to a specific person. This enables the ability to calculate metrics on counts of people and to enable linkage of activity within and across other data sets which use these fields.

In June 2024, an issue was identified in provisional HES data for the period of 2023/24 for a subset of activity reported in these fields, where records with unmatched one-time-use-id values for Person_ID and Token_Person_ID were being reported. These were being duplicated incorrectly where they should have been unique.

This issue has now been resolved for across all data products relating to 2023/24 activity. However further investigation of the issue for earlier years of HES data has identified an issue within the finalised HES Outpatient data for the year 2022/23.

Due to the proportion of these duplicate values being proportionally very small in relation to the overall finalised asset we will not be correcting this issue within the asset itself but making users aware of the issue and relative scale and impact via the following analyses shared below.   


Record counts

The monthly counts of outpatient records affected with a duplicated one-time-use-id are shown below:


Users can download counts by provider from the open data file below. Note that TOTAL_RECORDS is significantly lower than summary TOTAL_RECORDS above as this doesn't include provider or month that do not have any duplicates:


Records affected

All records affected are those where the Person_ID has been assigned a one-time-use-id. This is where the record does not have sufficient information to be matched to PDS, nor to the MPS record index. By definition, a one-time-use-id should only be used once, but there are cases where these are being used multiple times across different people. 

Impacted records all have a Person_ID starting with ‘U’ in the clear data, though not all the ‘U’ values are impacted. If you have access to clear data, the workaround is to treat all Person_IDs that begin with ‘U’ as individual people. There is currently no way to identify these from the tokenised Token_Person_ID. There are no plans to correct the HES Outpatient data for 2022/23 due to the small number of records affected.

Last edited: 16 July 2024 1:48 pm