Merging ONS data to HES
The deaths recorded in HES and the ONS mortality data are merged to create the linked data so that there is only 1 record per patient (PERSON_ID). The death record used (DRU), which is a derived field in the linked data, indicates the source of death record in the linked data (either HES or ONS). There are 5 distinct values, HES1, HES2, ONS1, ONS2 and MIX1 for the DRU. Table 1 outlines the meaning of these values and indicates which record should take precedence under a range of scenarios. The record that does not take precedence is removed from the final data.
When a patient dies in hospital, the discharge date is taken as the date of death. If the discharge date is not recorded, then the episode end date is used instead.
Table 1: Explanation of death record used in the linked data
Source | ONS mortality date | DRU | Match rank | Death record from |
---|---|---|---|---|
ONS only | not applicable | ONS2 | 1 and 8 | ONS – implies that the death was recorded only in ONS. |
HES only | not applicable | HES1 | 0 | HES – implies that the death was only recorded in HES. |
Both | More than 3 days after HES mortality date | HES2 | 1 and 8 | HES – where contradictory date of death information is present HES takes precedence. Information like date of registration and cause of death are from ONS. |
Both | 1-3 days inclusive after HES mortality date | MIX1 | 1 and 8 | ONS, but date of death from HES – proximity of dates suggests slight recording error. |
Both | 0-3 days inclusive before HES mortality date | ONS1 | 1 and 8 | ONS – implies delay in patient leaving hospital within dead. |
Both | More than 3 days before HES mortality date | HES2 | 1 and 8 |
HES – contradictory death information is present so HES takes precedence. Information like date of registration and cause of death are from ONS. |
If DRU is ONS2, it means that the death record was present only in ONS at the time of linkage. This could be because the individual died outside of hospital.
If DRU is HES1, it indicates that the death record was present only in HES at the time of linkage. Many deaths initially recorded with only a HES generated mortality record. (DRU = HES1) often get updated with an ONS record at a later date. This might be due to delays in registering the death, such as in the case of inquests by a coroner.
When there is a matching ONS record, the DRU changes in the linked data from HES1 to either ONS1 or MIX1 or HES2, depending on the date of death in ONS. The death record that is low in preference is removed from the linked data. ONS mortality records contain a richer set of information about the death than HES records. Where HES and ONS mortality records share the same date of death, the ONS record takes precedence (DRU = ONS1).
In cases where the ONS mortality data is used in the linked data (DRU is either ONS1, ONS2 or MIX1), the match rank will always be non-zero.
In cases where HES mortality data is used (DRU is either HES1 or HES2) in the linked data, the match rank is recorded as 0 if no matching data is present in ONS (DRU = HES1) and 1 or 8 if a match can be made to the ONS data, but the dates of death disagree by more than 3 days (DRU = HES2).
Figure 3 below illustrates how the DRU is defined using the ONS date of death as the reference.
Figure 4: Percentage of records corresponding to each death record used in the HES-ONS linked data for April 2023.
Figure 4 illustrates the percentage of records corresponding to each DRU in the HES-ONS linked data for April 2023. This is obtained by linking the mortality data in ONS up to February 2023 with the HES data until January 2023.
From Figure 4 we can infer that 52.20% of deaths are only recorded in the ONS data and 46.16% of deaths are recorded in both HES and ONS. Additionally, 1.64% of deaths are recorded only in HES or could not be linked to a death record from ONS, although it is mandatory that every death should be registered within 5 days of occurrence. Some of the potential reasons for deaths not appearing in the ONS data are:
- deaths referred to coroners for inquests
- provisional nature of the ONS data, which is not yet finalised
- the HES and ONS record not linking successfully due to missing or inaccurately recorded patient identifiers
While the vast majority of dates of death within the 2 data sets agree, there is a significant minority where the dates are different. There may be a number of reasons for this:
- the patient may have died late at night and the hospital were unable to record the discharge until the next day
- the patient may have died on a previous day but was not released until tests were performed on a subsequent day
- there was a data input error, meaning that the discharge date was incorrect
Flagging patients in the linked data with subsequent activity in HES
On rare occasions patients may appear to have activity in HES after the mortality record indicates that they have died. This is a data quality issue, either in the patient identifiers (causing an incorrect data linkage between HES and ONS), or due to a patient being incorrectly recorded in HES. The linked data flags these records as having activity in HES after the date of death.
These flagged records are available to users of the data. They are flagged and not removed because it is possible that the activity was incorrectly recorded in HES – for example where a patient had an outpatient appointment, but died before the appointment, resulting in the data being incorrectly sent by the patient administration system (PAS). Often such records appear in the monthly HES data but disappear after the HES annual refresh, as providers correct their submissions. In these cases, the flag will be removed once the submission is corrected.
In the case of outpatient appointments, subsequent activity is only flagged if the ATTENDED field has a value of 5, 6 or 7 (patient seen, patient arrived late but was seen, patient arrived late and could not be seen). In cases where the HES death record has a date of death between 0 and 3 days after the ONS date of death (DRU = ONS1) this is not counted as subsequent activity.
Mortality records of patients having more than 1 PERSON_ID
Hospital activity data submitted by providers of healthcare is, at times, incomplete or incorrect. This can cause the system to create new PERSON_IDs for patients who already have a PERSON_ID, resulting in multiple PERSON_IDs for some patients.
In the mortality linkage process, we deal with such PERSON_IDs by maintaining separate death records in the linked data for both new and old PERSON_IDs, but with the same mortality data. Therefore, duplicates are intentionally maintained in the linked data. The reason for retaining duplicates is that users of the mortality data can link it to HES data extracts taken at different points of time, even if the patient’s PERSON_ID has changed over time.
Last edited: 25 May 2023 1:00 pm