NHS Maternity Statistics - data quality guidance
This guidance is to support both providers of maternity data and users of the annual NHS Maternity Statistics series, in understanding and explaining the quality of the data and identifying how to make improvements to it.
This guidance covers four main sections:
1. Data Sources
2. Tools available
3. Interpreting Data Quality issues
4. Solving Data Quality issues
Part 1 - Data Sources
Overview
The annual NHS Maternity Statistics publication uses data from two data sets: the Hospital Episode Statistics (HES) data warehouse and the Maternity Service Data Set (MSDS). The data in these are submitted by providers and processed for analysis by NHS England. Both data sets are secondary uses data sets which re-use clinical and operational data for purposes other than direct patient care.
The MSDS is a national-level data set made by submissions from NHS-funded maternity services and provides information to help with monitoring outcomes, commissioning, and addressing health inequalities. It defines the data items, definitions and associated value sets extracted or derived from local information systems. The MSDS dataset also contributes to the monthly statistical publication, however some measures in the monthly are not included in the annual, such as the Clinical Quality Improvement Metrics (CQIMs) and Continuity of Carer (CofC) measures.
An updated version of this data set, MSDS v2.0, was implemented in April 2019 to meet requirements that resulted from the National Maternity Review, resulting in numerous changes being made to the contents and structure of the dataset. Because of this, the data from April 2019 onwards is not directly comparable to data from previous years.
The HES database contains details of all admissions, outpatient appointments and A&E attendances at NHS hospital in England. HES A&E retired in March 2020 and is superseded with a new Emergency Care dataset, ECDS, which became the official course of A&E data from April 2020. Each HES record contains a wide range of information about an individual patient admitted to an NHS hospital, including clinical, patient, administrative and geographical information. HES data used related to pregnancies and births are referred to as delivery and birth episodes, defined as a period where a patient receives care from one consultant at one provider. Only completed episodes are included in HES.
Datasets and additional resources
Measures
Measures in the annual report come from both MSDS and HES. Some of these measures are available from both datasets and some are unique to one specific data set. HES data is used for the national totals, as it has historically been more complete.
Measures common to both data sets include the total count of deliveries, the ethnicity of the mother, and the babies’ birthweight. There are also some measures which may appear similar but have subtle differences, for example, miscarriage and ectopic pregnancies are recorded in HES but only miscarriage is recorded in MSDS and so the figures may not be directly comparable.
There are a number of statistics that are sourced only from MSDS that are not captured in HES, these include:
- Apgar Score (health check for new-borns)
- mother’s smoking status
- skin to skin status (physical contact between the mother and the new-born)
- first feed information (whether the baby was breastfed or not)
Further information on the data captured from MSDS can be found in the Technical Output Specification (TOS) and user guidance found on the tools and guidance webpage, and the MSDS Metadata file accompanying the 2023-24 annual publication.
Other measures in the annual publication, sourced from HES but not from MSDS, include:
- the status of the person conducting the delivery (Midwife, doctor, etc.)
- which anaesthetic, if any, was used during delivery
- the length of antenatal and postnatal hospital stays
- The number of babies delivered at the end of a pregnancy (single baby, twins, etc.)
- the number of miscarriages and ectopic pregnancies
Further information on the data captured by HES can be found in the Technical Output Specification (HES TOS), and the HES Metadata file accompanying the 2023-24 annual publication.
Issues with data quality
For NHS England to provide the best possible analysis for all users of our data, we require high quality data to work with. An accurate picture of maternity services allows clinicians, commissioners, and others to act in the most informed manner possible, ultimately leading to improved outcomes for patients. It also allows for better transparency around the work carried out by the NHS.
There are several known data quality issues with MSDS. For example, the number of providers submitting valid data for each data table and data item can vary. As a relatively recently revised national level data set this is somewhat expected, however the issue of non-response from providers has in turn impacted on the geographical coverage expected of the data set, leading to less reliable figures at levels higher than individual provider level.
Known HES DQ issues are documented on the HES processing cycle and data quality page.
Detailed MSDS Data Quality Statements can be found included with each annual NHS Maternity Statistics publication, alongside a CSV file which presents an analysis of the data quality of the submissions from MSDS from maternity service providers within the reporting period. This is named the MSDS Data Quality file and can be found in the Resources section of each publication.
If you do encounter problems with your MSDS data which you cannot resolve using the resources outlined in this guidance, you can contact [email protected].
Broader information on data quality across the NHS is presented in our data quality page.
Improving your Trust’s data quality should enable you to have a more comprehensive understanding of how your service is operating and your outcomes. Participating trusts are also assessed on their MSDS data quality as part of the Maternity Incentive Scheme organised by NHS Resolution.
If your Trust is within scope of Maternity Services and does not currently submit to the Maternity Services Dataset, find out how you can register to submit data as your first action towards meeting Maternity Incentive Scheme requirements.
Improving your trust’s submitted data should also reduce the differences seen between related HES and MSDS published figures, and so enable the results of analysis of both datasets to be used together to gain a deeper understanding of your service and patients, and the impacts of the actions you take.
Some MSDS measures represented in the NHS Maternity Statistics annual publication series are directly or closely comparable to the measures in the Maternity Services Monthly Statistics publication series, and its accompanying national Maternity Services Dashboard.
Annual dimension | Annual measure | Monthly measure |
---|---|---|
ComplexSocialFactorsInd | All | Complex social factor |
SmokingAtBooking | Smoker | CQIMSmokingBooking: Women who were current smokers at booking |
BabyFirstFeedBreastMilkStatus | CQIMSmokingBooking: Women who were current smokers at booking | CQIMBreastFeeding: Babies with a first feed of breast milk |
ApgarScore5TermGroup7 | 0 to 6 | CQIMApgar: Babies with an APGAR score between 0 and 6 (rate per 1000) |
PreviousCaesareanSectionsGroup | Zero previous births | CQIMRobsonGroup1: women in RG1 having a caesarean section with no previous births (Percent) |
PreviousCaesareanSectionsGroup | Zero previous births | CQIMRobsonGroup2: women in RG2 having a caesarean section with no previous births (Percent) |
PreviousCaesareanSectionsGroup | At least one Caesarean | CQIMRobsonGroup2 women in RG5 having a caesarean section with at least one previous birth (Percent) |
Part 2 - Tools available
This section is intended to provide you with an overview of the tools and resources available to help identify data quality issues, for both providers of maternity data and users of the annual NHS Maternity Statistics publication.
It is always important as part of addressing data quality concerns, to ensure clear robust communication is in place between providers of clinical services and those submitting the data. For example, the process of making MSDS data submissions should involve all those providing care to the women and babies, those entering the data, and those making the submission.
Data included in the NHS England statistical publications should also be cross-referenced against local data to ensure the HES and MSDS records are providing an accurate picture of the service provided.
Maternity Services Data Set (MSDS)
Data for the maternity services data set (MSDS) is supplied to NHS England via the cloud-based strategic data collection services (SDCS Cloud).
There is dedicated guidance available on the SDCS submission process, and on submitting data for the MSDS.
Data submitted via the SDCS goes through a series of automatic validation checks. Some submissions are rejected at submission if they do not fulfil the submission criteria, as specified in the MSDS Technical Output Specification (TOS). Upon submission, providers also receive a validation report detailing errors and record rejections to flag these issues to the provider. Further information on the data quality checks that are part of the SDCS can be found in the SDCS Data Quality guidance. Additionally, there is a tool designed specifically to help providers of data for MSDS to better understand the validation reports they receive, the MSDS Data Quality Submission Summary Tool.
Sharing information with maternity services about the rejections and warnings received at the point of data submission can be valuable in ensuring accurate data corrections are made, and to support a greater understanding of common data quality themes which could be addressed through collaboration and clearer data entry guidance.
Finally, there is a MSDS validation report released as part of the annual NHS Maternity Statistics publication which groups submitted data into categories (Valid, Default, Invalid, and Missing) based on whether the data conforms to the validation requirements. This data is provider-level and can be found in the Resources section of the annual publication.
Sharing information with maternity services about the rejections and warnings received at the point of data submission can be valuable in ensuring accurate data corrections are made, and to support a greater understanding of common data quality themes which could be addressed through collaboration and clearer data entry guidance.
Interactive dashboards
The Data Quality Dashboard for the MSDS presents information about the quality of data submitted each month and includes provider-level data quality information to help users understand the impact of local issues and to support data quality improvements.
The SNOMED data quality dashboard allows users to explore the SNOMED data recorded in the Maternity Services Data Set (MSDS). This dashboard has been designed to help providers improve their usage of SNOMED and thereby improve the submission of SNOMED to MSDS.
There is also an annual NHS Maternity Dashboard attached to the annual publication, to help providers identify data quality issues across both HES and MSDS data sets. This includes pages to highlight when a provider’s MSDS total deliveries count is significantly below its HES equivalent, and pages which enable users to compare the number of records submitted to HES and MSDS for each measure present in both data sets and to see what proportion of the data submitted for a measure is missing meaningful values.
Maternity Services Data Set - Data Quality Dashboard
The Data Quality Dashboard for the MSDS presents information about the quality of data submitted each month.
NHS Maternity Statistics - Dashboard
The Annual Maternity Dashboard includes information to help providers identify data quality issues across both HES and MSDS data sets.
Hospital Episode Statistics (HES)
Data for the hospital episode statistics (HES) data set is submitted directly to secondary uses service (SUS) within NHS England by providers, from the information recorded for clinical purposes by hospitals and other healthcare providers. From here, the HES data set is produced by monthly extraction of data from SUS. Learn more about the collection process.
Upon receipt of data for the HES data set, a number of automated cleaning rules are applied to the data. These data quality reports and checks are completed at various stages in the cleaning and processing cycle. More information on data quality as it relates to HES, including a monthly publication of known data quality issues, is available.
Additional resources for clinicians on the correct use of clinical codes and the importance of good quality data are provided.
Monthly provisional and annual reports produced from HES can be found in under the HES Publications section.
Other resources
There is also reporting on the quality of data across NHS data sets in the form of the Data Quality Maturity Index (DQMI) monthly publication, which provides data submitters with timely and transparent information on the state of their data quality. Further information about the DQMI publication, both current and historical, and the methodologies used in its construction are available.
Part 3 - Interpreting data quality issues
This section is intended to help data providers understand their own data quality issues, and to help guide readers of the annual NHS Maternity Statistics publication to resources that can be used to interpret the quality of the data contained within the publication and therefore better understand the possibilities and limitations of the data.
It is also intended to support data submitters in understanding how to identify and understand the different data quality issues that can arise within the maternity data. It is essential that as part of local data quality improvement work, that data submitters liaise with their local maternity services directly to rectify issues with maternity data input and ensure the dataset records are providing an accurate picture of the service provided and the care women and babies receive.
Understanding data quality: MSDS data quality feedback
In order to identify problems with the quality of data, feedback is provided at the point the data is submitted to the SDCS Cloud portal and further data quality checks are run within NHS England as the data is processed and cleaned ready for incorporation into the monthly and annual publications:
- Providers receive immediate feedback on the quality of their submission in a file containing validation reports. This file includes record-level reports of any submission errors, intended to give the data providers detailed information about which records caused which errors. Providers should then be able to address the specific errors highlighted and resubmit a more accurate data return. Data files can be submitted as many times as necessary during the submission window for each publication month. This is approximately a two-month window. Find out the MSDS publication dates for each month.
- Providing maternity services with sight of the data submission rejections and warnings helps ensure accurate corrections and made and any data quality themes that emerge can be addressed more comprehensively in how care is recorded.
- A variety of data quality checks are then run on the processed data as part of the validation and load process for monthly data, prior to production of the Maternity Services Monthly Statistics publication. These validated and cleaned monthly datasets are also used for the annual NHS Maternity Statistics publication. Where there are notable concerns about data quality, we contact providers directly so that any issues with local data extraction processes can be addressed for future submissions.
Understanding data quality: What MSDS data is included?
The Maternity Services Dataset (MSDS) is structured by linking data recorded in many different tables to connect all the information related to a specific pregnancy, this linkage is typically but not exclusively via a unique pregnancy identifier. The MSDS receives the populated data tables via monthly submissions from providers, as described in previous sections of this guidance.
A detailed explanation of which MSDS records are included in the annual publication can be found on the 'Records Included' worksheet in the MSDS Metadata file, which is published as part of the annual publication and can be found in the list of resources.
Understanding data quality: Which MSDS issues are most important?
For MSDS data providers, data quality issues within the submission can be prioritised based on:
- whether it is a warning or a validation failure
- whether the data item is mandatory or required
- and at which level the validation issue has occurred.
Each of these criteria will be outlined below.
Data that has successfully been submitted can also then be reviewed in the later statistical publication outputs such as the NHS Maternity Statistics publication, to check for instances of useable but incomplete data such as that shown in the MSDS measures counts of records with a “Missing value / Value outside reporting parameters”.
Understanding data quality: Common mistakes and things to look out for
Some general pointers for data providers to consider are listed below. It is also always important for data submitters to liaise with their local maternity services ask part of rectifying issues with data input:
- Group-level rejections are one of the most common data quality issues, highlighted in DQ reports.
- Invalid format - record level rejections, in many cases an invalid format record level rejection, the root cause may be due to the way the data was imported in the submitted access database causing the ‘leading zero’ to be dropped (for example ‘2’ submitted rather than ‘02’).
- An issue relating to an invalid date format could be due to the way data was imported into the submitted Access database whereby the date format is changed automatically during import.
- Many warnings and rejections can be caused due to incorrect values being submitted for organisation-related data items. This could be due to an issue at the point of data collection. The legacy Bureau Service Portal (BSP) may have historically allowed expired codes, or in certain cases, it may have accepted certain data without issuing warnings or rejections, meaning that some issues may not have been identified.
- Errors can be caused due to the incorrect usage of SNOMED CT codes. Guidance on the use of SNOMED CT can be found in the SNOMED CT Mapping guidance tool.
- In certain cases, where a data item value is populated, a corresponding value is required in another data item field.
- If a large amount of data is submitted outside of the required date range, then numerous rejection messages will be generated back to the provider. This may hinder the provider’s ability to identify 'real' rejection messages that require corrections to be made to “included” data. Users are advised to check the date validation rules prior to submission to identify and submit data that is relevant to the reporting period only.
- When making a submission it is good practice for providers to access the pre-deadline extract, as this will show exactly what records have been accepted for each table.
- Tables and records once accepted for submission can still contain omissions which affect the quality of later measures of the service and care provided. The statistical publications and data quality validation tools, including interactive dashboards, mentioned in this guidance can assist with identifying where these gaps are and what the impact of them has been. They can then help data submitters identify what changes are needed to correct the issues for future submissions.
Understanding data quality: Annual publication figures
As part of the monthly and annual publications using maternity data, a CSV 'Data Quality' file is provided which contains information on the data quality of the MSDS submissions from maternity service providers. This file provides a count of the valid records within a field, as well as the missing and invalid records and the data items which have been left as the default value. This also breaks down the data by item, provider and regions and can be used to assess the limitations of the data used in the annual publication.
This can be used in conjunction with a review of local reporting to identify and understand where data completion and quality issues have arisen.
- An interactive visualisation of this data can be found in the annual NHS Maternity Interactive Dashboard
- Specific types of data entry problem can also be investigated in the Data Quality Dashboard for the MSDS.
Comparisons of the data taken from HES and MSDS can also be broadly used to assess the completeness of the MSDS submissions, as HES is generally considered to be more complete and can thus act as a benchmark against which to measure the completeness of MSDS. However, this greater completion may not apply within every provider and region and relying on it for data quality assessments can risk masking separate data quality issues within HES data. There are also rare provider-level cases where the MSDS figures may in fact be more complete than HES. These comparisons can therefore be a useful starting point for understanding the quality of maternity data, but differences once identified should be investigated thoroughly to understand their underlying causes. Additionally, measure-level comparisons of HES and MSDS data can cause issues when similar sounding measures are in fact constructed differently, meaning that they will inherently lead to different resulting measurements. These MSDS and HES comparisons are available in the annual NHS Maternity Statistics Interactive Dashboard, including for specific measures where relevant. It is important to refer to the metadata information for both datasets when making such comparisons.
Detailed statements on the quality of data from both HES and MSDS can be found in the data quality statements within the written report, which are provided separately for MSDS data, and for HES data.
As stated within the MSDS data quality statement, users of the maternity data must make their own assessment of the quality of the data for a particular purpose, drawing on these resources. In addition, local knowledge or other comparative data sources may be required to distinguish changes in data volumes between reporting periods which reflect changes in actual service delivery, from those that are an artefact of changes in the underlying data quality.
Part 4 - Solving data quality issues
This section is intended to inform providers of maternity data, especially for the maternity service data set (MSDS), on how to resolve data quality issues and monitor the outcomes of such resolutions.
Alongside the practical steps outlined below, we would always encourage data submitters to include maternity service providers in these discussions and investigations and ensure they compare submitted and published data to locally held data. This should support a clearer understanding of how front-line care is translated into data inputting, and how the resulting data submissions are translated into publication outputs.
What can be done?
Last edited: 12 December 2024 1:04 pm