Part of Validation of National Child Measurement Programme data
Post deadline validations
This section describes the additional validation and data quality reporting carried out by NHS Digital once all LAs have submitted data and the collection period has ended .These are summarised below and fuller explanations follow
Summary
This section describes the additional validation and data quality reporting carried out by NHS Digital once all LAs have submitted data and the collection period has ended .These are summarised below and fuller explanations follow
- NHS Digital examines the data quality indicators that the LA has signed off as part of the submission process and queries any which do not match the required conditions.
- NHS Digital carries out additional post deadline validations on the record level data.
- The data quality indicators are published as part of the NHS Digital national report which makes users aware of any LAs who have submitted poor quality data in relation to their peers.
What does NHS Digital do when a data quality issue is identified?
Where issues are identified, NHS Digital contacts the LA concerned and a range of options are available at this time. These include
Allowing the LA to resubmit if there is sufficient time before publication
This option is only used by exception as once the collection period closes there is usually not enough time for NHS Digital to allow LAs to resubmit data and still publish the national report on time.
NHS Digital correcting data after submission
This option is rarely used as outlined in the principles section earlier. An example in previous years was an LA who had submitted all their pupil ethnicity data as one value and realised this was an error when queried by NHS Digital. As there was insufficient time for the LA to collate and resubmit correct ethnicity data for their pupils, a decision was taken for NHS Digital to set all the ethnicity data for that LA to “unknown”.
Flagging data as a data quality concern
This is the most commonly used option. It simply involves adding a data item to the dataset which flags records where there is a concern around the quality of the data. This will have a specific value to alert users as to the reason the record has been flagged and allow them to make an informed decision on whether to include or exclude specific records.
For example the 2013/14 data extract had a data quality flag which took the following values
-
- = no data quality issue
- = records from one school in London Borough of Enfield which had a high proportion of extreme measurements (60 records)
- = records from Wakefield Council where one or more measurement warnings were not suppressed at submission (42 records)
Data quality table
Data providers also have access to a summary data quality table throughout the collection period. More details on this table are given in the following section as it is also used as part of the post deadline validation.
Additional post-deadline validations
The following additional checks are also carried out. All checks are carried out for year R and year 6 data combined unless specified.
- Changes in number measured the number of children measured should not be more than 10% different to the previous year. This is carried out separately for children in reception and year 6.
- Eligible pupil numbers - any LA not providing their own headcounts for more than 90% of schools on their list and with a participation rate below 90% will be queried to ensure the system supplied numbers from DfE are correct.
- Changes in BMI prevalence – the prevalence rate for any BMI category should not change by more than 5 percentage points from the previous collection year.
- Changes in ethnicity groups – the proportion of children in each ethnic category will be queried if it was more than 20 percentage points different to the previous year and the ethnic group proportion was in the top five. However, LAs will not be queried if the ethnic categories of “not stated” or “unknown” have decreased by more than 20 percentage points as this represents an improvement in data quality rather than a potential miscoding error.
- Schools with a high proportion of extreme measurements – the proportion of children in a school with an extreme score should not exceed 10%. This check is carried out separately for height, weight and BMI for both year R and year 6 pupils (but schools with 20 or fewer pupils in that school year will not be queried). Measurements are defined as “extreme” when the measurement z-score is lower than -3 or above 4
- Schools with a high number of extreme pupil postcode to school postcode distance – the number of pupils in a school where the distance from their home postcode to school is greater than 60km should not be 3 or more.
- Schools removed and added by the local authority – the number of schools removed from their school list by a local authority should not exceed 3 or more than the number added.
Last edited: 18 January 2022 8:38 am