Publication, Part of Smoking, Drinking and Drug Use among Young People in England
Smoking, Drinking and Drug Use among Young People in England, 2021
National statistics, Accredited official statistics
Correction to sources of information on drug use data (part 10)
Following the initial publication it was discovered that around half of pupil responses to the question on 'Sources of helpful information about drug use' had been excluded from the results. This was corrected and the affected tables and commentary have been re-issued.
In Part 10: Young people and drugs: the context, the affected outputs were tables 10.19, 10.20 and 10.21, and the associated chart and commentary in the section on 'Sources of helpful information about drug use'. Though some of the quoted figures changed by 0-3 percentage points, there was no effect to the order of contribution of the most common sources.
4 November 2022 00:00 AM
Appendix B: Technical Notes
B1 - Limitations of the statistics
This publication is based on survey data. It is therefore subject to potential limitations inherent in all surveys, including:
- Sampling error: This occurs when the sample of schools and pupils selected to take part in the survey is not representative of all schools and pupils. This is mitigated against by randomly selecting schools and pupils within schools. Sampling error will also vary to a greater or lesser extent depending on the level of disaggregation at which results are presented.
- Non-response error: This occurs when those pupils who take part in the survey are not representative of all pupils. It results in systematic bias due to non-response by schools and pupils selected to take part. In an attempt to correct for differential non-response, estimates are weighted using population totals (more details on weighting are available in appendix A8).
- Survey coverage: SDD covers pupils in all mainstream state maintained and independent schools in England. Some small groups of schools are not included such as pupil referral units (PRUs) and it may be that pupils in PRUs are more likely to partake in risky behaviours than other pupils. In addition, pupils who were playing truant, or were excluded or were absent for other reasons are not generally included in the survey (although there were “mop-up” visits to a small number of schools to capture pupils who were not there on the day).
- Sample size: Although SDD has a relatively large sample size detailed analysis of some subgroups of pupils may require several years of data to be combined for some analysis.
- Under reporting of risky behaviours: Pupils may not feel they wish to admit to some of behaviours asked about in the survey for fear of how this will be perceived by the interviewer or their teachers despite the questionnaire containing no personal identifiable information such as their name. They may also feel bad about undertaking such behaviours so may not want to admit to them. However, previous analysis has shown SDD to provide the most accurate measures of undertaking in risky behaviours (See section A8 of Health and Wellbeing of 15-year-olds in England - Main findings from the What About YOUth? Survey 2014)
Since the results compared in this report are from surveys in the SDD series conducted in a similar way and using the same methods of collecting information, other types of error should be similar on each survey and so will not affect comparisons. However, it is also possible that social desirability of these behaviours may affect whether pupils over-report or under-report, and as social desirability may change over time this may affect comparability.
B2 - Confidence intervals, significance testing and sampling error
Estimates in this report are subject to uncertainty due to sampling error as they were obtained from a relatively small sample of all pupils in secondary school (i.e. the “eligible population”). Any sample is only one of an almost infinite number that might have been selected which would all produce slightly different estimates. Sampling error stems from the probability that any selected sample is not completely representative of the population from which it is drawn.
It is possible to calculate the level of uncertainty around a survey estimate by exploring how that estimate would change if many survey samples were taken for the same time period instead of just one. A range of uncertainty can be placed around the survey estimate which is called the Confidence Interval. Confidence intervals are typically set up so that users of the data can be 95% sure that the true value lies within the range – in which case this range is referred to as a “95% confidence interval”. Confidence intervals can be used as a guide to the size of sampling error and a wider confidence interval indicates a greater uncertainty around the estimate. Generally, a smaller sample size will lead to estimates that have a wider confidence interval than estimates from larger sample sizes. This is because a smaller sample is less likely than a larger sample to reflect the characteristics of the total population and therefore there will be more uncertainty around the estimates derived from the sample.
Statistical significance is a concept that says whether an estimated value is likely to have arisen only from variations in the survey’s sampling. It is most often used when talking about a change over time (e.g. comparison to the previous survey) or a difference between groups (e.g. between boys and girls). A statistically significant change or difference is one that is not likely to be due only to sampling, and therefore likely to be a real change or difference. Plotting estimates and their confidence intervals (a measure of the uncertainty of an estimate) gives an indication of whether or not a difference is significant. In general, if the confidence intervals of two estimates do not overlap, the estimates are significantly different. If they do overlap however then they may still be statistically significantly different and so a significance test is required. The following formula produces a statistic which can be compared with the normal distribution to see if it is statistically different to zero. If it falls outside the 2.5 or 97.5 percentile then it suggests the difference is statistically significant:
\({p1 - p2} \over \sqrt{(s1^2+s2^2)}\)
p = prevalence estimate from the first category
s = standard error from the first category
p = prevalence estimate from the second category
s = standard error from the second category
In general, attention is drawn to differences between estimates in this report only when they are significant at the 95% confidence level. This indicates that there is less than 5% probability that the observed difference is due to sampling variation rather than a real difference occurring in the population. The excel confidence interval data tables give true standard errors and 95% confidence intervals for the sample design for a range of key outputs broken down by age, gender, region and ethnicity. Standard errors and design effects (defts) were calculated in R studio, using a Taylor Series expansion method.
The deft is a measure of the efficiency of the sample design, with a value greater than 1 indicating statistical inefficiency in the sample. The deft can be interpreted as the relationship between the achieved sample size and the number of pupils that would be needed from a simple random sample to achieve the same level of precision.
B3 - Logistic regression analysis and odds ratios
B3.1 Running the model
Logistic regression modelling has been used in this report to examine the factors that might be associated with selected outcome variables after adjusting for other factors. Models were constructed for outcomes of interest: current smoking, drinking alcohol in the last week and taking drugs in the last month.
The models included a variety of variables (factors) relating to pupil characteristics (e.g. age, sex, region, smoking, drinking, drug use, family deprivation). Although each model used comparable variables as far as possible, they also included variables specific to particular outcomes. For example, the current smoking model included families’ attitudes to pupils’ smoking and recall of lessons on smoking but not recall of lessons on drinking or drugs misuse.
For each model, the variables are grouped into a number of discrete categories which includes missing values. Sample code to create a model is shown below.
The placeholders 'var1', 'var2' and 'var3' (in green) represent where the variables would be entered (e.g. age, region and sex), and 'indicator' represents the dependent variable being modelled (e.g. a binary variable to reflect whether the pupil is a current cigarette smoker).
The full modelling process was developed using a mixture of R and Python code and the complete codebase can be found in GitHub.
The final models were developed using an iterative process, by comparing a number of different variables and testing for significant associations. Variables, in each model, were rejected if the association was not significant (p < 0.05).
The results of the regression analysis, which includes the significant variables, are presented in data tables 1.10, 5.26 and 8.10. The results include odds ratios for the final models, together with the probability that each association is statistically significant.
Details of the variables included in the different models can also be found in the GitHub codebase.
B3.2 Interpreting the odds ratios
The models show the relative odds of the outcome of interest (e.g. current smoking) for each category of the explanatory variable (e.g. being a boy or a girl).
For categorical variables, odds are expressed relative to a reference category with a given value of 1. Odds ratios greater than 1 indicate higher odds (increased likelihood), and odds ratios less than 1 indicate lower odds (reduced likelihood). 95% confidence intervals for the odds ratios are shown. Where the interval does not include 1, this category differs significantly from the reference category.
For continuous variables, there is a single p-value. Continuous variables do not have a reference category; the odds ratio represents the change in odds associated with each additional point in the range (for example each extra year of age, or unit of alcohol drunk). Again, the 95% confidence interval is shown, and the odds ratio is significant if the interval does not include 1.
B3.3 Interpreting the C-statistic
The c-statistic can be used to assess the goodness of fit, with values ranging from 0.5 to 1.0. A value of 0.5 indicates that the model is no better than chance at predicting membership in a group. A value of 1.0 indicates that the model perfectly identifies those within a group and those not.
Models are typically considered reasonable when the c-statistic is higher than 0.7 and strong when the c-statistic exceeds 0.8 (Hosmer and Lemeshow, 2000).
B3.4 Estimates for contribution of each variable to the model
The complexity of the interactions between variables in a model makes it difficult to untangle their relative contributions. Nonetheless, an estimate of a variable’s contribution can be made using the model comparison technique.
For example, consider a simple model containing three variables: age band, ethnicity and sex and a dataset that has 1,000,000 possible pairs where one record has the outcome (1) and the other does not (0). When run, the model assigns a higher probability to (1) than for (0) for 700,000 pairs (correct guesses), giving a concordance (or c-statistic) of 0.700.
Real models are also likely to have ties, which are counted as half a correct guess. To perform a model comparison, the logistic regression is re-run with one (and only one) of the predictor variables removed each time. The resultant c-statistics indicate how much the removed variable contributed to the final model.
In this example, when the age band is removed, the number of incorrect guesses increases from 300,000 to 450,000, a difference of 150,000. Therefore, the inclusion of the age band reduces the proportion of incorrect guesses by 33.3 per cent (150,000 / 450,000): this is the estimated contribution. Using the same methodology, ethnicity and sex have estimated contributions of 6.3 and 0.3 per cent, respectively. It can therefore be deduced that age band makes the largest contribution to the model’s predictive power.
This is illustrated in the table below. Please note that estimated contributions cannot be added together.
Table 1: Example of estimating contribution to the logistic regression model | |||||||
Model | c-statistic | Pairs | Guesses | Reduction in incorrect guesses when included |
|||
Correct | Incorrect | ||||||
000s | 000s | 000s | 000s | % | |||
All variables | 0.700 | 1,000 | 700 | 300 | |||
Variable excluded | |||||||
Age band | 0.550 | 1,000 | 550 | 450 | 150 | 33.3% | |
Ethnicity | 0.680 | 1,000 | 680 | 320 | 20 | 6.3% | |
Sex | 0.699 | 1,000 | 699 | 301 | 1 | 0.3% |
B4 - Extending the fieldwork into the spring term
The fieldwork in surveys prior to 2016 has been conducted during the autumn school term. It has always felt important to complete the fieldwork by the end of the autumn term as anecdotally it has been believed that some pupils will have their first experience of the partaking in risky behaviours (particularly drinking alcohol) during the Christmas period either with their peers or at family gatherings.
However in all surveys since 2016, it was decided to continue the fieldwork until the end of January to boost the number of schools and therefore pupils who took part in the survey.
It was a concern that extending the fieldwork in this way could lead to the estimates of drinking prevalence being inflated and there could also be some impact on smoking and drugs misuse. It was felt therefore that whilst it may be possible to include the post-Christmas sample in some of the report tables such as attitudinal responses or source of alcohol, it could be that the impact was so great on the drinking prevalence estimates that pupils surveyed after Christmas should be excluded from some of the prevalence tables. The advantage of including them in the analysis was that they would greatly increase the sample size and therefore reduce the confidence intervals around the survey estimates.
Following the 2016 survey, there were two pieces of work carried out to test the impact of including the post-Christmas sample in the analysis, details of which can be found in Appendix B4 from the 2016 report.
This concluded that extending the fieldwork by just one month was not found to be sufficiently significant to exclude those pupils surveyed after Christmas from the main prevalence estimates.
B5 - Measuring alcohol consumption
B5.1 Conversion to units
Pupils who had drunk in the last seven days were asked how much they had drunk in that period. Their answers were used to calculate their consumption in units (one unit of alcohol is equivalent to 10ml by volume of pure alcohol). These questions about alcohol consumption have been asked in a consistent way since 1990, with minor changes in 2002. The questionnaire specified six types of drink; for each type, pupils were asked whether they had drunk any in the last seven days and, if so, how much. The types of drink covered in the questionnaire (with the quantities asked about for each) were:
- Beer, lager and cider: pints, half pints, large cans, small cans, bottles
- Shandy: pints, half pints, large cans, small cans
- Wine, martini, sherry: glasses*
- Spirits and liqueurs (e.g. whisky, vodka, gin, tequila, Baileys, Tia Maria): glasses
- Alcopops (e.g. Bacardi Breezer, Reef, Smirnoff Ice, Vodka Kick, WKD): small cans, bottles.
- Other alcoholic drinks.
*Before 2014, wine was asked about separately from martini and sherry. The two categories were combined in recognition that there is increasing convergence in the alcoholic content of the drinks within these categories.
Pupils who had drunk beer, lager or cider were asked if they usually drank normal strength or strong beer (alcohol volume of 6% or more). For the 2016 survey an additional category was added where the pupil could say they didn’t know the strength and the implication of this change is discussed later in this section.
Attempting to accurately measure alcohol consumption among 11 to 15 year olds presents similar but not identical challenges to surveys of adults. For both adults and children, recall of their drinking can be erroneous; a generally acknowledged problem for all surveys measuring alcohol consumption. Also, the majority of pupils’ drinking is in informal settings, and the quantities they drink are not necessarily standard measures. In addition, the survey method limits the amount of detail that can be recorded about the alcoholic strength and quantities drunk, so that, to convert actual drinks into units of alcohol consumed, it is necessary to make consistent assumptions about the strength and size of each type of drink.
Since the established unit measurement was introduced in 1990 there have been significant changes in the alcohol content of drinks and the variability in glass size. As a result, the 2006 General Household Survey and the Health Survey for England changed the method by which adult alcohol consumption is converted into units of alcohol. The 2007 report in this survey series revised the method of calculating units in line with these surveys of adults and reported ‘original’ and ‘revised’ units of alcohol. This resulted in a higher, more accurate estimate of alcohol consumption among pupils, and reflected a likely gradual change in drinking behaviour since the 1990s. From 2008, consumption has been shown only in ‘revised’ units and so direct comparisons between consumption of alcohol from 2008 onwards and trend data based on the original units from 2006 and before are not possible.
The conversion factors used in this report are shown in the table below.
Table 2: Conversion factors used to estimate consumption in units of alcohol
Type of drink |
Measure |
Units of alcohol |
Beer, lager or cider |
Pint |
2 |
Half pint |
1 |
|
Large can |
2 |
|
Small can or bottle |
1.5 |
|
Less than half a pint |
0.5 |
|
Shandy |
Pint |
1 |
|
Half pint |
0.5 |
|
Large can |
0 |
|
Small can or bottle |
0 |
|
Less than half a pint |
0.25 |
Wine, martini, sherry |
Glass |
2 |
|
Less than a glass |
0.5 |
Spirits and liqueurs |
Glass |
1 |
Less than a glass |
0.5 |
|
Alcopops |
Can or bottle |
1.5 |
Less than a bottle |
0.75 |
Where pupils have indicated that they normally drink strong rather than normal strength beer, lager or cider, the number of units has been multiplied by 1.5. Where they indicated they did not know the strength then the number of units has been multiplied by 1.25.
B5.2 Question on alcohol strength
As mentioned previously, the methodology for the number of alcohol units consumed allows for an adjustment to be made if the pupil normally drinks strong beer, cider or lager. The adjustment is to multiply the units of alcohol consumed via beer, cider and lager by 1.5 if the pupil indicates they normally drink stronger variants. Prior to 2016, the only options were “normal” and “strong” but during cognitive testing of the survey questions some pupils reported they were often unsure about the strength of the alcohol they were drinking so they were either not answering the question or guessing. Therefore an additional option of “don’t know” was added from 2016 onwards, although this then presented a challenge of the level of multiplier which should be used for pupils who chose the “don’t know” option. The following scenarios were considered:
- Use a multiplier of 1 - i.e. treat ”don’t knows” as if they answered “normal” which is the same as those who didn’t answer the equivalent question in previous years were treated.
- Use a multiplier of 1.5 - i.e. treat the “don’t knows” as if they answered “strong”
- Use a multiplier of 1.25 - i.e. an average of “strong” and “normal”.
For the 2016 survey, the impact of the different scenarios on the mean number of alcohol units drunk by age and sex of the pupil was examined. It showed that the differences were very small and therefore it was decided to use a multiplier of 1.25 as it is an average between strong and normal strength.
B6 - Measuring wellbeing
From 2018 the survey moved to measuring wellbeing using questions recommended by ONS. These questions represent a harmonised standard for measuring personal wellbeing and are used in many surveys across the UK.
Pupils were asked to rank their feelings from 0 to 10 in relation to the following questions:
- Overall, how satisfied are you with life nowadays?
- Overall, to what extent do you feel that the things you do in life are worthwhile?
- Overall, how happy did you feel yesterday?
- Overall, how anxious did you feel yesterday?
The responses were then allocated to one of 4 categories per wellbeing question as shown in the table below.
Table 3: Personal well-being thresholds
Life satisfaction, life worthwhile, and happiness scores |
Anxiety scores |
||
Response on an 11 point scale |
Label |
Response on an 11 point scale |
Label |
0 to 4 |
Low |
0 to 1 |
Very low |
5 to 6 |
Medium |
2 to 3 |
Low |
7 to 8 |
High |
4 to 5 |
Medium |
9 to 10 |
Very high |
6 to 10 |
High |
Source: Office for National Statistics |
B7 - Measuring family affluence
Family affluence is measured as per the methodology used by the HBSC (Health Behaviour in School-aged Children), a research collaboration with World Health Organisation offices across Europe and North America: http://www.hbsc.org/
It uses the following questions to produce an overall family affluence ranking of low, medium or high:
- Do you have your own bedroom for yourself?
- Does your family have a dishwasher at home?
- How many times did you and your family travel outside of the UK for a holiday last year?
- How many computers (including laptops and tablets, not including game consoles and smartphones) does your family own?
- How many cars, vans or trucks does your family own?
- How many bathrooms (room with a bath/shower or both) are there in your home?
Last edited: 10 October 2024 11:40 am