Skip to main content

Data sharing standard 3 - Data Minimisation

This standard is part of a series of guidance documents to support the various stages of a DARS application.

 

Standard description

The General Data Protection Regulations state that:

Personal data shall be adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed (data minimisation)

This means that the amount of data requested must be justified by the purpose stated within the application. In assessing this the following questions will be considered. Additional justification is required where identifiable and/or sensitive data, and/or data for patients aged under 16, is requested.

Datasets

All datasets should be relevant to the purpose.

  • Is it possible to reduce the number of datasets requested? If not, why not?
  • Can the purpose be achieved in a less intrusive way? For example, can the purpose be achieved using anonymised or pseudonymised data?

Years

The number of years must be justified within the application.

  • Is it possible to reduce the number of years requested? If not, why not?

Filtering

Explain why the data for the number of patients is supported by the purpose.

  • Can the data be narrowed by geography? If not, why not?
  • Can the data be narrowed by demographics (such as age)? If not, why not?
  • Can the data be narrowed by clinical factors (such as diagnosis/procedure)? If not, why not?

Episodes

  • Are all the patients' episodes required to achieve the purpose? If so, why?
  • Are all elective episodes required to achieve the purpose? If so, why?
  • Are maternity episodes required to achieve the purpose? If yes, are the unborn child and neonatal records necessary? If so, why?
  • Is there a timeframe around the index event (such as procedure or diagnosis) required? If so, why?

Fields

  • For the records requested, are all fields necessary to achieve the purpose? If not, why not?
  • If identifiable/sensitive fields have been chosen, is it possible to reduce the risk of intrusion? (for example, flag for 30-day mortality rather than full date of death, or survival days which will give an exact age in days but we would supply DOB/DOD, or replacing specific diagnosis codes with categories)

Cohorts/linkages

  • For a data linkage can additional filters be applied? For example, Hospital Episode Statistics (HES) data linked to mental health data but only requiring HES records where there is an associated mental health record.
  • Can a HES cohort be created to minimise the data being provided? For example, if a customer is interested in all episodes for patients with a specific diagnosis/procedure, we can find the HES IDs for these people and provide all episodes for these people (meaning HES is filtered by condition, and Office for National Statistics mortality records are only provided for patients who appear in the HES extract).

Some of the above considerations will require the relevant production team to discuss with the customer. Some can be answered straight away by the customer.

Last edited: 12 May 2025 12:11 pm