Data classification decision tool for research

My research data includes… Data classification
Personal health information (PHI).[1] Level 4
Data subject to export controls or the Controlled Goods Program.[2] Level 4
Other sensitive research areas and data, including but not limited to [3]:

  • Confirmed dual-use (military, intelligence or dual military/civilian applications) potentiality.
  • Biological agent and toxin biosecurity, including security sensitive biological agents (SSBA).
  • National security/strategic implications.
Level 4
Personal data classified as “extra sensitive” or similar under General Data Protection Regulation (GDPR) or equivalent privacy legislation. Level 4
Identifiable human subjects’ data:

  • Directly identifiable information.
  • De-identified data that can be re-identified or linked using publicly available data.
  • Collections/constellations of variables or indirectly identifiable information that, when merged, becomes sensitive.
Level 3
Administrative records or data used for research purposes:

  • Student records.
  • Employee records.
  • General-purpose emails and business records.
Level 3
Data classified as confidential[4] or sensitive by partners (data use agreements), funding agencies, research ethics boards, legislation, regulations or the researcher.

The following is a non-exhaustive list of considerations that can help parse out the degree of risk or potentiality of harm present within one’s research data:

  • Vulnerability of the individual or community from which the data originates.
  • Social and cultural norms, wherein disclosure of controversial or stigmatized behaviour would be concerning or harmful to the individual’s wellbeing.
  • Local laws and geopolitical situations, wherein disclosure of information would be concerning or harmful to the individual’s wellbeing.
  • Likelihood that nation state, criminal or other malicious groups or individuals might want to steal, halt, destroy or alter research data.
  • The financial, reputational/social, psychological, behavioural, legal, and/or physical risk, impact or harm that an unauthorized disclosure might cause to the data subject, community or researcher.
  • The volume of data stored, wherein the scale of information which could be affected by a possible unauthorized disclosure requires additional security controls to limit risk.
  • The data subjects’ ability to provide consent to the use of their data for research purposes.
Level 3 Level 4
Non-identifiable human subjects’ data (non-PHI):

  • De-identified information (e.g., anonymized and/or coded information).
    • Note: The code or data keys for purposes of re-linkage are classified at the same level as the original, uncoded data.
  • Anonymous information where no identifiers were collected.
Level 2
Most active and/or unpublished research and intellectual property, by default (unless otherwise classified). Level 2
Published research data under embargo by publisher or other body. Level 2
Published research not subject to embargo or beyond embargo period. Level 1
Publicly available data and datasets. Level 1
Unpublished research and intellectual property (not otherwise classified), which the Principal Investigator (PI) wishes to be made generally accessible. Level 1

Footnotes