Openly sharing and preservation of research data after my research project has finished
Sharing and making data open access
As a result of the national and international push towards ‘open research’, it’s the policy of the University and all major funders that you make the research data which underpins your publications open access (i.e. publicly available to download) at the end of your project, unless there is an ethical, legal or contractual reason not to do so. Not only does this greatly increase the transparency of the research process by allowing results to be re-analysed and conclusions re-tested, it also enables data to be reused by other researchers, industry, charities, governments and even the general public for creative and innovative purposes.
Do I have to share my research data?
It is the University’s policy (along with the policy of major funders) that you publicly share your research data if it underpins published research findings or has potential long-term value.
The caveat to this policy is that if there is a legal, contractual or ethical reason not to share your research data, then you do not have to do so. For example, there may be a contractual requirement from a commercial partner to keep research data confidential. Also, you do not have to share your research data if you have plans for future commercial exploitation. Similarly, you must not share your research data if it wouldn't be legal/ethical to do so. For example, if it's not possible to fully anonymise your research data (i.e. remove all personally identifiable information) and sharing it would violate the GDPR.
Sharing personal data
You must NOT share personal data (i.e. unanonymised data) beyond the originally agreed project team, unless you have explicit consent from participants allowing their personal data to be shared.
Therefore, to share this type of data, you must first fully anonymise it and you should still gain participants' consent for sharing. To explain this a bit further, once the data is anonymised it's no longer covered by the DPA/GDPR. However, even when the data is anonymised, you should still make it clear to participants that other processing will occur, for example, an anonymised report of participation being included in open research. Thus, although data is no longer covered by the DPA/GDPR once it has been anonymised, from an ethical perspective you should still gain participants’ consent for sharing the data.
In the more unusual scenario where the research requires that the data is shared publicly in a format that identifies the individual, the participants must be clearly told this before they take part in the research so they can decide not to engage with the research at all. The data can only be used and shared in an identifiable format if explicit consent has been given by the participants allowing them to be identified, but even then, you should also be prepared to remove this data if the individual changes their mind.
Guidance on Sharing Qualitative Data
The following guidance is intended to provide a brief overview of the types of qualitative data and associated metadata that should be shared.
Please note: If your research involves human subjects, you need to gain their consent to both take part in your research and also to share their data, even if it is anonymised. Guidance on suggested wording is provided via the Ethics page, in particular for Participant Information Sheet guidance (Section 12) and Consent Form guidance ('Wider use of data, tissue, DNA' section).
Types of qualitative data
Examples of types of qualitative data that are potentially eligible to be archived for secondary analysis include:
- In-depth/unstructured interviews, including video
- Semi-structured interviews
- Structured interview questionnaires containing substantial open comments
- Focus groups
- Unstructured or semi-structured diaries
- Observation field notes/technical fieldwork notes
- Case study notes
- Minutes of meetings
- Press clippings
- Court transcripts
Confidentiality in qualitative data
You should remove information that would allow any of your research subjects to be identified. (In unusual cases when you have consent from subjects to share non-anonymised data, this doesn’t apply). This process can be made less arduous by creating an anonymisation scheme prior to data collection and anonymising the data as the qualitative files are created for the analysis.
Guidance on the anonymisation of qualitative data is available at the UK Data Service.
Where the anonymisation of data would make the data useless or the act anonymisation would require excessive effort (e.g. the blurring of a significant amount of video), or the data are considered too sensitive then restricted access control should be considered.
Documentation for qualitative data (metadata)
Any information that could provide context and clarity to a secondary user should be provided. In order for qualitative data to be used in secondary analysis (i.e. by others), it is extremely important that the data are well-documented. The UK data Service provide guidance on study-level and data-level documentation.
You should also document any modifications that you have made so as to mask confidential information in the qualitative data. Such information (ie: where you have modified data for this reason) should be made available to secondary users of the data so as to assist them with their use of the data.
Checklist - Annex A covers the metadata you should (as applicable) provide.
(Adapted from: ICPSR (n.d.). Retrieved from https://www.icpsr.umich.edu/web/pages/datamanagement/index.html )
Where can I share my research data?
In terms of which repository (archive) to use for long-term preservation and to make your data publicly available and open access, researchers need to follow these steps:
- First, if you are externally funded and/or have external collaborators, see if your funder(s)/collaborator(s) require the data to be deposited in a particular repository, e.g. all research funded by the ESRC needs to be deposited in the UK Data Archive.
- If not, then see if a recognised, accredited subject or disciplinary international or national repository exists where the data can be deposited. This is advised as it affords the research greater visibility. The https://www.re3data.org/ provides a useful index to a large number of repositories.
- Finally, if a suitable external repository doesn’t exist, then the Portsmouth Research Portal (Pure) can be used to archive your research data and make it open access (see Adding Datasets to Pure instructions).
You may also want to consider promoting your dataset by publishing in a data journal. The Australian National Data Service provide a brief overview of data journals and their purpose.
In all of the above scenarios, a metadata record (and link to external repository if relevant) also needs to be added to Pure so that the University has a record of the data produced.
If you would like help in selecting a repository, then please contact the University’s Research Data Officer - firstname.lastname@example.org.
What else must I do to share my data?
Your data should be stored in a format suitable for long-term preservation. This is to ensure that it can still be accessed over time, despite changes in software and hardware used to originally create or access it. Thus, the research data should be stored in an open standard format, and where applicable (e.g. when storing images) this format should be lossless. See the UK Data Service for further help or contact email@example.com.
You also need to ensure that you also provide sufficient documentation / metadata for someone else who wishes to use your data to understand it. Please see the documentation section.
Plus, you should specify what licence (e.g. CC BY) you intend to apply to your data, as this enables it’s reuse while also ensuring you gain full credit for having produced the data.
How do I obtain a DOI for my data?
A digital object identifier (DOI) is a unique alphanumeric string that’s used to identify content and provide a persistent (i.e. won't change) link to where it is on the Internet. When you publish a journal article, your publisher assigns a DOI to your article. DOIs also need to be assigned to datasets to enable people to find them more easily.
When you deposit your data in any of the above types of repository, a DOI will be automatically created for you.
Clinical trials - registration
It has been estimated that 85% of research funding and effort is wasted because it "asks the wrong questions, is badly designed, not published or poorly reported” (Chalmers and Glasziou article in The Lancet). Research registration is one important tool that has been developed to address the problem of reporting bias. It is a legal requirement for some types of trials to be registered, but the Declaration of Helsinki gives strong methodological and ethical reasons for all other types of research involving human participants to register.
If you are an academic or researcher working at the University of Portsmouth and you are running clinical trials (including experiments and studies involving humans in medical and health care areas) then you need to do one of the following.
If the University of Portsmouth is the sponsor* of your research, then you must:-
- Register the research with a relevant registry. Advice for registration of health-based research can be found on the NHS Health Research Authority website. Please be aware in your choice of registries that some do make a charge. You will need to arrange for this to be met from research funds before registering.
- Then email firstname.lastname@example.org and provide a link to your registered study so that the University can record the registration.
- Update the registry when the study is complete. If you fail to do this, your Head of Department will be informed and it will ultimately be regarded as academic misconduct.
If an NHS body (e.g. Portsmouth Hospitals Trust - PHT) is a sponsor* of your research, then you need to contact the research office of the hospital that you're working with. Please use the NHS directory to find the contact details. If another organisation (e.g. another university) is the sponsor, then you need to liaise with your collaborators to ensure they have registered the research.
*A sponsor is the organisation that has overall responsibility or control of the study. I.e. manage the ethical approval process, produce the annual reports, responsible for the finance, manage the running of the study etc. If you are unsure who your sponsor is, then please contact email@example.com .
Can I share my research data with some restrictions imposed?
You can do this, but only if you have a legal, contractual or ethical reason to do so.
One example of a restriction is that the degree to which the research data are made open access may vary. For example, data could be provided on request (as opposed to being free to download), or the data could be provided subject to certain terms and conditions. Or, for example, there may be a contractual licensing requirement from a commercial partner to keep the research data confidential until after an embargo period. One way of sharing data, but restricting access, is to apply a suitable licence.
If you are sharing sensitive or confidential data externally, you will need to put a data sharing agreement in place, please contact firstname.lastname@example.org for advice.
How can I retain my research data without sharing it?
In addition to making your research data open access wherever possible (as noted above), you may also have some other parts of your research data which you need to retain at the end of your project but can’t make open access for ethical, legal or contractual reasons - e.g. consent forms, participant contact details and unanonymised raw data kept for follow up research.
When collecting the personal or sensitive personal data, the research participants should be told whether or not it will be necessary to retain the data in an identifiable format and therefore asked for their consent to this ‘further’ processing. Furthermore, it should be made clear whether the data in its raw or non-anonymised form will be used for any further research or only research approved by an ethics committee, and whether data will subsequently be anonymised. Where applicable, paper records may be scanned and originals destroyed.
All retained (but not shared) research data needs to be stored securely to prevent unauthorised access using one of the solutions described in Data storage and security while my research project is taking place. All personal and sensitive personal data needs to be stored in accordance with the DPA/GDPR, as described in Managing personal data.
How long after the project should I store my research data for?
You need to look at the various policies that impinge on your research data and use the longest retention period. The retention applies to both the data and consent forms. The policies that will typically apply are:
- When it comes to the research data that you’ve shared publicly (see above), the overarching goal is long-term preservation. Each repository imposes their own minimum period for keeping your data.
- Your research data is also subject to the University’s Research Data Management Policy and Retention Schedule 7. Project Records, which has a minimum requirement of 10 years (see section 7.3.2.).
- Plus your data may be subject to the policy of a particular funder. For example, the Medical Research Council has a retention policy of 30 years.
Also, to complicate things further, if you are storing personal data then the GDPR may come in effect if a participant requests that you destroy their data (see below).
How can I link my publication to my research data?
You should include data availability statements in all of your publications. This may be an explicit statement in the article and many journals have data statement templates you can follow.
However, not all journals have data statement sections. Therefore you may have to include equivalent text elsewhere in the article text.
Data statements can be thought of as falling into four broad categories:
- No new data
- The data are in the paper
- The data are openly available and links provided
- Access to the data is restricted in some way, e.g. for legal/ethical reasons, it's restricted 3rd party data, available on request from the author(s) etc.
Suggested wording for data statements/equivalent text can be found here.
You may need to plan how you will destroy your research data when you no longer need to retain it. See above for retention periods.
Some studies entail the destruction of any personal data following analysis for research purposes while the anonymised data is retained for at least the minimum time of the applicable retention period. Note that unless there is a justification for deletion/destruction, anonymised research data will not automatically be deleted. For example, a justification for destruction might be continuing prohibitive costs of keeping a large dataset.
With regards to the GDPR and the destruction of data, if you have used ‘consent’ as your ‘legal basis’ for collecting and processing their personal data then you do have to delete it if the participant requests you to do so. If you have already anonymised it and so cannot delete it because you cannot identify it, then you would need to explain that to the participant.
Preserving data that cannot be shared: Research Data Archive
Research data that cannot be shared can be stored in our Research Data Archive.
This is a secure archive set up on the University's network storage. Please contact email@example.com.