Research data: life cycle and archiving

Research support home page Contents list for research support

 

 

(Ingram, 2016)

In recent years, there’s been a global drive towards open science, the public sharing of research data and good data management practice. This page explains what this means in practice for academic staff/researchers/doctoral students at Portsmouth. (This information is also useful for undergraduates and masters students, but always consult your tutor/supervisor in the first instance.) If you have any RDM queries please contact researchdata@port.ac.uk.

This is the important distinction between the 2 stages of research data, which is worth keeping in mind when reading this page.Live versus archive data types

Page contents list:-

Introduction to Research Data Management (RDM) Support at UoP

What is research data?
What is RDM? (a.k.a “What’s in it for me?”)
When do I need think about RDM?
Why is open science and data sharing important?
What support is available and who can help me?
What is the University and funders' policy on research data?

 

Data Management Plans (DMPs)

What is a DMP?
Do I have to write a DMP?
How do I write a DMP?
How do I work out the cost of a project?

 

Reusing Existing Data

Can I use someone else’s data?
Where can I find existing data to reuse?

 

Data Creation and Organisation

What do I need to consider when creating and organising my data?
What documentation do I need to produce?

 

Managing personal data

What is personal (and sensitive personal) data?
How do I store personal (and sensitive personal) data?

 

Data storage and security WHILE my research project is taking place

Digital storage
USB sticks, external hard drives and similar
Transferring personal data via email
Paper records

(Back to the top)

Openly sharing and preservation of research data AFTER my research project has finished

Sharing and making research data open access
Do I have to share my research data?
Sharing personal data
Where can I share my research data?
What else must I do to share my data?
How do I obtain a DOI for my data?
Can I share my research data with some restrictions imposed?
How can I retain my research data without sharing it?
How long after the project should I store my research data for?
How can I link my publication to my research data?
Destruction

 

Intellectual Property Rights, Copyright and Licensing

What do I need to know about Intellectual Property Rights? (a.k.a. who ‘owns’ research data?)
What do I need to know about Copyright?
What do I need to know about Licensing?

 

Ethics

What do I need to consider?


Useful links

Summary of key links.

 

References

 


If you’ve spotted something that’s missing from these pages or need more information, then please contact the University’s Research Data Officer, Dr Gary Pike, researchdata@port.ac.uk.

(Back to the top)


Introduction to Research Data Management (RDM) Support at UoP

What is research data?

In a nutshell, research data is information (in any format) that you collect during the course of your research and from which you draw conclusions or make inferences. It can range from raw measurement data collected in a spreadsheet, to transcribed interview text, to qualitative observational data ... the list is endless!

The University’s Research Data Management Policy (page 4) provides a more detailed definition. You may also find the definitions adopted in RCUK's Concordat on Open Research Data (page 3) useful.

 (Back to the top)

What is RDM? (a.k.a “What’s in it for me?”)

Your research data is extremely valuable; both to yourself as an individual and the research community, and wider society as a whole. So it is important to take good care of it.

Research Data Management (RDM) is about efficiently organising data in your project. It is concerned with the creation, processing and preservation of your data during the lifetime of your project and beyond. It’s an umbrella term for a group of actions and processes, some of which you probably already engage with, for example:

Using data formats that can be easily accessed and shared.

Controlling file versions.

Organising your files and storing them securely.

Backing up your data to prevent loss.

Controlling access to data, particularly personal data.

Documenting processes, analytical methods and issues with the data (i.e. metadata) to allow your work to be verified by others (...or to remind yourself what you actually did!).

Depositing/archiving data at the end of project to enable it to be shared and made open access wherever possible.

(Back to the top)


When do I need think about RDM?

(Ingram, 2016)

You might think of RDM in three phases during the lifecycle of your project:

  1. At the start of your project, while planning and mapping out how you intend to create, process and preserve your data.
  2. During the project, where you manage your data on a day-to-day basis, documenting processes and revising your plan if necessary.
  3. At the end of the project, to ensure that the data and metadata to be archived (and made open access whenever possible) is fit for purpose.

 (Back to the top)

Why is open science and data sharing important?

With the increasing international momentum towards open science, there is growing emphasis on openly sharing data and publications. Sharing research data enables the widest benefits to be gained from the data - for other researchers, businesses and the wider benefit to society at large. Given that the majority of research is funded with public money, it’s only reasonable to share the data produced.


Sharing research data also has direct benefits to you:

For research integrity and transparency - it allows you to verify the processes and conclusions drawn by other researchers (and likewise allows other researchers to verify your research).

It makes your outputs more visible and potentially improves citation rates (see Piwowar & Vision, 2013).

It enables collaboration by facilitating the reuse and sharing of data, and avoiding the duplication of work were possible.

 

...and good RDM practice in general also benefits you:

It reduces the risk of inadvertent data loss / unintended disclosure.

It helps continuity within the project should team members leave / new members join.

 

These significant benefits have been recognised by national and international organisations. Therefore it’s increasingly a policy requirement and expectation of major funders, institutions and publishers that (wherever possible) research data is shared publicly and made open access at the end of a project. Effective data management is necessary to make that happen.

 

Sources / further information:

The section below on how to share your research data gives practical advice.

Source: adapted from Ingram (2016). JISC ‘How and why you should manage your research data: a guide for researchers’.


Concordat on Open Research Data
http://www.rcuk.ac.uk/documents/documents/concordatonopenresearchdata-pdf/


Open University
http://www.open.ac.uk/library-research-support/research-data-management

 (Back to the top)

 

What support is available?

Although RDM is a complex area, there's a lot of support available to help you at every stage in your project.

The University’s Research Data Officer 

 

The Research Data Officer, Dr Gary Pike, is a part of the Research Outputs Team based in the University Library. He is available to offer RDM advice to academic staff at any stage of a project. Please feel free to contact researchdata@port.ac.uk.

Gary also works with the Research and Innovation Services (RIS) Grants Team, where he reviews the data management plan (DMP) component of funding applications that are being processed by the Peer Review College (PRC). The PRC provides support for funding applications to the RCUK funders, Royal Society & British Academy (>£15k), and for fellowships. Any applications that meet this criteria will go through the PRC process. Please note that applications via the PRC will take priority, however, please feel free to ask for advice on other funding applications or projects.

Further information on guidance about the PRC can be found on the RIS Guide page.

Academics are welcome to seek advice at any stage of the application process, but you are strongly encouraged to seek advice at the initial drafting stage of the project, the earlier the better. There may also be specialist advice available in your department, faculty or externally in your discipline.

 

Workshops and Presentations

As part of RIS’s Research and Innovation Development Programme (RISDP), data management related workshops (along with other grant related topics) are run on a regular basis throughout the academic year. You can check dates and book a place via the Research Support page. Where there is demand, information sessions can be arranged for groups within faculties and/or departments.

For doctoral students please refer to the Graduate School Development Programme.

DMPonline and the DCC

The internationally recognised Digital Curation Centre (DCC) is a government funded organisation aimed at assisting researchers with all things RDM related. It includes the free DMP Online tool, which provides data management plan (DMP - see below) templates and guidance for major UK funders, as well as generic guidance for all other projects.

The DCC also provide examples of DMPs for most major funders. We strongly recommend that you use the DMP Online tool.

 

Other external resources

There are significant data management resources available externally, notably the Digital Curation Centre, as already discussed. Other useful resources include:

The University of Bristol provides an introduction to research data through its Interactive Bootcamp Tutorial.

The University of Edinburgh provides Research Data Management training through its free online MANTRA program.

The UK Data Service  offer in-depth RDM advice.

 (Back to the top)

What is the University and funders' policies on research data?

This is the University’s Research Data Management Policy with additional clarifying guideline. The guidance notes refers to much of the content on this page.

If your research is externally funded (e.g. by RCUK), it is extremely likely that your funder will also have a research data policy which you much adhere to. Please check your funder’s website for more details. For example, please see RCUK’s Research Data Policy.

(Back to the top)


Data Management Plans (DMPs) 

What is a DMP?

A DMP is a document that you write which lays out how you intend to manage your research data. It will normally explain how you will organise the creation, processing and preservation of your data during the lifetime of your project and beyond. During the early stages of a project it should be considered a live document. Depending on the funding body, DMPs can range in size (and complexity) from a statement of around 200 words to a 2- or 3-page document (e.g AHRC, ESRC).

In the UoP's Research Data Management Policy the University uses the following definition:

“A ‘Data Management Plan’ (DMP) typically states what research data are likely to be created as a result of certain research activities and outlines the plans for sharing, dissemination, storage, preservation, eventual (possible) destruction of such research data. A DMP might take the form of a stand-alone document or be part of a research funding application, or research design protocol (for ethical review).”

 

Just to make things interesting, please note that different funding bodies use different names for DMPs so you may come across the following terms, or similar: Data Management Plan, Access and Data Management Plans, Data Access Plans, Data Management and Sharing Plans, Statement on Data Sharing, Technical Plans, etc.

Examples of DMPs for some major funders can be found on the DCC website.

For more in depth information on data management plans please review the DCC website, and/or feel free to ask for advice via researchdata@port.ac.uk.

 (Back to the top)

Do I have to write a DMP?

In short, yes - however, it can be a bit more complicated than that!

Nearly all major funders require a DMP at the application stage (see the overview of funders’ data policies below). DMPs are reviewed by funders as part of the application process. A DMP lacking sufficient detail may jeopardise a funding application. In the instances where no data is created a covering statement explaining why a DMP is not required will normally be needed.

Above: Overview of funders’ data policies (DCC, 2018b).

If your research is not funded, you are still required to ensure that adequate plans for the management of your data are in place. Your data should be managed in accordance with the UoP’s Research Data Management Policy.

 (Back to the top)

How do I write a DMP?

It’s strongly recommended that you use the DMPonline tool. This free tool is provided by the DCC and offers structured templates for writing DMPs, which includes embedded guidance taken directly from the major funders. The DCC are an organisation established to provide expert data curation advice for the UK HE and research community.

The DMPonline tool has custom templates for each major funder (see the screenshot below). Each funder has different requirements for their DMPs, so it is important to select the correct template and also review the relevant funder’s research data policy (or equivalent) in detail.

If you are not applying to a major funder, the DMPonline tool also includes a generic template and advice section relevant to all other projects.

 

 Example template. Note the guidance options on the right (DCC, 2018a)(You'll need to register...)

For the social scientists amongst you the Consortium of European Social Sciences Data Archives (CESSDA) also provides a useful list of questions to answer (or at least consider) for your DMP: CESSDA - Adapt your Data Management Plan.

Examples of DMPs for some major funders can be found on the DCC website. If you need advice at any stage please contact researchdata@port.ac.uk.

 (Back to the top)

How do I work out the cost of a project?

Many funders allow you to apply for the cost of data storage in your funding application. If the opportunity is there, then you need to take advantage of it. Please contact researchdata@port.ac.uk in the first instance for advice.

(Back to the top)


Reusing Existing Data 

Can I use someone else’s data?

The short answer is yes!

One of the ideals of open data is that data is as freely and openly accessible as possible to encourage the reuse of data. Data can be expensive and time consuming to collect, so where possible the reuse of data is encouraged to efficiently use available resources and promote collaboration. The reuse of data is viewed favourably by funders. In fact, the ESRC actually requires the researcher to justify the collection of new data as part of their application for funding. Therefore, before collecting research data you should consider if any suitable data already exists which you could reuse.

(One caveat to this is if you're a doctoral student it may be a requirement of your study programme to collect your own data. Please consult your supervisor).

If you’re planning on using someone else’s data there are a number of things you should be aware of. You need to be aware of the scope and limitations of the data: e.g. when was it collected, whether it can be used to answer your research question, etc. Does the associated metadata (information describing their data) give you sufficient information for you to make that judgement? You will also need to be aware of the licence under which the data you wish to use has been released. This is to ensure that you can use and share the data as you intend.

The UK Data Service provide a guide on the reuse of data, including examples of how data have been reused.

 (Back to the top)

Where can I find existing data to reuse?

Data repositories
To find existing datasets to reuse there are a variety of international, national, subject area and institutional repositories available to search.

(re3data.org, 2018)

re3data.org (above; from drop down menu select Browse/Browse by subject, then click in area of interest on the wheel) is a good place to start a search as it which provides a global registry of data repositories from different academic disciplines.

There are many subject-specific repositories to explore, e.g. the UK Data Service houses social, economic and population data from the UK. Plus there are general purpose data repositories for a wide variety of data types, for example Dryad, Figshare and Zenodo.

If you require help finding a suitable repository please contact researchdata@port.ac.uk. Don’t forget that your colleagues may be able to suggest suitable options to you as well.

 

Data journals
Another avenue for finding data (and publishing data, for that matter) is data journals. These are publications that treat datasets as you would research articles. Examples include Scientific Data (Nature), Data in Brief (Elsevier) and the Data Science Journal (CODATA).

 

Research publications themselves
Finally, it’s worth highlighting that you may also find datasets within the relevant publications themselves. There’s an increasing drive from publishers for authors to include information about how to access the underlying research data. So when exploring related literature (e.g. journal articles) you may be able to see how to access and use the data in the published literature itself.

(Back to the top)


Data Creation and Organisation

What do I need to consider when creating and organising my data?

You’re almost certainly already familiar with the basics of creating and organising data in your research area, however, some aspects may be new to you. As a refresher, the UK Data Service provide guidance on all aspects, including: File Formats and Software; Recommended Formats; Organising Data (e.g file structure); Quality Assurance; Version Control and Authenticity; Transcription; and Digitising Data.

You may also want to review metadata standards for your subject area. The DCC has links on its Disciplinary Metadata page.

 (Back to the top)

What documentation do I need to produce?

You need to produce documentation that includes all the information required to re-run the entire experiment/process from scratch. It doesn’t need to be long, but it should include settings, conditions, significant decisions made etc, to enable the data to be fully understood. This will enable your conclusions to be independently verified and your data to be reused, thus making the research process transparent. When sharing your research data at the end of your project you’ll also need to share your documentation.

The UK Data Service’s ‘Document your data’ page provides useful advice on this topic. If you need advice at any stage please contact researchdata@port.ac.uk.

(Back to the top)


Managing personal data

This section is only relevant if you are collecting or using personal personal data, e.g. from studies with human participants. If you are collecting personal data, you will also need to seek ethical approval for your research and include a summary of how you will manage personal data in your DMP

 (Back to the top)

What is personal (and sensitive personal) data?

If you are collecting personal data (including sensitive personal data), it must be processed in compliance with the Data Protection Act (1998) (DPA) until 25th May 2018, and the General Data Protection Regulation (GPDR) and the forthcoming Data Protection Act 2017 from 25th May 2018. (See the Corporate Governance page and Data Protection Policy for more details.) Because the GDPR will come into effect shortly, most references within this page have been made to the new legislation rather than the outgoing legislation. (Further information.) You need to pay particular attention to the secure storage of personal data and ensure confidentiality will be maintained.

Personal data is data which identifies individual participants, whether by name or another identifier, such as an ID number, IP address, or by particular circumstances relating to that individual. If you can tell from your data which measurements, responses, observations, etc, came from which participant, then you are processing personal data. Consent forms also count as personal data. Also, please be aware that some data collection instruments (e.g. surveys) allow participants to be identified by their IP address and hence also count as personal data.

A subset of personal data is sensitive personal data (or from 25th May 2018 “special categories of personal data”), which relates to areas including ethnicity, religion, sexuality, trade union membership, political views, mental and physical health and from 25th May 2018 genetic and biometric data that is processed for the purpose of uniquely identifying an individual.

Please be aware that even if participants aren’t actually named, it still counts as personal (or sensitive personal) data if it’s possible to deduce their identity. For example, if a survey didn’t collect people's names but did collect job titles and company names, then it’s likely this would include personal data because (in some cases at least) you could work our which responses came from which people by their job titles.

Data can be linked-anonymised (pseudonymisation). This means the that a participant's identity is held in a separate document from their responses / observations, and the two documents are only linked via a 'key' (e.g. participant names are replaced with numbers and the only 'key' document links the participant name with their number).  Strictly speaking under the GDPR pseudonymised data counts as personal (or sensitive personal data). However, the Information Commission Office is currently reviewing the situation and should be producing some pragmatic guidance. 

The GDPR requires that personal data should only be kept (see below for information on how to store it) in an identifiable form for as long as necessary, and that, where possible, the data is anonymised as soon as feasible. 

Once the data has been anonymised, it no longer falls within the requirements of the DPA 1998 or the GDPR.

 (Back to the top)

How do I store personal (and sensitive personal) data?

When it is necessary to retain the data in an identifiable format, e.g. if there is a need to maintain contact with the participant or to keep a record of the individual’s participation in case of any later enquiry or complaint, it must be stored securely in accordance with the GDPR. (See above information on personal data.)

The University provides storage facilities that can be configured to comply with the DPA and GDPR’s requirements for storing both personal and sensitive personal data, i.e. the storage meets the GDPR’s ‘safeguards’ required to store and process personal data. Therefore, you MUST use one of the storage facilities outlined in the next section.

The main difference between storing personal and sensitive personal data is the access control. When storing personal data it may be appropriate for all members of the project team to see all of the data, but with sensitive personal data (depending on the nature of the research) access to the data must be restricted to only those who need to see it - for example, the researcher rather than the administrator.

For further information, you may like to read the GDPR articles 89, 5, 9 and 14, and recitals 156-162.

(Back to the top)


Live research data

Data storage and security WHILE my research project is taking place

While your project is taking place, you’ll need to store your research data securely and protected from loss, unlawful or unethical access.

If you are collecting personal data then please read the previous section first.

If you are interested in archiving your data AFTER your project has completed, then please see the next section.

 (Back to the top)

Digital storage

You have a number of options for storing your ‘live’ research data during your project. The aim is to find a flexible solution, which best fits your requirements.

To help with this, and to keep your Service Delivery Manager informed of new project data storage requirements, please access and complete a Research Data Enquiry form.

(Please note, in the rare instances where research involves the handling and storage of illegal material, you must read the Handling Illegal Material advisory (Information Security Advisories, under section 1: Governance) before proceeding and contact servicedesk@port.ac.uk in the first instance.)


Staff and students have several options:

 

1. Google Team Drive:

Google Team Drive has unlimited storage, is easy to access off-campus and is useful when working with external collaborators. (Please note - you must only use the a Google Team Drive associated with your official UoP account.)

Where appropriate the University recommends the use of Google Team Drives for research data storage. External options, such as Dropbox, should only be considered in exceptional circumstances - if you think this is the case please contact IS Service Desk; a Privacy Impact Assessment may be required if personal data is involved.

However, before choosing the Google Team Drive, there are three things that you need to be aware of:-

It may not be ideal when working with large files (e.g. videos or high resolution images) due to the time needed to download/upload or stream them.

If you’re storing personal data then it is essential that (in addition to following Steps 1-3) you follow Step 4 in its entirety.

You’ll need to consider how you’ll ensure that multiple team members aren't working on the same file at the same time. (The exception is if you’re only using Google Docs, as you can edit them online.)

 

So although Google Team Drive is an extremely useful tool, there may be some specific circumstances where it not may not be practical and you will need to contact your Service Delivery Manager (below) for an alternative.

 

The steps below explain how to set up and configure a Google Team Drive. If you have any questions, please contact servicedesk@port.ac.uk.

Step 1: Create your Team Drive
To set up a Google Team Drive, please follow these instructions.

Step 2: Set up access permissions
You must restrict access to just the project team and monitor who has access. Please follow these instructions.

We are conscious that if the members of a project team leave the university there will no longer be an active data steward, with the possibility of a Project Team Drive being difficult to access. We recommend that all Research Google Team Drives have ownership shared with researchdata@port.ac.uk. This purely to ensure that a monitored University account is always connected to the project data. The content will not be viewed unless specifically requested. For example when a Principal Investigator or Project Lead leaves the University.

Step 3 (optional, but recommended): Set up Google Drive File Stream

It is recommended that you also use the ‘add-on’, Google Drive File Stream. Using Google Drive File Stream allows you to stream files from your Google Team Drive, as opposed to manually downloading/uploading files each time you need to work on them. Or in other words, this allows your Google Team Drive to appear as another drive on your computer, much like the K drive.

It also allows you to choose to have a file or folder offline, so you can have access even when not connected to the network (for example when travelling, etc). This downloads a copy to your computer, so you do need to be mindful of the space that this will take up.

To set up Google Drive File Stream, please follow these instructions. Alternatively you can contact IS Service Desk (servicedesk@port.ac.uk) and request it to be installed on your PC.

Step 4: Storage of personal data on Google Drive
If your project involves personal data (or any other data deemed to be sensitive or confidential), it is a legal requirement that you must store the data in a manner that meets the requirements of the General Data Protection Regulation (GDPR).

In order to do this, it is the PI’s responsibility (or supervisor in the case of student projects) to ensure that the Google Team Drive is configured in the following way. Please contact servicedesk@port.ac.uk if you have any queries:

You must restrict access to your Google Team Drive to just project team members who need to have access (and have participants’ consent to have access) to the files (Step 2). Please also see the note in the previous section about controlling access further if you’re dealing with sensitive personal (special categories) data. It is essential that you are extremely careful with these ‘access sharing’ settings. 

Please be also aware that any project team member who you grant access to ‘edit’ file and folders in your Google Team Drive will technically be able to share these files and folders onto other people. It is the PI’s responsibility to ensure this does not happen. Therefore, where possible, it is advised that the team members are only given ‘view’ access.

Where you do download a personal data file (e.g. if you select to work on it offline while using Google Drive File Stream), this will save a local copy to your computer. Therefore, you must also delete the downloaded copy of the data from your device as soon as you’ve finished working on it (e.g. at the end of each day) and empty the Recycle Bin to remove all copies of the document/s from your computer.

With regards to encrypting files, you must encrypt files that contain sensitive (special category - see above) personal data at all times; both while the files are stored on the Google Team drive and also if you have downloaded a copy to work on. If your files contain personal data (but not sensitive / special category data) then you only need to encrypt them if you have download a copy to work on.

You must encrypt the computer/laptop/any other device you are using to access these files. For Windows machines please use BitLocker and use FileVault if you are using a Macs:

BitLocker: Start-> Control Panel->BitLocker Drive Encryption->...follow instructions.

BitLocker To Go (for encrypting portable devices) on the same page.

FileVault: instructions (Gizmodo blog)

 

And finally, just to reiterate the point above. You must use a Google Team Drive associated with your University network account, not with any other G Suite account that you may have access to. Each project team member must access the data using their own individual login. You must not create ‘generic’ Gmail logins that multiple people can use, or allow other collaborators to use them.

 

(Back to the top)


2. Folder on your department K drive (staff only):

Capacity: 1GB

Advantage:

If you are storing files that contain personal data, then you may find the K drive slightly easier work with than a Google Team Drive. This is because the access and sharing permissions are set up for you by IS and centrally controlled. However, IS are also available to help you set up your Google Team Drive, so this isn't a major issue.

Disadvantages:

It’s harder (but still possible) to access off campus (instructions).

Limited storage capacity.

If your project involves personal data (or any other data deemed to be sensitive or confidential), you MUST put further precautions in place when using the K drive:

Ensure that your K drive folder is configured (by contacting servicedesk@port.ac.uk) to restrict access using a password to just the project team. Please see the note in the previous section about further access control if you’re dealing with sensitive personal (special categories) data. You must not save local copies of files containing personal data to your own computer / laptop / device.

(Back to the top)


3. Folder on your department N drive (students only):

Capacity: 20GB

Advantage:

If you are storing files that contain personal data, then you may find the N drive slightly easier work with than a Google Drive. This is because the access and sharing permissions are set up for you by IS and centrally controlled. However, IS are also available to help you set up your Google Drive, so this isn't a major issue.

Disadvantage:

It’s harder (but still possible) to access off campus (instructions).

 

(Staff should not use their N drive to store research data as it creates issues for sharing data with other team members and issues when members of staff leave.)

 

Please contact your Service Delivery Manager (SDM) if the storage options listed above do not meet your needs. They will be able to advise you on the most suitable storage solution for your research project. Please ensure that your Head of Department (or supervisor if you’re a PhD student) has agreed that your project can take place before contacting your SDM. Also, please be aware that there maybe a cost involved if you require a very large amount of storage.

As noted at the top of this section please access and complete a Research Data Enquiry form.

 

Contact details for the Service Delivery Managers:

Faculty of Science - stuart.graves@port.ac.uk

Faculty of Humanities and Social Science - nancy.jefferies@port.ac.uk

Faculty of Technology - barrie.miles@port.ac.uk

Faculty of Business and Law - robert.cox@port.ac.uk

Faculty of Creative and Cultural Industries - les.black@port.ac.uk

 (Back to the top)

USB sticks, external hard drives and similar

The use of unencrypted portable devices (e.g. laptops, memory sticks, portable hard drives, DVDs) to store any data (including personal data), even for temporary storage, is not permitted for staff or students. It is recommended that staff and students purchased encrypted devices. However, please be aware that if an encrypted device fails then the data will be irretrievably lost. Therefore, the use of encrypted portable devices should only be used for temporary storage when absolutely necessary (e.g. during fieldwork) and the data must be transferred to network storage/Google Team Drive at the earliest opportunity.

 (Back to the top)

Transferring personal data via email

If you follow the instructions above for storing personal (and personal sensitive) data, then you shouldn't need to send personal (or personal sensitive data) via email. However, if you must send personal data by email, then you MUST send the data as an encrypted attachment (i.e. not in the email text itself). Please review the Information Security advisory on 'Transferring Restricted Data by Email' (Section 4, Information). Seek advice from Information Services if you are not fully confident with the process and obtain the approval of management, beforehand. Encryption utilities are built into the Microsoft Office products or the Axcrypt encryption software is available on the University network. The UK Data Service has further information on encryption in general.

 (Back to the top)

 

Paper records

All paper records which contain personal data, including consent forms, must also be stored securely. In reality, this means storing the paper records in a lockable filing cabinet. When dealing with sensitive personal data, the filing cabinet must not be unlocked until the data is required, and the data returned as soon as it is not needed. Sensitive personal data must not be left unattended on a desk. Only those members of the project team who have permission and need to access the sensitive personal data should be given physical access to the filing cabinet.      

(Back to the top)


Openly sharing and preservation of research data AFTER my research project has finished

Archived data

Sharing and making data open access

As a result of the national and international push towards ‘open research’, it’s the policy of the University and all major funders that you make the research data which underpins your publications open access (i.e. publicly available to download) at the end of your project, unless there is an ethical, legal or contractual reason not to do so. Not only does this greatly increase the transparency of the research process by allowing results to be re-analysed and conclusions re-tested, it also enables data to be reused by other researchers, industry, charities, governments and even the general public for creative and innovative purposes.

 (Back to the top)

Do I have to share my research data?

It is the University’s policy (along with the policy of major funders) that you publicly share your research data if it underpins published research findings or has of potential long-term value.

The caveat to this policy is that if there is a legal, contractual or ethical reason not to share your research data, then you do not have to do so. For example, there may be a contractual requirement from a commercial partner to keep research data confidential. Also, you do not have to share your research data if you have plans for future commercial exploitation. Similarly, you must not share your research data if it wouldn't be legal/ethical to do so. For example, if it's not possible to fully anonymise your research data (i.e. remove all personally identifiable information) and sharing it would violate the GDPR.

 (Back to the top)

Sharing personal data

You must NOT share personal data (i.e. unanonymised data) beyond the originally agreed project team, unless you have explicit consent from participants allowing their personal data to be shared.

Therefore, to share this type of data, you must first fully anonymise it and you should still gain participants' consent for sharing.  To explain this a bit further, once the data is anonymised it's no longer covered by the DPA/GDPR. However, even when the data is anonymised, you should still make it clear to participants that other processing will occur, for example, an anonymised report of participation being included in open research. Thus, although data is no longer covered by the DPA/GDPR once it has been anonymised, from an ethical perspective you should still gain participants’ consent for sharing the data.

In the more unusual scenario where the research requires that the data is shared publicly in a format that identifies the individual, the participants must be clearly told this before they take part in the research so they can decide not to engage with the research at all. The data can only be used and shared in an identifiable format if explicit consent has been given by the participants allowing them to be identified, but even then, you should also be prepared to remove this data if the individual changes their mind.

 (Back to the top)

Where can I share my research data?

In terms of which repository (archive) to use for long-term preservation and to make your data publicly available and open access, researchers need to follow these steps:

  1. First, if you are externally funded and/or have external collaborators, see if your funder(s)/collaborator(s) require the data to be deposited in a particular repository, e.g. all research funded by the ESRC needs to be deposited in the UK Data Archive.
  2. If not, then see if a recognised, accredited subject or disciplinary international or national repository exists where the data can be deposited. This is advised as it affords the research greater visibility. The https://www.re3data.org/ provides a useful index to a large number of repositories.
  3. Finally, if a suitable external repository doesn’t exist, then the Portsmouth Research Portal (Pure) can be used to archive your research data and make it open access (see Adding Datasets to Pure instructions).

 

 

In all of the above scenarios, a metadata record (and link to external repository if relevant) also needs to be added to Pure so that the University has a record of the data produced.

If you would like help in selecting a repository, then please contact the University’s Research Data Officer - researchdata@port.ac.uk.

(Back to the top)

 

What else must I do to share my data?

Your data should be stored in a format suitable for long-term preservation. This is to ensure that it can still be accessed over time, despite changes in software and hardware used to original create or access it. Thus, the research data should be stored in an open standard format, and where applicable (e.g. when storing images) this format should be lossless. See the UK Data Service for further help or contact researchdata@port.ac.uk.

You also need to ensure that you also provide sufficient documentation / metadata for someone else who wishes to use your data to understand it. Please see the documentation section.

Plus, you should specify what licence (e.g. CC BY) you intend to apply to your data, as this enables it’s reuse while also ensuring you gain full credit for having produced the data.

(Back to the top)

How do I obtain a DOI for my data?

A digital object identifier (DOI) is a unique alphanumeric string that’s used to identify content and provide a persistent (i.e. won't change) link to where it is on the Internet. When you publish a journal article, your publisher assigns a DOI to your article. DOIs also need to be assigned to datasets to enable people to find them more easily.

When you deposit your data in any of the above types of repository, a DOI will be automatically created for you.

 (Back to the top)

Can I share my research data with some restrictions imposed?

You can do this, but only if you have a legal, contractual or ethical reason to do so.

One example of a restriction is that the degree to which the research data are made open access may vary. For example, data could be provided on request (as opposed to being free to download), or the data could be provided subject to certain terms and conditions. Or, for example, there may be a contractual licensing requirement from a commercial partner to keep the research data confidential until after an embargo period. One way of sharing data, but restricting access, is to apply a suitable licence.

(Back to the top)

How can I retain my research data without sharing it?

In addition to making your research data open access wherever possible (as noted above), you may also have some other parts of your research data which you need to retain at the end of your project but can’t make open access for ethical, legal or contractual reasons - e.g. consent forms, participant contact details and unanonymised raw data kept for follow up research.

When collecting the personal or sensitive personal data, the research participants should be told whether or not it will be necessary to retain the data in an identifiable format and therefore asked for their consent to this ‘further’ processing. Furthermore, it should be made clear whether the data in its raw or non-anonymised form will be used for any further research or only research approved by an ethics committee, and whether data will subsequently be anonymised. Where applicable, paper records may be scanned and originals destroyed.

All retained (but not shared) research data needs to be stored securely to prevent unauthorised access using one of the solutions described in the Digital Storage section. All personal and sensitive personal data needs to be stored in accordance with the DPA/GDPR, as described in the previous section.

 (Back to the top)

 

How long after the project should I store my research data for?

You need to look at the various policies that impinge on your research data and use the longest retention period. The retention applies to both the data and consent forms. The policies that will typically apply are:

When it comes to the research data that you’ve shared publicly (see above), the overarching goal is long-term preservation. Each repository imposes their own minimum period for keeping your data.

Your research data is also subject to the UoP Retention Schedule for Research Data and University’s Research Data Management Policy, which has a minimum of 10 years requirement (see Section 12 for the finer detail).

Plus your data may be subject to the policy of a particular funder. For example, the Medical Research Council has a retention policy of 30 years.

 

Also, to complicate things further, if you are storing personal data then the GDPR may come in effect if a participant requests that you destroy their data (below).

 (Back to the top)

How can I link my publication to my research data?

In your publications, you need to include a statement of where the reader can find a copy of your data. Examples include:

"Supporting/supplementary research data are openly available at [web address or DOI]".
"Supporting/supplementary research data are available, subject to conditions of use, from [web address or DOI]".
"The study analysed an existing research dataset, available from [web address or DOI]."
"Supporting research data are available on request from the [add name]".
"Due to licence restrictions, supporting research data are not openly available."
"No Research Data were created during this study."
"Supporting data are included in the body of this publication."

 (Back to the top)

 

Destruction

You need to plan how you will destroy your research data when you no longer need to retain it. See above for retention periods. (Some studies entail the destruction of any personal data following analysis for research purposes while the anonymised data is retained for the full retention period.)

With regards to the GDPR and the destruction of data, if you have used ‘consent’ as your ‘legal basis’ for collecting and processing their personal data then you do have to delete it if the participant requests you to do so. If you have already anonymised it and so cannot delete it because you cannot identify it, then you would need to explain that to the participant.

Please contact the University’s Research Data Officer (researchdata@port.ac.uk) for further advice.

(Back to the top)


Intellectual Property Rights, Copyright and Licensing

What do I need to know about Intellectual Property Rights (IPR)? (a.k.a. Who ‘owns’ the research data?)

By default the University asserts ownership over the research data collected by academics (or co-ownership with third parties), unless alternative arrangements have been agreed (e.g. by a separate written contract).

Please see the University’s Intellectual Property (IP) Policy for further details. IPR may have an impact on what licence(s) you release the data under.

 (Back to the top)

 

What do I need to know about copyright?

Copyright is a type of intellectual property protection. Copyright protects your work/data by preventing others from copying, adapting, and commercialising your work without permission. Your work is normally automatically copyrighted (in the case of research data, nearly always to the University) upon its creation (i.e. there’s no requirement to apply for it) (see above). 

For copyright queries please contact david.sherren@port.ac.uk. (The University's Copyright Policy - relates to the use of third party material. For ownership of copyright see the Intellectual Property Policy above.)

 (Back to the top)

 

What do I need to know about licensing?

A licence grants other people / organisations permission to use your research data under certain conditions. Therefore, sharing your research data and making it open access ‘under licence’ allows others to reuse it, while also allowing the University to (typically) retain the IPR of the data and ensuring that you and the University are fully credited for its creation.

There are two aspects you need to consider when thinking about licensing:

  1. How you apply a licence to your own research data to enable it to be shared.
  2. When re-using someone else's research data, what are the implications of any licence that’s been placed on it?
     

If you are ever in any doubt about licences, please contact researchdata@port.ac.uk.

This raises the question of which licence to apply to your research data? The default rule is that you should use a suitably permissive licence, unless you’ve got a good reason to use a more restrictive licence.

For example, the Creative Commons ‘Attribution’ (CC BY) licence is commonly used, which allows other people to re-use and adapt your research data, but they must provide an attribution back to you (and the University) as being the original producers. This licence does not put any restriction on the data being used for commercial purposes.

If your research is externally funded or conducted as part of a contractual agreement, then you also need to check the terms and conditions of your funders/contract. In the case of contracts, there may be restrictions on whether or not you can actually share your data and the licence you can apply.  Or, for example, if you are funded by RCUK then they have expressed a strong preference towards the CC-BY licence. 

If your research is not externally funded or not conducted as part of a contractual agreement, then it falls under the University of Portsmouth’s Research Data Policy, which expresses a strong preference towards the CC BY licence.

When re-using someone else’s data, if it has a licence applied (which it probably will have) then you need to respect the conditions of the licence. For example, if it has the CC BY-NC licence then you must publicly acknowledge (e.g. in your research article) the original source of the data and you must not use this research data for commercial purposes (see below). The licence already applied to any data that you re-use can also affect the licencing under which you can share your work onwards. Please contact researchdata@port.ac.uk

So in terms of how you actually release your data under a licence, this is actually very simple. Once you’ve selected your licence (see list below), you would typically write your licence both on the metadata record for your data and in a sensible place within the data itself (e.g. in a header / footer). This is normally a statement that data is released under the chosen licence, which will include a link to the full text of the licence itself.

This is an example text for licensing statement:

[This database is/These data are/<name of dataset> is] made available under the Public Domain Dedication and Licence v1.0 whose full text can be found at: http://opendatacommons.org/licenses/pddl/1.0/ 

 (Back to the top)

List of licenses:

Standard licences

Creative Commons (CC) - The most common and widely used type of licensing for most forms of original content and data. Simple and robust. Combination of attribution (CC BY), share-alike (-SA), no derivatives (-ND) (no altering) and non-commercial (-NC) aspects.

 

Creative Commons Licence  What it means

CC BY

(Attribution)

 

 

 

(least restrictive)

The CC BY licence lets all other people/organisations use your* research data.

They can also redistribute, remix, tweak, and build upon your* research data.

People who use your* data must credit you* for the data’s original creation.

This licence allows your* data to be used for any purpose, including commercial purposes.

This licence is recommended for maximum dissemination and use of licensed materials.

CC BY-SA

(Attribution, Share Alike)

 

The CC BY-SA licence lets all other people/organisations use your* research data.

They can also redistribute, remix, tweak, and build upon your* research data.

People who use your data must credit you* for the data’s original creation.

This licence allows your* data to be used for any purpose, including commercial purposes.

Plus, people who use your* data must license their new creations under the identical terms, i.e. must also be licensed under the CC BY-SA license.

CC BY-ND

(Attribution, No Derivatives) 

 

The CC BY-ND licence lets all other people/organisations use your* research data.

People who use your* data must credit you* for the data’s original creation.

This license allows your* data to be used for any purpose, including commercial purposes.

Plus, people who use your* data must only share it onwards in its whole and unchanged form.

CC BY-NC

(Attribution, Non-Commercial) 

 

The CC BY-NC licence lets all other people/organisations use your* research data.

They can also redistribute, remix, tweak, and build upon your* research data.

People who use your* data must credit you* for the data’s original creation.

This licence does NOT allow your* data to be used for commercial purposes.

CC BY-NC-SA

(Attribution, Non-Commercial, Share Alike) 

 

The CC BY-NC-SA licence lets all other people/organisations use your* research data.

They can also redistribute, remix, tweak, and build upon your* research data.

People who use your* data must credit you for the data’s original creation.

This licence does NOT allow your* data to be used for commercial purposes.

Plus, people who use your* data must license their new creations under the identical terms, i.e. must also be licensed under the CC BY-NC-SA license.

CC BY-NC-ND

(Attribution, Non-Commercial, No Derivatives) 

 

 

 

(most restrictive)

The CC BY-NC-ND licence lets all other people/organisations use your* research data.

People who use your* data must credit you* for the data’s original creation.

This licence does NOT allow your* data to be used for commercial purposes.

Plus, people who use your* data must only share it onwards in its whole and unchanged form.

 (Back to the top)

* Where 'you/your' is defined as - 'You means the individual or entity exercising the Licensed Rights under this Public License. Your has a corresponding meaning.', Creative Commons 4.0 Legal Code, Section 1.k.

i.e 'you' means yourself (the individual academic) and typically the University and other collaborators.

 

Under the CC licences there is also a Public Domain (CC(0)) licence, although this is rarely used for research data.

 

Open Data Commons - aimed at databases.

Open / Non-commercial Government Licence - UK public sector and government resources.

 

Prepared Licences
You may be required to apply a prepared licence. For example, depositors at the UK Data Service will need to sign a licence agreement as part of the deposition process. Safeguarded/controlled data can be made available under bespoke licensing options that apply access restrictions to data. For example, see the UK Data Service Data Access Policy page for more information.

 

Multiple licenses
Used where no one licence is entirely satisfactory. Often use with open source software.

 

Bespoke licences
Normally the standard licence cover all requirements. However, if you require a bespoke licence you will need to contact the Contracts & Post Award Team in Research & Innovation Services  (riscontracts@port.ac.uk).

 

(Source: adapted from Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides/license-research-data)

(Back to the top)


Ethics

What do I need to consider?

The University provides an online reviewing system (see UoP Ethics page) to help identify ethical concerns and provide a formal record of ethical review (even if it is just to say that that there are no issues of concern). Where there are issues that need to be explored in greater depth, you will be invited to complete an Application for Ethics Review form.

You may also like to read the Managing Personal Data section above.

(Back to the top)


Useful links

Internal:

Corporate Governance - General Data Protection Regulation

Ethics

Information Security Advisories

Researcher Support (Research and Innovation Services)

University of Portsmouth's Data Protection Policy

University of Portsmouth's Research Data Management Policy

 

External:

Digital Curation Centre Data Management Plans

DMPonline

UK Data Service

Get Data

Use Data

Manage Data

Deposit Data

 (Back to the top)


References

Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides/license-research-data)

Digital Curation Centre (DCC). (2018a). DMPonline. Retrieved from http://www.dcc.ac.uk/dmponline

Digital Curation Centre (DCC). (2018b). Overview of funders' data policies. Retrieved from http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies

Ingram, C. (2016). How and why you should manages your research data. Retrieved from https://www.jisc.ac.uk/guides/how-and-why-you-should-manage-your-research-data

re3data.org (2018). Retrieved from https://www.re3data.org/browse/by-subject/