Trying to select what data to keep for long term preservation can be a problematic process, particularly as it is impossible to predict what information will be required in the future. There is no right or wrong answer of what to keep and what to dispose of, which ultimately makes the decision harder and solely reliant on a subjective opinion.
Keeping an audit trail detailing what data was disposed of, why and when, and ensuring that funders and institutional requirements are met should help you overcome any unforeseen issues that may arise.
Is it not possible to just keep everything?
You may consider that keeping all of your data maybe the easiest solution however, it is not practical to do this for a number of reasons including:
- costs associated to storing data in the long term
- the difficulties it may cause others in trying to find the data that is ‘useful’ for them
- if requested the information stored must be disclosed under the Freedom of Information Act
How should I select what to keep?
There are several things to consider when deciding what data to keep. The following check list may provide a useful starting point to help with this decision:
- what are my funder and institutional requirements on what data to keep?
- who holds the intellectual property and legal rights to this data in relation to storage and re-use? Can I negotiate these rights if it is not me?
- is there sufficient metadata to enable future users to locate the data effectively?
- if the costs of storing the data are my responsibility can I afford it?
- is the data transient or a ‘one off’ that cannot be replicated e.g. weather records?
Acknowledgement: This content is based on guidance provided by the University of Cambridge; the University of Glasgow; the University of Leicester the Digital Curation Centre and UK Data Archive.