Philosophy

A brief introduction to best practices in data management.

Citation

“Someone unfamiliar with your project should be able to look at your computer files and understand in detail what you did and why.”

– A Quick Guide to Organizing Computational Biology Projects - William Stafford Noble (Article →)

Overview

In this section, we will discuss:

  • Best practices for data storage & organisation.
  • Rules and conventions for naming files.
  • Typical components of genomics data analysis projects.
  • Records metadata and traceability.
  • Backups and archives.
  • Data transfers to and from remote computers.

With a particular focus on genomics data sets and pipelines.

Prerequisites

Refer to the section Files and directories for essential knowledge and commands to manage files and directories.