Reproducibility and Reusability

What is reproducibility and why does it matter?

An increasing number of funding agencies and journal publishers require sharing data in ways that allow others to reproduce the results reported. Facilitating reproducibility of published work is one of the objectives of research data management, and helps verify or improve research results over time.

  • Reproducibility and Replication (as defined by the National Science Foundation's Subcommittee on Replicability and Science, 2015, PDF)
    • Reproducibility: The ability for a researcher to replicate the results of a prior study using the same materials and procedures used by the original investigator.
    • Replication: Same procedures used by the original investigator are followed, but new data are collected.
  • Empirical, Computational, Statistical Reproducibility (Stodden, 2014)
    • Empirical: Data and collection details are made freely available. 
    • Computational: Code, software, hardware, and implementations details are provided. 
    • Statistical: Details on choice of statistics tests, model parameters are provided. 

What should I know about FAIR data principles?

FAIR data principles refer to Findable, Accessible, Interoperable, and Reusable data - not only by humans, but also machines. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data. The FAIR principles are becoming widely used within the data management and data sharing community and the standard in funders requirements. An increasing number of data repositories (such as the Harvard Dataverse) are aligned and mostly compliant with the FAIR data principles.