User:Shawndouglas/sandbox/sublevel13

From LIMSWiki
Jump to navigationJump to search

Sandbox begins below

[[File:|right|520px]] Title: Why are the FAIR data principles increasingly important to research laboratories and their software?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

What are the FAIR data principles?

The FAIR data principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."[1] Since being published, other researchers have taken the somewhat broad set of principles and refined them to their own scientific disciplines, as well as to other types of research objects, including the research software being used by those researchers to generate research objects.[2][3][4][5][6][7]

But why are research laboratories increasingly pushing for more findable, accessible, interoperable, and reusable research objects and software? The short answer, as evidenced by the Wilkinson et al. quote above is that greater innovation can be gained through improved knowledge discovery. The discovery process necessary for that greater innovation—whether through traditional research methods or artificial intelligence (AI)-driven methods—is enhanced when research objects and software are compatible with the core ideas of FAIR.[1][8][9]

A slightly longer answer, suitable for a Q&A topic, requires looking at a few more details of the FAIR principles as applied to both research objects and research software. Research laboratories, whether located in an organization or contracted out as third parties, exist to innovate. That innovation can come in the form of discovering new materials that may or may not have a future application, developing a pharmaceutical to improve patient outcomes for a particular disease, or modifying (for some sort of improvement) an existing food or beverage recipe, among others. In academic research labs, this usually looks like knowledge advancement and the publishing of research results, whereas in industry research labs, this typically looks like more practical applications of research concepts to new or existing products or services. In both cases, research software was likely involved at some point, whether it be something like a researcher-developed bioinformatics application or a commercial vendor-developed electronic laboratory notebook (ELN).

Regarding research objects themselves, the FAIR principles essentially say "vast amounts of data and information in largely heterogeneous formats spread across disparate sources both electronic and paper make modern research workflows difficult, tedious, and at times impossible. Further, repeatability, reproducibility, and replicability of published (from academic research organizations) or internal (for industry research organizations) research results is at risk, giving less confidence to academic peers in the published research, or less confidence to critical stakeholders in the viability of a researched prototype." As such, research objects (which include not only their inherent data and information but also any metadata that describe features of that data and information) need to be[10]:

  • findable, with globally unique and persistent identifiers, rich metadata that link to the identifier of the data described, and an ability to be indexed as an effectively searchable resource;
  • accessible, being able to be retrieved (including metadata of data that is no longer available) by identifiers using secure standardized communication protocols that are open, free, and universally implementable with authentication and authorization mechanisms;
  • interoperable, represented using formal, accessible, shared, and relevant language models and vocabularies that abide by FAIR principles, as well as with qualified linkage to other metadata; and
  • reusable, being richly described by accurate and relevant metadata, released with a clear and accessible data usage license, associated with sufficiently detailed provenance information, and compliant with discipline-specific community standards.


References

  1. 1.0 1.1 1.2 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  2. "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024. 
  3. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  4. Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024. 
  5. Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x. 
  6. Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x. 
  7. Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3. 
  8. Olsen, C. (1 September 2023). "Embracing FAIR Data on the Path to AI-Readiness". Pharma's Almanac. https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness. Retrieved 03 May 2024. 
  9. Huerta, E. A.; Blaiszik, Ben; Brinson, L. Catherine; Bouchard, Kristofer E.; Diaz, Daniel; Doglioni, Caterina; Duarte, Javier M.; Emani, Murali et al. (26 July 2023). "FAIR for AI: An interdisciplinary and international community building perspective" (in en). Scientific Data 10 (1): 487. doi:10.1038/s41597-023-02298-6. ISSN 2052-4463. PMC PMC10372139. PMID 37495591. https://www.nature.com/articles/s41597-023-02298-6. 
  10. Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "Introducing the FAIR Principles". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564.