Journal:Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
Full article title | Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation |
---|---|
Journal | Journal of Biomedical Semantics |
Author(s) | Schröder, Max; Staehlke, Susanne; Groth, Paul; Nebe, J. Barbara; Spors, Sascha; Krüger, Frank |
Author affiliation(s) | University of Rostock, University Medical Center Rostock, University of Amsterdam |
Primary contact | Email: max dot schroeder at uni-rostock dot de |
Year published | 2022 |
Volume and issue | 13 |
Article # | 4 (2022) |
DOI | 10.1186/s13326-021-00257-x |
ISSN | 2041-1480 |
Distribution license | Creative Commons Attribution 4.0 International |
Website | https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x |
Download | https://jbiomedsem.biomedcentral.com/track/pdf/10.1186/s13326-021-00257-x.pdf (PDF) |
This article should be considered a work in progress and incomplete. Consider this article incomplete until this notice is removed. |
Abstract
Background: Electronic laboratory notebooks (ELNs) are used to document experiments and investigations in the wet lab. Protocols in ELNs contain a detailed description of the conducted steps, including the necessary information to understand the procedure and the raised research data, as well as to reproduce the research investigation. The purpose of this study is to investigate whether such ELN protocols can be used to create semantic documentation of the provenance of research data by the use of ontologies and linked data methodologies.
Methods: Based on an ELN protocol of a biomedical wet lab experiment, a retrospective provenance model of the raised research data describing the details of the experiment in a machine-interpretable way is manually engineered. Furthermore, an automated approach for knowledge acquisition from ELN protocols is derived from these results. This structure-based approach exploits the structure in the experiment’s description—such as headings, tables, and links—to translate the ELN protocol into a semantic knowledge representation. To satisfy the FAIR guiding principles (making data findable, accessible, interoperable, and reuseable), a ready-to-publish bundle is created that contains the research data together with their semantic documentation.
Results: While the manual modelling efforts serve as proof of concept by employing one protocol, the automated structure-based approach demonstrates the potential generalization with seven ELN protocols. For each of those protocols, a ready-to-publish bundle is created and, by employing the SPARQL query language, it is illustrated such that questions about the processes and the obtained research data can be answered.
Conclusions: The semantic documentation of research data obtained from the ELN protocols allows for the representation of the retrospective provenance of research data in a machine-interpretable way. Research Object Crate (RO-Crate) bundles including these models enable researchers to easily share the research data, including the corresponding documentation, as well as to search and relate the experiment to each other.
Keywords: research data, provenance, knowledge acquisition, electronic laboratory notebooks, semantic documentation, RO-Crate, FAIR
Background
References
Notes
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.