Journal:Persistent identification of instruments

From LIMSWiki
Revision as of 22:36, 15 September 2020 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Persistent identification of instruments
Journal Data Science Journal
Author(s) Stocker, Markus; Darroch, Louise; Krahl, Rolf; Habermann, Ted; Devaraju, Anusuriya; Schwardmann, Ulrich;
D'Onofrio, Claudio; Häggström, Ingemar
Author affiliation(s) TIB Leibniz Information Centre for Science and Technology, University of Bremen, British Oceanographic Data Centre,
Helmholtz-Zentrum Berlin für Materialien und Energie, Metadata Game Changers, Gesellschaft für wissenschaftliche
Datenverarbeitung Göttingen, Lund University, EISCAT Scientific Association
Primary contact Email: markus dot stocker at tib dot eu
Year published 2020
Volume and issue 19(1)
Article # 18
DOI 10.5334/dsj-2020-018
ISSN 1683-1470
Distribution license Creative Commons Attribution 4.0 International
Website https://datascience.codata.org/articles/10.5334/dsj-2020-018/
Download https://datascience.codata.org/articles/10.5334/dsj-2020-018/galley/962/download/ (PDF)

Abstract

Instruments play an essential role in creating research data. Given the importance of instruments and associated metadata to the assessment of data quality and data reuse, globally unique, persistent, and resolvable identification of instruments is crucial. The Research Data Alliance Working Group Persistent Identification of Instruments (PIDINST) developed a community-driven solution for persistent identification of instruments, which we present and discuss in this paper. Based on an analysis of 10 use cases, PIDINST developed a metadata schema and prototyped schema implementation with DataCite and ePIC as representative persistent identifier infrastructures, and with HZB (Helmholtz-Zentrum Berlin für Materialien und Energie) and the BODC (British Oceanographic Data Centre) as representative institutional instrument providers. These implementations demonstrate the viability of the proposed solution in practice. Moving forward, PIDINST will further catalyze adoption and consolidate the schema by addressing new stakeholder requirements.

Keywords: persistent identification, instruments, metadata, DOI, handle

Introduction

Between March 2018 and October 2019, the Research Data Alliance (RDA) Working Group (WG) Persistent Identification of Instruments (PIDINST) explored a community-driven solution for globally unambiguous and persistent identification of operational scientific measuring instruments. A "measuring instrument" is understood to be a “device used for making measurements, alone or in conjunction with one or more supplementary devices,” as defined by the Joint Committee for Guides in Metrology (JCGM).[1] Hence, PIDINST chose to address the problem of persistently identifying the devices themselves (i.e., each unique device), the real-world assets with instantaneous capabilities and configurations, rather than the identification of material instrument designs (i.e., models).

Instruments are employed in numerous and diverse scientific disciplines. Instruments can be static (e.g., weather station, laboratory instrument) or mobile when mounted on moving platforms (e.g., remotely operated underwater vehicles, drones). They may be used in observation or experimentation research activities. They may be owned and operated by individual researchers, research groups, national, international, or global research infrastructures or other types of institutions. For instance, at the time of writing, the Integrated Carbon Observation System (ICOS) operates approximately 3,000 instruments at over 130 stations in 12 European countries. Astronomy is well known for their intense use of telescopes. Those working in the life sciences employ an array of instrument types, ranging from microscopes to sequencers. The engineering sciences, too, make heavy use of instruments.

Persistent identifiers (PIDs) have a long tradition for the globally unique identification of entities relevant to or involved in research. They were developed “to address challenges arising from the distributed and disorganised nature of the internet, which often resulted in URLs to internet endpoints becoming invalid,” (Klump and Huber, 2017) making it difficult to maintain a persistent record of science. Examples for well established persistent identifiers include:

Borgman suggested that “to interpret a digital dataset, much must be known about the hardware used to generate the data, whether sensor networks or laboratory machines.”[6] Borgmann also highlights that “when questions arise […] about calibration […], they sometimes have to locate the departed student or postdoctoral fellow most closely involved.”[6] A persistent identifier for instruments would enable research data to be persistently associated with such crucial metadata, helping to set data into context. Moreover, discovering and retrieving an instrument’s metadata through resolvable identifiers aligns with the FAIR data management principles, a set of guiding principles for the management of research data and its metadata by making them findable, accessible, interoperable, and reusable.[7] Buck et al. suggested that data provenance information is fundamental to a user’s trust in data and any data products generated. They also recommended persistent identifiers for instruments as one of the next levels of data interoperability required to better understand and evaluate our oceans.[8] Thus, more broadly, FAIR metadata about instruments is critical in the scientific and research endeavors.

In addition to improving the FAIRness of instrument metadata, the persistent identification of instruments is also important for trusted cross-linking to valuable scientific objects, such as the research data they produce, which can be persistently identified themselves. A similar argument can be made for cross-links between instruments and literature since instruments (typically the instrument model) are generally mentioned in the literature as materials. Such cross-linking has received considerable attention in the community. The Scholix project[9] and the corresponding RDA/WDS Scholarly Link Exchange (Scholix) WG have recently proposed and implemented a common schema to standardize the exchange of information about the links between literature and data. As a result, it is now easier for a data publisher that discovers a link between data and literature to share this information, and for the publisher of the article to benefit by establishing a cross-link from literature to data. With the PID Graph[10], the FREYA Project is now generalizing cross-linking literature and data to other entities, including people, organizations, funders, etc. Arguably it makes good sense to enrich these connections by adding instruments.


References

  1. Joint Committee for Guides in Metrology (2012). "3.1 measuring instrument". International vocabulary of metrology – Basic and general concepts and associated terms (VIM) (3rd ed.). Joint Committee for Guides in Metrology. p. 34. https://www.bipm.org/en/publications/guides/#vim. 
  2. Paskin, N. (2009). "Digital Object Identifier (DOI®) System". In Bates, M.J.; Maack, M.N.. Encyclopedia of Library and Information Sciences (3rd ed.). Taylor & Francis Group. pp. 1586–92. doi:10.1081/e-elis3-120044418. 
  3. Haak, L.L.; Fenner, M.; Paglione, L. et al. (2012). "ORCID: a system to uniquely identify researchers". Learned Publishing 25 (4): 259–64. doi:10.1087/20120404. 
  4. Devaraju, A.; Klump, J.; Cox, S.J.D. et al. (2016). "Representing and publishing physical sample descriptions". Computers & Geosciences 96: 1–10. doi:10.1016/j.cageo.2016.07.018. 
  5. Bandrowkski, A.; Brush, M.; Grethe, J.S. et al. (2015). "The Resource Identification Initiative: A cultural shift in publishing (Version 2, Peer review 2 approved)". F1000Research 4: 134. doi:10.12688/f1000research.6555.2. 
  6. 6.0 6.1 Borgman, C.L. (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press. doi:10.7551/mitpress/9963.001.0001. ISBN 9780262327862. 
  7. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al. (2016). "The FAIR Guiding Principles for scientific data management and stewardship". Scientific Data 3: 160018. doi:10.1038/sdata.2016.18. PMC PMC4792175. PMID 26978244. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4792175. 
  8. Buck, J.J.H.; Bainbridge, S.J.; Burger, E.F. et al. (2019). "Ocean Data Product Integration Through Innovation-The Next Level of Data Interoperability". Frontiers in Marine Science 6: 32. doi:10.3389/fmars.2019.00032. 
  9. Burton, A.; Koers, H.; Manghi, P. et al. (2017). "The Scholix Framework for Interoperability in Data-Literature Information Exchange". D-Lib Magazine 23 (1–2). doi:10.1045/january2017-burton. 
  10. Fenner, M.; Aryani, A. (28 March 2019). "Introducing the PID Graph". DataCite Blog. doi:10.5438/jwvf-8a66. https://blog.datacite.org/introducing-the-pid-graph/. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order; however, this version lists them in order of appearance, by design. All footnotes—which are simply URLs—from the original article were turned into either external links or full citations for this version.