Difference between revisions of "User:Shawndouglas/sandbox/sublevel12"

Latest revision as of 13:29, 13 May 2024

This is sublevel12 of my sandbox, where I play with features and test MediaWiki code. If you wish to leave a comment for me, please see my discussion page instead.

Sandbox begins below

FAIRResourcesGraphic AustralianResearchDataCommons 2018.png

Title: What are the potential implications of the FAIR data principles to laboratory informatics applications?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

https://www.limswiki.org/index.php/Journal:Infrastructure_tools_to_support_an_effective_radiation_oncology_learning_health_system

This brief topical article will examine

The "FAIR-ification" of research objects and software

First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the FAIR data principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.^[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."^[1]

Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice^[2], including in ways beyond what perhaps was originally imagined by Wilkinson et al.. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.^[3]^[4]^[5]^[6]^[7] Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."^[4] The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.

A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.^[4] These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).^[4]

At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.^[8]^[9]^[10] In 2021, as part of the FAIRsFAIR initiative, Gruenpeter et al. made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition^[8]:

Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.

Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or laboratory information management system (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is^[10]:

Research software is a component of our instruments.
Research software is the instrument.
Research software analyzes research data.
Research software presents research results.
Research software assembles or integrates existing components into a working whole.
Research software is infrastructure or an underlying tool.
Research software facilitates distinctively research-oriented collaboration.

When considering these definitions^[8]^[10] of research software and their adoption by other entities^[11], it would appear that at least in part some laboratory informatics software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, electronic laboratory notebooks (ELNs) like open-source Jupyter Notebook or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.^[10] Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.

Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.

Implications of the FAIR concept to laboratory informatics software

The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones

To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"^[12]:

Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.

This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.

The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.^[13]^[14]^[15] RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."^[13]^[16] Elaborating on that concept, Cohen et al. add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."^[16]

The concept of software quality management (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.^[17] However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software.^[12] Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.

The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches

Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, metadata, and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.^[18]^[19] This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven ontologies (i.e., controlled vocabularies) at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.^[19] And that approach must value the importance of metadata and PIDs. As Weigel et al. note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."^[20] Enter non-relational RDF knowledge graph databases.

This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational Resource Description Framework (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.

Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.^[21] However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.^[21]^[22]

It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra et al. and the FAIR Cookbook.^[23]) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide semantic transparency (i.e., provide a framework for better understanding and reusing the greater research object through basic examination of the relationships of its associated metadata and their constituents), making these objects more easily accessible, interoperable, and machine-readable.^[21] The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.^[21]^[22]^[23]^[24]^[25]^[26] This concept can even be extended to post factum visualizations of the knowledge graph data^[25], as well as the FAIR management of computational laboratory workflows.^[27]

While rare, some commercial laboratory informatics vendors like Semaphore Solutions have already recognized the potential of RDF knowledge graph databases to FAIR-driven laboratory research, having implemented such structures into their offerings.^[24] (The use of knowledge graphs has already been demonstrated in academic research software, such as with the ELN tools developed by RSEs at the University of Rostock and University of Amsterdam.^[28]) As noted in the prior point, it is potentially advantageous to not only laboratory informatics vendors to provide but also research labs to use relevant and sustainable research software that has the FAIR principles embedded in the software's design. Turning to knowledge graph databases is another example of keeping such software relevant and FAIR to research labs.

Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications

The third and final point for this Q&A article highlights another positive consequence of engineering laboratory informatics software with FAIR in mind: FAIRified research objects are much closer to being usable for the trending inclusion of machine learning (ML) and artificial intelligence (AI) tools in laboratory informatics platforms and other companion research software. By developing laboratory informatics software with a focus on FAIR-driven metadata and database schemes, not only are research objects more FAIR but also "cleaner" and more machine-ready for advanced analytical uses as with ML and AI.

To be sure, the FAIRness of any structured dataset alone is not enough to make it ready for ML and AI applications. Factors such as classification, completeness, context, correctness, duplicity, integrity, mislabeling, outliers, relevancy, sample size, and timeliness of the research object and its contents are also important to consider.^[29]^[30] When those factors aren't appropriately addressed as part of a FAIRification effort towards AI readiness (as well as part of the development of research software of all types), research data and metadata have a higher likelihood of revealing themselves to be inconsistent. As such, searches and analytics using that data and metadata become muddled, and the ultimate ML or AI output will also be muddled (i.e., "garbage in, garbage out"). Whether retroactively updating existing research objects to a more FAIRified state or ensuring research objects (e.g., those originating in an ELN or LIMS) are more FAIR and AI-ready from the start, research software updating or generating those research objects has to address ontologies, data models, data types, identifiers, and more in a thorough yet flexible way.^[31]

Noting that Wilkinson et al. originally highlighted the importance of machine-readability of FAIR data, Huerta et al. add that that core principle of FAIRness "is synergistic with the rapid adoption and increased use of AI in research."^[32] They go on to discuss the positive interactions of FAIR research objects with FAIR-driven, AI-based research. Among the benefits include^[32]:

greater findability of FAIR research objects for further AI-driven scientific discovery;
greater reproducibility of FAIR research objects and any AI models published with them;
improved generalization of AI-driven medical research models when exposed to diverse and FAIR research objects;
improved reporting of AI-driven research results using FAIRified research objects, lending further credibility to those results;
more uniform comparison of AI models using well-defined hyperstructure and information training conditions from FAIRified research objects;
more developed and interoperable "data e-infrastructure," which can further drive a more effective "AI services layer";
reduced bias in AI-driven processes through the use of FAIR research objects and AI models; and
improved surety of scientific correctness where reproducibility in AI-driven research can't be guaranteed.

In the end, developers of research software (whether discipline-specific research software or broader laboratory informatics solutions) would be advised to keep in mind the growing trends of FAIR research, FAIR software, and ML- and AI-driven research, especially in the life sciences, but also a variety of other fields.^[32]

Restricted clinical data and its FAIRification for greater research innovation

Broader discussion in the research community continues to occur in regards to how best to ethically make restricted or privacy-protected clinical data and information FAIR for greater innovation and, by extension, improved patient outcomes, particularly in the wake of the COVID-19 pandemic.^[33]^[34]^[35] (Note that while there are other types of restricted and privacy-protected data, this section will focus largely on clinical data and research objects as the most obvious type.)

These efforts have usually revolved around pulling reusable clinical patient or research data from hospital information systems (HIS), electronic medical records (EMRs), clinical trial management systems (CTMSs), and research databases (often relational in nature) that either contain de-identified data or can de-identify aspects of data and information before access and extraction. Sometimes that clinical data or research object may have already in part been FAIRified, but often it may not be. In all cases, the concepts of privacy, security, and anonymization come up as part of any desire to gain access to that clinical material. However, any FAIRified clinical data isn't necessarily readily open for access. As Snoeijer et al. note: "The authors of the FAIR principles, however, clearly indicate that 'accessible' does not mean open. It means that clarity and transparency is required around the conditions governing access and reuse."^[36]

This is being mentioned in the context of laboratory informatics applications for a couple of reasons. First, a well-designed commercial LIMS that supports clinical research laboratory workflows is already going to address privacy and security aspects, as part of the developer recognizing the need for those labs to adhere to regulations such as the Health Insurance Portability and Accountability Act (HIPAA) and comply with standards such as ISO 15189. However, such a system may not have been developed with FAIR data principles in mind, and any built-in metadata and ontology schemes may be insufficient for full FAIRification of laboratory-based clinical trial research objects. As Queralt-Rosinach et al. note, however, "interestingly, ontologies may also be used to describe data access restrictions to complement FAIR metadata with information that supports data safety and patient privacy."^[34] Essentially, the authors are suggesting that while a HIS or LIS may have built-in access management tools, setting up ontologies and metadata mechanisms that link privacy aspects of a research object (e.g., "has consent form for," "is de-identified," etc.) to the object's metadata allows for even more flexible, FAIR-driven approaches to privacy and security. Research software developers creating such information management tools for the regulated clinical research space may want to apply FAIR concepts such as this to how access control and privacy restrictions are managed. This will inevitably mean any research objects exported with machine-readable privacy-concerning metadata will be more reusable in a way that still "supports data safety and patient privacy."^[34]

Second, a well-designed research software solution working with clinical data will provide not only support for open, community-supported data models and vocabularies for clinical data, but also standardized community-driven ontologies that are specifically developed for access control and privacy. Queralt-Rosinach et al. continue^[34]:

Also, very important for accessibility and data privacy is that the digital objects per se can accommodate the criteria and protocols necessary to comply with regulatory and governance frameworks. Ontologies can aid in opening and protecting patient data by exposing logical definitions of data use conditions. Indeed, there are ontologies to define access and reuse conditions for patient data such as the Informed Consent Ontology (ICO), the Global Alliance for Genomics and Health Data Use Ontology (DUO) standard, and the Open Digital Rights Language (ODRL) vocabulary recommended by W3C.

Also of note here is the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and its OHDSI standardized vocabularies. In all these cases, a developer-driven approach to research software that incorporates community-driven standards that support FAIR principles is welcome. However, as Maxwell et al. noted in their Lancet review article in late 2023, "few platforms or registries applied community-developed standards for participant-level data, further restricting the interoperability of ... data-sharing initiatives [like FAIR]."^[33] As the FAIR principles continue to gain ground in clinical research and diagnostics settings, software developers will need to be more attuned to translating old ways of development to ones that incorporate FAIR data and software principles. Demand for FAIR data will only continue to grow, and any efforts to improve interoperability and reusability while honoring (and enhancing) privacy and security aspects of restricted data will be appreciated by clinical researchers. However, just as FAIR is not an overall goal for researchers, software built with FAIR principles in mind is not the end point of research organizations managing restricted and privacy-protected research objects. Ultimately, those organizations will have make other considerations about restricted data in the scope of FAIR, including addressing data management plans, data use agreements, disclosure review practices, and training as it applies to their research software and generated research objects.^[37]

Conclusion

Laboratory informatics developers will also need to remember that FAIRification of research in itself is not a goal for research laboratories; it is a continual process that recognizes improved scientific research and greater innovation as a more likely outcome.^[1]^[31]^[32]

References

↑ ^1.0 ^1.1 ^1.2 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618.
↑ "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024.
↑ Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html.
↑ ^4.0 ^4.1 ^4.2 ^4.3 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024.
↑ Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x.
↑ Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x.
↑ Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3.
↑ ^8.0 ^8.1 ^8.2 Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016.
↑ "What is Research Software?". JuRSE, the Community of Practice for Research Software Engineering. Forschungszentrum Jülich. 13 February 2024. https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software. Retrieved 30 April 2024.
↑ ^10.0 ^10.1 ^10.2 ^10.3 van Nieuwpoort, Rob; Katz, Daniel S. (14 March 2023) (in en). Defining the roles of research software. doi:10.54900/9akm9y5-5ject5y. https://upstream.force11.org/defining-the-roles-of-research-software.
↑ "Open source software and code". F1000 Research Ltd. 2024. https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/. Retrieved 30 April 2024.
↑ ^12.0 ^12.1 Moynihan, G. (7 July 2020). "The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE". Invenia Blog. Invenia Technical Computing Corporation. https://invenia.github.io/blog/2020/07/07/software-engineering/.
↑ ^13.0 ^13.1 Woolston, Chris (31 May 2022). "Why science needs more research software engineers" (in en). Nature: d41586–022–01516-2. doi:10.1038/d41586-022-01516-2. ISSN 0028-0836. https://www.nature.com/articles/d41586-022-01516-2.
↑ "RSE@KIT". Karlsruhe Institute of Technology. 20 February 2024. https://www.rse-community.kit.edu/index.php. Retrieved 01 May 2024.
↑ "Purdue Center for Research Software Engineering". Purdue University. 2024. https://www.rcac.purdue.edu/rse. Retrieved 01 May 2024.
↑ ^16.0 ^16.1 Cohen, Jeremy; Katz, Daniel S.; Barker, Michelle; Chue Hong, Neil; Haines, Robert; Jay, Caroline (1 January 2021). "The Four Pillars of Research Software Engineering". IEEE Software 38 (1): 97–105. doi:10.1109/MS.2020.2973362. ISSN 0740-7459. https://ieeexplore.ieee.org/document/8994167/.
↑ Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html.
↑ Ghiringhelli, Luca M.; Baldauf, Carsten; Bereau, Tristan; Brockhauser, Sandor; Carbogno, Christian; Chamanara, Javad; Cozzini, Stefano; Curtarolo, Stefano et al. (14 September 2023). "Shared metadata for data-centric materials science" (in en). Scientific Data 10 (1): 626. doi:10.1038/s41597-023-02501-8. ISSN 2052-4463. PMC PMC10502089. PMID 37709811. https://www.nature.com/articles/s41597-023-02501-8.
↑ ^19.0 ^19.1 Fitschen, Timm; tom Wörden, Henrik; Schlemmer, Alexander; Spreckelsen, Florian; Hornung, Daniel (12 October 2022). "Agile Research Data Management with FDOs using LinkAhead". Research Ideas and Outcomes 8: e96075. doi:10.3897/rio.8.e96075. ISSN 2367-7163. https://riojournal.com/article/96075/.
↑ Weigel, Tobias; Schwardmann, Ulrich; Klump, Jens; Bendoukha, Sofiane; Quick, Robert (1 January 2020). "Making Data and Workflows Findable for Machines" (in en). Data Intelligence 2 (1-2): 40–46. doi:10.1162/dint_a_00026. ISSN 2641-435X. https://direct.mit.edu/dint/article/2/1-2/40-46/9994.
↑ ^21.0 ^21.1 ^21.2 ^21.3 Aggour, Kareem S.; Kumar, Vijay S.; Gupta, Vipul K.; Gabaldon, Alfredo; Cuddihy, Paul; Mulwad, Varish (9 April 2024). "Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data" (in en). Integrating Materials and Manufacturing Innovation. doi:10.1007/s40192-024-00348-4. ISSN 2193-9764. https://link.springer.com/10.1007/s40192-024-00348-4.
↑ ^22.0 ^22.1 Grobe, Peter; Baum, Roman; Bhatty, Philipp; Köhler, Christian; Meid, Sandra; Quast, Björn; Vogt, Lars (26 June 2019). "From Data to Knowledge: A semantic knowledge graph application for curating specimen data" (in en). Biodiversity Information Science and Standards 3: e37412. doi:10.3897/biss.3.37412. ISSN 2535-0897. https://biss.pensoft.net/article/37412/.
↑ ^23.0 ^23.1 Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "FAIR and Knowledge graphs". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564.
↑ ^24.0 ^24.1 Tomlinson, E. (28 July 2023). "RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software" (PDF). Semaphore Solutions, Inc. https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf. Retrieved 01 May 2024.
↑ ^25.0 ^25.1 Deagen, Michael E.; McCusker, Jamie P.; Fateye, Tolulomo; Stouffer, Samuel; Brinson, L. Cate; McGuinness, Deborah L.; Schadler, Linda S. (27 May 2022). "FAIR and Interactive Data Graphics from a Scientific Knowledge Graph" (in en). Scientific Data 9 (1): 239. doi:10.1038/s41597-022-01352-z. ISSN 2052-4463. PMC PMC9142568. PMID 35624233. https://www.nature.com/articles/s41597-022-01352-z.
↑ Brandizi, Marco; Singh, Ajit; Rawlings, Christopher; Hassani-Pak, Keywan (25 September 2018). "Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach" (in en). Journal of Integrative Bioinformatics 15 (3): 20180023. doi:10.1515/jib-2018-0023. ISSN 1613-4516. PMC PMC6340125. PMID 30085931. https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html.
↑ de Visser, Casper; Johansson, Lennart F.; Kulkarni, Purva; Mei, Hailiang; Neerincx, Pieter; Joeri van der Velde, K.; Horvatovich, Péter; van Gool, Alain J. et al. (28 September 2023). Palagi, Patricia M.. ed. "Ten quick tips for building FAIR workflows" (in en). PLOS Computational Biology 19 (9): e1011369. doi:10.1371/journal.pcbi.1011369. ISSN 1553-7358. PMC PMC10538699. PMID 37768885. https://dx.plos.org/10.1371/journal.pcbi.1011369.
↑ Schröder, Max; Staehlke, Susanne; Groth, Paul; Nebe, J. Barbara; Spors, Sascha; Krüger, Frank (1 December 2022). "Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation" (in en). Journal of Biomedical Semantics 13 (1): 4. doi:10.1186/s13326-021-00257-x. ISSN 2041-1480. PMC PMC8802522. PMID 35101121. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x.
↑ Hiniduma, Kaveen; Byna, Suren; Bez, Jean Luca (2024). "Data Readiness for AI: A 360-Degree Survey". arXiv. doi:10.48550/ARXIV.2404.05779. https://arxiv.org/abs/2404.05779.
↑ Fletcher, Lydia (16 April 2024). FAIR Re-use: Implications for AI-Readiness. The University Of Texas At Austin, The University Of Texas At Austin. doi:10.26153/TSW/51475. https://repositories.lib.utexas.edu/handle/2152/124873.
↑ ^31.0 ^31.1 Olsen, C. (1 September 2023). "Embracing FAIR Data on the Path to AI-Readiness". Pharma's Almanac. https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness. Retrieved 03 May 2024.
↑ ^32.0 ^32.1 ^32.2 ^32.3 Huerta, E. A.; Blaiszik, Ben; Brinson, L. Catherine; Bouchard, Kristofer E.; Diaz, Daniel; Doglioni, Caterina; Duarte, Javier M.; Emani, Murali et al. (26 July 2023). "FAIR for AI: An interdisciplinary and international community building perspective" (in en). Scientific Data 10 (1): 487. doi:10.1038/s41597-023-02298-6. ISSN 2052-4463. PMC PMC10372139. PMID 37495591. https://www.nature.com/articles/s41597-023-02298-6.
↑ ^33.0 ^33.1 Maxwell, Lauren; Shreedhar, Priya; Dauga, Delphine; McQuilton, Peter; Terry, Robert F; Denisiuk, Alisa; Molnar-Gabor, Fruzsina; Saxena, Abha et al. (1 October 2023). "FAIR, ethical, and coordinated data sharing for COVID-19 response: a scoping review and cross-sectional survey of COVID-19 data sharing platforms and registries" (in en). The Lancet Digital Health 5 (10): e712–e736. doi:10.1016/S2589-7500(23)00129-2. PMC PMC10552001. PMID 37775189. https://linkinghub.elsevier.com/retrieve/pii/S2589750023001292.
↑ ^34.0 ^34.1 ^34.2 ^34.3 Queralt-Rosinach, Núria; Kaliyaperumal, Rajaram; Bernabé, César H.; Long, Qinqin; Joosten, Simone A.; van der Wijk, Henk Jan; Flikkenschild, Erik L.A.; Burger, Kees et al. (1 December 2022). "Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic" (in en). Journal of Biomedical Semantics 13 (1): 12. doi:10.1186/s13326-022-00263-7. ISSN 2041-1480. PMC PMC9036506. PMID 35468846. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7.
↑ Martínez-García, Alicia; Alvarez-Romero, Celia; Román-Villarán, Esther; Bernabeu-Wittel, Máximo; Luis Parra-Calderón, Carlos (1 May 2023). "FAIR principles to improve the impact on health research management outcomes" (in en). Heliyon 9 (5): e15733. doi:10.1016/j.heliyon.2023.e15733. PMC PMC10189186. PMID 37205991. https://linkinghub.elsevier.com/retrieve/pii/S2405844023029407.
↑ Snoeijer, B.; Pasapula, V.; Covucci, A. et al. (2019). "Paper SA04 - Processing big data from multiple sources" (PDF). Proceedings of PHUSE Connect EU 2019. PHUSE Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2019/Connect/EU/Amsterdam/PAP_SA04.pdf. Retrieved 03 May 2024.
↑ Jang, Joy Bohyun; Pienta, Amy; Levenstein, Margaret; Saul, Joe (6 December 2023). "Restricted data management: the current practice and the future". Journal of Privacy and Confidentiality 13 (2). doi:10.29012/jpc.844. ISSN 2575-8527. PMC PMC10956935. PMID 38515607. https://journalprivacyconfidentiality.org/index.php/jpc/article/view/844.

[WilkinsonTheFAIR16-1] 1.0 ^1.1 ^1.2 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618.

[NIHPubMedSearch-2] "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024.

[3] Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html.

[GruenpeterFAIRPlus20-4] 4.0 ^4.1 ^4.2 ^4.3 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024.

[5] Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x.

[6] Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x.

[7] Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3.

[GruenpeterDefining21-8] 8.0 ^8.1 ^8.2 Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016.

[JulichWhatIsRes24-9] "What is Research Software?". JuRSE, the Community of Practice for Research Software Engineering. Forschungszentrum Jülich. 13 February 2024. https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software. Retrieved 30 April 2024.

[vanNieuwpoortDefining24-10] 10.0 ^10.1 ^10.2 ^10.3 van Nieuwpoort, Rob; Katz, Daniel S. (14 March 2023) (in en). Defining the roles of research software. doi:10.54900/9akm9y5-5ject5y. https://upstream.force11.org/defining-the-roles-of-research-software.

[F1000Open24-11] "Open source software and code". F1000 Research Ltd. 2024. https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/. Retrieved 30 April 2024.

[MoynihanTheHitch20-12] 12.0 ^12.1 Moynihan, G. (7 July 2020). "The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE". Invenia Blog. Invenia Technical Computing Corporation. https://invenia.github.io/blog/2020/07/07/software-engineering/.

[WoolstonWhySci22-13] 13.0 ^13.1 Woolston, Chris (31 May 2022). "Why science needs more research software engineers" (in en). Nature: d41586–022–01516-2. doi:10.1038/d41586-022-01516-2. ISSN 0028-0836. https://www.nature.com/articles/d41586-022-01516-2.

[KITRSE.40KIT24-14] "RSE@KIT". Karlsruhe Institute of Technology. 20 February 2024. https://www.rse-community.kit.edu/index.php. Retrieved 01 May 2024.

[PUPurdueCenter-15] "Purdue Center for Research Software Engineering". Purdue University. 2024. https://www.rcac.purdue.edu/rse. Retrieved 01 May 2024.

[CohenTheFour21-16] 16.0 ^16.1 Cohen, Jeremy; Katz, Daniel S.; Barker, Michelle; Chue Hong, Neil; Haines, Robert; Jay, Caroline (1 January 2021). "The Four Pillars of Research Software Engineering". IEEE Software 38 (1): 97–105. doi:10.1109/MS.2020.2973362. ISSN 0740-7459. https://ieeexplore.ieee.org/document/8994167/.

[17] Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html.

[GhiringhelliShared23-18] Ghiringhelli, Luca M.; Baldauf, Carsten; Bereau, Tristan; Brockhauser, Sandor; Carbogno, Christian; Chamanara, Javad; Cozzini, Stefano; Curtarolo, Stefano et al. (14 September 2023). "Shared metadata for data-centric materials science" (in en). Scientific Data 10 (1): 626. doi:10.1038/s41597-023-02501-8. ISSN 2052-4463. PMC PMC10502089. PMID 37709811. https://www.nature.com/articles/s41597-023-02501-8.

[FirschenAgile22-19] 19.0 ^19.1 Fitschen, Timm; tom Wörden, Henrik; Schlemmer, Alexander; Spreckelsen, Florian; Hornung, Daniel (12 October 2022). "Agile Research Data Management with FDOs using LinkAhead". Research Ideas and Outcomes 8: e96075. doi:10.3897/rio.8.e96075. ISSN 2367-7163. https://riojournal.com/article/96075/.

[20] Weigel, Tobias; Schwardmann, Ulrich; Klump, Jens; Bendoukha, Sofiane; Quick, Robert (1 January 2020). "Making Data and Workflows Findable for Machines" (in en). Data Intelligence 2 (1-2): 40–46. doi:10.1162/dint_a_00026. ISSN 2641-435X. https://direct.mit.edu/dint/article/2/1-2/40-46/9994.

[AggourSemantics24-21] 21.0 ^21.1 ^21.2 ^21.3 Aggour, Kareem S.; Kumar, Vijay S.; Gupta, Vipul K.; Gabaldon, Alfredo; Cuddihy, Paul; Mulwad, Varish (9 April 2024). "Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data" (in en). Integrating Materials and Manufacturing Innovation. doi:10.1007/s40192-024-00348-4. ISSN 2193-9764. https://link.springer.com/10.1007/s40192-024-00348-4.

[GrobeFromData19-22] 22.0 ^22.1 Grobe, Peter; Baum, Roman; Bhatty, Philipp; Köhler, Christian; Meid, Sandra; Quast, Björn; Vogt, Lars (26 June 2019). "From Data to Knowledge: A semantic knowledge graph application for curating specimen data" (in en). Biodiversity Information Science and Standards 3: e37412. doi:10.3897/biss.3.37412. ISSN 2535-0897. https://biss.pensoft.net/article/37412/.

[Rocca-SerraFAIRCook22-23] 23.0 ^23.1 Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "FAIR and Knowledge graphs". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564.

[TomlinsonRDF23-24] 24.0 ^24.1 Tomlinson, E. (28 July 2023). "RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software" (PDF). Semaphore Solutions, Inc. https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf. Retrieved 01 May 2024.

[DeagenFAIRAnd22-25] 25.0 ^25.1 Deagen, Michael E.; McCusker, Jamie P.; Fateye, Tolulomo; Stouffer, Samuel; Brinson, L. Cate; McGuinness, Deborah L.; Schadler, Linda S. (27 May 2022). "FAIR and Interactive Data Graphics from a Scientific Knowledge Graph" (in en). Scientific Data 9 (1): 239. doi:10.1038/s41597-022-01352-z. ISSN 2052-4463. PMC PMC9142568. PMID 35624233. https://www.nature.com/articles/s41597-022-01352-z.

[26] Brandizi, Marco; Singh, Ajit; Rawlings, Christopher; Hassani-Pak, Keywan (25 September 2018). "Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach" (in en). Journal of Integrative Bioinformatics 15 (3): 20180023. doi:10.1515/jib-2018-0023. ISSN 1613-4516. PMC PMC6340125. PMID 30085931. https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html.

[27] Visser, Casper; Johansson, Lennart F.; Kulkarni, Purva; Mei, Hailiang; Neerincx, Pieter; Joeri van der Velde, K.; Horvatovich, Péter; van Gool, Alain J. et al. (28 September 2023). Palagi, Patricia M.. ed. "Ten quick tips for building FAIR workflows" (in en). PLOS Computational Biology 19 (9): e1011369. doi:10.1371/journal.pcbi.1011369. ISSN 1553-7358. PMC PMC10538699. PMID 37768885. https://dx.plos.org/10.1371/journal.pcbi.1011369.

[28] Schröder, Max; Staehlke, Susanne; Groth, Paul; Nebe, J. Barbara; Spors, Sascha; Krüger, Frank (1 December 2022). "Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation" (in en). Journal of Biomedical Semantics 13 (1): 4. doi:10.1186/s13326-021-00257-x. ISSN 2041-1480. PMC PMC8802522. PMID 35101121. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x.

[HinidumaDataRead24-29] Hiniduma, Kaveen; Byna, Suren; Bez, Jean Luca (2024). "Data Readiness for AI: A 360-Degree Survey". arXiv. doi:10.48550/ARXIV.2404.05779. https://arxiv.org/abs/2404.05779.

[FletcherFAIRRe24-30] Fletcher, Lydia (16 April 2024). FAIR Re-use: Implications for AI-Readiness. The University Of Texas At Austin, The University Of Texas At Austin. doi:10.26153/TSW/51475. https://repositories.lib.utexas.edu/handle/2152/124873.

[OlsenEmbracing23-31] 31.0 ^31.1 Olsen, C. (1 September 2023). "Embracing FAIR Data on the Path to AI-Readiness". Pharma's Almanac. https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness. Retrieved 03 May 2024.

[HuertaFAIRForAI23-32] 32.0 ^32.1 ^32.2 ^32.3 Huerta, E. A.; Blaiszik, Ben; Brinson, L. Catherine; Bouchard, Kristofer E.; Diaz, Daniel; Doglioni, Caterina; Duarte, Javier M.; Emani, Murali et al. (26 July 2023). "FAIR for AI: An interdisciplinary and international community building perspective" (in en). Scientific Data 10 (1): 487. doi:10.1038/s41597-023-02298-6. ISSN 2052-4463. PMC PMC10372139. PMID 37495591. https://www.nature.com/articles/s41597-023-02298-6.

[MaxwellFAIREthic23-33] 33.0 ^33.1 Maxwell, Lauren; Shreedhar, Priya; Dauga, Delphine; McQuilton, Peter; Terry, Robert F; Denisiuk, Alisa; Molnar-Gabor, Fruzsina; Saxena, Abha et al. (1 October 2023). "FAIR, ethical, and coordinated data sharing for COVID-19 response: a scoping review and cross-sectional survey of COVID-19 data sharing platforms and registries" (in en). The Lancet Digital Health 5 (10): e712–e736. doi:10.1016/S2589-7500(23)00129-2. PMC PMC10552001. PMID 37775189. https://linkinghub.elsevier.com/retrieve/pii/S2589750023001292.

[Queralt-RosinachApplying22-34] 34.0 ^34.1 ^34.2 ^34.3 Queralt-Rosinach, Núria; Kaliyaperumal, Rajaram; Bernabé, César H.; Long, Qinqin; Joosten, Simone A.; van der Wijk, Henk Jan; Flikkenschild, Erik L.A.; Burger, Kees et al. (1 December 2022). "Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic" (in en). Journal of Biomedical Semantics 13 (1): 12. doi:10.1186/s13326-022-00263-7. ISSN 2041-1480. PMC PMC9036506. PMID 35468846. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7.

[35] Martínez-García, Alicia; Alvarez-Romero, Celia; Román-Villarán, Esther; Bernabeu-Wittel, Máximo; Luis Parra-Calderón, Carlos (1 May 2023). "FAIR principles to improve the impact on health research management outcomes" (in en). Heliyon 9 (5): e15733. doi:10.1016/j.heliyon.2023.e15733. PMC PMC10189186. PMID 37205991. https://linkinghub.elsevier.com/retrieve/pii/S2405844023029407.

[SnoeijerProcess19-36] Snoeijer, B.; Pasapula, V.; Covucci, A. et al. (2019). "Paper SA04 - Processing big data from multiple sources" (PDF). Proceedings of PHUSE Connect EU 2019. PHUSE Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2019/Connect/EU/Amsterdam/PAP_SA04.pdf. Retrieved 03 May 2024.

[37] Jang, Joy Bohyun; Pienta, Amy; Levenstein, Margaret; Saul, Joe (6 December 2023). "Restricted data management: the current practice and the future". Journal of Privacy and Confidentiality 13 (2). doi:10.29012/jpc.844. ISSN 2575-8527. PMC PMC10956935. PMID 38515607. https://journalprivacyconfidentiality.org/index.php/jpc/article/view/844.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

@@ Line 8: / Line 8: @@
 ==Sandbox begins below==
 <div class="nonumtoc">__TOC__</div>
-[[File:Médécin au Laboratoire Hopital Douala Cameroun.jpg|thumb|380px|right|Hospitals and labs around the world depend on a laboratory information system to manage and report patient data and test results.]]
+[[File:FAIRResourcesGraphic AustralianResearchDataCommons 2018.png|right|520px]]
-A '''laboratory information system''' (LIS) is a software system that records, manages, and stores data for clinical [[Laboratory|laboratories]]. An LIS has traditionally been most adept at sending laboratory test orders to lab instruments, tracking those orders, and then recording the results, typically to a searchable database.<ref name="biohealth">{{cite web |url=http://www.biohealthmatics.com/technologies/his/lis.aspx |archiveurl=https://web.archive.org/web/20180106000403/http://www.biohealthmatics.com/technologies/his/lis.aspx |title=Laboratory Information Systems |work=Biohealthmatics.com |publisher=Biomedical Informatics Ltd |date=10 August 2006 |archivedate=06 January 2020 |accessdate=13 March 2024}}</ref> The standard LIS has supported the operations of public health institutions (like [[Hospital|hospitals]] and clinics) and their associated labs by managing and reporting critical data concerning "the status of infection, immunology, and care and treatment status of patients."<ref name="APHLLIS">{{cite web |url=https://www.aphl.org/MRC/Documents/GH_2005Oct_LIS-Quick-Start-Guide.pdf |archiveurl=https://web.archive.org/web/20170919184029/https://www.aphl.org/MRC/Documents/GH_2005Oct_LIS-Quick-Start-Guide.pdf |format=PDF |title=Quick Start Guide to Laboratory Information System (LIS) Implementation |publisher=Association of Public Health Laboratories |date=October 2005 |archivedate=19 September 2017 |accessdate=13 March 2024}}</ref>
+'''Title''': ''What are the potential implications of the FAIR data principles to laboratory informatics applications?''
-==History of the LIS==
+'''Author for citation''': Shawn E. Douglas
-Advances in computational technology in the early 1960s led some to experiment with time and data management functions in the healthcare setting. Company Bolt Beranek Newman and the Massachusetts General Hospital worked together to create a system that "included time-sharing and multiuser techniques that would later be essential to the implementation of the modern LIS."<ref name="APLISReview">{{cite journal |journal=Advances in Anatomic Pathology |title=Anatomic Pathology Laboratory Information Systems: A Review |author=Park, S.L.; Pantanowitz, L.; Sharma, G.; Parwani, A.V. |volume=19 |issue=2 |page=81–96 |year=2012 |doi=10.1097/PAP.0b013e318248b787}}</ref> At around the same time General Electric announced plans to program a [[hospital information system]] (HIS), though those plans eventually fell through.<ref name="HistMedInfo">{{cite book |url=https://books.google.com/books/about/A_History_of_medical_informatics.html?id=AR5rAAAAMAAJ |title=A History of Medical Informatics |author=Blum, B.I.; Duncan, K.A. |publisher=ACM Press |year=1990 |pages=141–53 |isbn=9780201501287}}</ref>
-Aside from the Massachusetts General Hospital experiment, the idea of a software system capable of managing time and data management functions wasn't heavily explored until the late 1960s, primarily because of the lack of proper technology and of communication between providers and end-users. The development of the Massachusetts General Hospital Utility Multi-Programming System (MUMPS) in the mid-'60s certainly helped as it suddenly allowed for a multi-user interface and a hierarchical system for persistent storage of data.<ref name="APLISReview" /> Yet due to its advanced nature, fragmented use across multiple entities, and inherent difficulty in extracting and analyzing data from the database, development of healthcare and laboratory systems on MUMPS was sporadic at best.<ref name="HistMedInfo" /> By the 1980s, however, the advent of Structured Query Language ([[SQL]]), [[Database|relational database management systems]] (RDBMS), and [[Health Level 7]] (HL7) allowed software developers to expand the functionality and interoperability of the LIS, including the application of business analytics and business intelligence techniques to clinical data.<ref name="PractPathInfo">{{cite book |url=https://books.google.com/books?id=WerUyK618fcC |title=Practical Pathology Informatics: Demstifying Informatics for the Practicing Anatomic Pathologist |author=Sinard, J.H. |publisher=Springer |year=2006 |pages=393 |isbn=0387280588}}</ref>
+'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]
-In the early 2010s, web-based and database-centric internet applications of [[laboratory informatics]] software changed the way researchers and technicians interacted with data, with web-driven data formatting technologies like [[Extensible Markup Language]] (XML) making LIS and [[electronic medical record]] (EMR) interoperability a much-needed reality.<ref name="OverBarEMR">{{cite journal |title=Overcoming barriers to electronic medical record (EMR) implementation in the US healthcare system: A comparative study |journal=Health Informatics Journal |author=Kumar, S.; Aldrich, K. |volume=16 |issue=4 |year=2011 |doi=10.1177/1460458210380523}}</ref> [[Software as a service|SaaS]] and cloud computing technologies have since further changed how the LIS is implemented, while at the same time raising new questions about security and stability.<ref name="APLISReview" />
+'''Publication date''': May 2024
-The modern LIS has evolved to take on new functionalities not previously seen, including configurable [[Clinical decision support system|clinical decision support]] rules, system integration, laboratory outreach tools, and support for point-of-care testing (POCT) data. LIS modules have also begun to show up in EMR and [[electronic health record|EHR]] products, giving some laboratories the option to have an enterprise-wide solution that can cover multiple aspects of the lab.<ref name="FutrellWhatsNew17">{{cite web |url=https://www.mlo-online.com/continuing-education/article/13009013/whats-new-in-todays-lis |title=What's new in today's LIS? |author=Futrell, K. |work=Medical Laboratory Observer |publisher=NP Communications, LLC |date=23 January 2017 |accessdate=13 March 2024}}</ref> Additionally, the distinction between an LIS and a [[laboratory information management system]] (LIMS) has blurred somewhat, with some vendors choosing to use the "LIMS" acronym to market their clinical laboratory data management system.
+==Introduction==
+https://www.limswiki.org/index.php/Journal:Infrastructure_tools_to_support_an_effective_radiation_oncology_learning_health_system
-==Purpose and functionality==
+This brief topical article will examine
-An LIS is a software solution designed to allow end users to better manage a wide variety of operational and quality management aspects of the academic, government, and commercial [[Clinical laboratory|clinical laboratories]] appearing in and around public, private, and mobile healthcare facilities. The reasons for adopting an LIS and other [[laboratory informatics]] solutions varies by laboratory, but a December 2019 survey by ''Medical Laboratory Observer'', consisting of 273 respondents, is somewhat revealing of the common purposes for adopting an LIS. Ninety-five percent of respondents indicated they use it to streamline their electronic order entry and result management, with medical data connectivity being the second most popular use. Automation tools, customer relationship management, scheduling, inventory management, revenue management, quality management, and reporting were all also mentioned as important to users.<ref name="SilvaITSol19">{{cite web |url=https://www.mlo-online.com/information-technology/article/21117759/it-solutions-in-the-clinical-lab |title=IT solutions in the clinical lab |author=Silva, B. |work=Medical Laboratory Observer |date=19 December 2019 |accessdate=18 November 2021}}</ref> When asked to select from five choices (or provide some other reason) in regard to what their top priority was in selecting an LIS, respondents indicated that their most important priority was providing data analysis mechanisms for all types of pathology, followed by multi-lab interoperability, integration with [[electronic medical record]]s (EMRs), flexible laboratory management functionality, and real-time automated inventory management.<ref name="SilvaITSol19" />
-These responses help paint a picture of what a LIS can do, but there's definitely more to it. Through regulatory, market, patient, and technological pressures, many laboratories have decided that increasingly digitizing the laboratory makes sense in its efforts towards greater compliance, competitiveness, patient outcomes, and efficiency.<ref name="AstrixProgress22">{{cite web |url=https://astrixinc.com/wp-content/uploads/2022/06/Progress-Snapshot-on-Enabling-the-Digital-Lab-of-the-Future-v4a.pdf |format=PDF |title=2022 Laboratory Informatics: Progress Snapshot on Enabling the Digital Lab of the Future |publisher=Astrix Technology, LLC |pages=18–23 |date=June 2022 |accessdate=12 March 2024}}</ref><ref name="LiscouskiJust23">{{cite book |url=https://www.limswiki.org/index.php/LII:Justifying_LIMS_Acquisition_and_Deployment_within_Your_Organization/Introduction_to_LIMS_and_its_acquisition_and_deployment |title=Justifying LIMS Acquisition and Deployment within Your Organization |chapter=1. Introduction to LIMS and its acquisition and deployment |author=Liscouski, J. |editor=Douglas, S.E. |publisher=LIMSwiki |date=July 2023 |accessdate=13 March 2024}}</ref> An LIS deployment primarily focuses on specimen and patient management, data acquisition, and reporting activities; however, its scope can expand much further depending on the scientific discipline (e.g., [[Clinical pathology|clinical]] vs. [[anatomical pathology|anatomic pathology]]) or role the lab plays (e.g., commercial clinical diagnostics vs. government public health).
+==The "FAIR-ification" of research objects and software==
+First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the [[Journal:The FAIR Guiding Principles for scientific data management and stewardship|FAIR data principles]] were published by Wilkinson ''et al.'' in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and [[information]] of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.<ref name="WilkinsonTheFAIR16">{{Cite journal |last=Wilkinson |first=Mark D. |last2=Dumontier |first2=Michel |last3=Aalbersberg |first3=IJsbrand Jan |last4=Appleton |first4=Gabrielle |last5=Axton |first5=Myles |last6=Baak |first6=Arie |last7=Blomberg |first7=Niklas |last8=Boiten |first8=Jan-Willem |last9=da Silva Santos |first9=Luiz Bonino |last10=Bourne |first10=Philip E. |last11=Bouwman |first11=Jildau |date=2016-03-15 |title=The FAIR Guiding Principles for scientific data management and stewardship |url=https://www.nature.com/articles/sdata201618 |journal=Scientific Data |language=en |volume=3 |issue=1 |pages=160018 |doi=10.1038/sdata.2016.18 |issn=2052-4463 |pmc=PMC4792175 |pmid=26978244}}</ref> The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."<ref name="WilkinsonTheFAIR16" />
-===LIS functionality===
+Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice<ref name="NIHPubMedSearch">{{cite web |url=https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles |title=fair data principles |work=PubMed Search |publisher=National Institutes of Health, National Library of Medicine |accessdate=30 April 2024}}</ref>, including in ways beyond what perhaps was originally imagined by Wilkinson ''et al.''. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref><ref name="GruenpeterFAIRPlus20">{{Cite web |last=Gruenpeter, M. |date=23 November 2020 |title=FAIR + Software: Decoding the principles |url=https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf |format=PDF |publisher=FAIRsFAIR “Fostering FAIR Data Practices In Europe” |accessdate=30 April 2024}}</ref><ref>{{Cite journal |last=Barker |first=Michelle |last2=Chue Hong |first2=Neil P. |last3=Katz |first3=Daniel S. |last4=Lamprecht |first4=Anna-Lena |last5=Martinez-Ortiz |first5=Carlos |last6=Psomopoulos |first6=Fotis |last7=Harrow |first7=Jennifer |last8=Castro |first8=Leyla Jael |last9=Gruenpeter |first9=Morane |last10=Martinez |first10=Paula Andrea |last11=Honeyman |first11=Tom |date=2022-10-14 |title=Introducing the FAIR Principles for research software |url=https://www.nature.com/articles/s41597-022-01710-x |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=622 |doi=10.1038/s41597-022-01710-x |issn=2052-4463 |pmc=PMC9562067 |pmid=36241754}}</ref><ref>{{Cite journal |last=Patel |first=Bhavesh |last2=Soundarajan |first2=Sanjay |last3=Ménager |first3=Hervé |last4=Hu |first4=Zicheng |date=2023-08-23 |title=Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool |url=https://www.nature.com/articles/s41597-023-02463-x |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=557 |doi=10.1038/s41597-023-02463-x |issn=2052-4463 |pmc=PMC10447492 |pmid=37612312}}</ref><ref>{{Cite journal |last=Du |first=Xinsong |last2=Dastmalchi |first2=Farhad |last3=Ye |first3=Hao |last4=Garrett |first4=Timothy J. |last5=Diller |first5=Matthew A. |last6=Liu |first6=Mei |last7=Hogan |first7=William R. |last8=Brochhausen |first8=Mathias |last9=Lemas |first9=Dominick J. |date=2023-02-06 |title=Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software |url=https://link.springer.com/10.1007/s11306-023-01974-3 |journal=Metabolomics |language=en |volume=19 |issue=2 |pages=11 |doi=10.1007/s11306-023-01974-3 |issn=1573-3890}}</ref> Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."<ref name="GruenpeterFAIRPlus20" /> The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.
-An LIS can have a complex list of features, or it may have minimal functionality. Software developers with competent and experienced personnel usually do well with a collection of the required base features, plus any industry-specific features a laboratory may need. But not all developers get it right. The following is a list of LIS functionality that is considered by a variety of experts to be vital to almost any clinical diagnostic or research laboratory.<ref name="APHLLab19">{{cite web |url=https://www.aphl.org/aboutAPHL/publications/Documents/GH-2019May-LIS-Guidebook-web.pdf |format=PDF |title=Laboratory Information Systems Project Management: A Guidebook for International Implementations |author=Association of Public Health Laboratories |publisher=APHL |date=May 2019 |accessdate=13 March 2024}}</ref><ref name="KyobeSelecting17">{{cite journal |title=Selecting a Laboratory Information Management System for Biorepositories in Low- and Middle-Income Countries: The H3Africa Experience and Lessons Learned |journal=Biopreservation and Biobanking |author=Kyobe, S.; Musinguzi, H.; Lwanga, N. et al. |volume=15 |issue=2 |pages=111–15 |year=2017 |doi=10.1089/bio.2017.0006 |pmc=PMC5397240}}</ref><ref name="ListEffic14">{{cite journal |title=Efficient sample tracking with OpenLabFramework |journal=Scientific Reports |author=List, M.; Schmidt, S.; Trojnar, J. et al. |volume=4 |pages=4278 |year=2014 |doi=10.1038/srep04278 |pmid=24589879 |pmc=PMC3940979}}</ref><ref name="APILISTool13">{{cite web |url=https://www.pathologyinformatics.org/toolkit.php |archiveurl=https://web.archive.org/web/20210801175420/https://www.pathologyinformatics.org/toolkit.php |title=LIS Functionality Assessment Toolkit |author=Splitz, A.R.; Balis, U.J.; Friedman, B.A. et al. |publisher=Association for Pathology Informatics |date=20 September 2013 |archivedate=01 August 2021 |accessdate=13 March 2024}}</ref>
-'''''Test, experiment, and patient management'''''
+A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.<ref name="GruenpeterFAIRPlus20" /> These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).<ref name="GruenpeterFAIRPlus20" />
-*specimen log-in and management, with support for unique IDs
+At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.<ref name="GruenpeterDefining21">{{Cite journal |last=Gruenpeter, Morane |last2=Katz, Daniel S. |last3=Lamprecht, Anna-Lena |last4=Honeyman, Tom |last5=Garijo, Daniel |last6=Struck, Alexander |last7=Niehues, Anna |last8=Martinez, Paula Andrea |last9=Castro, Leyla Jael |last10=Rabemanantsoa, Tovo |last11=Chue Hong, Neil P. |date=2021-09-13 |title=Defining Research Software: a controversial discussion |url=https://zenodo.org/record/5504016 |journal=Zenodo |doi=10.5281/zenodo.5504016}}</ref><ref name="JulichWhatIsRes24">{{cite web |url=https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software |title=What is Research Software? |work=JuRSE, the Community of Practice for Research Software Engineering |publisher=Forschungszentrum Jülich |date=13 February 2024 |accessdate=30 April 2024}}</ref><ref name="vanNieuwpoortDefining24">{{Cite journal |last=van Nieuwpoort |first=Rob |last2=Katz |first2=Daniel S. |date=2023-03-14 |title=Defining the roles of research software |url=https://upstream.force11.org/defining-the-roles-of-research-software |language=en |doi=10.54900/9akm9y5-5ject5y}}</ref> In 2021, as part of the FAIRsFAIR initiative, Gruenpeter ''et al.'' made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition<ref name="GruenpeterDefining21" />:
-*batching support
-*barcode and RFID support
-*specimen tracking
-*clinical decision support, including test ordering tools and duplicate test checks
-*custom test management
-*event and instrument scheduling
-*templates, forms, and data fields that are configurable
-*analytical tools, including data visualization, trend analysis, and data mining features
-*data import and export
-*robust query tools
-*document and image management
-*project and experiment management
-*workflow management
-*patient management
-*case management
-*physician and supplier management
-'''''Quality, security, and compliance'''''
+<blockquote>Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.</blockquote>
-*quality assurance / quality control mechanisms, including tracking of nonconformance
+Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or [[laboratory information management system]] (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is<ref name="vanNieuwpoortDefining24" />:
-*data normalization and validation
-*results review and approval
-*version control
-*user qualification, performance, and training management
-*audit trails and chain of custody support
-*configurable and granular role-based security
-*configurable system access and use (log-in requirements, account usage rules, account locking, etc.)
-*electronic signature support
-*configurable alarms and alerts
-*data encryption and secure communication protocols
-*data archiving and retention support
-*configurable data backups
-*environmental monitoring and control
-'''''Operations management and reporting'''''
+*Research software is a component of our instruments.
+*Research software is the instrument.
+*Research software analyzes research data.
+*Research software presents research results.
+*Research software assembles or integrates existing components into a working whole.
+*Research software is infrastructure or an underlying tool.
+*Research software facilitates distinctively research-oriented collaboration.
-*customizable rich-text reporting, with multiple supported output formats
+When considering these definitions<ref name="GruenpeterDefining21" /><ref name="vanNieuwpoortDefining24" /> of research software and their adoption by other entities<ref name="F1000Open24">{{cite web |url=https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/ |title=Open source software and code |publisher=F1000 Research Ltd |date=2024 |accessdate=30 April 2024}}</ref>, it would appear that at least in part some [[laboratory informatics]] software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, [[electronic laboratory notebook]]s (ELNs) like open-source [[Jupyter Notebook]] or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.<ref name="vanNieuwpoortDefining24" /> Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.
-*synoptic reporting
-*industry-compliant labeling
-*email integration
-*internal messaging system
-*revenue management
-*instrument interfacing and data management
-*instrument calibration and maintenance tracking
-*inventory and reagent management
-*third-party software and database interfacing
-*mobile device support
-*voice recognition capability
-*results portal for external parties
-*integrated (or online) system help
-*configurable language
-===Discipline-specific LIS===
+Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.
-Through the early 2010s, the laboratory information system had been roughly segmented into two broad categories (though other variations existed): the clinical pathology and anatomic pathology LIS.<ref name="MedLabInfoPaper">{{cite journal |title=Medical Laboratory Informatics |journal=Clinics in Laboratory Medicine |author=Pantanowitz, L.; Henricks, W.H.; Beckwith, B.A. |volume=27 |issue=4 |pages=823–43 |year=2007 |doi=10.1016/j.cll.2007.07.011}}</ref><ref name="MedLabInfoDesc">{{cite web |url=http://clinfowiki.org/wiki/index.php/Medical_laboratory_informatics |title=Medical laboratory informatics |work=ClinfoWiki |date=19 November 2011 |accessdate=13 March 2024}}</ref><ref name="CPAPLISDiffs">{{cite web |url=http://www.pathinformatics.pitt.edu/sites/default/files/2012Powerpoints/01HenricksTues.pdf |archiveurl=https://web.archive.org/web/20150910050825/http://www.pathinformatics.pitt.edu/sites/default/files/2012Powerpoints/01HenricksTues.pdf |format=PDF |title=LIS Basics: CP and AP LIS Design and Operations |work=Pathology Informatics 2012 |author=Henricks, W.H. |publisher=University of Pittsburgh |date=09 October 2012 |archivedate=10 September 2015 |accessdate=13 March 2024}}</ref>
-In clinical pathology the chemical, hormonal, and biochemical components of body fluids are analyzed and interpreted to determine if a disease is present, while anatomic pathology tends to focus on the analysis and interpretation of a wide variety of tissue structures, from small slivers via biopsy to complete organs from a surgery or autopsy.<ref name="ForensicMedBook">{{cite book |url=https://books.google.com/books?id=x5FftcZOv1UC&pg=PA3 |title=Forensic Medicine |author=Adelman, H.C. |pages=3–4 |publisher=Infobase Publishing |year=2009 |isbn=1438103816 |accessdate=13 March 2024}}</ref> These differences may appear to be small, but the differentiation in laboratory workflow of these two medical specialties led to the creation of different functionalities within LISs. Specimen collection, receipt, and tracking; work distribution; and report generation varies—sometimes significantly—between the two types of labs, requiring targeted functionality in the LIS.<ref name="CPAPLISDiffs" /><ref name="EvolvingLIS">{{cite web |url=https://www.mlo-online.com/home/article/13004085/the-evolving-lis-needs-to-be-everything-for-todays-laboratories |title=The evolving LIS needs to be "everything" for today's laboratories |author=Clifford, L.-J. |work=Medical Laboratory Observer |publisher=NP Communications, LLC |date=01 August 2011 |accessdate=13 March 2024}}</ref> Other differences among these two disciplines include<ref name="APLISReview" />:
+==Implications of the FAIR concept to laboratory informatics software==
+===The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones===
+To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"<ref name="MoynihanTheHitch20">{{cite web |url=https://invenia.github.io/blog/2020/07/07/software-engineering/ |title=The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE |author=Moynihan, G. |work=Invenia Blog |publisher=Invenia Technical Computing Corporation |date=07 July 2020}}</ref>:
-* Specific dictionary-driven tests are found in clinical pathology environments but not so much in anatomic pathology environments.
+<blockquote>Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.</blockquote>
-* Ordered anatomic pathology tests typically require more [[information]] than clinical pathology tests.
-* A single anatomic pathology order may be comprised of several tissues from several organs; clinical pathology orders usually do not.
-* Anatomic pathology specimen collection may be a very procedural, multi-step processes, while clinical pathology specimen collection is routinely more simple.
-Over time, the LIS has evolved to address a wide variety of other clinical disciplines, including [[toxicology]], blood banking, [[molecular diagnostics]], [[public health]] and [[epidemiology]], and [[Medical research|clinical research]]. Each has their own set of analyses, workflows, operational requirements, and regulatory considerations<ref name="DouglasLabInfoCh1_22">{{cite web |url=https://www.limswiki.org/index.php/LII:Laboratory_Informatics_Buyer%27s_Guide_for_Medical_Diagnostics_and_Research/Introduction_to_medical_diagnostics_and_research_laboratories |title=1. Introduction to medical diagnostics and research laboratories |work=Laboratory Informatics Buyer's Guide for Medical Diagnostics and Research |author=Douglas, S.E. |publisher=LIMSwiki |date=January 2022 |accessdate=13 March 2024}}</ref>, in turn requiring specialty functionality in an LIS to address the needs of each. Examples include support for stain panels and histology worksheets in pathology, supporting surge capacity for high-priority analyses in public health, providing medication-based compliance monitoring and interpretive reporting for toxicology, and allowing for electronic crossmatch of human-based medical products in blood banking and transfusion.<ref name="DouglasLabInfoCh2_22">{{cite web |url=https://www.limswiki.org/index.php/LII:Laboratory_Informatics_Buyer%27s_Guide_for_Medical_Diagnostics_and_Research/Choosing_laboratory_informatics_software_for_your_lab |title=2. Choosing laboratory informatics software for your lab |work=Laboratory Informatics Buyer's Guide for Medical Diagnostics and Research |author=Douglas, S.E.; Vaughn, A. |publisher=LIMSwiki |date=January 2022 |accessdate=13 March 2024}}</ref>
+This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.
-== Differences between an LIS and a LIMS ==
+The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.<ref name="WoolstonWhySci22">{{Cite journal |last=Woolston |first=Chris |date=2022-05-31 |title=Why science needs more research software engineers |url=https://www.nature.com/articles/d41586-022-01516-2 |journal=Nature |language=en |pages=d41586–022–01516-2 |doi=10.1038/d41586-022-01516-2 |issn=0028-0836}}</ref><ref name="KITRSE@KIT24">{{cite web |url=https://www.rse-community.kit.edu/index.php |title=RSE@KIT |publisher=Karlsruhe Institute of Technology |date=20 February 2024 |accessdate=01 May 2024}}</ref><ref name="PUPurdueCenter">{{cite web |url=https://www.rcac.purdue.edu/rse |title=Purdue Center for Research Software Engineering |publisher=Purdue University |date=2024 |accessdate=01 May 2024}}</ref> RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."<ref name="WoolstonWhySci22" /><ref name="CohenTheFour21">{{Cite journal |last=Cohen |first=Jeremy |last2=Katz |first2=Daniel S. |last3=Barker |first3=Michelle |last4=Chue Hong |first4=Neil |last5=Haines |first5=Robert |last6=Jay |first6=Caroline |date=2021-01 |title=The Four Pillars of Research Software Engineering |url=https://ieeexplore.ieee.org/document/8994167/ |journal=IEEE Software |volume=38 |issue=1 |pages=97–105 |doi=10.1109/MS.2020.2973362 |issn=0740-7459}}</ref> Elaborating on that concept, Cohen ''et al.'' add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."<ref name="CohenTheFour21" />
-There is often confusion regarding the difference between an LIS and a LIMS. While the two laboratory informatics components are related, their purposes diverged early in their existences. Up until recently, the LIS and LIMS have historically exhibited a few key differences<ref name="StarlimsLimsLis">{{cite web |url=http://www.starlims.com/en-us/services-and-resources/resources/lis-vs-lims/ |archiveurl=https://web.archive.org/web/20140428060811/http://www.starlims.com/en-us/resources/white-papers/lis-vs-lims/ |title=Adding "Management" to Your LIS |publisher=STARLIMS Corporation |date=2012 |archivedate=28 April 2014 |accessdate=14 March 2024}}</ref>:
-. An LIS was traditionally designed primarily for processing and reporting data related to individual patients in a clinical setting. A LIMS was traditionally designed to process and report data related to batches of samples from drug trials, water treatment facilities, and other entities that handle complex batches of data.<ref name="lislims1">{{cite web |url=http://labsoftnews.typepad.com/lab_soft_news/2008/11/liss-vs-limss-its-time-to-consider-merging-the-two-types-of-systems.html |title=LIS vs. LIMS: It's Time to Blend the Two Types of Lab Information Systems |author=Friedman, B. |work=Lab Soft News |date=04 November 2008 |accessdate=14 March 2024}}</ref><ref name="analytica">{{cite web |url=https://www.analytica-world.com/en/news/35566/lims-lis-market-and-poct-supplement.html |title=LIMS/LIS Market and POCT Supplement |work=analytica-world.com |date=20 February 2004 |accessdate=14 March 2024}}</ref>
+The concept of [[software quality management]] (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref> However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software.<ref name="MoynihanTheHitch20" /> Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.
-. An LIS would need to satisfy the reporting and auditing needs of hospital accreditation agencies, [[HIPAA]], and other clinical medical practitioners. A LIMS, however, would need to satisfy [[good manufacturing practice]] (GMP) and meet the reporting and audit needs of the U.S. [[Food and Drug Administration]] and research scientists in many different industries.<ref name="lislims1" />
+===The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches===
+Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, [[metadata]], and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.<ref name="GhiringhelliShared23">{{Cite journal |last=Ghiringhelli |first=Luca M. |last2=Baldauf |first2=Carsten |last3=Bereau |first3=Tristan |last4=Brockhauser |first4=Sandor |last5=Carbogno |first5=Christian |last6=Chamanara |first6=Javad |last7=Cozzini |first7=Stefano |last8=Curtarolo |first8=Stefano |last9=Draxl |first9=Claudia |last10=Dwaraknath |first10=Shyam |last11=Fekete |first11=Ádám |date=2023-09-14 |title=Shared metadata for data-centric materials science |url=https://www.nature.com/articles/s41597-023-02501-8 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=626 |doi=10.1038/s41597-023-02501-8 |issn=2052-4463 |pmc=PMC10502089 |pmid=37709811}}</ref><ref name="FirschenAgile22">{{Cite journal |last=Fitschen |first=Timm |last2=tom Wörden |first2=Henrik |last3=Schlemmer |first3=Alexander |last4=Spreckelsen |first4=Florian |last5=Hornung |first5=Daniel |date=2022-10-12 |title=Agile Research Data Management with FDOs using LinkAhead |url=https://riojournal.com/article/96075/ |journal=Research Ideas and Outcomes |volume=8 |pages=e96075 |doi=10.3897/rio.8.e96075 |issn=2367-7163}}</ref> This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven [[Ontology (information science)|ontologies]] (i.e., controlled vocabularies) at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.<ref name="FirschenAgile22" /> And that approach must value the importance of metadata and PIDs. As Weigel ''et al.'' note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."<ref>{{Cite journal |last=Weigel |first=Tobias |last2=Schwardmann |first2=Ulrich |last3=Klump |first3=Jens |last4=Bendoukha |first4=Sofiane |last5=Quick |first5=Robert |date=2020-01 |title=Making Data and Workflows Findable for Machines |url=https://direct.mit.edu/dint/article/2/1-2/40-46/9994 |journal=Data Intelligence |language=en |volume=2 |issue=1-2 |pages=40–46 |doi=10.1162/dint_a_00026 |issn=2641-435X}}</ref> Enter non-relational RDF [[knowledge graph]] [[database]]s.
-. An LIS was usually most competitive in patient-centric settings (dealing with "subjects" and "specimens") and clinical labs, whereas a LIMS was most competitive in group-centric settings (dealing with "batches" and "samples") that often deal with mostly anonymous research-specific laboratory data.<ref name="analytica" /><ref name="lislims2">{{cite web |url=http://labsoftnews.typepad.com/lab_soft_news/2008/11/lis-vs-lims.html |title=LIS vs. LIMS: Some New Insights |author=Friedman, B. |work=Lab Soft News |date=19 November 2008 |accessdate=14 March 2024}}</ref><ref name="starlims">{{cite web |url=http://blog.starlims.com/2009/07/01/swimming-in-the-clinical-pool-why-lims-are-supplanting-old-school-clinical-lis-applications/ |archiveurl=https://web.archive.org/web/20110313145726/http://blog.starlims.com/2009/07/01/swimming-in-the-clinical-pool-why-lims-are-supplanting-old-school-clinical-lis-applications/ |title=Swimming in the Clinical Pool: Why LIMS are supplanting old-school clinical LIS applications |author=Hice, R. |publisher=STARLIMS Corporation |date=01 July 2009 |archivedate=13 March 2011 |accessdate=14 March 2024}}</ref>
+This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational [[Resource Description Framework]] (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.
-However, these distinctions began to fade somewhat in the early 2010s as some LIMS vendors began to adopt the case-centric information management normally reserved for an LIS, blurring the lines between the two components further.<ref name="starlims" /> [[Thermo Scientific]]'s Clinical LIMS was an example of this merger of the LIS with LIMS, with Dave Champagne, informatics vice president and general manager, stating: "Routine molecular diagnostics requires a convergence of the up-to-now separate systems that have managed work in the lab (the LIMS) and the clinic (the LIS). The industry is asking for, and the science is requiring, a single lab-centric solution that delivers patient-centric results."<ref name="ConvergeLimsLis">{{cite web |url=https://clpmag.com/lab-essentials/information-technology/convergence-of-lims-and-lis/ |title=Convergence of LIMS and LIS |author=Tufel, G. |work=Clinical Lab Products |publisher=MEDQOR |date=01 February 2012 |accessdate=14 March 2024}}</ref> [[Abbott Informatics Corporation]]'s STARLIMS product was another example of this LIS/LIMS merger.<ref name="StarlimsLimsLis" /> With the distinction between the two entities becoming less clear, discussions within the laboratory informatics community began to includes the question of whether or not the two entities should be considered the same.<ref name="LinkedInDifLisLims">{{cite web |url=https://www.linkedin.com/feed/update/urn:li:groupPost:2069898-98494737/ |title=What is the difference between a LIS and a LIMS? |author=Jones, J. |publisher=LinkedIn |date=March 2012 |accessdate=13 March 2024}}</ref><ref name="LinkedInLisLimsSame">{{cite web |url=http://www.linkedin.com/groups/Are-LIMS-LIS-same-thing-2069898.S.147132083 |title=Are LIMS and LIS the same thing? |author=Jones, John |publisher=LinkedIn |date=September 2012 |accessdate=07 November 2012}}{{Dead link}}</ref> {{As of|2024}}, vendors continue to recognize the historical differences between the two products while also continuing to acknowledge that some developed LIMS are taking on more of the clinical aspects usually reserved for a LIS.<ref name="AgilabFAQ">{{cite web |url=http://agilab.com/faq/ |archiveurl=https://web.archive.org/web/20190325075813/http://agilab.com/faq/ |title=FAQ: What is the difference between a LIMS and a medical laboratory quality system? |publisher=AgiLab SAS |archivedate=25 March 2019 |accessdate=13 March 2024}}</ref><ref name="ReisenwitzWhatIs17">{{cite web |url=https://www.capterra.com/resources/what-is-a-laboratory-information-management-system/ |title=What Is a Laboratory Information Management System? |author=Reisenwitz, C. |work=Capterra Medical Software Blog |publisher=Capterra, Inc |date=11 May 2017 |accessdate=13 March 2024}}</ref><ref name="CloudLISDifference16">{{cite web |url=https://cloudlims.com/lims-vs-lis/ |title=LIS vs LIMS: Uncover the Difference & Choose the Right Informatics Solution |publisher=CloudLIMS.com, LLC |date=12 October 2023 |accessdate=13 March 2024}}</ref>
+Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.<ref name="AggourSemantics24">{{Cite journal |last=Aggour |first=Kareem S. |last2=Kumar |first2=Vijay S. |last3=Gupta |first3=Vipul K. |last4=Gabaldon |first4=Alfredo |last5=Cuddihy |first5=Paul |last6=Mulwad |first6=Varish |date=2024-04-09 |title=Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data |url=https://link.springer.com/10.1007/s40192-024-00348-4 |journal=Integrating Materials and Manufacturing Innovation |language=en |doi=10.1007/s40192-024-00348-4 |issn=2193-9764}}</ref> However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.<ref name="AggourSemantics24" /><ref name="GrobeFromData19">{{Cite journal |last=Grobe |first=Peter |last2=Baum |first2=Roman |last3=Bhatty |first3=Philipp |last4=Köhler |first4=Christian |last5=Meid |first5=Sandra |last6=Quast |first6=Björn |last7=Vogt |first7=Lars |date=2019-06-26 |title=From Data to Knowledge: A semantic knowledge graph application for curating specimen data |url=https://biss.pensoft.net/article/37412/ |journal=Biodiversity Information Science and Standards |language=en |volume=3 |pages=e37412 |doi=10.3897/biss.3.37412 |issn=2535-0897}}</ref>
-==Regulations, standards, and best practices affecting LIS development and use==
+It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra ''et al.'' and the FAIR Cookbook.<ref name="Rocca-SerraFAIRCook22">{{Cite book |last=Rocca-Serra, Philippe |last2=Sansone, Susanna-Assunta |last3=Gu, Wei |last4=Welter, Danielle |last5=Abbassi Daloii, Tooba |last6=Portell-Silva, Laura |date=2022-06-30 |title=D2.1 FAIR Cookbook |url=https://zenodo.org/record/6783564 |chapter=FAIR and Knowledge graphs |doi=10.5281/ZENODO.6783564}}</ref>) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide [[Semantics|semantic]] transparency (i.e., provide a framework for better understanding and reusing the greater research object through basic examination of the relationships of its associated metadata and their constituents), making these objects more easily accessible, interoperable, and machine-readable.<ref name="AggourSemantics24" /> The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.<ref name="AggourSemantics24" /><ref name="GrobeFromData19" /><ref name="Rocca-SerraFAIRCook22" /><ref name="TomlinsonRDF23">{{cite web |url=https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf |format=PDF |title=RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software |author=Tomlinson, E. |publisher=Semaphore Solutions, Inc |date=28 July 2023 |accessdate=01 May 2024}}</ref><ref name="DeagenFAIRAnd22">{{Cite journal |last=Deagen |first=Michael E. |last2=McCusker |first2=Jamie P. |last3=Fateye |first3=Tolulomo |last4=Stouffer |first4=Samuel |last5=Brinson |first5=L. Cate |last6=McGuinness |first6=Deborah L. |last7=Schadler |first7=Linda S. |date=2022-05-27 |title=FAIR and Interactive Data Graphics from a Scientific Knowledge Graph |url=https://www.nature.com/articles/s41597-022-01352-z |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=239 |doi=10.1038/s41597-022-01352-z |issn=2052-4463 |pmc=PMC9142568 |pmid=35624233}}</ref><ref>{{Cite journal |last=Brandizi |first=Marco |last2=Singh |first2=Ajit |last3=Rawlings |first3=Christopher |last4=Hassani-Pak |first4=Keywan |date=2018-09-25 |title=Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach |url=https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html |journal=Journal of Integrative Bioinformatics |language=en |volume=15 |issue=3 |pages=20180023 |doi=10.1515/jib-2018-0023 |issn=1613-4516 |pmc=PMC6340125 |pmid=30085931}}</ref> This concept can even be extended to ''post factum'' visualizations of the knowledge graph data<ref name="DeagenFAIRAnd22" />, as well as the FAIR management of computational laboratory [[workflow]]s.<ref>{{Cite journal |last=de Visser |first=Casper |last2=Johansson |first2=Lennart F. |last3=Kulkarni |first3=Purva |last4=Mei |first4=Hailiang |last5=Neerincx |first5=Pieter |last6=Joeri van der Velde |first6=K. |last7=Horvatovich |first7=Péter |last8=van Gool |first8=Alain J. |last9=Swertz |first9=Morris A. |last10=Hoen |first10=Peter A. C. ‘t |last11=Niehues |first11=Anna |date=2023-09-28 |editor-last=Palagi |editor-first=Patricia M. |title=Ten quick tips for building FAIR workflows |url=https://dx.plos.org/10.1371/journal.pcbi.1011369 |journal=PLOS Computational Biology |language=en |volume=19 |issue=9 |pages=e1011369 |doi=10.1371/journal.pcbi.1011369 |issn=1553-7358 |pmc=PMC10538699 |pmid=37768885}}</ref>
-A LIS' development and use is affected by regulations, standards, and best practices such as:
-*[[21 CFR Part 11]] ''Electronic records; Electronic signature'': Regulated clinical-focused industries, such as [[medical device]]s or pharmaceuticals, are expected to comply with U.S. [[Food and Drug Administration]] (FDA) regulations like 21 CFR Part 11, which address matters of software validation, data integrity, [[data retention]], audit trails, signed records, and secured access to data. These matters pertain to software systems like LIMS and ELN, as well as other systems employed in modern laboratories.<ref name="21CFR11">{{cite web |url=https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfCFR/CFRSearch.cfm?CFRPart=11&showFR=1 |title=CFR - Code of Federal Regulations Title 21, Part 11 Electronic Records; Electronic Signatures |publisher=U.S. Food and Drug Administration |date=22 December 2023 |accessdate=12 March 2024}}</ref><ref name="RDEditorsAQuick12">{{cite web |url=https://www.rdworldonline.com/a-quick-guide-to-eln-regulatory-requirements/ |title=A Quick Guide to ELN Regulatory Requirements |author=R&D Editors |work=R&D World |date=10 May 2012 |accessdate=12 March 2024}}</ref><ref name="LFWhite20">{{cite web |url=https://labfolder.com/wp-content/uploads/2020/01/Labfolder-CFR21-Part11-Whitepaper.docx-2.pdf |format=PDF |title=Whitepaper: FDA's 21 CFR Part 11 |publisher=Labforward GmbH |date=January 2020 |accessdate=12 March 2024}}</ref>
+While rare, some commercial laboratory informatics vendors like Semaphore Solutions have already recognized the potential of RDF knowledge graph databases to FAIR-driven laboratory research, having implemented such structures into their offerings.<ref name="TomlinsonRDF23" /> (The use of knowledge graphs has already been demonstrated in academic research software, such as with the ELN tools developed by RSEs at the University of Rostock and University of Amsterdam.<ref>{{Cite journal |last=Schröder |first=Max |last2=Staehlke |first2=Susanne |last3=Groth |first3=Paul |last4=Nebe |first4=J. Barbara |last5=Spors |first5=Sascha |last6=Krüger |first6=Frank |date=2022-12 |title=Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation |url=https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x |journal=Journal of Biomedical Semantics |language=en |volume=13 |issue=1 |pages=4 |doi=10.1186/s13326-021-00257-x |issn=2041-1480 |pmc=PMC8802522 |pmid=35101121}}</ref>) As noted in the prior point, it is potentially advantageous to not only laboratory informatics vendors to provide but also research labs to use relevant and sustainable research software that has the FAIR principles embedded in the software's design. Turning to knowledge graph databases is another example of keeping such software relevant and FAIR to research labs.
-*[[ASTM E1578]] ''Standard Guide for Laboratory Informatics'': This standard is geared towards a variety of stakeholders having some sort of professional interest in laboratory informatics. It intends to educate on and recommend approaches to laboratory software development, acquisition, implementation, and maintenance, including as how they relate to LIMS.<ref name="ASTME1578">{{cite web |url=https://www.astm.org/e1578-18.html |title=ASTM E1578-18 Standard Guide for Laboratory Informatics |publisher=ASTM International |date=23 August 2019 |accessdate=13 March 2024}}</ref>
-*Good clinical laboratory practice (GCLP): GLP is a quality- and data-driven approach to ensuring the safety, consistency, high quality, and reliability of developed and produced goods. These best practices address a variety of aspects of non-clinical research and manufacturing laboratory workflows, from personnel and equipment to tests and reporting. Some LIMS are developed to help labs better enforce a GLP approach to its operations.<ref>{{Cite book |date=2023 |editor-last=Elzagheid |editor-first=Mohamed |title=Chemical technicians: good laboratory practice and laboratory information management systems |series=De Gruyter Textbook |edition=1st |publisher=De Gruyter |place=Boston |isbn=978-3-11-119110-2}}</ref>
-*[[ISO 15189]] ''Medical laboratories — Requirements for quality and competence'': This standard specifies quality management approaches to clinical laboratory settings. It pulls inspiration from ISO/IEC 17025 while acknowledging the unique characteristics and needs of the clinical lab. The standard's requirements on laboratory need for addressing cybersecurity, system validation, and more apply directly to LIMS development and implementation.<ref>{{Cite journal |last=Ilinca |first=Radu |last2=Chiriac |first2=Ionuț A. |last3=Luțescu |first3=Dan A. |last4=Ganea |first4=Ionela |last5=Hristodorescu-Grigore |first5=Smaranda |last6=Dănciulescu-Miulescu |first6=Rucsandra-Elena |date=2023-04-01 |title=Understanding the key differences between ISO 15189:2022 and ISO 15189:2012 for an improved medical laboratory quality of service |url=https://www.sciendo.com/article/10.2478/rrlm-2023-0011 |journal=Revista Romana de Medicina de Laborator |language=en |volume=31 |issue=2 |pages=77–82 |doi=10.2478/rrlm-2023-0011 |issn=2284-5623}}</ref>
-==See also==
+===Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications===
-*[[Laboratory informatics]]
+The third and final point for this Q&A article highlights another positive consequence of engineering laboratory informatics software with FAIR in mind: FAIRified research objects are much closer to being usable for the trending inclusion of [[machine learning]] (ML) and [[artificial intelligence]] (AI) tools in laboratory informatics platforms and other companion research software. By developing laboratory informatics software with a focus on FAIR-driven metadata and database schemes, not only are research objects more FAIR but also "cleaner" and more machine-ready for advanced analytical uses as with ML and AI.
-*[[LIS feature|Common LIS features]]
+To be sure, the FAIRness of any structured dataset alone is not enough to make it ready for ML and AI applications. Factors such as classification, completeness, context, correctness, duplicity, integrity, mislabeling, outliers, relevancy, sample size, and timeliness of the research object and its contents are also important to consider.<ref name="HinidumaDataRead24">{{Cite journal |last=Hiniduma |first=Kaveen |last2=Byna |first2=Suren |last3=Bez |first3=Jean Luca |date=2024 |title=Data Readiness for AI: A 360-Degree Survey |url=https://arxiv.org/abs/2404.05779 |journal=arXiv |doi=10.48550/ARXIV.2404.05779}}</ref><ref name="FletcherFAIRRe24">{{Cite journal |last=Fletcher |first=Lydia |date=2024-04-16 |others=The University Of Texas At Austin, The University Of Texas At Austin |title=FAIR Re-use: Implications for AI-Readiness |url=https://repositories.lib.utexas.edu/handle/2152/124873 |doi=10.26153/TSW/51475}}</ref> When those factors aren't appropriately addressed as part of a FAIRification effort towards AI readiness (as well as part of the development of research software of all types), research data and metadata have a higher likelihood of revealing themselves to be inconsistent. As such, searches and analytics using that data and metadata become muddled, and the ultimate ML or AI output will also be muddled (i.e., "garbage in, garbage out"). Whether retroactively updating existing research objects to a more FAIRified state or ensuring research objects (e.g., those originating in an ELN or LIMS) are more FAIR and AI-ready from the start, research software updating or generating those research objects has to address ontologies, data models, data types, identifiers, and more in a thorough yet flexible way.<ref name="OlsenEmbracing23">{{cite web |url=https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness |title=Embracing FAIR Data on the Path to AI-Readiness |author=Olsen, C. |work=Pharma's Almanac |date=01 September 2023 |accessdate=03 May 2024}}</ref>
+Noting that Wilkinson ''et al.'' originally highlighted the importance of machine-readability of FAIR data, Huerta ''et al.'' add that that core principle of FAIRness "is synergistic with the rapid adoption and increased use of AI in research."<ref name="HuertaFAIRForAI23">{{Cite journal |last=Huerta |first=E. A. |last2=Blaiszik |first2=Ben |last3=Brinson |first3=L. Catherine |last4=Bouchard |first4=Kristofer E. |last5=Diaz |first5=Daniel |last6=Doglioni |first6=Caterina |last7=Duarte |first7=Javier M. |last8=Emani |first8=Murali |last9=Foster |first9=Ian |last10=Fox |first10=Geoffrey |last11=Harris |first11=Philip |date=2023-07-26 |title=FAIR for AI: An interdisciplinary and international community building perspective |url=https://www.nature.com/articles/s41597-023-02298-6 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=487 |doi=10.1038/s41597-023-02298-6 |issn=2052-4463 |pmc=PMC10372139 |pmid=37495591}}</ref> They go on to discuss the positive interactions of FAIR research objects with FAIR-driven, AI-based research. Among the benefits include<ref name="HuertaFAIRForAI23" />:
+*greater findability of FAIR research objects for further AI-driven scientific discovery;
+*greater reproducibility of FAIR research objects and any AI models published with them;
+*improved generalization of AI-driven medical research models when exposed to diverse and FAIR research objects;
+*improved reporting of AI-driven research results using FAIRified research objects, lending further credibility to those results;
+*more uniform comparison of AI models using well-defined hyperstructure and information training conditions from FAIRified research objects;
+*more developed and interoperable "data e-infrastructure," which can further drive a more effective "AI services layer";
+*reduced bias in AI-driven processes through the use of FAIR research objects and AI models; and
+*improved surety of scientific correctness where reproducibility in AI-driven research can't be guaranteed.
+In the end, developers of research software (whether discipline-specific research software or broader laboratory informatics solutions) would be advised to keep in mind the growing trends of FAIR research, FAIR software, and ML- and AI-driven research, especially in the [[life sciences]], but also a variety of other fields.<ref name="HuertaFAIRForAI23" />
+===Restricted clinical data and its FAIRification for greater research innovation===
+Broader discussion in the research community continues to occur in regards to how best to ethically make restricted or privacy-protected clinical data and information FAIR for greater innovation and, by extension, improved patient outcomes, particularly in the wake of the [[COVID-19]] [[pandemic]].<ref name="MaxwellFAIREthic23">{{Cite journal |last=Maxwell |first=Lauren |last2=Shreedhar |first2=Priya |last3=Dauga |first3=Delphine |last4=McQuilton |first4=Peter |last5=Terry |first5=Robert F |last6=Denisiuk |first6=Alisa |last7=Molnar-Gabor |first7=Fruzsina |last8=Saxena |first8=Abha |last9=Sansone |first9=Susanna-Assunta |date=2023-10 |title=FAIR, ethical, and coordinated data sharing for COVID-19 response: a scoping review and cross-sectional survey of COVID-19 data sharing platforms and registries |url=https://linkinghub.elsevier.com/retrieve/pii/S2589750023001292 |journal=The Lancet Digital Health |language=en |volume=5 |issue=10 |pages=e712–e736 |doi=10.1016/S2589-7500(23)00129-2 |pmc=PMC10552001 |pmid=37775189}}</ref><ref name="Queralt-RosinachApplying22">{{Cite journal |last=Queralt-Rosinach |first=Núria |last2=Kaliyaperumal |first2=Rajaram |last3=Bernabé |first3=César H. |last4=Long |first4=Qinqin |last5=Joosten |first5=Simone A. |last6=van der Wijk |first6=Henk Jan |last7=Flikkenschild |first7=Erik L.A. |last8=Burger |first8=Kees |last9=Jacobsen |first9=Annika |last10=Mons |first10=Barend |last11=Roos |first11=Marco |date=2022-12 |title=Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic |url=https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7 |journal=Journal of Biomedical Semantics |language=en |volume=13 |issue=1 |pages=12 |doi=10.1186/s13326-022-00263-7 |issn=2041-1480 |pmc=PMC9036506 |pmid=35468846}}</ref><ref>{{Cite journal |last=Martínez-García |first=Alicia |last2=Alvarez-Romero |first2=Celia |last3=Román-Villarán |first3=Esther |last4=Bernabeu-Wittel |first4=Máximo |last5=Luis Parra-Calderón |first5=Carlos |date=2023-05 |title=FAIR principles to improve the impact on health research management outcomes |url=https://linkinghub.elsevier.com/retrieve/pii/S2405844023029407 |journal=Heliyon |language=en |volume=9 |issue=5 |pages=e15733 |doi=10.1016/j.heliyon.2023.e15733 |pmc=PMC10189186 |pmid=37205991}}</ref> (Note that while there are other types of restricted and privacy-protected data, this section will focus largely on clinical data and research objects as the most obvious type.)
+These efforts have usually revolved around pulling reusable clinical patient or research data from [[hospital information system]]s (HIS), [[electronic medical record]]s (EMRs), [[clinical trial management system]]s (CTMSs), and research databases (often relational in nature) that either contain de-identified data or can de-identify aspects of data and information before access and extraction. Sometimes that clinical data or research object may have already in part been FAIRified, but often it may not be. In all cases, the concepts of privacy, security, and anonymization come up as part of any desire to gain access to that clinical material. However, any FAIRified clinical data isn't necessarily readily open for access. As Snoeijer ''et al.'' note: "The authors of the FAIR principles, however, clearly indicate that 'accessible' does not mean open. It means that clarity and transparency is required around the conditions governing access and reuse."<ref name="SnoeijerProcess19">{{cite book |url=https://phuse.s3.eu-central-1.amazonaws.com/Archive/2019/Connect/EU/Amsterdam/PAP_SA04.pdf |format=PDF |chapter=Paper SA04 - Processing big data from multiple sources |title=Proceedings of PHUSE Connect EU 2019 |author=Snoeijer, B.; Pasapula, V.; Covucci, A. et al. |publisher=PHUSE Limited |year=2019 |accessdate=03 May 2024}}</ref>
+This is being mentioned in the context of laboratory informatics applications for a couple of reasons. First, a well-designed commercial LIMS that supports clinical research laboratory workflows is already going to address privacy and security aspects, as part of the developer recognizing the need for those labs to adhere to regulations such as the [[Health Insurance Portability and Accountability Act]] (HIPAA) and comply with standards such as [[ISO 15189]]. However, such a system may not have been developed with FAIR data principles in mind, and any built-in metadata and ontology schemes may be insufficient for full FAIRification of laboratory-based clinical trial research objects. As Queralt-Rosinach ''et al.'' note, however, "interestingly, ontologies may also be used to describe data access restrictions to complement FAIR metadata with information that supports data safety and patient privacy."<ref name="Queralt-RosinachApplying22" /> Essentially, the authors are suggesting that while a HIS or LIS may have built-in access management tools, setting up ontologies and metadata mechanisms that link privacy aspects of a research object (e.g., "has consent form for," "is de-identified," etc.) to the object's metadata allows for even more flexible, FAIR-driven approaches to privacy and security. Research software developers creating such information management tools for the regulated clinical research space may want to apply FAIR concepts such as this to how access control and privacy restrictions are managed. This will inevitably mean any research objects exported with machine-readable privacy-concerning metadata will be more reusable in a way that still "supports data safety and patient privacy."<ref name="Queralt-RosinachApplying22" />
+Second, a well-designed research software solution working with clinical data will provide not only support for open, community-supported data models and vocabularies for clinical data, but also standardized community-driven ontologies that are specifically developed for access control and privacy. Queralt-Rosinach ''et al.'' continue<ref name="Queralt-RosinachApplying22" />:
+<blockquote>Also, very important for accessibility and data privacy is that the digital objects ''per se'' can accommodate the criteria and protocols necessary to comply with regulatory and governance frameworks. Ontologies can aid in opening and protecting patient data by exposing logical definitions of data use conditions. Indeed, there are ontologies to define access and reuse conditions for patient data such as the Informed Consent Ontology (ICO), the Global Alliance for Genomics and Health Data Use Ontology (DUO) standard, and the Open Digital Rights Language (ODRL) vocabulary recommended by W3C.</blockquote>
+Also of note here is the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and its OHDSI standardized vocabularies. In all these cases, a developer-driven approach to research software that incorporates community-driven standards that support FAIR principles is welcome. However, as Maxwell ''et al.'' noted in their ''Lancet'' review article in late 2023, "few platforms or registries applied community-developed standards for participant-level data, further restricting the interoperability of ... data-sharing initiatives [like FAIR]."<ref name="MaxwellFAIREthic23" /> As the FAIR principles continue to gain ground in clinical research and diagnostics settings, software developers will need to be more attuned to translating old ways of development to ones that incorporate FAIR data and software principles. Demand for FAIR data will only continue to grow, and any efforts to improve interoperability and reusability while honoring (and enhancing) privacy and security aspects of restricted data will be appreciated by clinical researchers. However, just as FAIR is not an overall goal for researchers, software built with FAIR principles in mind is not the end point of research organizations managing restricted and privacy-protected research objects. Ultimately, those organizations will have make other considerations about restricted data in the scope of FAIR, including addressing data management plans, data use agreements, disclosure review practices, and training as it applies to their research software and generated research objects.<ref>{{Cite journal |last=Jang |first=Joy Bohyun |last2=Pienta |first2=Amy |last3=Levenstein |first3=Margaret |last4=Saul |first4=Joe |date=2023-12-06 |title=Restricted data management: the current practice and the future |url=https://journalprivacyconfidentiality.org/index.php/jpc/article/view/844 |journal=Journal of Privacy and Confidentiality |volume=13 |issue=2 |doi=10.29012/jpc.844 |issn=2575-8527 |pmc=PMC10956935 |pmid=38515607}}</ref>
+==Conclusion==
+Laboratory informatics developers will also need to remember that FAIRification of research in itself is not a goal for research laboratories; it is a continual process that recognizes improved scientific research and greater innovation as a more likely outcome.<ref name="WilkinsonTheFAIR16" /><ref name="OlsenEmbracing23" /><ref name="HuertaFAIRForAI23" />
-== Further reading ==
-* {{cite web |url=http://www.pathinformatics.pitt.edu/sites/default/files/2012Powerpoints/01HenricksTues.pdf |archiveurl=https://web.archive.org/web/20150910050825/http://www.pathinformatics.pitt.edu/sites/default/files/2012Powerpoints/01HenricksTues.pdf |format=PDF |title=LIS Basics: CP and AP LIS Design and Operations |work=Pathology Informatics 2012 |author=Henricks, W.H. |publisher=University of Pittsburgh |date=09 October 2012 |archivedate=10 September 2015}}
-* {{cite journal |journal=Advances in Anatomic Pathology |title=Anatomic Pathology Laboratory Information Systems: A Review |author=Park, S.L.; Pantanowitz, L.; Sharma, G.; Parwani, A.V. |volume=19 |issue=2 |page=81–96 |year=2012 |doi=10.1097/PAP.0b013e318248b787}}
 ==References==
 {{Reflist|colwidth=30em}}
 <!---Place all category tags here-->

Difference between revisions of "User:Shawndouglas/sandbox/sublevel12"

Latest revision as of 13:29, 13 May 2024

Contents

Sandbox begins below

Introduction

The "FAIR-ification" of research objects and software

Implications of the FAIR concept to laboratory informatics software

The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones

The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches

Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications

Restricted clinical data and its FAIRification for greater research innovation

Conclusion

References

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

Popular publications

Print/export