Difference between revisions of "User:Shawndouglas/sandbox/sublevel12"

From LIMSWiki
Jump to navigationJump to search
Tag: Reverted
(364 intermediate revisions by the same user not shown)
Line 8: Line 8:
==Sandbox begins below==
==Sandbox begins below==
<div class="nonumtoc">__TOC__</div>
<div class="nonumtoc">__TOC__</div>
==5. Develop and create the cybersecurity plan==
[[File:FAIRResourcesGraphic AustralianResearchDataCommons 2018.png|right|520px]]
What follows is a template to help guide you in developing your own [[cybersecurity]] plan. Remember that this is a template and strategy for developing the cybersecurity plan for your organization, not a regulatory guidance document. This template has at its core a modified version of the template structure suggested in the late 2018 ''Cybersecurity Strategy Development Guide'' created for the National Association of Regulatory Utility Commissioners (NARUC).<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref> While their document focuses on cybersecurity for utility cooperatives and commissions, much of what NARUC suggests can still be more broadly applied to all but the tiniest of businesses. Additional resources such as the American Health Information Management Association's ''AHIMA Guidelines: The Cybersecurity Plan''<ref name="DowningAHIMA17">{{cite web |url=https://journal.ahima.org/wp-content/uploads/2017/12/AHIMA-Guidelines-Cybersecurity-Plan.pdf |archiveurl=https://web.archive.org/web/20220119204903/https://journal.ahima.org/wp-content/uploads/2017/12/AHIMA-Guidelines-Cybersecurity-Plan.pdf |format=PDF |title=AHIMA Guidelines: The Cybersecurity Plan |author=Downing, K. |publisher=American Health Information Management Association |date=December 2017 |archivedate=19 January 2022 |accessdate=10 March 2023}}</ref>; National Rural Electric Cooperative Association (NRECA), Cooperative Research Network's ''Guide to Developing a Cyber Security and Risk Mitigation Plan''<ref name="LebanidzeGuide11">{{cite web |url=https://www.cooperative.com/programs-services/bts/documents/guide-cybersecurity-mitigation-plan.pdf |format=PDF |title=Guide to Developing a Cyber Security and Risk Mitigation Plan |author=Lebanidze, E. |publisher=National Rural Electric Cooperative Association, Cooperative Research Network |date=2011 |accessdate=10 March 2023}}</ref>; and various cybersecurity experts' articles<ref name="LagoHowTo19">{{cite web |url=https://www.cio.com/article/222076/how-to-implement-a-successful-security-plan.html |title=How to implement a successful cybersecurity plan |author=Lago, C. |work=CIO |publisher=IDG Communications, Inc |date=10 July 2019 |accessdate=10 March 2023}}</ref><ref name="NortonSimilar18">{{cite web |url=https://intraprisehealth.com/similar-but-different-gap-assessment-vs-risk-assessment/ |title=Similar but Different: Gap Assessment vs Risk Analysis |author=Norton, K. |publisher=IntrapriseHEALTH |date=21 June 2018 |accessdate=10 March 2023}}</ref><ref name="EwingFourWays17">{{cite web |url=https://deltarisk.com/blog/4-ways-to-integrate-your-cyber-security-incident-response-and-business-continuity-plans/ |title=4 Ways to Integrate Your Cyber Security Incident Response and Business Continuity Plans |author=Ewing, S. |publisher=Delta Risk |date=12 July 2017 |accessdate=10 March 2023}}</ref><ref name="KrasnowCyber17">{{cite web |url=https://www.irmi.com/articles/expert-commentary/cyber-security-event-recovery-plans |title=Cyber-Security Event Recovery Plans |author=Krasnow, M.J. |publisher=International Risk Management Institute, Inc |date=February 2017 |accessdate=10 March 2023}}</ref><ref name="CopelandHowToDev18">{{cite web |url=https://www.copelanddata.com/blog/how-to-develop-a-cybersecurity-plan/ |title=How to Develop A Cybersecurity Plan For Your Company (checklist included) |publisher=Copeland Technology Solutions |date=17 July 2018 |accessdate=10 March 2023}}</ref><ref name="TalamantesDoesYour17">{{cite web |url=https://www.redteamsecure.com/blog/does-your-cybersecurity-plan-need-an-update |title=Does Your Cybersecurity Plan Need an Update? |author=Talamantes, J. |work=RedTeam Knowledge Base |publisher=RedTeam Security Corporation |date=06 September 2017 |accessdate=10 March 2023}}</ref> have been reviewed to further supplement the template. This template covers 10 main cybersecurity planning steps, each with multiple sub-steps. Additional commentary, guidance, and citation is included with those sub-steps.
'''Title''': ''What are the potential implications of the FAIR data principles to laboratory informatics applications?''


Note that before development begins, you'll want to consider the knowledge resources available and key stakeholders involved. Do you have the expertise available in-house to address all 10 planning steps, or will you need to acquire help from one or more third parties? Who are the key individuals providing critical support to the business and its operations? Having the critical expertise and stakeholders involved with the plan's development process early on can enhance the overall plan and provide for more effective strategic outcomes.<ref name="NARUCCyber18" />
'''Author for citation''': Shawn E. Douglas


Also remind yourself that completing this plan will likely not require a straightforward, by-the-numbers approach. The most feasible outcome will have you jumping around a few steps and filling in blanks or revising statements in previous portions of the plan. While the ordering of these steps is deliberate, completing them in order may not make the best sense for your organization. Don't be afraid to jump around or go back and update sections you've worked on previously using new-found knowledge. For example, some organizations with limited professional expertise in cybersecurity may find value in jumping to the end of section 5.3 and reviewing the wording of some of the cybersecurity controls early in the process in order to become more familiar with the related vocabulary.
'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]


Finally, the various steps of this plan will recommend the development of a variety of other policies, procedures, and documents, e.g., a communications plan and a response and continuity plan. As NIST notes in its SP 800-53 framework, effective security plans make reference to other policy and procedure documents and don't necessarily fully contain those actual policies and procedures themselves. Rather, the plan should "provide explicitly or by reference, sufficient [[information]] to define what needs to be accomplished" by those policies and procedures. All of that is to say that when going through the steps below, be cognizant of that advice. Recommendations to make a communications plan or response plan don't necessarily mean those plans should be an actual portion of your overall cybersecurity plan, but rather a component external to the plan yet referenced and detailed sufficiently within the plan.
'''Publication date''': May 2024


'''''An Example Cybersecurity Plan'''''
==Introduction==


The following instructional template for developing a cybersecurity plan is admittedly a lot of information to take in at once. Some people are much better understanding a concept through examples. As such, what is modestly called ''An Example Cybersecurity Plan'' has been developed to accompany this guide. That example plan includes an introduction to provide more context concerning its creation, as well as a simple outline of the following steps 5.1 through 5.10. The example plan itself comes afterwards, presented from the perspective of fictional environmental laboratory company ABC123 Co. This example is slightly unorthodox in that it presents a cybersecurity plan in an iterative state of development, emphasizing the "living document" aspect of a cybersecurity plan. The document demonstrates the concepts emphasized in this guide, including the concept of referencing other relevant policies and documents without duplicating them within the cybersecurity plan. Note that while a separate document, ''An Example Cybersecurity Plan'' is released under the same Creative Commons license as this guide, and those license requirements should still be followed.
This brief topical article will examine


'''Link to file''': [[:File:An Example Cybersecurity Plan - Shawn Douglas - v1.0.pdf|''An Example Cybersecurity Plan'']]
==The "FAIR-ification" of research objects and software==
First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the [[Journal:The FAIR Guiding Principles for scientific data management and stewardship|FAIR data principles]] were published by Wilkinson ''et al.'' in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and [[information]] of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.<ref name="WilkinsonTheFAIR16">{{Cite journal |last=Wilkinson |first=Mark D. |last2=Dumontier |first2=Michel |last3=Aalbersberg |first3=IJsbrand Jan |last4=Appleton |first4=Gabrielle |last5=Axton |first5=Myles |last6=Baak |first6=Arie |last7=Blomberg |first7=Niklas |last8=Boiten |first8=Jan-Willem |last9=da Silva Santos |first9=Luiz Bonino |last10=Bourne |first10=Philip E. |last11=Bouwman |first11=Jildau |date=2016-03-15 |title=The FAIR Guiding Principles for scientific data management and stewardship |url=https://www.nature.com/articles/sdata201618 |journal=Scientific Data |language=en |volume=3 |issue=1 |pages=160018 |doi=10.1038/sdata.2016.18 |issn=2052-4463 |pmc=PMC4792175 |pmid=26978244}}</ref> The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."<ref name="WilkinsonTheFAIR16" />


'''Instructions''': After clicking the above link, click the link (underneath the PDF icon) at the top of the resulting page to view in browser, or right-click and "save as" to save a copy.)
Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice<ref name="NIHPubMedSearch">{{cite web |url=https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles |title=fair data principles |work=PubMed Search |publisher=National Institutes of Health, National Library of Medicine |accessdate=30 April 2024}}</ref>, including in ways beyond what perhaps was originally imagined by Wilkinson ''et al.''. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref><ref name="GruenpeterFAIRPlus20">{{Cite web |last=Gruenpeter, M. |date=23 November 2020 |title=FAIR + Software: Decoding the principles |url=https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf |format=PDF |publisher=FAIRsFAIR “Fostering FAIR Data Practices In Europe” |accessdate=30 April 2024}}</ref><ref>{{Cite journal |last=Barker |first=Michelle |last2=Chue Hong |first2=Neil P. |last3=Katz |first3=Daniel S. |last4=Lamprecht |first4=Anna-Lena |last5=Martinez-Ortiz |first5=Carlos |last6=Psomopoulos |first6=Fotis |last7=Harrow |first7=Jennifer |last8=Castro |first8=Leyla Jael |last9=Gruenpeter |first9=Morane |last10=Martinez |first10=Paula Andrea |last11=Honeyman |first11=Tom |date=2022-10-14 |title=Introducing the FAIR Principles for research software |url=https://www.nature.com/articles/s41597-022-01710-x |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=622 |doi=10.1038/s41597-022-01710-x |issn=2052-4463 |pmc=PMC9562067 |pmid=36241754}}</ref><ref>{{Cite journal |last=Patel |first=Bhavesh |last2=Soundarajan |first2=Sanjay |last3=Ménager |first3=Hervé |last4=Hu |first4=Zicheng |date=2023-08-23 |title=Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool |url=https://www.nature.com/articles/s41597-023-02463-x |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=557 |doi=10.1038/s41597-023-02463-x |issn=2052-4463 |pmc=PMC10447492 |pmid=37612312}}</ref><ref>{{Cite journal |last=Du |first=Xinsong |last2=Dastmalchi |first2=Farhad |last3=Ye |first3=Hao |last4=Garrett |first4=Timothy J. |last5=Diller |first5=Matthew A. |last6=Liu |first6=Mei |last7=Hogan |first7=William R. |last8=Brochhausen |first8=Mathias |last9=Lemas |first9=Dominick J. |date=2023-02-06 |title=Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software |url=https://link.springer.com/10.1007/s11306-023-01974-3 |journal=Metabolomics |language=en |volume=19 |issue=2 |pages=11 |doi=10.1007/s11306-023-01974-3 |issn=1573-3890}}</ref> Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."<ref name="GruenpeterFAIRPlus20" /> The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.


A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.<ref name="GruenpeterFAIRPlus20" /> These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).<ref name="GruenpeterFAIRPlus20" />


===5.1. Develop strategic cybersecurity goals and define success===
At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.<ref name="GruenpeterDefining21">{{Cite journal |last=Gruenpeter, Morane |last2=Katz, Daniel S. |last3=Lamprecht, Anna-Lena |last4=Honeyman, Tom |last5=Garijo, Daniel |last6=Struck, Alexander |last7=Niehues, Anna |last8=Martinez, Paula Andrea |last9=Castro, Leyla Jael |last10=Rabemanantsoa, Tovo |last11=Chue Hong, Neil P. |date=2021-09-13 |title=Defining Research Software: a controversial discussion |url=https://zenodo.org/record/5504016 |journal=Zenodo |doi=10.5281/zenodo.5504016}}</ref><ref name="JulichWhatIsRes24">{{cite web |url=https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software |title=What is Research Software? |work=JuRSE, the Community of Practice for Research Software Engineering |publisher=Forschungszentrum Jülich |date=13 February 2024 |accessdate=30 April 2024}}</ref><ref name="vanNieuwpoortDefining24">{{Cite journal |last=van Nieuwpoort |first=Rob |last2=Katz |first2=Daniel S. |date=2023-03-14 |title=Defining the roles of research software |url=https://upstream.force11.org/defining-the-roles-of-research-software |language=en |doi=10.54900/9akm9y5-5ject5y}}</ref> In 2021, as part of the FAIRsFAIR initiative, Gruenpeter ''et al.'' made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition<ref name="GruenpeterDefining21" />:
[[File:NICE Cybersecurity Workforce Framework.jpg|right|300px]]
====5.1.1 Broadly articulate business goals and how information technology relates====
Something should drive you to want to implement a cybersecurity plan. Sometimes the impetus may be external, such as a major breach at another company that affects millions of people. But more often than not, well-formulated business goals and the resources, regulations, and motivations tied to them will propel development of the plan. Business goals have, hopefully, already been developed by the time you consider a cybersecurity plan. Now is the time to identify the technology and data that are tied to those goals. A [[clinical laboratory]], for example, may have as a business goal "to provide prompt, accurate analysis of specimens submitted to the laboratory." Does the lab utilize [[information management]] systems as a means to better meet that goal? How secure are the systems? What are the consequences of having mission-critical data compromised in said systems?


====5.1.2 Articulate why cybersecurity is vital to achieving those goals====
<blockquote>Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.</blockquote>
Looking to your business goals for the technology, data, and other resources used to achieve those goals gives you an opportunity to turn the magnifying glass towards why the technology, data, and resources need to be secure. For example, the clinical testing lab will likely be dealing with [[protected health information]] (PHI), and an electric cooperative must reliably provide service practically 100 percent of the time. Both the data and the service must be protected from physical and cyber intrusion, at risk of significant and costly consequence. Be clear about what the potential consequences actually may be, as well as how business goals could be hindered without proper cybersecurity for critical assets. Or, conversely, clearly state what will be positively achieved by addressing cybersecurity for those assets.


====5.1.3 State the cybersecurity mission and define how to achieve it, based on the above====
Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or [[laboratory information management system]] (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is<ref name="vanNieuwpoortDefining24" />:
You've stated your business goals, how technology and data plays a role in them, and why it's vital to ensure their security. Now it's time to develop your strategic mission in regards to cybersecurity. You may wish to take a few extra steps before defining the goals of that mission, however. The NARUC has this to say in that regard<ref name="NARUCCyber18" />:


<blockquote>Establishing a strategic [mission] is a critical first step that sets the tone for the entire process of drafting the strategy. Before developing [the mission], a commission may want to do an internal inventory of key stakeholders; conduct blue-sky thinking exercises; and do an environmental assessment and literature review to identify near-, mid-, and long-term drivers of change that may affect its goals.</blockquote>
*Research software is a component of our instruments.
*Research software is the instrument.
*Research software analyzes research data.
*Research software presents research results.
*Research software assembles or integrates existing components into a working whole.
*Research software is infrastructure or an underlying tool.
*Research software facilitates distinctively research-oriented collaboration.


Whatever cybersecurity mission goals you inevitably declare, you'll want to be sure they "provide a sense of purpose, identity, and long-term direction" and clearly communicate what's most important in regards to cybersecurity to internal and external customers. Also consider adding concise points that paint the overall mission as one dedicated to limiting vulnerabilities and keeping risks mitigated.<ref name="NARUCCyber18" />
When considering these definitions<ref name="GruenpeterDefining21" /><ref name="vanNieuwpoortDefining24" /> of research software and their adoption by other entities<ref name="F1000Open24">{{cite web |url=https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/ |title=Open source software and code |publisher=F1000 Research Ltd |date=2024 |accessdate=30 April 2024}}</ref>, it would appear that at least in part some [[laboratory informatics]] software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, [[electronic laboratory notebook]]s (ELNs) like open-source [[Jupyter Notebook]] or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.<ref name="vanNieuwpoortDefining24" /> Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.


====5.1.4 Gain and promote active and visible support from executive management in achieving the cybersecurity mission====
Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.
Ensuring executive management is fully on-board with your stated cybersecurity mission is vital. If key business leaders have not been intimately involved with the process as of yet, it is now time to gain their input and full support. As NARUC notes, "with leadership buy-in, it will be easier to institutionalize the idea that cybersecurity is a priority and can result in more readily available resources."<ref name="NARUCCyber18" /> Consider what AHIMA calls a "State of the Union" approach to presenting the cybersecurity mission goals to leadership, being prepared to answer questions from them about responsible parties, communication policies, and "cyber insurance."<ref name="DowningAHIMA17" /> (Answers to such questions are addressed further into this template. You may wish to have some of what follows informally addressed before taking it to leadership. Or perhaps have an agreement to keep leadership appraised throughout cybersecurity plan development, gaining their feedback and overall acceptance of the plan as development comes to a close.)


<div class="nonumtoc">__TOC__</div>
==Implications of the FAIR concept to laboratory informatics software==
===5.2 Define scope and responsibilities===
===The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones===
[[File:Innovation & Research Symposium Cisco and Ecole Polytechnique 9-10 April 2018 Artificial Intelligence & Cybersecurity (40631791164).jpg|right|400px]]
To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"<ref name="MoynihanTheHitch20">{{cite web |url=https://invenia.github.io/blog/2020/07/07/software-engineering/ |title=The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE |author=Moynihan, G. |work=Invenia Blog |publisher=Invenia Technical Computing Corporation |date=07 July 2020}}</ref>:
====5.2.1 Define the scope and applicability through key requirements and boundaries====
Now that the cybersecurity mission goals are clear and supported by leadership, it's time to tailor strategies based on those stated goals.


How broad of scope will the mission goals take you across your business assets? Information technology (IT) and data will surely be at the forefront, but don't forget to also address operational technology (OT) assets as well.<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref> One helpful tool in determining the strategies and requirements needed to meet mission goals is to clearly define the logical and physical boundaries of your information system.<ref name="NARUCCyber18" /><ref name="LebanidzeGuide11">{{cite web |url=https://www.cooperative.com/programs-services/bts/documents/guide-cybersecurity-mitigation-plan.pdf |format=PDF |title=Guide to Developing a Cyber Security and Risk Mitigation Plan |author=Lebanidze, E. |publisher=National Rural Electric Cooperative Association, Cooperative Research Network |date=2011 |accessdate=10 March 2023}}</ref> When considering those boundaries, remember the following<ref name="LebanidzeGuide11" />:
<blockquote>Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.</blockquote>


* An information system is more than a piece of software; it's a collection of all the components and other resources within the system's environment. Some of those will be internal and some external.
This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.  
* The system is more than just hardware; the interfaces—physical and logical—as well as communication protocols also make up the system.
* The system has physical, logical, and security control boundaries, as well as data flows tied to those boundaries.
* The data housed and transmitted in the system is likely composed of varying degrees of sensitivity, further shaping boundaries.
* The information system's primary functions are directly tied to the goals of the business.


Additionally, when considering the scope of the plan, you'll also want to take into account advancements in both technology and cyber threats. "Unprecedented cybersecurity challenges loom just beyond the horizon," states CNA, a nonprofit research and analysis organization located in Arlington, Virginia. But we have to focus on more than just the "now." CNA adds that "today's operational security agenda is too narrow in scope to address the wide range of issues likely to emerge in the coming years."<ref name="CNACyber19">{{cite web |url=https://www.cna.org/centers/ipr/safety-security/cyber-security-project |archiveurl=https://web.archive.org/web/20220109120854/https://www.cna.org/centers/ipr/safety-security/cyber-security-project |title=Cybersecurity Futures 2025 |work=Institute for Public Research |publisher=CNA |date=2019 |archivedate=09 January 2022 |accessdate=10 March 2023}}</ref> Just as CNA is preparing a global initiative to shape policy on future cybersecurity challenges, so should you apply some focus to what potential technology upgrades may be made and what new cyber threats may appear.  
The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.<ref name="WoolstonWhySci22">{{Cite journal |last=Woolston |first=Chris |date=2022-05-31 |title=Why science needs more research software engineers |url=https://www.nature.com/articles/d41586-022-01516-2 |journal=Nature |language=en |pages=d41586–022–01516-2 |doi=10.1038/d41586-022-01516-2 |issn=0028-0836}}</ref><ref name="KITRSE@KIT24">{{cite web |url=https://www.rse-community.kit.edu/index.php |title=RSE@KIT |publisher=Karlsruhe Institute of Technology |date=20 February 2024 |accessdate=01 May 2024}}</ref><ref name="PUPurdueCenter">{{cite web |url=https://www.rcac.purdue.edu/rse |title=Purdue Center for Research Software Engineering |publisher=Purdue University |date=2024 |accessdate=01 May 2024}}</ref> RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."<ref name="WoolstonWhySci22" /><ref name="CohenTheFour21">{{Cite journal |last=Cohen |first=Jeremy |last2=Katz |first2=Daniel S. |last3=Barker |first3=Michelle |last4=Chue Hong |first4=Neil |last5=Haines |first5=Robert |last6=Jay |first6=Caroline |date=2021-01 |title=The Four Pillars of Research Software Engineering |url=https://ieeexplore.ieee.org/document/8994167/ |journal=IEEE Software |volume=38 |issue=1 |pages=97–105 |doi=10.1109/MS.2020.2973362 |issn=0740-7459}}</ref> Elaborating on that concept, Cohen ''et al.'' add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."<ref name="CohenTheFour21" />


Finally, some of the plan's scope may be dictated by prioritized assessment of risks to critical assets—addressed in the next section—and other assessments. It's important to keep this in mind when developing the scope; it may be affected by other parts of the plan. As you develop further sections of the plan, you may need to update previous sections with what you've learned.
The concept of [[software quality management]] (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref> However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software.<ref name="MoynihanTheHitch20" /> Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.


====5.2.2 Define the roles, responsibilities, and chain of command of those enacting and updating the cybersecurity plan====
===The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches===
You'll also want to define who will fill what roles, what responsibilities they will have, and who reports to who as part of the scope of your plan. This will include not only who's responsible for developing the cybersecurity plan (which you'll have hopefully determined early on) but also implementing, enforcing, and updating it. Having a senior manager who's able to oversee these responsibilities, make decisions, and enforce requirements will improve the plan's chance of success. Having clearly defined security-related roles and responsibilities (including security risk management) at one or more organizational levels (depending on how big your organization is) will also improve success rates.<ref name="NARUCCyber18" /><ref name="LebanidzeGuide11" /><ref name="CopelandHowToDev18">{{cite web |url=https://www.copelanddata.com/blog/how-to-develop-a-cybersecurity-plan/ |title=How to Develop A Cybersecurity Plan For Your Company (checklist included) |publisher=Copeland Technology Solutions |date=17 July 2018 |accessdate=10 March 2023}}</ref><ref name="TalamantesDoesYour17">{{cite web |url=https://www.redteamsecure.com/blog/does-your-cybersecurity-plan-need-an-update |title=Does Your Cybersecurity Plan Need an Update? |author=Talamantes, J. |work=RedTeam Knowledge Base |publisher=RedTeam Security Corporation |date=06 September 2017 |accessdate=10 March 2023}}</ref>
Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, [[metadata]], and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.<ref name="GhiringhelliShared23">{{Cite journal |last=Ghiringhelli |first=Luca M. |last2=Baldauf |first2=Carsten |last3=Bereau |first3=Tristan |last4=Brockhauser |first4=Sandor |last5=Carbogno |first5=Christian |last6=Chamanara |first6=Javad |last7=Cozzini |first7=Stefano |last8=Curtarolo |first8=Stefano |last9=Draxl |first9=Claudia |last10=Dwaraknath |first10=Shyam |last11=Fekete |first11=Ádám |date=2023-09-14 |title=Shared metadata for data-centric materials science |url=https://www.nature.com/articles/s41597-023-02501-8 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=626 |doi=10.1038/s41597-023-02501-8 |issn=2052-4463 |pmc=PMC10502089 |pmid=37709811}}</ref><ref name="FirschenAgile22">{{Cite journal |last=Fitschen |first=Timm |last2=tom Wörden |first2=Henrik |last3=Schlemmer |first3=Alexander |last4=Spreckelsen |first4=Florian |last5=Hornung |first5=Daniel |date=2022-10-12 |title=Agile Research Data Management with FDOs using LinkAhead |url=https://riojournal.com/article/96075/ |journal=Research Ideas and Outcomes |volume=8 |pages=e96075 |doi=10.3897/rio.8.e96075 |issn=2367-7163}}</ref> This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven [[Ontology (information science)|ontologies]] (i.e., controlled vocabularies) at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.<ref name="FirschenAgile22" /> And that approach must value the importance of metadata and PIDs. As Weigel ''et al.'' note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."<ref>{{Cite journal |last=Weigel |first=Tobias |last2=Schwardmann |first2=Ulrich |last3=Klump |first3=Jens |last4=Bendoukha |first4=Sofiane |last5=Quick |first5=Robert |date=2020-01 |title=Making Data and Workflows Findable for Machines |url=https://direct.mit.edu/dint/article/2/1-2/40-46/9994 |journal=Data Intelligence |language=en |volume=2 |issue=1-2 |pages=40–46 |doi=10.1162/dint_a_00026 |issn=2641-435X}}</ref> Enter non-relational RDF [[knowledge graph]] [[database]]s.


====5.2.3 Ensure that roles and responsibility for security (the “who” of it) are clear====
This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational [[Resource Description Framework]] (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.  
Defining roles, responsibilities, and chain of command isn't enough. Effectively communicating these roles and responsibilities to everyone inside and outside the organization—including third parties such as contractors and [[Cloud computing|cloud providers]]—is vital. This typically involves encouraging transparency of cybersecurity and responsibility goals of the organization, as well as addressing everyday communications and education of everyone affected by the cybersecurity plan.<ref name="NARUCCyber18" /><ref name="LebanidzeGuide11" /><ref name="CopelandHowToDev18" /> However, through it all, keep in mind for future communications and training that ultimately security is everyone's responsibility, from employees to contractors, not just those enacting and updating the plan.


<div class="nonumtoc">__TOC__</div>
Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.<ref name="AggourSemantics24">{{Cite journal |last=Aggour |first=Kareem S. |last2=Kumar |first2=Vijay S. |last3=Gupta |first3=Vipul K. |last4=Gabaldon |first4=Alfredo |last5=Cuddihy |first5=Paul |last6=Mulwad |first6=Varish |date=2024-04-09 |title=Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data |url=https://link.springer.com/10.1007/s40192-024-00348-4 |journal=Integrating Materials and Manufacturing Innovation |language=en |doi=10.1007/s40192-024-00348-4 |issn=2193-9764}}</ref> However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.<ref name="AggourSemantics24" /><ref name="GrobeFromData19">{{Cite journal |last=Grobe |first=Peter |last2=Baum |first2=Roman |last3=Bhatty |first3=Philipp |last4=Köhler |first4=Christian |last5=Meid |first5=Sandra |last6=Quast |first6=Björn |last7=Vogt |first7=Lars |date=2019-06-26 |title=From Data to Knowledge: A semantic knowledge graph application for curating specimen data |url=https://biss.pensoft.net/article/37412/ |journal=Biodiversity Information Science and Standards |language=en |volume=3 |pages=e37412 |doi=10.3897/biss.3.37412 |issn=2535-0897}}</ref>
===5.3 Identify cybersecurity requirements and objectives===
[[File:Cybersecurity Strategy 5 Layer CS5L.png|right|450px]]
====5.3.1 Detail the existing system and classify its critical and non-critical cyber assets====
AHIMA recommends you "create an information asset inventory as a base for risk analysis that defines where all data and information are stored across the entire organization."<ref name="DowningAHIMA17">{{cite web |url=https://journal.ahima.org/wp-content/uploads/2017/12/AHIMA-Guidelines-Cybersecurity-Plan.pdf |archiveurl=https://web.archive.org/web/20220119204903/https://journal.ahima.org/wp-content/uploads/2017/12/AHIMA-Guidelines-Cybersecurity-Plan.pdf |format=PDF |title=AHIMA Guidelines: The Cybersecurity Plan |author=Downing, K. |publisher=American Health Information Management Association |date=December 2017 |archivedate=19 January 2022 |accessdate=10 March 2023}}</ref> Consider any applications and systems used within the periphery of your operations, including business intelligence software, mobile devices, and legacy systems. Remember that any networked application or system could potentially be compromised and turned into a vector of attack. Additionally, classify and gauge those assets' based on type, risk, and criticality. What are their essential functions? How can they be grouped? How do they communicate: internally, externally, or not at all?<ref name="LebanidzeGuide11">{{cite web |url=https://www.cooperative.com/programs-services/bts/documents/guide-cybersecurity-mitigation-plan.pdf |format=PDF |title=Guide to Developing a Cyber Security and Risk Mitigation Plan |author=Lebanidze, E. |publisher=National Rural Electric Cooperative Association, Cooperative Research Network |date=2011 |accessdate=10 March 2023}}</ref> As AHIMA notes, you'll be able to use this asset inventory, in combination with a variety of additional assessments described below, as a base for your risk assessment.


====5.3.2 Define the contained data and classify its criticality====
It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra ''et al.'' and the FAIR Cookbook.<ref name="Rocca-SerraFAIRCook22">{{Cite book |last=Rocca-Serra, Philippe |last2=Sansone, Susanna-Assunta |last3=Gu, Wei |last4=Welter, Danielle |last5=Abbassi Daloii, Tooba |last6=Portell-Silva, Laura |date=2022-06-30 |title=D2.1 FAIR Cookbook |url=https://zenodo.org/record/6783564 |chapter=FAIR and Knowledge graphs |doi=10.5281/ZENODO.6783564}}</ref>) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide [[Semantics|semantic]] transparency (i.e., provide a framework for better understanding and reusing the greater research object through basic examination of the relationships of its associated metadata and their constituents), making these objects more easily accessible, interoperable, and machine-readable.<ref name="AggourSemantics24" /> The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.<ref name="AggourSemantics24" /><ref name="GrobeFromData19" /><ref name="Rocca-SerraFAIRCook22" /><ref name="TomlinsonRDF23">{{cite web |url=https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf |format=PDF |title=RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software |author=Tomlinson, E. |publisher=Semaphore Solutions, Inc |date=28 July 2023 |accessdate=01 May 2024}}</ref><ref name="DeagenFAIRAnd22">{{Cite journal |last=Deagen |first=Michael E. |last2=McCusker |first2=Jamie P. |last3=Fateye |first3=Tolulomo |last4=Stouffer |first4=Samuel |last5=Brinson |first5=L. Cate |last6=McGuinness |first6=Deborah L. |last7=Schadler |first7=Linda S. |date=2022-05-27 |title=FAIR and Interactive Data Graphics from a Scientific Knowledge Graph |url=https://www.nature.com/articles/s41597-022-01352-z |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=239 |doi=10.1038/s41597-022-01352-z |issn=2052-4463 |pmc=PMC9142568 |pmid=35624233}}</ref><ref>{{Cite journal |last=Brandizi |first=Marco |last2=Singh |first2=Ajit |last3=Rawlings |first3=Christopher |last4=Hassani-Pak |first4=Keywan |date=2018-09-25 |title=Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach |url=https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html |journal=Journal of Integrative Bioinformatics |language=en |volume=15 |issue=3 |pages=20180023 |doi=10.1515/jib-2018-0023 |issn=1613-4516 |pmc=PMC6340125 |pmid=30085931}}</ref> This concept can even be extended to ''post factum'' visualizations of the knowledge graph data<ref name="DeagenFAIRAnd22" />, as well as the FAIR management of computational laboratory [[workflow]]s.<ref>{{Cite journal |last=de Visser |first=Casper |last2=Johansson |first2=Lennart F. |last3=Kulkarni |first3=Purva |last4=Mei |first4=Hailiang |last5=Neerincx |first5=Pieter |last6=Joeri van der Velde |first6=K. |last7=Horvatovich |first7=Péter |last8=van Gool |first8=Alain J. |last9=Swertz |first9=Morris A. |last10=Hoen |first10=Peter A. C. ‘t |last11=Niehues |first11=Anna |date=2023-09-28 |editor-last=Palagi |editor-first=Patricia M. |title=Ten quick tips for building FAIR workflows |url=https://dx.plos.org/10.1371/journal.pcbi.1011369 |journal=PLOS Computational Biology |language=en |volume=19 |issue=9 |pages=e1011369 |doi=10.1371/journal.pcbi.1011369 |issn=1553-7358 |pmc=PMC10538699 |pmid=37768885}}</ref>
During the asset inventory, you'll also want to address classifying the type of data contained or transported by the cyber asset, which aids in decision-making regarding the controls you'll need to adequately protect the assets.<ref name="LebanidzeGuide11" /> Use a consistent set of nomenclature to define the data. For example, if you look at universities such as the University of Illinois and Carnagie Mellon University, they provide guidance on how to classify institutional data based on characteristics such as criticality, sensitivity, and risk. The University of Illinois has a defined set of standardized terms such as "high-risk," "sensitive," "internal," and "public,"<ref name="UoIData19">{{cite web |url=https://cybersecurity.uillinois.edu/data_classification |title=Data Classification Overview |work=Cybersecurity |publisher=University of Illinois System |date=2019 |accessdate=10 March 2023}}</ref> whereas Carnagie Mellon uses "restricted," "private," and "public."<ref name="CMUGuidelines18">{{cite web |url=https://www.cmu.edu/data/guidelines/data-classification.html |title=Guidelines for Data Classification |work=Information Security Office Guidelines |publisher=Carnegie Mellon University |date=16 November 2022 |accessdate=10 March 2023}}</ref> You don't necessarily need to use anyone's classification system verbatim; however, do use a consistent set of terminology to define and classify data.<ref name="LebanidzeGuide11" /> Consider also adding additional details about whether the data is in-motion, in-use, or at-rest.<ref name="BowieSEC19">{{cite web |url=https://adeliarisk.com/sec-cybersecurity-guidance-data-loss-prevention/ |archiveurl=https://web.archive.org/web/20191130181159/https://adeliarisk.com/sec-cybersecurity-guidance-data-loss-prevention/ |title=SEC Cybersecurity Guidance: Data Loss Prevention |author=Bowie, K. |publisher=Adelia Associates, LLC |date=09 April 2019 |archivedate=30 November 2019 |accessdate=10 March 2023}}</ref>


If you have difficulties classifying the data, pose a series of data protection questions concerning the data's characteristics. One such baseline for questions could be the European Union's definition of what constitutes personal data. For example<ref name="LebanidzeGuide11" /><ref name="KochWhatIs19">{{cite web |url=https://gdpr.eu/eu-gdpr-personal-data/ |title=What is considered personal data under the EU GDPR? |author=Koch, R. |publisher=Proton Technologies AG |date=01 February 2019 |accessdate=10 March 2023}}</ref>:
While rare, some commercial laboratory informatics vendors like Semaphore Solutions have already recognized the potential of RDF knowledge graph databases to FAIR-driven laboratory research, having implemented such structures into their offerings.<ref name="TomlinsonRDF23" /> (The use of knowledge graphs has already been demonstrated in academic research software, such as with the ELN tools developed by RSEs at the University of Rostock and University of Amsterdam.<ref>{{Cite journal |last=Schröder |first=Max |last2=Staehlke |first2=Susanne |last3=Groth |first3=Paul |last4=Nebe |first4=J. Barbara |last5=Spors |first5=Sascha |last6=Krüger |first6=Frank |date=2022-12 |title=Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation |url=https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x |journal=Journal of Biomedical Semantics |language=en |volume=13 |issue=1 |pages=4 |doi=10.1186/s13326-021-00257-x |issn=2041-1480 |pmc=PMC8802522 |pmid=35101121}}</ref>) As noted in the prior point, it is potentially advantageous to not only laboratory informatics vendors to provide but also research labs to use relevant and sustainable research software that has the FAIR principles embedded in the software's design. Turning to knowledge graph databases is another example of keeping such software relevant and FAIR to research labs.


* Does the data identify an individual directly?
===Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications===
* Does the data relate specifically to an identifiable person?
The third and final point for this Q&A article highlights another positive consequence of engineering laboratory informatics software with FAIR in mind: FAIRified research objects are much closer to being usable for the trending inclusion of [[machine learning]] (ML) and [[artificial intelligence]] (AI) tools in laboratory informatics platforms and other companion research software. By developing laboratory informatics software with a focus on FAIR-driven metadata and database schemes, not only are research objects more FAIR but also "cleaner" and more machine-ready for advanced analytical uses as with ML and AI.
* Could the data—when processed, lost, or misused—have an impact on an individual?


====5.3.3 Identify current and previous cybersecurity policy and tools, as well as their effectiveness====
To be sure, the FAIRness of any structured dataset alone is not enough to make it ready for ML and AI applications. Factors such as classification, completeness, context, correctness, duplicity, integrity, mislabeling, outliers, relevancy, sample size, and timeliness of the research object and its contents are also important to consider.<ref name="HinidumaDataRead24">{{Cite journal |last=Hiniduma |first=Kaveen |last2=Byna |first2=Suren |last3=Bez |first3=Jean Luca |date=2024 |title=Data Readiness for AI: A 360-Degree Survey |url=https://arxiv.org/abs/2404.05779 |journal=arXiv |doi=10.48550/ARXIV.2404.05779}}</ref><ref name="FletcherFAIRRe24">{{Cite journal |last=Fletcher |first=Lydia |date=2024-04-16 |others=The University Of Texas At Austin, The University Of Texas At Austin |title=FAIR Re-use: Implications for AI-Readiness |url=https://repositories.lib.utexas.edu/handle/2152/124873 |doi=10.26153/TSW/51475}}</ref> When those factors aren't appropriately addressed as part of a FAIRification effort towards AI readiness (as well as part of the development of research software of all types), research data and metadata have a higher likelihood of revealing themselves to be inconsistent. As such, searches and analytics using that data and metadata become muddled, and the ultimate ML or AI output will also be muddled (i.e., "garbage in, garbage out"). Whether retroactively updating existing research objects to a more FAIRified state or ensuring research objects (e.g., those originating in an ELN or LIMS) are more FAIR and AI-ready from the start, research software updating or generating those research objects has to address ontologies, data models, data types, identifiers, and more in a thorough yet flexible way.<ref name="OlsenEmbracing23">{{cite web |url=https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness |title=Embracing FAIR Data on the Path to AI-Readiness |author=Olsen, C. |work=Pharma's Almanac |date=01 September 2023 |accessdate=03 May 2024}}</ref>
Unless your business is in the formative stages, some type of technology infrastructure and policy likely exists. What, if any, cybersecurity policies and tools have you implemented in the past? Review any current access control protocols (e.g., role-based and "least privilege" policies) and security policies. Have they been updated to take into consideration recent changes in threats, risks, criticality, technology, or regulation?<ref name="DowningAHIMA17" /><ref name="LagoHowTo19">{{cite web |url=https://www.cio.com/article/222076/how-to-implement-a-successful-security-plan.html |title=How to implement a successful cybersecurity plan |author=Lago, C. |work=CIO |publisher=IDG Communications, Inc |date=10 July 2019 |accessdate=10 March 2023}}</ref><ref name="CopelandHowToDev18">{{cite web |url=https://www.copelanddata.com/blog/how-to-develop-a-cybersecurity-plan/ |title=How to Develop A Cybersecurity Plan For Your Company (checklist included) |publisher=Copeland Technology Solutions |date=17 July 2018 |accessdate=10 March 2023}}</ref> In the same way, identify any past security policies and why they were discontinued. It may be convenient to track all these security protocols and policies in a master sheet, rather than spread out across multiple documents. Also, now might be a good time to identify how security-aware personnel are overall.<ref name="LagoHowTo19" /> Of course, if protocols and policies aren't in place, create them, remembering to include proper communication, scheduled policy reviews, and training into the equation.


====5.3.4 Identify the regulations, standards, and best practices affecting your assets and data====
Noting that Wilkinson ''et al.'' originally highlighted the importance of machine-readability of FAIR data, Huerta ''et al.'' add that that core principle of FAIRness "is synergistic with the rapid adoption and increased use of AI in research."<ref name="HuertaFAIRForAI23">{{Cite journal |last=Huerta |first=E. A. |last2=Blaiszik |first2=Ben |last3=Brinson |first3=L. Catherine |last4=Bouchard |first4=Kristofer E. |last5=Diaz |first5=Daniel |last6=Doglioni |first6=Caterina |last7=Duarte |first7=Javier M. |last8=Emani |first8=Murali |last9=Foster |first9=Ian |last10=Fox |first10=Geoffrey |last11=Harris |first11=Philip |date=2023-07-26 |title=FAIR for AI: An interdisciplinary and international community building perspective |url=https://www.nature.com/articles/s41597-023-02298-6 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=487 |doi=10.1038/s41597-023-02298-6 |issn=2052-4463 |pmc=PMC10372139 |pmid=37495591}}</ref> They go on to discuss the positive interactions of FAIR research objects with FAIR-driven, AI-based research. Among the benefits include<ref name="HuertaFAIRForAI23" />:
Arguably, most business types will be impacted by [[Regulatory compliance|regulations]], standards, or best practices. Even niche professions like cinema editors are guided by best practices set forth by professional organizations.<ref name="ACEBest17">{{cite web |url=https://americancinemaeditors.org/best-practices-guide/ |title=ACE Best Practices Guide for Post Production |publisher=American Cinema Editors |date=2017 |accessdate=10 March 2023}}</ref> In the case of laboratories, multiple regulations and standards apply to operations, including information management and privacy practices. Presumably one or more executives in your business are familiar with the legal and professional aspects of how the business should be run. If not, significant research and outside consultant help may be required. Regardless, when approaching this task, ensure everyone understands the distinctions among "regulation," "standard," and "best practice."  


Remember that while regulators may dictate how you manage your cybersecurity assets, setting policy that goes above and beyond regulation is occasionally detrimental to your business. [[Data retention]] requirements, for example, are important to consider, not only for regulatory purposes but also data management and security reasons. To be sure, numerous U.S. Code of Federal Regulations (e.g., [[21 CFR Part 11]], 40 CFR Part 141, and 45 CFR Part 164), European Union regulations (e.g., E.U. Annex 11 and E.U. Commission Directive 2003/94/EC), and even global entities (e.g., WHO Technical Report Series, #986, Annex 2) address the need for record retention. However, as AHIMA points out, records shouldn't be kept forever<ref name="DowningAHIMA17" />:
*greater findability of FAIR research objects for further AI-driven scientific discovery;
*greater reproducibility of FAIR research objects and any AI models published with them;
*improved generalization of AI-driven medical research models when exposed to diverse and FAIR research objects;
*improved reporting of AI-driven research results using FAIRified research objects, lending further credibility to those results;
*more uniform comparison of AI models using well-defined hyperstructure and information training conditions from FAIRified research objects;
*more developed and interoperable "data e-infrastructure," which can further drive a more effective "AI services layer";
*reduced bias in AI-driven processes through the use of FAIR research objects and AI models; and
*improved surety of scientific correctness where reproducibility in AI-driven research can't be guaranteed.


<blockquote>Healthcare organizations have been storing and maintaining records and information well beyond record retention requirements. This creates significant additional security risks as systems and records must be maintained, patched, backed up, and provisioned (access) for longer than necessary or required by law ...  In the era of big data the idea of keeping “everything forever” must end. It simply is not feasible, practical, or economical to secure legacy and older systems forever.</blockquote>
In the end, developers of research software (whether discipline-specific research software or broader laboratory informatics solutions) would be advised to keep in mind the growing trends of FAIR research, FAIR software, and ML- and AI-driven research, especially in the [[life sciences]], but also a variety of other fields.<ref name="HuertaFAIRForAI23" />


This example illustrates the idea that while regulatory compliance is imperative, going well beyond compliance limits has its own costs, not only financially but also by increasing cybersecurity risk.
===Restricted clinical data and its FAIRification for greater research innovation===
Broader discussion in the research community continues to occur in regards to how best to ethically make restricted or privacy-protected clinical data and information FAIR for greater innovation and, by extension, improved patient outcomes, particularly in the wake of the [[COVID-19]] [[pandemic]].<ref name="MaxwellFAIREthic23">{{Cite journal |last=Maxwell |first=Lauren |last2=Shreedhar |first2=Priya |last3=Dauga |first3=Delphine |last4=McQuilton |first4=Peter |last5=Terry |first5=Robert F |last6=Denisiuk |first6=Alisa |last7=Molnar-Gabor |first7=Fruzsina |last8=Saxena |first8=Abha |last9=Sansone |first9=Susanna-Assunta |date=2023-10 |title=FAIR, ethical, and coordinated data sharing for COVID-19 response: a scoping review and cross-sectional survey of COVID-19 data sharing platforms and registries |url=https://linkinghub.elsevier.com/retrieve/pii/S2589750023001292 |journal=The Lancet Digital Health |language=en |volume=5 |issue=10 |pages=e712–e736 |doi=10.1016/S2589-7500(23)00129-2 |pmc=PMC10552001 |pmid=37775189}}</ref><ref name="Queralt-RosinachApplying22">{{Cite journal |last=Queralt-Rosinach |first=Núria |last2=Kaliyaperumal |first2=Rajaram |last3=Bernabé |first3=César H. |last4=Long |first4=Qinqin |last5=Joosten |first5=Simone A. |last6=van der Wijk |first6=Henk Jan |last7=Flikkenschild |first7=Erik L.A. |last8=Burger |first8=Kees |last9=Jacobsen |first9=Annika |last10=Mons |first10=Barend |last11=Roos |first11=Marco |date=2022-12 |title=Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic |url=https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7 |journal=Journal of Biomedical Semantics |language=en |volume=13 |issue=1 |pages=12 |doi=10.1186/s13326-022-00263-7 |issn=2041-1480 |pmc=PMC9036506 |pmid=35468846}}</ref><ref>{{Cite journal |last=Martínez-García |first=Alicia |last2=Alvarez-Romero |first2=Celia |last3=Román-Villarán |first3=Esther |last4=Bernabeu-Wittel |first4=Máximo |last5=Luis Parra-Calderón |first5=Carlos |date=2023-05 |title=FAIR principles to improve the impact on health research management outcomes |url=https://linkinghub.elsevier.com/retrieve/pii/S2405844023029407 |journal=Heliyon |language=en |volume=9 |issue=5 |pages=e15733 |doi=10.1016/j.heliyon.2023.e15733 |pmc=PMC10189186 |pmid=37205991}}</ref> (Note that while there are other types of restricted and privacy-protected data, this section will focus largely on clinical data and research objects as the most obvious type.)


====5.3.5 Identify and analyze logical and physical system entry points and configurations====
These efforts have usually revolved around pulling reusable clinical patient or research data from [[hospital information system]]s (HIS), [[electronic medical record]]s (EMRs), [[clinical trial management system]]s (CTMSs), and research databases (often relational in nature) that either contain de-identified data or can de-identify aspects of data and information before access and extraction. Sometimes that clinical data or research object may have already in part been FAIRified, but often it may not be. In all cases, the concepts of privacy, security, and anonymization come up as part of any desire to gain access to that clinical material. However, any FAIRified clinical data isn't necessarily readily open for access. As Snoeijer ''et al.'' note: "The authors of the FAIR principles, however, clearly indicate that 'accessible' does not mean open. It means that clarity and transparency is required around the conditions governing access and reuse."<ref name="SnoeijerProcess19">{{cite book |url=https://phuse.s3.eu-central-1.amazonaws.com/Archive/2019/Connect/EU/Amsterdam/PAP_SA04.pdf |format=PDF |chapter=Paper SA04 - Processing big data from multiple sources |title=Proceedings of PHUSE Connect EU 2019 |author=Snoeijer, B.; Pasapula, V.; Covucci, A. et al. |publisher=PHUSE Limited |year=2019 |accessdate=03 May 2024}}</ref>
This step is actually closely tied to the next step concerning gap analysis. As such, you may wish to address both steps together. You've already identified your critical and non-critical assets, and performing a gap analysis on them may be a useful start in finding and analyzing the logical entry points of a system. But what are some of the most common entry points that attackers may use?<ref name="KumarDiscover16">{{cite web |url=https://resources.infosecinstitute.com/topic/discovering-entry-points/ |title=Discovering Entry Points |author=Kumar, A.J. |publisher=InfoSec Institute |date=06 September 2016 |accessdate=10 March 2023}}</ref><ref name="AhmedIndustrial19">{{cite web |url=https://www.controleng.com/articles/industrial-control-system-ics-cybersecurity-advice-best-practices/ |title=Industrial control system (ICS) cybersecurity advice, best practices |author=Ahmed, O.; Rehman, A.; Habib, A. |work=Control Engineering |publisher=CFE Media LLC |date=12 May 2019 |accessdate=10 March 2023}}</ref><ref name="BonderudPodcast19">{{cite web |url=https://securityintelligence.com/media/podcast-lateral-movement-combating-high-risk-low-noise-threats/ |title=Podcast: Lateral Movement: Combating High-Risk, Low-Noise Threats |author=Bonderud, D. |work=SecurityIntelligence |publisher=IBM |date=11 June 2019 |accessdate=10 March 2023}}</ref><ref name="VerizonIncident19">{{cite web |url=https://www.verizon.com/business/resources/reports/dbir/2019/incident-classification-patterns-subsets/ |title=Incident Classification Patterns and Subsets |work=2019 Data Breach Investigations Report |publisher=Verizon |date=2019 |accessdate=10 March 2023}}</ref>


* Inbound network-based attacks through software, network gateways, and online repositories
This is being mentioned in the context of laboratory informatics applications for a couple of reasons. First, a well-designed commercial LIMS that supports clinical research laboratory workflows is already going to address privacy and security aspects, as part of the developer recognizing the need for those labs to adhere to regulations such as the [[Health Insurance Portability and Accountability Act]] (HIPAA) and comply with standards such as [[ISO 15189]]. However, such a system may not have been developed with FAIR data principles in mind, and any built-in metadata and ontology schemes may be insufficient for full FAIRification of laboratory-based clinical trial research objects. As Queralt-Rosinach ''et al.'' note, however, "interestingly, ontologies may also be used to describe data access restrictions to complement FAIR metadata with information that supports data safety and patient privacy."<ref name="Queralt-RosinachApplying22" /> Essentially, the authors are suggesting that while a HIS or LIS may have built-in access management tools, setting up ontologies and metadata mechanisms that link privacy aspects of a research object (e.g., "has consent form for," "is de-identified," etc.) to the object's metadata allows for even more flexible, FAIR-driven approaches to privacy and security. Research software developers creating such information management tools for the regulated clinical research space may want to apply FAIR concepts such as this to how access control and privacy restrictions are managed. This will inevitably mean any research objects exported with machine-readable privacy-concerning metadata will be more reusable in a way that still "supports data safety and patient privacy."<ref name="Queralt-RosinachApplying22" />
* Inbound network-based attacks through misconfigured firewalls and gateways
* Access to systems using stolen credentials (networked and physical)
* Access to peripheral systems via communication protocols, insecure credentials, etc. through lateral movement in the network


From email and [[enterprise resource planning]] (ERP) applications and servers to networking devices and tools, a wide variety of vectors for attack exist in the system, some more common than others. Analyzing these components and configurations takes significant expertise. If internal expertise is unavailable for this, it may require a third-party security assessment to gain a clearer picture of the entry points into your system. Even employees and their lack of cybersecurity knowledge may represent points of entry, via phishing schemes.<ref name="DowningAHIMA17" /><ref name="VerizonIncident19" /> This is where training and internal random testing (addressed later) come into play.<ref name="DowningAHIMA17" />
Second, a well-designed research software solution working with clinical data will provide not only support for open, community-supported data models and vocabularies for clinical data, but also standardized community-driven ontologies that are specifically developed for access control and privacy. Queralt-Rosinach ''et al.'' continue<ref name="Queralt-RosinachApplying22" />:


Physical access to system components and data also represent a significant attack vector, more so in particular industries and network set-ups. For example, industrial control systems in manufacturing plants may require extra consideration, with some control system vendors now offering an added layer of physical security in the form of physical locks that prevent code from being executed on the controller.<ref name="AhmedIndustrial19" /> Cloud-based data centers and field-based monitoring systems represent other specialist situations requiring added physical controls.<ref name="LebanidzeGuide11" /><ref name="DowningAHIMA17" /><ref name="CopelandHowToDev18" /> That's not to say that even small businesses shouldn't worry about physical security; their workstations, laptops, USB drives, mobile devices, etc. can be compromised if made easy for the general public to access offices and other work spaces.<ref name="CopelandHowToDev18" /> In regulated environments, physical access controls and facility monitoring may even be mandated.
<blockquote>Also, very important for accessibility and data privacy is that the digital objects ''per se'' can accommodate the criteria and protocols necessary to comply with regulatory and governance frameworks. Ontologies can aid in opening and protecting patient data by exposing logical definitions of data use conditions. Indeed, there are ontologies to define access and reuse conditions for patient data such as the Informed Consent Ontology (ICO), the Global Alliance for Genomics and Health Data Use Ontology (DUO) standard, and the Open Digital Rights Language (ODRL) vocabulary recommended by W3C.</blockquote>


====5.3.6 Perform a gap analysis====
Also of note here is the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and its OHDSI standardized vocabularies. In all these cases, a developer-driven approach to research software that incorporates community-driven standards that support FAIR principles is welcome. However, as Maxwell ''et al.'' noted in their ''Lancet'' review article in late 2023, "few platforms or registries applied community-developed standards for participant-level data, further restricting the interoperability of ... data-sharing initiatives [like FAIR]."<ref name="MaxwellFAIREthic23" /> As the FAIR principles continue to gain ground in clinical research and diagnostics settings, software developers will need to be more attuned to translating old ways of development to ones that incorporate FAIR data and software principles. Demand for FAIR data will only continue to grow, and any efforts to improve interoperability and reusability while honoring (and enhancing) privacy and security aspects of restricted data will be appreciated by clinical researchers. However, just as FAIR is not an overall goal for researchers, software built with FAIR principles in mind is not the end point of research organizations managing restricted and privacy-protected research objects. Ultimately, those organizations will have make other considerations about restricted data in the scope of FAIR, including addressing data management plans, data use agreements, disclosure review practices, and training as it applies to their research software and generated research objects.<ref>{{Cite journal |last=Jang |first=Joy Bohyun |last2=Pienta |first2=Amy |last3=Levenstein |first3=Margaret |last4=Saul |first4=Joe |date=2023-12-06 |title=Restricted data management: the current practice and the future |url=https://journalprivacyconfidentiality.org/index.php/jpc/article/view/844 |journal=Journal of Privacy and Confidentiality |volume=13 |issue=2 |doi=10.29012/jpc.844 |issn=2575-8527 |pmc=PMC10956935 |pmid=38515607}}</ref>
A gap analysis is different from a risk analysis in that the gap analysis represents a high-level, narrowly-focused comparison of the technical, physical, and administrative safeguards in place with how well they actually perform against a cyber attack. As such, the gap analysis can be thought of as introduction to potential vulnerabilities in a system, which is part of an overall risk analysis.<ref name="NortonSimilar18">{{cite web |url=https://intraprisehealth.com/similar-but-different-gap-assessment-vs-risk-assessment/ |title=Similar but Different: Gap Assessment vs Risk Analysis |author=Norton, K. |publisher=HIPAA One |date=21 June 2018 |accessdate=10 March 2023}}</ref> The gap analysis asks what your cyber capabilities are, what the major threats are, and what the differences are between the two. Additionally, you may want to consider what the potential impacts would be if a threat were realized.<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref>


The gap analysis can also be looked at as measure of current safeguards in place vs. what industry best practice controls dictate. This may be done by choosing an industry-standard security framework—we're using the NIST SP 800-53, Rev. 5 framework for this guide—and evaluating key stakeholder policies, responsibilities, and processes against that framework.<ref name="SellHowTo15">{{cite web |url=https://www.cio.com/article/251153/how-to-conduct-an-information-security-gap-analysis.html |title=How To Conduct An Information Security Gap Analysis |author=Sell, C. |work=CIO |publisher=IDG Communications, Inc |date=28 January 2015 |accessdate=10 March 2023}}</ref>
==Conclusion==
Laboratory informatics developers will also need to remember that FAIRification of research in itself is not a goal for research laboratories; it is a continual process that recognizes improved scientific research and greater innovation as a more likely outcome.<ref name="WilkinsonTheFAIR16" /><ref name="OlsenEmbracing23" /><ref name="HuertaFAIRForAI23" />


====5.3.7 Perform a risk analysis and prioritize risk based on threat, vulnerability, likelihood, and impact====
==References==
With cybersecurity goals, asset inventory, and gap analysis in hand, its time to go comprehensive with [[risk assessment]] and prioritization. Regardless of whether or not you're hosting and transmitting PHI or other types of sensitive information, you'll want to look at all your cybersecurity goals, systems, and applications as part of the risk analysis.<ref name="DowningAHIMA17" /> Functions of risk analysis include, but are not limited to<ref name="LebanidzeGuide11" /><ref name="NortonSimilar18" /><ref name="TalamantesDoesYour17">{{cite web |url=https://www.redteamsecure.com/blog/does-your-cybersecurity-plan-need-an-update |title=Does Your Cybersecurity Plan Need an Update? |author=Talamantes, J. |work=RedTeam Knowledge Base |publisher=RedTeam Security Corporation |date=06 September 2017 |accessdate=10 March 2023}}</ref>:
{{Reflist|colwidth=30em}}
 
<!---Place all category tags here-->
* considering the operations supporting business goals and how those operations use technology to achieve them;
* considering the various ways the system functionality and entry points could be abused and compromised (threat modeling);
* comparing the current system's or component's architecture and features to various threat models; and
* compiling the risks identified during threat modeling and architecture analysis and prioritizing them based on threat, vulnerability, likelihood, and impact.
 
Additionally, as part of this process, you'll also want to examine the human element of risk in your business. How thorough are your background checks of new employees and third parties accessing your systems? How easy is it for them to access the software and the hardware? Is the principle of "least privilege" being used appropriately? Have any employee loyalties shifted drastically lately? Are the vendors supplying your IT and data services thoroughly vetted? These and other questions can supplement the human-based aspect of cybersecurity risk assessment.
 
====5.3.8 Declare and describe objectives based on the outcomes of the above assessments====
After performing all that research, it's finally time to distill it down into a clear set of cybersecurity objectives. Those objectives will act as the underlying core for the actions the business will take to develop policies, fill in security gaps, monitor progress, and educate staff. The objectives that come out of this step "should be specific, realistic, and actionable."<ref name="NARUCCyber18" /> In a world where cyber threats are constantly evolving, being "100 percent secure against cyber threats" is an unrealistic goal, for example.<ref name="NARUCCyber18" />
 
One way to go about this process is to go back to the cybersecurity strategy you created (see 5.1.3) and place your objectives under the strategic goals they support. Perhaps one of your goals is to promote a culture of cybersecurity awareness among all employees and contractors. Under that goal you could list objectives such as "improve subject-matter expertise among leadership" and "support and encourage biannual cybersecurity training exercises throughout the business." You may also want to, at this point, make mention of what prioritization the objectives have and what progress measurement mechanisms can and should be put in place. Finally, work a certain level of adaptability into the objectives, not only in what they should achieve but also how they should be evaluated and updated. Technology, attack vectors, and even business needs can change rapidly. The objectives you develop should take this into consideration, as should the review and update policy for the cybersecurity plan itself (discussed later).
 
It's important to note that at this point of identifying cybersecurity requirements and objectives, and into the subsequent two steps of identifying policies and selecting and refining controls, the concept of risk management should be front and center. In a 2016 letter to the U.S. Commission on Enhancing National Cybersecurity, computing experts Lipner and Lampson emphasized the difficulty of cybersecurity risk management<ref name="LipnerRisk16">{{cite web |url=https://www.nist.gov/system/files/documents/2016/09/16/s.lipner-b.lampson_rfi_response.pdf |format=PDF |title=Risk Management and the Cybersecurity of the U.S. Government - Input to the Commission on Enhancing National Cybersecurity |author=Lipner, S.B.; Lampson, B.W. |date=September 2016 |accessdate=10 March 2023}}</ref>:
 
<blockquote>"It is impossible to measure precisely either the amount of security a given investment buys or the expected consequences of less than perfect security. Thus, decision makers must make security investments whose benefits are very uncertain. This makes it tempting to spend less on security and more on new programs or other alternatives with more visible benefits."</blockquote>
 
Despite these difficulties, the process of translating gap analysis and risk analysis into actionable and realistic objectives must be done. But you're close to or already have made the inroads to reaching this point. You've inventoried and examined the hardware and network settings you already use (5.3.1). You've examined your existing (and possibly even future) data and declared its criticality (5.3.2) in the context of the regulations, standards, and best practices affecting your assets and data (5.3.4). And you've reviewed your existing policies, looking for areas of improvement (5.3.3). You know where you are and where you want to go. Now the risk management components arrive in the form of objectives (5.3.8) that align with your overall cybersecurity strategy (5.1.3), the tangential cybersecurity policies requiring creation and modification (5.3.9), and the security controls chosen to support those objectives and policies (5.3.10). If you realize this full approach, cybyersecurity risk management should come naturally, by extension.
 
====5.3.9 Identify policies for creation or modification concerning passwords, physical security, etc., particularly where gaps have been identified from the prior assessments and objectives====
Say you previously noted your business has a few password and other access management policies in place, but they are relatively weak compared to where you want to be. If you haven't already, now is a good time to start tracking those and other policies that need to be updated or even created from scratch. Note that you may also discover additional policies that should be addressed during the next step of selecting and refining security controls. Consider performing this step in tandem with the next to gain the clearest idea of all the policy updates the business will require to meet its cybersecurity objectives.
 
====5.3.10 Select and refine security controls for identification, protection, detection, response, and recovery based on the assessments, objectives, and policies above====
According to cybersecurity solutions company Tenable, 84 percent of U.S. organization turn to to at least one cybersecurity framework in their organization, and 44 percent work with more than one.<ref name="WatsonTopFour19">{{cite web |url=https://www.itgovernanceusa.com/blog/top-4-cybersecurity-frameworks |title=Top 4 cybersecurity frameworks |author=Watson, M. |work=IT Governance USA Blog |publisher=GRC International Group plc |date=17 January 2019 |accessdate=10 March 2023}}</ref> This is done in part to comply with a mix of regulations affecting organizations today, as well as provide a baseline of security policies and protocols, even for the smallest of organizations. In that regard, it makes sense to heavily involve cybersecurity frameworks and controls in the development of your cybersecurity plan.
 
For the purposes of this guide, the NIST control descriptions found in [https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final NIST Special Publication 800-53, Revision 5]: ''Security and Privacy Controls for Information Systems and Organizations'' are used. Most of the "Low" baseline controls, as well as select "Moderate" and "High" baseline controls, are highly worthy of consideration for organizations. Additionally, a simplified version of controls was derived from 800-53 in the form of [https://csrc.nist.gov/publications/detail/sp/800-171/rev-2/final NIST Special Publication 800-171, Revision 2]: ''Protecting Controlled Unclassified Information in Nonfederal Systems and Organizations''. One of the benefits of this set of NIST controls is that it maps to both NIST SP 800-53 and the [[|ISO/IEC 27000-series|ISO/IEC 27001:2013]] controls. And many of the cybersecurity groups and frameworks listed above have at their base—or are mapped to—those same two groups of controls.
 
You'll probably want to refer to Appendix 1 of this guide. There you'll find the "Low" baseline controls of NIST SP 800-53, as well as select "Moderate" and "High" baseline controls. For basic organizations working with non-federal data, these controls should prefer a perfectly useful baseline. The control descriptions have been simplified somewhat for quick reading, and any references or additional recommend reading is also added. Finally, you'll also see mapping to what's known as "[[LII:LIMSpec 2022 R2|LIMSpec]]," an evolving set of software requirements specifications for [[laboratory informatics]] systems. If you're not in the laboratory industry, you may not find that mapping entirely useful; however, LIMSpec still includes many specifications that could apply to a broad array of software systems.
 
Regardless of which frameworks and control groups you choose, you'll want to choose at least one and browse through the controls. Select the controls that map readily to the assessments, objectives, and policies you've already developed. You should even notice that some of the controls match to elements of the cybersecurity plan development steps found in this guide. The NIST control "IR-1 Incident response policy and procedures," for example, ties into step 5.8 of this guide, discussed later.
 
 
 
<div class="nonumtoc">__TOC__</div>
===5.6 Determine resource needs===
[[File:Figure 1- Cybersecurity Funding at IRS, Fiscal Years 2014 Estimated, 2015 Actual, 2016 Enacted, and 2017 Requested (Dollars in Millions) (28979530692).jpg|right|500px]]
====5.6.1 Determine whether sufficient in-house subject-matter expertise exists, and if not, how it will be acquired====
Businesses come in many sizes, and not all have the in-house expertise to take the deep dive into cybersecurity. To be fair, the size of a business isn't the only determiner of IT resources. Hiring practices and hosting decisions for both software and IT (e.g., [[software as a service]] [SaaS] and [[infrastructure as a service]] [IaaS] vs. local hosting) may also impact the level of cybersecurity expertise in the business. Regardless, it's doubtlessly imperative to have some type of expertise involved in assisting with the implementation of your organization's cybersecurity plan. You probably have already addressed this during part two and three of making the cybersecurity plan, but now is an excellent time to double check that aside from any short-term expertise you're tapping into while formulating your plan, ensure you have long-term support for the implementation and monitoring of the plan's components.
 
====5.6.2 Estimate time commitments and resource allocation towards training exercises, professional assistance, infrastructure, asset management, and recovery and continuity====
The realities of business dictate that time is indeed valuable.<ref name="CakmakTime19">{{cite web |url=https://techonomy.com/time-money-money-time-means-tech/ |title=Time is Money, Money is Time, and What That Means for Tech |author=Cakmak, J. |work=Techonomy |publisher=Techonomy Media, Inc |date=11 January 2019 |accessdate=10 March 2023}}</ref> For a business to meet its primary goals, an investment of time and resources are required by those involved in the business. For a clinical laboratory, that means laboratorians performing analyses, making [[quality control]] checks, managing test results and reporting, and more. How much time do they truly need to commit in any given week to developing cybersecurity skills? And beyond the individual level, how much time does the business as a whole want to commit? With a need for training, infrastructure management, policy development and management, and recovery and continuity activities, your business has a lot to consider. These and other questions must be asked in relation to the realistic amount of resources available to the business and its personnel.
 
Here are a few additional questions to ask, as suggested by NARUC<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref>:
 
* "What level of staff time should [a business] dedicate to learning about cybersecurity and developing skills necessary to achieve stated goals?"
* "Do staff need to become subject-matter experts, or is it enough that they are familiar with the language and terms?"
* "Do any staff need one-time training, ongoing training, certifications, or security clearances?"
* "Does the [business] have enough personnel to build and maintain relationships with [cybersecurity stakeholders]?"
 
====5.6.3 Review the budget====
Of course, the realities of business also dictate that money is a key component to business operations. That means budgeting that all-important resource. What share of the overall budget will cybersecurity take, as proposed vs. what can realistically be allotted? This is where that previously conducted gap assessment and risk assessment comes into play again. You ended up identifying critical gaps in your current infrastructure and prioritizing cyber risks based on threat, vulnerability, likelihood, and impact. Those assessments guided your goals and objectives. Does your budget align with those goals and objectives? If not, what concessions must be made? If you're a small retail shop, antivirus software and firewalls may be enough. And as editor Cristina Lago notes in her 2019 article for ''CIO'': "Be realistic about what you can afford. After all, you don’t need a huge budget to have a successful security plan. Invest in knowledge and skills."<ref name="LagoHowTo19">{{cite web |url=https://www.cio.com/article/222076/how-to-implement-a-successful-security-plan.html |title=How to implement a successful cybersecurity plan |author=Lago, C. |work=CIO |publisher=IDG Communications, Inc |date=10 July 2019 |accessdate=10 March 2023}}</ref>
 
<div class="nonumtoc">__TOC__</div>
===5.7 Develop a communications plan===
[[File:Cybersecurity and the nation's digital future.jpg|right|300px]]
====5.7.1 Address the need for transparency in improving the cybersecurity culture====
<blockquote>"If you look at it historically, the best ways to handle [cybersecurity] incidents is the more transparent you are the more you are able to maintain a level of trust. Obviously, every time there’s an incident, trust in your organization goes down. But the most transparent and communicative organizations tend to reduce the financial impact of that incident.” - McAfee CTO Ian Yip<ref name="LagoHowTo19">{{cite web |url=https://www.cio.com/article/222076/how-to-implement-a-successful-security-plan.html |title=How to implement a successful cybersecurity plan |author=Lago, C. |work=CIO |publisher=IDG Communications, Inc |date=10 July 2019 |accessdate=10 March 2023}}</ref></blockquote>
 
When your organization spreads the idea of improving cybersecurity and the culture around it, it shouldn't forget to talk about the importance of transparency. That includes the development process for the cybersecurity plan itself. Stakeholders will appreciate a forthright plan development and implementation strategy that clearly and concisely addresses the critical information system protections, monitoring, and communication that should be enacted.<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref><ref name="LagoHowTo19" /> Not only should internal communication about plan status be clear and regular, but also greater openness placed on promptly informing the affected individuals of cybersecurity risks and incidents. Of course, trust can be indirectly built up in other ways, such as ensuring training material is relevant and understandable, improving user management in critical systems, and ensuring communication barriers between people are limited.
 
====5.7.2 Determine guidelines for everyday communication and mandatory reporting to meet cybersecurity goals====
Sure, your IT specialists and system administrators know and understand the language of cybersecurity, but do the rest of your staff know and understand the topic enough to meet various cybersecurity business goals? One aspect of solving this issue involves ensuring clear, consistent communication and understanding across all levels of the organization. (Another aspect, of course, is training, discussed below.) If everyone is speaking the same language, planning and implementation for cybersecurity becomes more effective.<ref name="NARUCCyber18" /> This extends to everyday communications and reporting. Tips include:
 
* Clearly and politely communicate what consequences exist for those who violate cybersecurity policy, better ensuring compliance.<ref name="LebanidzeGuide11">{{cite web |url=https://www.cooperative.com/programs-services/bts/documents/guide-cybersecurity-mitigation-plan.pdf |format=PDF |title=Guide to Developing a Cyber Security and Risk Mitigation Plan |author=Lebanidze, E. |publisher=National Rural Electric Cooperative Association, Cooperative Research Network |date=2011 |accessdate=10 March 2023}}</ref><ref name="CopelandHowToDev18">{{cite web |url=https://www.copelanddata.com/blog/how-to-develop-a-cybersecurity-plan/ |title=How to Develop A Cybersecurity Plan For Your Company (checklist included) |publisher=Copeland Technology Solutions |date=17 July 2018 |accessdate=10 March 2023}}</ref>
* Consider developing and using communication and reporting templates for a variety of everyday emails, letters, and reports.<ref name="NARUCCyber18" />
* Don't forget to communicate organizational privacy policies and other security policies to third parties such as vendors and contractors.
* Don't forget to communicate changes of cybersecurity policy to all affected.
* Be flexible with the various routes of communication you can use; not everyone is diligent with email, for example.
 
====5.7.3 Determine guidelines for handling or discussing sensitive information====
Safely and correctly working with sensitive, protected, or confidential data in the organization is no simple task, requiring extra precautions, attention to regulations, and improved awareness throughout the workflow. In the clinical realm, organizations have PHI to worry about, while [[forensic laboratories]] must be mindful of working with classified data. Most businesses keep some sort of financial transaction data, and even your smallest of businesses may be working with trade secrets. These and other types of data require special attention by those creating a cybersecurity plan. Important considerations include staying informed of changes to local, state, and federal law; being vigilant with any role-based access to sensitive data; developing and enforcing clear policy on documenting and disposing cyber assets with such data; and developing boundary protection mechanisms for confining sensitive communications to trusted zones.<ref name="LebanidzeGuide11" /> Cybersecurity standards and frameworks provide additional guidance in this realm.
 
====5.7.4 Address incident reporting and response, as well as corrective action====
As discussed earlier, fostering an environment of transparency in regards to cybersecurity matters is beneficial to the business. By extension, this includes properly disseminating notice of cybersecurity risks, breaches, and associated responses. Steve McGaw, the chief marketing officer for AT&T Business Solutions, had this to say about it in 2017<ref name="McGawBreaching17">{{cite journal |url=https://apps.prsa.org/Intelligence/TheStrategist/Articles/view/11873/1152/Breaching_the_Secret_to_Cybersecurity_Communicatio |archiveurl=https://web.archive.org/web/20220815122956/https://apps.prsa.org/Intelligence/TheStrategist/Articles/view/11873/1152/Breaching_the_Secret_to_Cybersecurity_Communicatio |title=Breaching the secret to cybersecurity communications |author=McGaw, S. |journal=The Public Relations Strategist |issue=Spring 2017 |year=2017 |archivedate=15 August 2022 |accessdate=10 March 2023}}</ref>
 
<blockquote>When a breach is revealed, the attacked company is portrayed not as a victim, but as negligent and, in a subtle way, complicit in the event that ultimately exposed partners and customers. In short, it’s clearer than ever that cyberattacks can have an existential impact on companies. If customers don’t trust a company, then they simply won’t do business with them. These types of brand implications are indelible, and a communication strategy is invaluable.</blockquote>
 
This is where you decide how to communicate cybersecurity incidents and respond to them. McGaw and others offer the following advice in that regard<ref name="NARUCCyber18" /><ref name="LagoHowTo19" /><ref name="McGawBreaching17" /><ref name="HamburgAlign18">{{cite book |chapter=Chapter 4: Aligning a Cybersecurity Strategy with Communication Management in Organizations |title=Digital Communication Management |author=Hamburg, I.; Grosch, K.R |editor=Peña-Acuña, B. |publisher=IntechOpen |year=2018 |isbn=9781838814908 |doi=10.5772/intechopen.75952}}</ref>:
 
* Organize an incident response team of IT professionals, writers, leaders, and legal advisers and together develop protocols for how revelation of a cybersecurity incident should be handled, from the start.
* Ensure that upon an identified breach that the issue and it's likely impact are eventually clearly understood before communicating it to stakeholders. Communicating a hastily written, vague message creates more problems than solutions.
* Provide messaging on the solution (corrective action), not just the problem. Sometimes the solution is complex and difficult, but it's still beneficial to at least let stakeholders know action is being taken to correct the issue and limit its impact.
* Consider the use of playbooks, report templates, and training drills as part of your communication plan. Practice resolving security incidents with your assembled incident response team, and seek outside help when needed.
* When crafting your message, avoid jargon, use clear and simple language, be transparent (avoid "may" and "might"; be up-front), and keep your business values in context with the message.
* Don't forget to extend transparent messaging to internal stakeholders.
 
====5.7.5 Address cybersecurity training methodology, requirements, and status tracking====
While the topic of cybersecurity training could arguably receive its own section, training and communication planning go hand-in-hand. What is training but another form of imparting (communicating) information to others to act upon? And getting the word out about the cybersecurity plan and the culture it wants to promote is just another impetus for providing training to the relevant stakeholders.
 
The training methodology, requirements, and tracking used will largely be shaped by the goals and objectives detailed prior, as well as the budget allotted by management. For example, businesses with ample budget may be able to add new software firewalls and custom firmware updates to their system; however, small businesses with limited resources may get more out of training users on proper cyber hygiene than investing heavily in IT.<ref name="NARUCCyber18" /> Regardless, addressing training in the workplace remains a critical aspect of your cybersecurity plan. As the NRECA notes<ref name="LebanidzeGuide11" />: "Insufficiently trained personnel are often the weakest security link in the organization’s security perimeter and are the target of social engineering attacks. It is therefore crucial to provide adequate security awareness training to all new hires, as well as refresher training to current employees on a yearly basis."
 
You'll find additional guidance on training recommendations and requirements by looking at existing regulations. Various NIST cybersecurity framework publications such as [https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r5.pdf 800-53], [https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-171r2.pdf 800-171], and the [https://doi.org/10.6028/NIST.CSWP.04162018 NIST Cybersecurity Framework] (PDFs) may also provide insight into training.
 
<div class="nonumtoc">__TOC__</div>
===5.8 Develop a response and continuity plan===
[[File:Micro Data Center.jpg|right|300px]]
====5.8.1 Consider linking a cybersecurity incident response plan and communication tools with a business continuity plan and its communication tools====
In the previous section, we discussed transparently and effectively communicating the details of a cybersecurity incident, as part of a communications plan. As it turns out, those communications also play a role in developing a recovery and continuity plan, which in turn helps limit the effects of a cyber incident. However, some planners end up confusing terminology, using "incident response" in place of either "business continuity" or "disaster recovery." While unfortunate, this gives you an opportunity to address both.
 
A cybersecurity incident response plan is a plan that focuses on the processes and procedures of managing the consequences of a particular cyber attack or other such incident. Traditionally, this plan has been the responsibility of the IT department and less the overall business. On the other hand, a business continuity plan is a plan that focuses on the processes and procedures of managing the consequences of any major disruption to business operations across the entire organization. A disaster recovery plan is one component of the business continuity plan that specifically addresses restoring IT infrastructure and operations after the major disruption. The business continuity plan looks at natural disasters like floods, fires and earthquakes, as well as other events, and it's usually developed with the help of management or senior leadership.<ref name="KrasnowCyber17">{{cite web |url=https://www.irmi.com/articles/expert-commentary/cyber-security-event-recovery-plans |title=Cyber-Security Event Recovery Plans |author=Krasnow, M.J. |publisher=International Risk Management Institute, Inc |date=February 2017 |accessdate=10 March 2023}}</ref><ref name="LindrosHowTo17">{{cite web |url=https://www.cio.com/article/288554/best-practices-how-to-create-an-effective-business-continuity-plan.html |title=How to create an effective business continuity plan |author=Lindros, K.; Tittel, E. |work=CIO |publisher=IDG Communications, Inc |date=18 July 2017 |accessdate=10 March 2023}}</ref>
 
All of these plans have utility, but consider linking your cybersecurity incident response plan with your new or existing business continuity plan. You may garner several benefits from doing so. In fact, some experts already view cyber incident response "as part of a larger business continuity plan, which may include other plans and procedures for ensuring minimal impact to business functions."<ref name="KrasnowCyber17" /><ref name="LindrosHowTo17" /><ref name="EwingFourWays17">{{cite web |url=https://deltarisk.com/blog/4-ways-to-integrate-your-cyber-security-incident-response-and-business-continuity-plans/ |title=4 Ways to Integrate Your Cyber Security Incident Response and Business Continuity Plans |author=Ewing, S. |publisher=Delta Risk |date=12 July 2017 |accessdate=10 March 2023}}</ref> Stephanie Ewing of Delta Risk offers four tips in integrating cybersecurity incident recovery with business continuity. First, she suggests using a similar process approach to creating and reviewing your plans, including establishing an organizational hierarchy of the plans for improved understanding of how they work together. Second, Ewing notes that both plans speak in terms of incident classifications, response thresholds, and affected technologies, adding that it would be advantageous to share those linkages for consistency and improved collaboration. Similarly, linking the experience of operations in developing training exercises and drills with the technological expertise of IT creates a logical match in efforts to test both plans. Finally, Ewing examines the tendency of operations teams to use different communications tools and language from IT, creating additional problems. She suggests removing the walls and silos and establishing a common communication between the two planning groups to ensure greater cohesion across the enterprise.<ref name="EwingFourWays17" />
 
For the specifics of what should be contained in your recovery and continuity planning, you may want to turn to reference works such as ''[https://books.google.com/books?id=DXhvDwAAQBAJ&printsec=frontcover Cybersecurity Incident Response]'', as well as existing incident response plans (e.g., [https://web.archive.org/web/20210320130805/https://www.it.miami.edu/_assets/pdf/security/cyber-security-incident-response-guide.pdf University of Miami]) and [https://www.irmi.com/articles/expert-commentary/cyber-security-event-recovery-plans expert advice].
 
====5.8.2 Include a listing of organizational resources and their criticality, a set of formal recovery processes, security and dependency maps, a list of responsible personnel, a (previously mentioned) communication plan, and information sharing criteria====
A lot of this material has already been developed as part of your overall cybersecurity plan, but it is all relevant to developing incident response plans. Having the list of technological components and their defined criticality will help you create the organizational hierarchy of the various aspects of your incident response and business continuity plans. Having the formal recovery processes in place beforehand allows your organization to develop training exercises around them, increasing preparedness. Application dependency mapping allows you to "understand risk, model policy, create mitigation strategies, set up compensating controls, and verify that those policies, strategies, and controls are working as you intend to mitigate risk."<ref name="KirnerTime17">{{cite web |url=https://www.illumio.com/blog/security-evolution-application-mapping |archiveurl=https://web.archive.org/web/20191204160526/https://www.illumio.com/blog/security-evolution-application-mapping |title=You need a map to evolve security |work=Time for a {r}evolution in data center and cloud security |author=Kirner, P.J. |publisher=Illumio |date=09 August 2017 |archivedate=04 December 2019 |accessdate=10 March 2023}}</ref> Knowing who's in charge of what aspect of recovery ensures a more rapid approach. And having a communication and information sharing strategy in place helps to limit rumors and transparently relate what happened, what's being done, and what the future looks like after the cyber incident.
 
<div class="nonumtoc">__TOC__</div>
===5.9 Establish how the overall cybersecurity plan will be implemented===
[[File:Cybersecurity.png|right|300px]]
====5.9.1 Detail the specific steps regarding how all the above will be implemented====
Weeks, months, perhaps even years of planning have led you to this point: how do we go about implementing the details of our cybersecurity plan? It may seem the daunting process, but this is where management expertise comes in handy. A formal project manager should be taking the reigns of the implementation, as that person preferably has experience initializing change processes, evaluating milestones as realistic or flawed, implementing ad hoc revisions to the plan, and finalizing the processes and procedures for reporting and evaluating the implementation.<ref name="NARUCCyber18">{{cite web |url=https://pubs.naruc.org/pub/8C1D5CDD-A2C8-DA11-6DF8-FCC89B5A3204 |format=PDF |title=Cybersecurity Strategy Development Guide |author=Cadmus Group, LLC |publisher=National Association of Regulatory Utility Commissioners |date=30 October 2018 |accessdate=10 March 2023}}</ref> The manager also has the benefit of being able to ensure the implementation will stay true to the proposed budget and make the necessary adjustments along the way.<ref name="LebanidzeGuide11">{{cite web |url=https://www.cooperative.com/programs-services/bts/documents/guide-cybersecurity-mitigation-plan.pdf |format=PDF |title=Guide to Developing a Cyber Security and Risk Mitigation Plan |author=Lebanidze, E. |publisher=National Rural Electric Cooperative Association, Cooperative Research Network |date=2011 |accessdate=10 March 2023}}</ref>
 
====5.9.2 State the major implementation milestones====
In Martinelli and Milosevic's ''Project Management ToolBox: Tools and Techniques for the Practicing Project Manager'', milestones and milestone charts are discussed as integral project management tools. They define a milestone as "a point in time or event whose importance lies in it being the climax point for many converging activities."<ref name="MarinelliProject16">{{cite book |url=https://books.google.com/books?id=SbA7CwAAQBAJ&pg=PA150 |title=Project Management ToolBox: Tools and Techniques for the Practicing Project Manager |author=Martinelli, R.J.; Milosevic, D.Z. |publisher=John Wiley & Sons |year=2016 |pages=150–54 |isbn=9781118973202}}</ref> They go on to give examples of milestones, including deliverables, project phase transitions, extensive reviews, and external events. Deciding what the key milestones of plan implementation will be up to the project manager, but they'll likely consider traditional milestones or focus on the major synchronization and decision points along the entire process. This includes studying the dependencies in the various implementation steps and anticipating how they will converge, ensuring also that the milestones are adequately spaced and have received team input.<ref name="MarinelliProject16" />
 
====5.9.3 Determine how best to communicate progress on the plan’s implementation====
The project manager will also likely oversee dissemination of communications related to plan implementation. Without a doubt, internal stakeholders will want to be kept aware of the implementation status of the cybersecurity plan. When should IT go live with the improved firewall installation? Are the new password requirements going into effect later than expected? Has the training literature you handed out last week been updated to reflect the critical changes your staff had to make over the weekend? Keeping everyone in the loop will help build trust in the attempt to build cybersecurity culture into the workplace. This also means concise and comprehensible documentation is being made available and is updated as changes in implementation take place. This is all in addition to deciding how to best communicate implementation progress (e.g., reports, emails, meetings, project website).

Revision as of 22:10, 9 May 2024

Sandbox begins below

FAIRResourcesGraphic AustralianResearchDataCommons 2018.png

Title: What are the potential implications of the FAIR data principles to laboratory informatics applications?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

This brief topical article will examine

The "FAIR-ification" of research objects and software

First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the FAIR data principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."[1]

Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice[2], including in ways beyond what perhaps was originally imagined by Wilkinson et al.. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.[3][4][5][6][7] Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."[4] The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.

A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.[4] These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).[4]

At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.[8][9][10] In 2021, as part of the FAIRsFAIR initiative, Gruenpeter et al. made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition[8]:

Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.

Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or laboratory information management system (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is[10]:

  • Research software is a component of our instruments.
  • Research software is the instrument.
  • Research software analyzes research data.
  • Research software presents research results.
  • Research software assembles or integrates existing components into a working whole.
  • Research software is infrastructure or an underlying tool.
  • Research software facilitates distinctively research-oriented collaboration.

When considering these definitions[8][10] of research software and their adoption by other entities[11], it would appear that at least in part some laboratory informatics software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, electronic laboratory notebooks (ELNs) like open-source Jupyter Notebook or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.[10] Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.

Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.

Implications of the FAIR concept to laboratory informatics software

The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones

To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"[12]:

Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.

This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.

The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.[13][14][15] RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."[13][16] Elaborating on that concept, Cohen et al. add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."[16]

The concept of software quality management (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.[17] However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software.[12] Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.

The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches

Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, metadata, and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.[18][19] This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven ontologies (i.e., controlled vocabularies) at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.[19] And that approach must value the importance of metadata and PIDs. As Weigel et al. note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."[20] Enter non-relational RDF knowledge graph databases.

This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational Resource Description Framework (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.

Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.[21] However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.[21][22]

It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra et al. and the FAIR Cookbook.[23]) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide semantic transparency (i.e., provide a framework for better understanding and reusing the greater research object through basic examination of the relationships of its associated metadata and their constituents), making these objects more easily accessible, interoperable, and machine-readable.[21] The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.[21][22][23][24][25][26] This concept can even be extended to post factum visualizations of the knowledge graph data[25], as well as the FAIR management of computational laboratory workflows.[27]

While rare, some commercial laboratory informatics vendors like Semaphore Solutions have already recognized the potential of RDF knowledge graph databases to FAIR-driven laboratory research, having implemented such structures into their offerings.[24] (The use of knowledge graphs has already been demonstrated in academic research software, such as with the ELN tools developed by RSEs at the University of Rostock and University of Amsterdam.[28]) As noted in the prior point, it is potentially advantageous to not only laboratory informatics vendors to provide but also research labs to use relevant and sustainable research software that has the FAIR principles embedded in the software's design. Turning to knowledge graph databases is another example of keeping such software relevant and FAIR to research labs.

Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications

The third and final point for this Q&A article highlights another positive consequence of engineering laboratory informatics software with FAIR in mind: FAIRified research objects are much closer to being usable for the trending inclusion of machine learning (ML) and artificial intelligence (AI) tools in laboratory informatics platforms and other companion research software. By developing laboratory informatics software with a focus on FAIR-driven metadata and database schemes, not only are research objects more FAIR but also "cleaner" and more machine-ready for advanced analytical uses as with ML and AI.

To be sure, the FAIRness of any structured dataset alone is not enough to make it ready for ML and AI applications. Factors such as classification, completeness, context, correctness, duplicity, integrity, mislabeling, outliers, relevancy, sample size, and timeliness of the research object and its contents are also important to consider.[29][30] When those factors aren't appropriately addressed as part of a FAIRification effort towards AI readiness (as well as part of the development of research software of all types), research data and metadata have a higher likelihood of revealing themselves to be inconsistent. As such, searches and analytics using that data and metadata become muddled, and the ultimate ML or AI output will also be muddled (i.e., "garbage in, garbage out"). Whether retroactively updating existing research objects to a more FAIRified state or ensuring research objects (e.g., those originating in an ELN or LIMS) are more FAIR and AI-ready from the start, research software updating or generating those research objects has to address ontologies, data models, data types, identifiers, and more in a thorough yet flexible way.[31]

Noting that Wilkinson et al. originally highlighted the importance of machine-readability of FAIR data, Huerta et al. add that that core principle of FAIRness "is synergistic with the rapid adoption and increased use of AI in research."[32] They go on to discuss the positive interactions of FAIR research objects with FAIR-driven, AI-based research. Among the benefits include[32]:

  • greater findability of FAIR research objects for further AI-driven scientific discovery;
  • greater reproducibility of FAIR research objects and any AI models published with them;
  • improved generalization of AI-driven medical research models when exposed to diverse and FAIR research objects;
  • improved reporting of AI-driven research results using FAIRified research objects, lending further credibility to those results;
  • more uniform comparison of AI models using well-defined hyperstructure and information training conditions from FAIRified research objects;
  • more developed and interoperable "data e-infrastructure," which can further drive a more effective "AI services layer";
  • reduced bias in AI-driven processes through the use of FAIR research objects and AI models; and
  • improved surety of scientific correctness where reproducibility in AI-driven research can't be guaranteed.

In the end, developers of research software (whether discipline-specific research software or broader laboratory informatics solutions) would be advised to keep in mind the growing trends of FAIR research, FAIR software, and ML- and AI-driven research, especially in the life sciences, but also a variety of other fields.[32]

Restricted clinical data and its FAIRification for greater research innovation

Broader discussion in the research community continues to occur in regards to how best to ethically make restricted or privacy-protected clinical data and information FAIR for greater innovation and, by extension, improved patient outcomes, particularly in the wake of the COVID-19 pandemic.[33][34][35] (Note that while there are other types of restricted and privacy-protected data, this section will focus largely on clinical data and research objects as the most obvious type.)

These efforts have usually revolved around pulling reusable clinical patient or research data from hospital information systems (HIS), electronic medical records (EMRs), clinical trial management systems (CTMSs), and research databases (often relational in nature) that either contain de-identified data or can de-identify aspects of data and information before access and extraction. Sometimes that clinical data or research object may have already in part been FAIRified, but often it may not be. In all cases, the concepts of privacy, security, and anonymization come up as part of any desire to gain access to that clinical material. However, any FAIRified clinical data isn't necessarily readily open for access. As Snoeijer et al. note: "The authors of the FAIR principles, however, clearly indicate that 'accessible' does not mean open. It means that clarity and transparency is required around the conditions governing access and reuse."[36]

This is being mentioned in the context of laboratory informatics applications for a couple of reasons. First, a well-designed commercial LIMS that supports clinical research laboratory workflows is already going to address privacy and security aspects, as part of the developer recognizing the need for those labs to adhere to regulations such as the Health Insurance Portability and Accountability Act (HIPAA) and comply with standards such as ISO 15189. However, such a system may not have been developed with FAIR data principles in mind, and any built-in metadata and ontology schemes may be insufficient for full FAIRification of laboratory-based clinical trial research objects. As Queralt-Rosinach et al. note, however, "interestingly, ontologies may also be used to describe data access restrictions to complement FAIR metadata with information that supports data safety and patient privacy."[34] Essentially, the authors are suggesting that while a HIS or LIS may have built-in access management tools, setting up ontologies and metadata mechanisms that link privacy aspects of a research object (e.g., "has consent form for," "is de-identified," etc.) to the object's metadata allows for even more flexible, FAIR-driven approaches to privacy and security. Research software developers creating such information management tools for the regulated clinical research space may want to apply FAIR concepts such as this to how access control and privacy restrictions are managed. This will inevitably mean any research objects exported with machine-readable privacy-concerning metadata will be more reusable in a way that still "supports data safety and patient privacy."[34]

Second, a well-designed research software solution working with clinical data will provide not only support for open, community-supported data models and vocabularies for clinical data, but also standardized community-driven ontologies that are specifically developed for access control and privacy. Queralt-Rosinach et al. continue[34]:

Also, very important for accessibility and data privacy is that the digital objects per se can accommodate the criteria and protocols necessary to comply with regulatory and governance frameworks. Ontologies can aid in opening and protecting patient data by exposing logical definitions of data use conditions. Indeed, there are ontologies to define access and reuse conditions for patient data such as the Informed Consent Ontology (ICO), the Global Alliance for Genomics and Health Data Use Ontology (DUO) standard, and the Open Digital Rights Language (ODRL) vocabulary recommended by W3C.

Also of note here is the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and its OHDSI standardized vocabularies. In all these cases, a developer-driven approach to research software that incorporates community-driven standards that support FAIR principles is welcome. However, as Maxwell et al. noted in their Lancet review article in late 2023, "few platforms or registries applied community-developed standards for participant-level data, further restricting the interoperability of ... data-sharing initiatives [like FAIR]."[33] As the FAIR principles continue to gain ground in clinical research and diagnostics settings, software developers will need to be more attuned to translating old ways of development to ones that incorporate FAIR data and software principles. Demand for FAIR data will only continue to grow, and any efforts to improve interoperability and reusability while honoring (and enhancing) privacy and security aspects of restricted data will be appreciated by clinical researchers. However, just as FAIR is not an overall goal for researchers, software built with FAIR principles in mind is not the end point of research organizations managing restricted and privacy-protected research objects. Ultimately, those organizations will have make other considerations about restricted data in the scope of FAIR, including addressing data management plans, data use agreements, disclosure review practices, and training as it applies to their research software and generated research objects.[37]

Conclusion

Laboratory informatics developers will also need to remember that FAIRification of research in itself is not a goal for research laboratories; it is a continual process that recognizes improved scientific research and greater innovation as a more likely outcome.[1][31][32]

References

  1. 1.0 1.1 1.2 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  2. "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024. 
  3. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  4. 4.0 4.1 4.2 4.3 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024. 
  5. Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x. 
  6. Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x. 
  7. Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3. 
  8. 8.0 8.1 8.2 Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016. 
  9. "What is Research Software?". JuRSE, the Community of Practice for Research Software Engineering. Forschungszentrum Jülich. 13 February 2024. https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software. Retrieved 30 April 2024. 
  10. 10.0 10.1 10.2 10.3 van Nieuwpoort, Rob; Katz, Daniel S. (14 March 2023) (in en). Defining the roles of research software. doi:10.54900/9akm9y5-5ject5y. https://upstream.force11.org/defining-the-roles-of-research-software. 
  11. "Open source software and code". F1000 Research Ltd. 2024. https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/. Retrieved 30 April 2024. 
  12. 12.0 12.1 Moynihan, G. (7 July 2020). "The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE". Invenia Blog. Invenia Technical Computing Corporation. https://invenia.github.io/blog/2020/07/07/software-engineering/. 
  13. 13.0 13.1 Woolston, Chris (31 May 2022). "Why science needs more research software engineers" (in en). Nature: d41586–022–01516-2. doi:10.1038/d41586-022-01516-2. ISSN 0028-0836. https://www.nature.com/articles/d41586-022-01516-2. 
  14. "RSE@KIT". Karlsruhe Institute of Technology. 20 February 2024. https://www.rse-community.kit.edu/index.php. Retrieved 01 May 2024. 
  15. "Purdue Center for Research Software Engineering". Purdue University. 2024. https://www.rcac.purdue.edu/rse. Retrieved 01 May 2024. 
  16. 16.0 16.1 Cohen, Jeremy; Katz, Daniel S.; Barker, Michelle; Chue Hong, Neil; Haines, Robert; Jay, Caroline (1 January 2021). "The Four Pillars of Research Software Engineering". IEEE Software 38 (1): 97–105. doi:10.1109/MS.2020.2973362. ISSN 0740-7459. https://ieeexplore.ieee.org/document/8994167/. 
  17. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  18. Ghiringhelli, Luca M.; Baldauf, Carsten; Bereau, Tristan; Brockhauser, Sandor; Carbogno, Christian; Chamanara, Javad; Cozzini, Stefano; Curtarolo, Stefano et al. (14 September 2023). "Shared metadata for data-centric materials science" (in en). Scientific Data 10 (1): 626. doi:10.1038/s41597-023-02501-8. ISSN 2052-4463. PMC PMC10502089. PMID 37709811. https://www.nature.com/articles/s41597-023-02501-8. 
  19. 19.0 19.1 Fitschen, Timm; tom Wörden, Henrik; Schlemmer, Alexander; Spreckelsen, Florian; Hornung, Daniel (12 October 2022). "Agile Research Data Management with FDOs using LinkAhead". Research Ideas and Outcomes 8: e96075. doi:10.3897/rio.8.e96075. ISSN 2367-7163. https://riojournal.com/article/96075/. 
  20. Weigel, Tobias; Schwardmann, Ulrich; Klump, Jens; Bendoukha, Sofiane; Quick, Robert (1 January 2020). "Making Data and Workflows Findable for Machines" (in en). Data Intelligence 2 (1-2): 40–46. doi:10.1162/dint_a_00026. ISSN 2641-435X. https://direct.mit.edu/dint/article/2/1-2/40-46/9994. 
  21. 21.0 21.1 21.2 21.3 Aggour, Kareem S.; Kumar, Vijay S.; Gupta, Vipul K.; Gabaldon, Alfredo; Cuddihy, Paul; Mulwad, Varish (9 April 2024). "Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data" (in en). Integrating Materials and Manufacturing Innovation. doi:10.1007/s40192-024-00348-4. ISSN 2193-9764. https://link.springer.com/10.1007/s40192-024-00348-4. 
  22. 22.0 22.1 Grobe, Peter; Baum, Roman; Bhatty, Philipp; Köhler, Christian; Meid, Sandra; Quast, Björn; Vogt, Lars (26 June 2019). "From Data to Knowledge: A semantic knowledge graph application for curating specimen data" (in en). Biodiversity Information Science and Standards 3: e37412. doi:10.3897/biss.3.37412. ISSN 2535-0897. https://biss.pensoft.net/article/37412/. 
  23. 23.0 23.1 Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "FAIR and Knowledge graphs". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564. 
  24. 24.0 24.1 Tomlinson, E. (28 July 2023). "RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software" (PDF). Semaphore Solutions, Inc. https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf. Retrieved 01 May 2024. 
  25. 25.0 25.1 Deagen, Michael E.; McCusker, Jamie P.; Fateye, Tolulomo; Stouffer, Samuel; Brinson, L. Cate; McGuinness, Deborah L.; Schadler, Linda S. (27 May 2022). "FAIR and Interactive Data Graphics from a Scientific Knowledge Graph" (in en). Scientific Data 9 (1): 239. doi:10.1038/s41597-022-01352-z. ISSN 2052-4463. PMC PMC9142568. PMID 35624233. https://www.nature.com/articles/s41597-022-01352-z. 
  26. Brandizi, Marco; Singh, Ajit; Rawlings, Christopher; Hassani-Pak, Keywan (25 September 2018). "Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach" (in en). Journal of Integrative Bioinformatics 15 (3): 20180023. doi:10.1515/jib-2018-0023. ISSN 1613-4516. PMC PMC6340125. PMID 30085931. https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html. 
  27. de Visser, Casper; Johansson, Lennart F.; Kulkarni, Purva; Mei, Hailiang; Neerincx, Pieter; Joeri van der Velde, K.; Horvatovich, Péter; van Gool, Alain J. et al. (28 September 2023). Palagi, Patricia M.. ed. "Ten quick tips for building FAIR workflows" (in en). PLOS Computational Biology 19 (9): e1011369. doi:10.1371/journal.pcbi.1011369. ISSN 1553-7358. PMC PMC10538699. PMID 37768885. https://dx.plos.org/10.1371/journal.pcbi.1011369. 
  28. Schröder, Max; Staehlke, Susanne; Groth, Paul; Nebe, J. Barbara; Spors, Sascha; Krüger, Frank (1 December 2022). "Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation" (in en). Journal of Biomedical Semantics 13 (1): 4. doi:10.1186/s13326-021-00257-x. ISSN 2041-1480. PMC PMC8802522. PMID 35101121. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00257-x. 
  29. Hiniduma, Kaveen; Byna, Suren; Bez, Jean Luca (2024). "Data Readiness for AI: A 360-Degree Survey". arXiv. doi:10.48550/ARXIV.2404.05779. https://arxiv.org/abs/2404.05779. 
  30. Fletcher, Lydia (16 April 2024). FAIR Re-use: Implications for AI-Readiness. The University Of Texas At Austin, The University Of Texas At Austin. doi:10.26153/TSW/51475. https://repositories.lib.utexas.edu/handle/2152/124873. 
  31. 31.0 31.1 Olsen, C. (1 September 2023). "Embracing FAIR Data on the Path to AI-Readiness". Pharma's Almanac. https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness. Retrieved 03 May 2024. 
  32. 32.0 32.1 32.2 32.3 Huerta, E. A.; Blaiszik, Ben; Brinson, L. Catherine; Bouchard, Kristofer E.; Diaz, Daniel; Doglioni, Caterina; Duarte, Javier M.; Emani, Murali et al. (26 July 2023). "FAIR for AI: An interdisciplinary and international community building perspective" (in en). Scientific Data 10 (1): 487. doi:10.1038/s41597-023-02298-6. ISSN 2052-4463. PMC PMC10372139. PMID 37495591. https://www.nature.com/articles/s41597-023-02298-6. 
  33. 33.0 33.1 Maxwell, Lauren; Shreedhar, Priya; Dauga, Delphine; McQuilton, Peter; Terry, Robert F; Denisiuk, Alisa; Molnar-Gabor, Fruzsina; Saxena, Abha et al. (1 October 2023). "FAIR, ethical, and coordinated data sharing for COVID-19 response: a scoping review and cross-sectional survey of COVID-19 data sharing platforms and registries" (in en). The Lancet Digital Health 5 (10): e712–e736. doi:10.1016/S2589-7500(23)00129-2. PMC PMC10552001. PMID 37775189. https://linkinghub.elsevier.com/retrieve/pii/S2589750023001292. 
  34. 34.0 34.1 34.2 34.3 Queralt-Rosinach, Núria; Kaliyaperumal, Rajaram; Bernabé, César H.; Long, Qinqin; Joosten, Simone A.; van der Wijk, Henk Jan; Flikkenschild, Erik L.A.; Burger, Kees et al. (1 December 2022). "Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic" (in en). Journal of Biomedical Semantics 13 (1): 12. doi:10.1186/s13326-022-00263-7. ISSN 2041-1480. PMC PMC9036506. PMID 35468846. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7. 
  35. Martínez-García, Alicia; Alvarez-Romero, Celia; Román-Villarán, Esther; Bernabeu-Wittel, Máximo; Luis Parra-Calderón, Carlos (1 May 2023). "FAIR principles to improve the impact on health research management outcomes" (in en). Heliyon 9 (5): e15733. doi:10.1016/j.heliyon.2023.e15733. PMC PMC10189186. PMID 37205991. https://linkinghub.elsevier.com/retrieve/pii/S2405844023029407. 
  36. Snoeijer, B.; Pasapula, V.; Covucci, A. et al. (2019). "Paper SA04 - Processing big data from multiple sources" (PDF). Proceedings of PHUSE Connect EU 2019. PHUSE Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2019/Connect/EU/Amsterdam/PAP_SA04.pdf. Retrieved 03 May 2024. 
  37. Jang, Joy Bohyun; Pienta, Amy; Levenstein, Margaret; Saul, Joe (6 December 2023). "Restricted data management: the current practice and the future". Journal of Privacy and Confidentiality 13 (2). doi:10.29012/jpc.844. ISSN 2575-8527. PMC PMC10956935. PMID 38515607. https://journalprivacyconfidentiality.org/index.php/jpc/article/view/844.