Difference between revisions of "Journal:Expert search strategies: The information retrieval practices of healthcare information professionals"

From LIMSWiki

Latest revision as of 19:45, 19 September 2021

Full article title	Expert search strategies: The information retrieval practices of healthcare information professionals
Journal	JMIR Medical Informatics
Author(s)	Russell-Rose, Tony; Chamberlain, Jon
Author affiliation(s)	UXLabs Ltd., University of Essex
Primary contact	Email: tgr at uxlabs dot co dot uk
Editors	Eysenbach, G.
Year published	2017
Volume and issue	5 (4)
Page(s)	e33
DOI	10.2196/medinform.7680
ISSN	2291-9694
Distribution license	Creative Commons Attribution 4.0 International
Website	http://medinform.jmir.org/2017/4/e33/
Download	http://medinform.jmir.org/2017/4/e33/pdf (PDF)

Abstract

Background: Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly.

Objective: The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems.

Methods: A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met.

Results: The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking four hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results.

Conclusions: Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems, and there is demand for improved functionality, in particular regarding the development and management of search strategies.

Keywords: review, surveys and questionnaires, search engine, information management, information systems

Introduction

Background

Medical knowledge is growing so rapidly that it is difficult for healthcare professionals to keep up. As the volume of published studies increases each year^[1], the gap between research knowledge and professional practice grows.^[2] Frontline healthcare providers (such as general practitioners [GPs]) responding to the immediate needs of patients may employ a web-style search for diagnostic purposes, with Google being reported to be a useful diagnostic tool^[3]; however, the credibility of results depends on the domain.^[4] Medical staff may also perform more in-depth searches, such as rapid evidence reviews, where a concise summary of what is known about a topic or intervention is required.^[5]

Healthcare information professionals play the primary role in closing the gap between published research and medical practice, by synthesizing the complex, incomplete, and at times conflicting findings of biomedical research into a form that can readily inform healthcare decision making.^[6] The systematic literature review process relies on the painstaking and meticulous searching of multiple databases using complex Boolean search strategies that often consist of hundreds of keywords, operators, and ontology terms^[7] (Textbox 1).

Textbox 1. An example of a multi-line search strategy
1. Attention Deficit Disorder with Hyperactivity/ 2. adhd 3. addh 4. adhs 5. hyperactiv$ 6. hyperkin$ 7. attention deficit$ 8. brain dysfunction 9. or/1-8 10. Child/ 11. Adolescent/ 12. child$ or boy$ or girl$ or schoolchild$ or adolescen$ or teen$ or “young person$” or “young people$” or youth$ 13. or/10-12 14. acupuncture therapy/or acupuncture, ear/or electroacupuncture/ 15. accupunct$ 16. or/14-15 17. 9 and 13 and 16

Performing a systematic review is a resource-intensive and time consuming undertaking, sometimes taking years to complete.^[8] It involves a lengthy content production process whose output relies heavily on the quality of the initial search strategy, particularly in ensuring that the scope is sufficiently exhaustive and that the review is not biased by easily accessible studies.^[9]

Numerous studies have been performed to investigate the healthcare information retrieval process and to better understand the challenges involved in strategy development, as it has been noted that online health resources are not created by healthcare professionals.^[10] For example, Grant^[11] used a combination of a semi-structured questionnaire and interviews to study researchers’ experiences of searching the literature, with particular reference to the use of optimal search strategies. McGowan et al.^[12] used a combination of a web-based survey and peer review forums to investigate what elements of the search process have the most impact on the overall quality of the resulting evidence base. Similarly, Gillies et al.^[13] used an online survey to investigate the review, with a view to identifying problems and barriers for authors of Cochrane reviews. Ciapponi and Glujovsky^[14] also used an online survey to study the early stages of systematic review.

No single database can cover all the medical literature required for a systematic review, although some are considered to be a core element of any healthcare search strategy, such as MEDLINE^[15], Embase^[16], and the Cochrane Library.^[17] Consequently, healthcare information professionals may consult these sources along with a number of other, more specialized databases to fit the precise scope area.^[18]

A survey^[1] of online tools for searching literature databases using PubMed^[19], the online literature search service primarily for MEDLINE, showed that most tools were developed for managing search results (such as ranking, clustering into topics and enriching with semantics). Very few tools improved on the standard PubMed search interface or offered advanced Boolean string editing methods in order to support complex literature searching.

Objective

To improve the accuracy and efficiency of the literature search process, it is essential that information retrieval applications (in this case, databases of medical literature and the interfaces through which they are accessed) are designed to support the tasks, needs, and expectations of their users. To do so they should consider the layers of context that influence the search task^[20] and how this affects the various phases in the search process.^[21] This study was designed to fill gaps in this knowledge by investigating the information retrieval practices of healthcare information professionals and contrasting their requirements to the level of support offered by a widely used literature search tool (PubMed).

The specific research questions addressed by this study were (1) How long do search tasks take when performed by healthcare information professionals? (2) How do they formulate search strategies and what kind of search functionality do they use? (3) How are search results evaluated? (4) What functionality do they value in a literature search system? (5) To what extent are their requirements and aspirations met by the PubMed literature search system?

In answering these research questions we hope to provide direct comparisons within other professions (e.g., in terms of the structure, complexity, and duration of their search tasks).

Methods

Online survey

The survey instrument consisted of an online questionnaire of 58 questions divided into five sections. It was designed to align with the structure and content of Joho et al.’s^[22] survey of patent searchers and wherever possible also with Geschwandtner et al.’s^[23] survey of medical professionals to facilitate comparisons with other professions. The following were the five sections: (1) Demographics, the background and professional experience of the respondents; (2) Search tasks, the tasks that respondents perform when searching literature databases; (3) Query formulation, the techniques respondents used to formulate search strategies; (4) Evaluating search results, how respondents evaluate the results of their search tasks; and (5) Ideal functionality for searching databases, any other features that respondents value when searching literature databases.

The survey was designed to be completed in approximately 15 minutes and was pre-tested for face validity by two health sciences librarians.

Survey respondents were recruited by sending an email invitation with a link to the survey to five healthcare professional association mailing lists that deal with systematic reviews and medical librarianship: LIS-MEDICAL^[24], CLIN-LIB^[25], EVIDENCE-BASED-HEALTH^[26], expertsearching^[27], and the Cochrane Information Retrieval Methods Group (IRMG).^[28] It was also sent directly to the members of the Chartered Institute of Library and Information Professionals (CILIP) Healthcare Libraries special interest group.^[29] The recruitment message and start page of the survey described the eligibility criteria for survey participants, expected time to complete the survey, its purpose, and funding source.

The survey (Multimedia Appendix 1) was conducted using SurveyMonkey, a web-based software application.^[30] Data were collected from July to September 2015. A total of 218 responses were received, of which 107 (49.1%, 107/218) were complete (meaning all pages of the survey had been viewed and all compulsory questions responded to). Only complete surveys were examined. Since the number of unique individuals reached by the mailing list announcements is unknown, the participation rate cannot be determined.

Responses to numeric questions were not constrained to integers, as a pilot survey had shown that respondents preferred to put in approximate and/or expressive values. Text responses corresponding to numerical questions (questions 14 to 22 and 32 to 38; 16 in total) were normalized as follows: (1) when the respondent specified a range (e.g., 10 to 20 hours), the midpoint was entered (e.g., 15 hours); (2) when the respondent indicated a minimum (e.g., 10 years and greater), the minimum was entered (e.g., 10 years); and (3) when the respondent entered an approximate number (e.g., about 20), that number was entered (e.g., 20).

After normalizing, 8.29% (142/1712) responses contained no numerical data and 21.61% (370/1712) responses were normalized.

Evaluation of PubMed

An evaluation of the PubMed search system was performed using online documentation^[31], best practice advice^[32], and direct testing of the interface using Boolean commands. In addition to the search portal, users can register to My NCBI, which provides additional functionality for saving search queries, managing results sets, and customizing filters so this was included in the comparison. The mobile version of PubMed, PubMed Mobile^[33] does not offer extended functionality, so it was not considered in the evaluation. Although beyond the scope of this study, information seeking by healthcare practitioners on hand-held devices has been shown to save time and improve the early learning of new developments.^[34]

Results

Demographics

Of the respondents, 89.3% (92/103) were female. Their ages were distributed bi-modally, with peaks at 39 to 45 and 53 to 59, with a conflated average age of 46.0 (SD 10.9, N=104) (Figure 1).

Fig. 1 Age of respondents

The mean time for respondents' experience in their profession was 16.6 years (SD 10.0), greater than their 12.0 (SD 9.0) years of experience in the review of scientific literature (N=107, P<.01, paired t test). Most respondents worked full time (78.5%, 84/107), and the commissioning agents for their searches were predominantly internal (i.e., within the same organization [72.9%, 78/107]).

The majority of respondents were either based in the U.K. (51.4%, 55/107), the U.S. (27.1%, 29/107), or Canada (7.5%, 8/107). The remaining respondents were from Australia (2.8%, 3/107), Netherlands, Norway, and Germany (1.9% each, 2/107), as well as Denmark, Singapore, Uruguay, South Africa, Belgium, and Ireland (0.9% each, 1/107). All (100.0%, 107/107) respondents stated that the language they used most frequently for searching was English; however, 6.5% (7/107) stated that they did not use English most frequently for communication in their workplace.

The majority of respondents (81.3%, 87/107) worked in organizations that provide systematic reviews. These organizations also provided other services including reference management (72.0%, 77/107), rapid evidence reviews (63.6%, 68/107), background reviews (60.7%, 65/107), and critical appraisals (52.3%, 56/107).

Search tasks

We considered a search task in this context to be the creation of one or more strategy lines to search a specific collection of documents or databases, with task completion resulting in a set of search results that will be subject to further analysis. The output of this process is the search strategy, which is often published as part of the search documentation. This rationalization is in line with a healthcare information professionals’ understanding, but the complexity of search tasks in this domain is discussed in more detail later.

The time respondents spent formulating search strategies, the time spent completing search tasks, and the number of strategy lines they used is shown in Table 1. Respondents were asked to estimate a minimum, average, and maximum for each of these measures, and the values reported here are the medians of each, with the interquartile range (IQR) shown in brackets (in the form Q1 to Q3). The final row shows the minimum, average, and maximum answers to the question: “What would you consider to be the ideal number of results returned for a typical search task?” On average, it takes 60 minutes to formulate a search strategy for a document collection, with the search task taking four hours to complete, and the final strategy consisting of 15 lines.

Task	Minimum (IQR^a)	Average (IQR)	Maximum (IQR)
Table 1. Effort to complete search tasks and evaluate results ^aIQR: interquartile range
Search time per document collection/database, minutes	20 (10-30)	60 (27.5-150)	228 (86-480)
Search task completion time, hours	1 (0.5-2)	4 (2-6.5)	14 (7-30)
Strategy lines per search task, n	5 (2.8-10)	15 (9.1-30)	59 (30-105)
Results examined from a search task, n	10 (5-32)	175 (75-500)	850 (400-5250)
Time to assess relevance of a single result/document, minutes	1 (0.5-2)	3 (1-5)	10 (5-25)
Ideal number of search results per search task, n	0	100	10,000

The data sources most frequently searched were MEDLINE (96.3%, 103/107), the Cochrane Library (87.9%, 94/107), and Embase (80.4%, 86/107) (Figure 2).

Fig. 2 Data sources most frequently searched

The majority of respondents (86.9%, 93/107) used previous search strategies or templates at least sometimes, suggesting that the value embodied in them is recognized and should be re-used wherever possible. In addition, most respondents (89.7%, 96/107) routinely share their search strategies in some form, either with colleagues in their workgroup, more broadly within their organization, or in some other capacity (e.g., with clients or as part of a published review).

Query formulation

We examined the mechanics of the query formulation process by asking respondents to indicate a level of agreement to statements using a five-point Likert scale ranging from 1 (strong disagreement) to 5 (strong agreement). The results are shown in Figure 3.

Fig. 3 Importance of query formulation functionality

When asked which taxonomies are regularly used, 74.8% (80/107) of respondents indicated they used MeSH, 45.8% (49/107) Emtree, and 18.9% (20/107) CINAHL headings.

When asked which combination of techniques they used to create their search strategies, 44.9% (48/107) stated they used a form-based query builder, 41.1% (44/107) did so manually on paper, and 40.2% (43/107) used a text editor. Only 9.3% (10/107) used some form of visual query builder.

Evaluating search results

Respondents indicated that the ideal number of results returned for a search task would be 100 documents, yet in practice they evaluate more than this (a median of 175 documents; Table 1). The ideal number of results and the actual number of results evaluated are strongly correlated (N=66, ρ=.661 [Spearman rank correlation]). The average time to assess relevance of a single document was three minutes.

Respondents were asked to indicate on a five-point Likert scale how frequently they use search limits and restriction criteria to narrow down results. The results are shown in Figure 4.

Fig. 4 Usage of restriction criteria

We also examined respondents’ strategies for examining the search results. The most popular approaches were to “start with the result that looked most relevant” (54.2%, 58/107) or simply “select the first result” (23.4%, 25/107). No respondent suggested selecting the “most trustworthy source.”

Respondents were asked what types of activities^[35] they typically engaged in whilst completing their search task (Figure 5). “Locating, verifying, and evaluating results” were the most common activities (see Multimedia Appendix 1 for the full description of each activity, as provided to the respondents).

Fig. 5 Activities that respondents engage in when completing a search task

Ideal functionality for searching databases

We also examined other features related to search management, organization, and history that respondents value when performing search tasks. Respondents were asked to indicate a level of agreement to a statement using a five-point Likert scale ranging from 1 (strong disagreement) to 5 (strong agreement). The results are shown in Figure 6.

Fig. 6 Ideal features of a literature search system

Discussion

Here, the implications of the results with verbatim responses to the question “How could the process of creating and managing search strategies be improved for you?” are discussed, and the findings are contextualized in relation to the PubMed literature search system.