Entrez

The Entrez (/ɒnˈtreɪ/)^[1] Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website.^[2] The NCBI is a part of the National Library of Medicine (NLM), which is itself a department of the National Institutes of Health (NIH), which in turn is a part of the United States Department of Health and Human Services. The name "Entrez" (a greeting meaning "Come in" in French) was chosen to reflect the spirit of welcoming the public to search the content available from the NLM.

Entrez Global Query is an integrated search and retrieval system that provides access to all databases simultaneously with a single query string and user interface. Entrez can efficiently retrieve related sequences, structures, and references. The Entrez system can provide views of gene and protein sequences and chromosome maps. Some textbooks are also available online through the Entrez system.

Features

The Entrez front page provides, by default, access to the global query. All databases indexed by Entrez can be searched via a single query string, supporting Boolean operators and search term tags to limit parts of the search statement to particular fields. This returns a unified results page, that shows the number of hits for the search in each of the databases, which are also linked to actual search results for that particular database.

Entrez also provides a similar interface for searching each particular database and for refining search results. The Limits feature allows the user to narrow a search, a web forms interface. The History feature gives a numbered list of recently performed queries. Results of previous queries can be referred to by number and combined via Boolean operators. Search results can be saved temporarily in a Clipboard. Users with a MyNCBI account can save queries indefinitely, and also choose to have updates with new search results e-mailed for saved queries of most databases. It is widely used in the field of biotechnology as a reference tool for students and professionals alike.

Databases

Entrez searches the following databases:

PubMed: biomedical literature citations and abstracts, including Medline—articles from (mainly medical) journals, often including abstracts. Links to PubMed Central and other full-text resources are provided for articles from the 1990s.
PubMed Central: free, full-text journal articles
Site Search: NCBI web and FTP web sites
Books: online books
Online Mendelian Inheritance in Man (OMIM)
Nucleotide: sequence database (GenBank)
Protein: sequence database (GenPept)
Genome: whole genome sequences and mapping
Structure: three-dimensional macromolecular structures
Taxonomy: organisms in GenBank Taxonomy
dbSNP: single nucleotide polymorphism
Gene:^[3] gene-centered information
HomoloGene: eukaryotic homology groups
PubChem Compound: unique small molecule chemical structures
PubChem Substance: deposited chemical substance records
Genome Project: genome project information
UniGene: gene-oriented clusters of transcript sequences
CDD: conserved protein domain database
PopSet: population study data sets (epidemiology)
GEO Profiles: expression and molecular abundance profiles
GEO DataSets: experimental sets of GEO data
Sequence read archive: high-throughput sequencing data
Cancer Chromosomes: cytogenetic databases
PubChem BioAssay: bioactivity screens of chemical substances
Probe: sequence-specific reagents
NLM Catalog: NLM bibliographic data for over 1.2 million journals, books, audiovisuals, computer software, electronic resources, and other materials resident in LocatorPlus (updated every weekday).

Access

In addition to using the search engine forms to query the data in Entrez, NCBI provides the Entrez Programming Utilities^[4] (eUtils) for more direct access to query results. The eUtils are accessed by posting specially formed URLs to the NCBI server, and parsing the XML response. There was also an eUtils SOAP interface which was terminated in July 2015.^[5]

History

In 1991, Entrez was introduced in CD form. In 1993, a client-server version of the software provided connectivity with the internet. In 1994, NCBI established a website, and Entrez was a part of this initial release. In 2001, Entrez bookshelf was released and in 2003, the Entrez Gene database was developed.^[6]