Difference between revisions of "Chemical informatics"

Revision as of 21:55, 19 May 2014

Chemical informatics (more commonly known as chemoinformatics and cheminformatics) is the use of computer and informational techniques applied to a range of problems in the field of chemistry. While the field has roughly been around around since the 1990s, the rise in high-throughput screening (a scientific experimentation method primarily used in drug discovery) and combinatorial chemistry (a method of synthesizing a large number of compounds in a single process), as well as increases in computing power and data storage sizes, have increased interest in the field in the twenty-first century.^[1]

Outside of pharmaceutical research, other applications of chemical informatics include the area of topology, chemical graph theory, and mining the chemical space. It can also be applied to data analysis for the paper, pulp, and dye industries.^[2]^[1]

History

The 1960s saw the introduction of databases for the storage and retrieval of chemical structures, as well as three-dimensional molecular modeling methods, laying the groundwork for future generations to improve computational methods of chemical and molecular analysis.^[2]

The term "chemoinformatics" was defined by F.K. Brown^[3]^[4] in 1998 as such:

Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.

Since then, both the "chem" and "chemo" spellings have been used. European academia settled on the term "chemoinformatics" for its 2006 Obernai research and teaching workshop.^[5] Other entities like the Journal of Cheminformatics and Slovak company Molinspiration have trended towards "cheminformatics."^[6]^[7]

Application

Storage and retrieval

The primary application of chemical informatics is in the storage and retrieval of both structured and unstructured information relating to chemical structures, molecular models and other chemical data. Efficiently querying and retrieving that stored information extends into other realms of computer science like data mining and machine learning. Other forms of data querying include graph, molecule, sequence, and tree mining.^[8]

Representation

The in silico representation of chemical structures uses specialized formats such as the XML-based Chemical Markup Language or Simplified Molecular-Input Line-Entry System (SMILES) specifications. These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in two or three dimensions, others are more suited for studying physical interactions, modeling, and docking studies.^[8]

Virtual libraries

Stored chemical data can pertain to both real and virtual molecules. Virtual libraries of such molecules and compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties. The Fragment Optimized Growth (FOG) algorithm, for example, was developed to "grow" novel classes of compounds like drugs, natural products, and diversity-oriented synthetic products from a training database of existing compounds.^[9]^[10]

Virtual screening

In contrast to high-throughput screening, virtual screening involves computationally screening in silico libraries of compounds, by means of various methods such as docking, to identify members likely to possess desired properties such as biological activity against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or natural products is screened.^[1]

Quantitative structure-activity relationship (QSAR)

This is the calculation of quantitative structure-activity relationship and quantitative structure property relationship values, used to predict the activity of compounds from their structures. In this context there is also a strong relationship to chemometrics, the science of extracting information from chemical systems by data-driven means. Chemical expert systems are also relevant since they represent parts of chemical knowledge as an in silico representation.^[1]

External links

Notes

This article reuses portions of content from the Wikipedia article.

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 Leach, Andrew R.; Gillet, Valerie J. (2007). An Introduction to Chemoinformatics. Springer. pp. 256. ISBN 9781402062902. http://books.google.com/books?id=4z7Q87HgBdwC&printsec=frontcover. Retrieved 19 May 2014.
↑ ^2.0 ^2.1 Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) (2006). "Chapter 1: Introduction". Chemoinformatics: A Textbook. John Wiley & Sons. pp. 1–14. ISBN 9783527606504. http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover.
↑ Brown, F.K. (1998). "Chapter 35. Chemoinformatics: What is it and how does it impact drug discovery". Annual Reports in Medicinal Chemistry 33: 375–384. doi:10.1016/S0065-7743(08)61100-8. http://www.sciencedirect.com/science/article/pii/S0065774308611008.
↑ Brown, Frank (May 2005). "Editorial Opinion: Chemoinformatics – a ten year update". Current Opinion in Drug Discovery & Development 8 (3): 298–302. http://www.ncbi.nlm.nih.gov/pubmed/15892243.
↑ "Workshop Chemoinformatics in Europe: Research and Teaching". Laboratoire de Chémoinformatique, University of Strasbourg. 2006. http://infochim.u-strasbg.fr/chemoinformatics/. Retrieved 19 May 2014.
↑ "Cheminformatics or Chemoinformatics?". Molinspiration Cheminformatics. December 2009. http://www.molinspiration.com/chemoinformatics.html. Retrieved 19 May 2014.
↑ "About Journal of Cheminformatics". Chemistry Central. http://www.jcheminf.com/about. Retrieved 19 May 2014.
↑ ^8.0 ^8.1 Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) (2006). "Chapter 2: Representation of Chemical Compounds". Chemoinformatics: A Textbook. John Wiley & Sons. pp. 15–157. ISBN 9783527606504. http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover.
↑ Kutchukian, Peter S.; Lou, David; Shakhnovich, Eugene I. (2009). "FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules Occupying Druglike Chemical Space". Journal of Chemical Information and Modeling 49 (7): 1630–1642. doi:10.1021/ci9000458. PMID 19527020. http://pubs.acs.org/doi/abs/10.1021/ci9000458.
↑ Kutchukian, Peter S.; Virtanen, Salla I.; Lounkine, Eugen; Glick, Meir; Shakhnovich, Eugene I.; Schneider, Gisbert (ed.) (2013). "Chapter 13: Construction of Drug-Like Compounds by Markov Chains". De novo Molecular Design. John Wiley & Sons. ISBN 9783527677009. http://books.google.com/books?id=Jf1QAQAAQBAJ&pg=PA311. Retrieved 19 May 2014.

[LeachIntroChem-1] 1.0 ^1.1 ^1.2 ^1.3 Leach, Andrew R.; Gillet, Valerie J. (2007). An Introduction to Chemoinformatics. Springer. pp. 256. ISBN 9781402062902. http://books.google.com/books?id=4z7Q87HgBdwC&printsec=frontcover. Retrieved 19 May 2014.

[Gasteiger2006-2] 2.0 ^2.1 Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) (2006). "Chapter 1: Introduction". Chemoinformatics: A Textbook. John Wiley & Sons. pp. 1–14. ISBN 9783527606504. http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover.

[Brown1998-3] Brown, F.K. (1998). "Chapter 35. Chemoinformatics: What is it and how does it impact drug discovery". Annual Reports in Medicinal Chemistry 33: 375–384. doi:10.1016/S0065-7743(08)61100-8. http://www.sciencedirect.com/science/article/pii/S0065774308611008.

[4] Brown, Frank (May 2005). "Editorial Opinion: Chemoinformatics – a ten year update". Current Opinion in Drug Discovery & Development 8 (3): 298–302. http://www.ncbi.nlm.nih.gov/pubmed/15892243.

[Obernai-5] "Workshop Chemoinformatics in Europe: Research and Teaching". Laboratoire de Chémoinformatique, University of Strasbourg. 2006. http://infochim.u-strasbg.fr/chemoinformatics/. Retrieved 19 May 2014.

[MolChem-6] "Cheminformatics or Chemoinformatics?". Molinspiration Cheminformatics. December 2009. http://www.molinspiration.com/chemoinformatics.html. Retrieved 19 May 2014.

[JournChem-7] "About Journal of Cheminformatics". Chemistry Central. http://www.jcheminf.com/about. Retrieved 19 May 2014.

[Gasteiger2006Ch2-8] 8.0 ^8.1 Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) (2006). "Chapter 2: Representation of Chemical Compounds". Chemoinformatics: A Textbook. John Wiley & Sons. pp. 15–157. ISBN 9783527606504. http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover.

[KutchFOG-9] Kutchukian, Peter S.; Lou, David; Shakhnovich, Eugene I. (2009). "FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules Occupying Druglike Chemical Space". Journal of Chemical Information and Modeling 49 (7): 1630–1642. doi:10.1021/ci9000458. PMID 19527020. http://pubs.acs.org/doi/abs/10.1021/ci9000458.

[SchneiderDeNovo-10] Kutchukian, Peter S.; Virtanen, Salla I.; Lounkine, Eugen; Glick, Meir; Shakhnovich, Eugene I.; Schneider, Gisbert (ed.) (2013). "Chapter 13: Construction of Drug-Like Compounds by Markov Chains". De novo Molecular Design. John Wiley & Sons. ISBN 9783527677009. http://books.google.com/books?id=Jf1QAQAAQBAJ&pg=PA311. Retrieved 19 May 2014.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

@@ Line 1: / Line 1: @@
-'''Cheminformatics''' (also known as '''chemoinformatics''' and '''chemical informatics''')  is the use of computer and informational techniques, applied to a range of problems in the field of chemistry. These ''in silico'' techniques are used in pharmaceutical companies in the process of [[drug design|drug discovery]]. These methods can also be used in chemical and allied industries in various other forms.
+'''Chemical informatics''' (more commonly known as '''chemoinformatics''' and '''cheminformatics''')  is the use of computer and informational techniques applied to a range of problems in the field of chemistry. While the field has roughly been around around since the 1990s, the rise in high-throughput screening (a scientific experimentation method primarily used in drug discovery) and combinatorial chemistry (a method of synthesizing a large number of compounds in a single process), as well as increases in computing power and data storage sizes, have increased interest in the field in the twenty-first century.<ref name="LeachIntroChem">{{cite book |url=http://books.google.com/books?id=4z7Q87HgBdwC&printsec=frontcover |title=An Introduction to Chemoinformatics |author=Leach, Andrew R.; Gillet, Valerie J. |publisher=Springer |version=Revised |year=2007 |pages=256 |isbn=9781402062902 |accessdate=19 May 2014}}</ref>
-== History ==
+Outside of pharmaceutical research, other applications of chemical informatics include the area of topology, chemical graph theory, and mining the chemical space. It can also be applied to data analysis for the paper, pulp, and dye industries.<ref name="Gasteiger2006">{{cite book |url=http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover |title=Chemoinformatics: A Textbook |chapter=Chapter 1: Introduction |author=Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) |publisher=John Wiley & Sons |year=2006 |pages=1–14 |isbn=9783527606504}}</ref><ref name="LeachIntroChem" />
-The term chemoinformatics was defined by F.K. Brown <ref name="Brown_1998">{{cite journal | author = F.K. Brown | title = Chapter 35. Chemoinformatics: What is it and How does it Impact Drug Discovery | journal = Annual Reports in Med. Chem. | year = 1998 | volume = 33 | pages = 375 | doi = 10.1016/S0065-7743(08)61100-8}}</ref><ref>{{cite journal | author = Brown, Frank | title = Editorial Opinion: Chemoinformatics – a ten year update | journal = Current Opinion in Drug Discovery & Development| year = 2005 | volume = 8 | issue = 3 | pages = 296–302}}</ref> in 1998:
+==History==
+The 1960s saw the introduction of databases for the storage and retrieval of chemical structures, as well as three-dimensional molecular modeling methods, laying the groundwork for future generations to improve computational methods of chemical and molecular analysis.<ref name="Gasteiger2006" />
+The term "chemoinformatics" was defined by F.K. Brown<ref name="Brown1998">{{cite journal |url=http://www.sciencedirect.com/science/article/pii/S0065774308611008 |journal=Annual Reports in Medicinal Chemistry |title=Chapter 35. Chemoinformatics: What is it and how does it impact drug discovery |author=Brown, F.K. |year=1998 |volume=33 |pages=375–384 |doi=10.1016/S0065-7743(08)61100-8}}</ref><ref>{{cite journal |url=http://www.ncbi.nlm.nih.gov/pubmed/15892243 |journal=Current Opinion in Drug Discovery & Development |title=Editorial Opinion: Chemoinformatics – a ten year update |author=Brown, Frank |year=May 2005 |volume=8 |issue=3 |pages=298–302}}</ref> in 1998 as such:
 <blockquote>
@@ Line 9: / Line 13: @@
 </blockquote>
-Since then, both spellings have been used, and some have evolved to be established as Cheminformatics,<ref>[http://www.molinspiration.com/chemoinformatics.html Cheminformatics or Chemoinformatics ?<!-- Bot generated title -->]</ref> while European Academia settled in 2006 for Chemoinformatics.<ref name="Obernai">[http://infochim.u-strasbg.fr/chemoinformatics/Obernai%20Declaration.pdf Obernai Declaration]</ref> The recent establishment of the Journal of Cheminformatics is a strong push towards the shorter variant.
+Since then, both the "chem" and "chemo" spellings have been used. European academia settled on the term "chemoinformatics" for its 2006 Obernai research and teaching workshop.<ref name="Obernai">{{cite web |url=http://infochim.u-strasbg.fr/chemoinformatics/ |title=Workshop Chemoinformatics in Europe: Research and Teaching |publisher=Laboratoire de Chémoinformatique, University of Strasbourg |year=2006 |accessdate=19 May 2014}}</ref> Other entities like the Journal of Cheminformatics and Slovak company Molinspiration have trended towards "cheminformatics."<ref name="MolChem">{{cite web |url=http://www.molinspiration.com/chemoinformatics.html |title=Cheminformatics or Chemoinformatics? |publisher=Molinspiration Cheminformatics |date=December 2009 |accessdate=19 May 2014}}</ref><ref name="JournChem">{{cite web |url=http://www.jcheminf.com/about |title=About ''Journal of Cheminformatics'' |publisher=Chemistry Central |accessdate=19 May 2014}}</ref>
-== Basics ==
+==Application==
-Cheminformatics combines the scientific working fields of chemistry and computer science for example in the area of topology and chemical graph theory and mining the chemical space.<ref name="Gasteiger_2004">Gasteiger J.(Editor), Engel T.(Editor): ''Chemoinformatics : A Textbook''. John Wiley & Sons, 2004, ISBN 3-527-30681-1</ref><ref>A.R. Leach, V.J. Gillet: ''An Introduction to Chemoinformatics''.  Springer, 2003, ISBN 1-4020-1347-7</ref>
-Cheminformatics can also be applied to data analysis for various industries like [[paper industry|paper]] and [[pulp industry|pulp]], [[dye industry|dyes]] and such allied industries.
-== Applications ==
 ===Storage and retrieval===
-The primary application of cheminformatics is in the storage of information relating to compounds. The efficient search of such stored information includes topics that are dealt with in computer science as [[data mining]] and machine learning. Related research topics include:
+The primary application of chemical informatics is in the storage and retrieval of both structured and unstructured information relating to chemical structures, molecular models and other chemical data. Efficiently querying and retrieving that stored information extends into other realms of computer science like data mining and machine learning. Other forms of data querying include graph, molecule, sequence, and tree mining.<ref name="Gasteiger2006Ch2">{{cite book |url=http://books.google.com/books?id=LCD-1vHBHIAC&printsec=frontcover |title=Chemoinformatics: A Textbook |chapter=Chapter 2: Representation of Chemical Compounds |author=Gasteiger, Johann (ed.) ; Engel, Thomas (ed.) |publisher=John Wiley & Sons |year=2006 |pages=15–157 |isbn=9783527606504}}</ref>
-* Unstructured data
-* Structured Data Mining and mining of Structured data
-** [[Database mining]]
-** Graph mining
-** Molecule mining
-** Sequence mining
-** Tree mining
-==== File formats ====
-The ''in silico'' representation of chemical structures uses specialized formats such as the XML-based Chemical Markup Language or SMILES.  These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in 2 or 3 dimensions, others are more suited for studying physical interactions, modeling and docking studies.
-=== Virtual libraries ===
+===Representation===
-Chemical data can pertain to real or virtual molecules.  Virtual libraries of compounds
+The ''in silico'' representation of chemical structures uses specialized formats such as the XML-based Chemical Markup Language or Simplified Molecular-Input Line-Entry System (SMILES) specifications. These representations are often used for storage in large chemical databases. While some formats are suited for visual representations in two or three dimensions, others are more suited for studying physical interactions, modeling, and docking studies.<ref name="Gasteiger2006Ch2" />
-may be generated in various ways to explore chemical space and hypothesize novel
-compounds with desired properties.
-Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm.
+===Virtual libraries===
-<ref>{{cite journal|title=FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules occupying Druglike Chemical | last=Kutchukian | first=Peter  | coauthors=Lou, David; Shakhnovich, Eugene |journal=Journal of Chemical Information and Modeling | year=2009 |volume=49 | pages=1630–1642|doi=10.1021/ci9000458|pmid=19527020|issue=7 }}</ref>  This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.
+Stored chemical data can pertain to both real and virtual molecules. Virtual libraries of such molecules and compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties. The Fragment Optimized Growth (FOG) algorithm, for example, was developed to "grow" novel classes of compounds like drugs, natural products, and diversity-oriented synthetic products from a training database of existing compounds.<ref name="KutchFOG">{{cite journal |url=http://pubs.acs.org/doi/abs/10.1021/ci9000458 |journal=Journal of Chemical Information and Modeling |title=FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules Occupying Druglike Chemical Space |author=Kutchukian, Peter S.; Lou, David; Shakhnovich, Eugene I. |year=2009 |volume=49 |issue=7 | pages=1630–1642 |doi=10.1021/ci9000458 |pmid=19527020}}</ref><ref name="SchneiderDeNovo">{{cite book |url=http://books.google.com/books?id=Jf1QAQAAQBAJ&pg=PA311 |chapter=Chapter 13: Construction of Drug-Like Compounds by Markov Chains |title=De novo Molecular Design |author=Kutchukian, Peter S.; Virtanen, Salla I.; Lounkine, Eugen; Glick, Meir; Shakhnovich, Eugene I.; Schneider, Gisbert (ed.) |publisher=John Wiley & Sons |year=2013 |isbn=9783527677009 |accessdate=19 May 2014}}</ref>
-=== Virtual screening ===
+===Virtual screening===
-In contrast to high-throughput screening, virtual screening involves computationally
+In contrast to high-throughput screening, virtual screening involves computationally screening ''in silico'' libraries of compounds, by means of various methods such as docking, to identify members likely to possess desired properties such as biological activity against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or natural products is screened.<ref name="LeachIntroChem" />
-screening ''in silico'' libraries of compounds, by means of various methods such as
-docking, to identify members likely to possess desired properties
-such as biological activity against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or natural products is screened.
-===Quantitative structure-activity relationship (QSAR) ===
+===Quantitative structure-activity relationship (QSAR)===
-This is the calculation of quantitative structure-activity relationship and quantitative structure property relationship values, used to predict the activity of compounds from their structures. In this context there is also a strong relationship to [[Chemometrics]]. Chemical expert systems are also relevant, since they represent parts of chemical knowledge as an ''in silico'' representation.
+This is the calculation of quantitative structure-activity relationship and quantitative structure property relationship values, used to predict the activity of compounds from their structures. In this context there is also a strong relationship to chemometrics, the science of extracting information from chemical systems by data-driven means. Chemical expert systems are also relevant since they represent parts of chemical knowledge as an ''in silico'' representation.<ref name="LeachIntroChem" />
-== See also ==
+==See also==
 * [[Bioinformatics]]
 * [[Data analysis]]
 == External links ==
-* [http://www.eyesopen.com/oechem-tk OEChem Cheminformatics Programming Toolkit]
+* [http://www.genomicglossaries.com/content/chemoinformatics_gloss.asp Cambridge Healthtech Institute Cheminformatics/ Chemoinformatics Glossary & Taxonomy]
-* The [http://www.blueobelisk.org/ Blue Obelisk] Movement
+* [http://icep.wikispaces.com/ Indiana Cheminformatics Education Portal]
-* The [http://www.echeminfo.com/ eCheminfo] Network and Community of Practice
+* [http://www.blueobelisk.org/ The Blue Obelisk Project]
-* [http://cheminfo.informatics.indiana.edu Cheminformatics at Indiana University]
-* [http://icep.wikispaces.com Indiana Cheminformatics Education Portal]
-* [http://reccr.chem.rpi.edu Cheminformatics at Rensselaer Polytechnic Institute]
 * [http://www.csa-trust.org The Chemical Structure Association Trust]
-* [http://www.cheminformatics.org Comprehensive cheminformatics link list and data set repository]
+* [http://www.echeminfo.com/ The eCheminfo Network and Community of Practice]
-* [http://www.genomicglossaries.com/content/chemoinformatics_gloss.asp A cheminformatics glossary]
+* [http://www.ukqsar.org The UK-QSAR and ChemoInformatics Group]
-* [http://moltable.ncl.res.in  Chemoinformatics initiatives at NCL Pune, India]
-* [http://moltable.ncl.res.in/icci  International Conference on Chemoinformatics at NCL,Pune]
-* Chemical Informatics [http://www.informatics.indiana.edu/academics/chem.asp Education] and [http://www.informatics.indiana.edu/djwild Research] at Indiana University
-* Famous [http://joelib.sourceforge.net/wiki/index.php/Cheminformatics_and_mining_quotation Cheminformatics quotations]
-* [http://www.qsar.org The Cheminformatics and QSAR Society]
-* [http://www.ukqsar.org UK-QSAR and ChemoInformatics Group]
-* [http://www.zbh.uni-hamburg.de/study/what_is_CI/index.php?language=en Education and Research at the University of Hamburg]
-* [http://www-ucc.ch.cam.ac.uk/ Cheminformatics research at the Unilever Centre for Molecular Informatics, Cambridge, UK]
-* [http://daniel.iut.univ-metz.fr/yachs YACHS Yet Another CHemistry Summarizer, Laboratoire Informatique d'Avignon LIA, France]
-* [http://www.novamechanics.com/ Cheminformatics research at NovaMechanics Cyprus]
-* [http://www.qspr.pe.kr/my/index.php?option=com_bookmarks&Itemid=28 Weblink-Cheminformatics SW and DB]
-* [http://chemoinformatician.co.uk Cheminformatics studies from Unilever Centre for Molecular Informatics to OpenEye]
 ==Notes==
-This article heavily reuses content from [http://en.wikipedia.org/wiki/Cheminformatics the Wikipedia article].
+This article reuses portions of content from [http://en.wikipedia.org/wiki/Cheminformatics the Wikipedia article].
 ==References==

Difference between revisions of "Chemical informatics"

Revision as of 21:55, 19 May 2014

Contents

History

Application

Storage and retrieval

Representation

Virtual libraries

Virtual screening

Quantitative structure-activity relationship (QSAR)

See also

External links

Notes

References

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

Popular publications

Print/export