Difference between revisions of "Journal:OpenChrom: A cross-platform open source software for the mass spectrometric analysis of chromatographic data"

From LIMSWiki
Jump to navigationJump to search
(Added content. Saving and adding more.)
m (Text replacement - "\[\[Shimadzu Corporation(.*)" to "[[Vendor:Shimadzu Corporation$1")
 
(6 intermediate revisions by the same user not shown)
Line 13: Line 13:
|vol_iss      = '''11'''
|vol_iss      = '''11'''
|pages        = 405
|pages        = 405
|doi          = [http://doi.org/10.1186/1471-2105-11-405]
|doi          = [http://doi.org/10.1186/1471-2105-11-405 10.1186/1471-2105-11-405]
|issn        = 1471-2105
|issn        = 1471-2105
|license      = [http://creativecommons.org/licenses/by/2.0 Creative Commons Attribution 2.0 Generic]
|license      = [http://creativecommons.org/licenses/by/2.0 Creative Commons Attribution 2.0 Generic]
Line 19: Line 19:
|download    = [http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-405?site=bmcbioinformatics.biomedcentral.com http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-405] (PDF)
|download    = [http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-405?site=bmcbioinformatics.biomedcentral.com http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-405] (PDF)
}}
}}
{{ombox
 
| type      = content
| style    = width: 500px;
| text      = This article should not be considered complete until this message box has been removed. This is a work in progress.
}}
==Abstract==
==Abstract==
===Background===
===Background===
Line 37: Line 33:
Software has become an integral part of analysis techniques. Especially in the area of gas chromatography/mass spectrometry, automatic samplers enable high throughput analyses. Software assists handling large amounts of data generated by automated and fast operating analytical instruments. Modern computer systems are inexpensive, powerful and allow analysis techniques that could not have been applied in the past. Deconvolution, a chromatographic quality enhancing technique, demonstrates for instance that increasing processor power makes new analysis techniques applicable. The technique of deconvolution has been described by Biller and Biemann<ref name="BillerIdent77">{{cite journal |title=Identification of the components of complex mixtures by GC-MS |journal=Abstracts Of Papers Of The American Chemical Society |author=Biller, J.E.; Herlihy, W.C.; Biemann, K. |volume=173 |issue=MAR20 |pages=23–23 |year=1977 |url=http://pubs.acs.org/doi/abs/10.1021/bk-1977-0054.ch002}}</ref><ref name="BillerRecon74">{{cite journal |title=Reconstructed Mass Spectra, A Novel Approach for the Utilization of Gas Chromatograph—Mass Spectrometer Data |journal=Analytical Letters |author=Biller, J.E.; Biemann, K. |volume=7 |issue=7 |pages=515–528 |year=1974 |doi=10.1080/00032717408058783}}</ref>, Dromey et al.<ref name="FromeyExtract76">{{cite journal |title=Extraction of mass spectra free of background and neighboring component contributions from gas chromatography/mass spectrometry data |journal=Analytical Chemistry |author=Dromey, R.G.; Stefik, M.J.; Rindfleisch, T.C.; Duffield, A.M. |volume=48 |issue=9 |pages=1368–1375 |year=1976 |doi=10.1021/ac50003a027}}</ref>, Colby<ref name="ColbySpect92">{{cite journal |title=Spectral deconvolution for overlapping GC/MS components |journal=Journal of the American Society for Mass Spectrometry |author=Colby, B.N. |volume=3 |issue=5 |pages=558–562 |year=1992 |doi=10.1016/1044-0305(92)85033-G |pmid=24234499}}</ref>, Hindmarch et al.<ref name="HindmarchDecon96">{{cite journal |title=Deconvolution and spectral clean-up of two-component mixtures by factor analysis of gas chromatographic–mass spectrometric data |journal=Analyst |author=Hindmarch, P.; Demir, C.; Brereton, R.G. |volume=121 |issue=8 |pages=993-1001 |year=1996 |doi=10.1039/AN9962100993}}</ref>, Halket et al.<ref name="HalketDecon99">{{cite journal |title=Deconvolution gas chromatography/mass spectrometry of urinary organic acids – potential for pattern recognition and automated identification of metabolic disorders |journal=Rapid Communications In Mass Spectrometry |author=Halket, J.M.; Przyborowska, A.; Stein, S.E. et al. |volume=13 |issue=4 |pages=279–284 |year=1999 |doi=10.1002/(SICI)1097-0231(19990228)13:4<279::AID-RCM478>3.0.CO;2-I |pmid=10097403}}</ref>, Kong et al.<ref name="KongDecon05">{{cite journal |title=Deconvolution of overlapped peaks based on the exponentially modified Gaussian model in comprehensive two-dimensional gas chromatography |journal=Journal of Chromatography A |author=Kong, H.W.; Ye, F.; Lu, X.; Guo, L.; Tian, J.; Xu, G.W. |volume=1086 |issue=1–2 |pages=160–164 |year=2005 |doi=10.1016/j.chroma.2005.05.103 |pmid=16130668}}</ref>, Taylor et al.<ref name="TaylorTheDecon98">{{cite journal |title=The deconvolution of pyrolysis mass spectra using genetic programming: Application to the identification of some ''Eubacterium'' species |journal=FEMS Microbiology Letters |author=Taylor, J.; Goodacre, R.; Wade, W.G. |volume=160 |issue=2 |pages=237–246 |year=1998 |doi=10.1111/j.1574-6968.1998.tb12917.x |pmid=9532743}}</ref>, Pool et al.<ref name="PoolBack96">{{cite journal |title=Backfolding applied to differential gas chromatography/mass spectrometry as a mathematical enhancement of chromatographic resolution |journal=Journal Of Mass Spectrometry |author=Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. |volume=31 |issue=5 |pages=509–516 |year=1996 |doi=10.1002/(SICI)1096-9888(199605)31:5<509::AID-JMS323>3.0.CO;2-B}}</ref><ref name="PoolAuto97">{{cite journal |title=Automated extraction of pure mass spectra from gas chromatographic/mass spectrometric data |journal=Journal Of Mass Spectrometry |author=Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. |volume=32 |issue=4 |pages=438–443 |year=1997 |doi=10.1002/(SICI)1096-9888(199704)32:4<438::AID-JMS499>3.0.CO;2-N}}</ref> and Davies<ref name="DaviesTheNew98">{{cite journal |title=The new Automated Mass Spectrometry Deconvolution and Identification System (AMDIS) |journal=Spectrometry Europe |author=Davies, A. |volume=10 |issue=3 |pages=22–26 |year=1998}}</ref> in various ways. Stein<ref name="SteinAnInt99">{{cite journal |title=An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data |journal=Journal Of the American Society for Mass Spectrometry |author=Stein, S.E. |volume=10 |issue=8 |pages=770–781 |year=1999 |doi=10.1016/S1044-0305(99)00047-1}}</ref> published an enhanced deconvolution algorithm that has been implemented in the software AMDIS (Automated Mass Spectral Deconvolution and Identification System).<ref name="AMDIS">{{cite web |url=http://chemdata.nist.gov/dokuwiki/doku.php?id=chemdata:amdis |title=AMDIS |publisher=The National Institute of Standards and Technology}}</ref> AMDIS is available free of charge from the National Institute of Standards and Technology (NIST). Windig et al.<ref name="WindigChemo07">{{cite journal |title=Chemometric analysis of complex hyphenated data: Improvements of the component detection algorithm |journal=Journal of Chromatography A |author=Windig, W.; Smith, W.F. |volume=1158 |issue=1–2 |pages=251–257 |year=2007 |doi=10.1016/j.chroma.2007.03.081 |pmid=17418223}}</ref><ref name="WindigANoise96">{{cite journal |title=A noise and background reduction method for component detection in liquid chromatography/mass spectrometry |journal=Analytical Chemistry |author=Windig, W.; Phalp, J.M.; Payne, A.W. |volume=68 |issue=20 |pages=3602–3606 |year=1996 |doi=10.1021/ac960435y}}</ref> described another approach to enhance chromatographic quality by a deconvolution method called CODA (Component Detection Algorithm). The commercially available software ACD/MS Manager<ref name="ACD/MS">{{cite web |url=http://www.acdlabs.com/ |title=ACD/Labs |publisher=Advanced Chemistry Development, Inc}}</ref> offers an implementation of this approach.
Software has become an integral part of analysis techniques. Especially in the area of gas chromatography/mass spectrometry, automatic samplers enable high throughput analyses. Software assists handling large amounts of data generated by automated and fast operating analytical instruments. Modern computer systems are inexpensive, powerful and allow analysis techniques that could not have been applied in the past. Deconvolution, a chromatographic quality enhancing technique, demonstrates for instance that increasing processor power makes new analysis techniques applicable. The technique of deconvolution has been described by Biller and Biemann<ref name="BillerIdent77">{{cite journal |title=Identification of the components of complex mixtures by GC-MS |journal=Abstracts Of Papers Of The American Chemical Society |author=Biller, J.E.; Herlihy, W.C.; Biemann, K. |volume=173 |issue=MAR20 |pages=23–23 |year=1977 |url=http://pubs.acs.org/doi/abs/10.1021/bk-1977-0054.ch002}}</ref><ref name="BillerRecon74">{{cite journal |title=Reconstructed Mass Spectra, A Novel Approach for the Utilization of Gas Chromatograph—Mass Spectrometer Data |journal=Analytical Letters |author=Biller, J.E.; Biemann, K. |volume=7 |issue=7 |pages=515–528 |year=1974 |doi=10.1080/00032717408058783}}</ref>, Dromey et al.<ref name="FromeyExtract76">{{cite journal |title=Extraction of mass spectra free of background and neighboring component contributions from gas chromatography/mass spectrometry data |journal=Analytical Chemistry |author=Dromey, R.G.; Stefik, M.J.; Rindfleisch, T.C.; Duffield, A.M. |volume=48 |issue=9 |pages=1368–1375 |year=1976 |doi=10.1021/ac50003a027}}</ref>, Colby<ref name="ColbySpect92">{{cite journal |title=Spectral deconvolution for overlapping GC/MS components |journal=Journal of the American Society for Mass Spectrometry |author=Colby, B.N. |volume=3 |issue=5 |pages=558–562 |year=1992 |doi=10.1016/1044-0305(92)85033-G |pmid=24234499}}</ref>, Hindmarch et al.<ref name="HindmarchDecon96">{{cite journal |title=Deconvolution and spectral clean-up of two-component mixtures by factor analysis of gas chromatographic–mass spectrometric data |journal=Analyst |author=Hindmarch, P.; Demir, C.; Brereton, R.G. |volume=121 |issue=8 |pages=993-1001 |year=1996 |doi=10.1039/AN9962100993}}</ref>, Halket et al.<ref name="HalketDecon99">{{cite journal |title=Deconvolution gas chromatography/mass spectrometry of urinary organic acids – potential for pattern recognition and automated identification of metabolic disorders |journal=Rapid Communications In Mass Spectrometry |author=Halket, J.M.; Przyborowska, A.; Stein, S.E. et al. |volume=13 |issue=4 |pages=279–284 |year=1999 |doi=10.1002/(SICI)1097-0231(19990228)13:4<279::AID-RCM478>3.0.CO;2-I |pmid=10097403}}</ref>, Kong et al.<ref name="KongDecon05">{{cite journal |title=Deconvolution of overlapped peaks based on the exponentially modified Gaussian model in comprehensive two-dimensional gas chromatography |journal=Journal of Chromatography A |author=Kong, H.W.; Ye, F.; Lu, X.; Guo, L.; Tian, J.; Xu, G.W. |volume=1086 |issue=1–2 |pages=160–164 |year=2005 |doi=10.1016/j.chroma.2005.05.103 |pmid=16130668}}</ref>, Taylor et al.<ref name="TaylorTheDecon98">{{cite journal |title=The deconvolution of pyrolysis mass spectra using genetic programming: Application to the identification of some ''Eubacterium'' species |journal=FEMS Microbiology Letters |author=Taylor, J.; Goodacre, R.; Wade, W.G. |volume=160 |issue=2 |pages=237–246 |year=1998 |doi=10.1111/j.1574-6968.1998.tb12917.x |pmid=9532743}}</ref>, Pool et al.<ref name="PoolBack96">{{cite journal |title=Backfolding applied to differential gas chromatography/mass spectrometry as a mathematical enhancement of chromatographic resolution |journal=Journal Of Mass Spectrometry |author=Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. |volume=31 |issue=5 |pages=509–516 |year=1996 |doi=10.1002/(SICI)1096-9888(199605)31:5<509::AID-JMS323>3.0.CO;2-B}}</ref><ref name="PoolAuto97">{{cite journal |title=Automated extraction of pure mass spectra from gas chromatographic/mass spectrometric data |journal=Journal Of Mass Spectrometry |author=Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. |volume=32 |issue=4 |pages=438–443 |year=1997 |doi=10.1002/(SICI)1096-9888(199704)32:4<438::AID-JMS499>3.0.CO;2-N}}</ref> and Davies<ref name="DaviesTheNew98">{{cite journal |title=The new Automated Mass Spectrometry Deconvolution and Identification System (AMDIS) |journal=Spectrometry Europe |author=Davies, A. |volume=10 |issue=3 |pages=22–26 |year=1998}}</ref> in various ways. Stein<ref name="SteinAnInt99">{{cite journal |title=An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data |journal=Journal Of the American Society for Mass Spectrometry |author=Stein, S.E. |volume=10 |issue=8 |pages=770–781 |year=1999 |doi=10.1016/S1044-0305(99)00047-1}}</ref> published an enhanced deconvolution algorithm that has been implemented in the software AMDIS (Automated Mass Spectral Deconvolution and Identification System).<ref name="AMDIS">{{cite web |url=http://chemdata.nist.gov/dokuwiki/doku.php?id=chemdata:amdis |title=AMDIS |publisher=The National Institute of Standards and Technology}}</ref> AMDIS is available free of charge from the National Institute of Standards and Technology (NIST). Windig et al.<ref name="WindigChemo07">{{cite journal |title=Chemometric analysis of complex hyphenated data: Improvements of the component detection algorithm |journal=Journal of Chromatography A |author=Windig, W.; Smith, W.F. |volume=1158 |issue=1–2 |pages=251–257 |year=2007 |doi=10.1016/j.chroma.2007.03.081 |pmid=17418223}}</ref><ref name="WindigANoise96">{{cite journal |title=A noise and background reduction method for component detection in liquid chromatography/mass spectrometry |journal=Analytical Chemistry |author=Windig, W.; Phalp, J.M.; Payne, A.W. |volume=68 |issue=20 |pages=3602–3606 |year=1996 |doi=10.1021/ac960435y}}</ref> described another approach to enhance chromatographic quality by a deconvolution method called CODA (Component Detection Algorithm). The commercially available software ACD/MS Manager<ref name="ACD/MS">{{cite web |url=http://www.acdlabs.com/ |title=ACD/Labs |publisher=Advanced Chemistry Development, Inc}}</ref> offers an implementation of this approach.


Increasing computational power enables new applications, but there is still a lack of interoperability. Instrument vendors, such as [[Agilent Technologies, Inc.|Agilent Technologies]], [[Shimadzu Corporation|Shimadzu]], [[Thermo Scientific|Thermo Fisher Scientific]] and [[Waters Corporation]] have created their own software and data format. Usually, the mass spectral data formats are binary and can only be accessed by the instrument vendors' proprietary software. Some commercial tools exist to convert the mass spectral data files into other formats, such as MASS Transit from PALISADE Corporation.<ref name="PALISADE">{{cite web |url=http://www.palisade.com/ |title=PALISADE |publisher=Palisade Corporation}}</ref> To avoid these limitations, some efforts have been made to design and implement interoperable data formats and software libraries as for example NetCDF<ref name="NetCDF">{{cite web |url=http://www.unidata.ucar.edu/software/netcdf/ |title=Network Common Data Form (NetCDF) |publisher=University Corporation for Atmospheric Research}}</ref> or mzXML.<ref name="PedrioliACommon04">{{cite journal |title=A common open representation of mass spectrometry data and its application to proteomics research |journal=Nature Biotechnology |author=Pedrioli, P.G.A.; Eng, J.K.; Hubley, R. |volume=22 |issue=11 |pages=1459–1466 |year=2004 |doi=10.1038/nbt1031 |pmid=15529173}}</ref><ref name="FalknerProt07">{{cite journal |title=ProteomeCommons.org IO Framework: Reading and writing multiple proteomics data formats |journal=Bioinformatics |author=Falkner, J.A.; Falkner, J.W.; Andrews, P.C. |volume=23 |issue=2 |pages=262–263 |year=2007 |doi=10.1093/bioinformatics/btl573 |pmid=17121776}}</ref> But even if it is possible to convert the data files to other formats, there are drawbacks in data processing as each software implements specific functions, has its own graphical user interface and is in most cases commercially available only, as for example the applicable software of ChemStation, Xcalibur or MassLynx. Hence, the users are forced to become familiar with different software systems, user interfaces and methods. Moreover, the software tools primarily target only specific operating systems, such as Microsoft Windows and Mac OSX. The number of software applications that are independent of the operating system and can also be run under Unix or Linux is limited. Linux systems are open source, available at no cost and their usage increases in scientific research (see Scientific Linux<ref name="SciLin">{{cite web |url=https://en.wikipedia.org/wiki/Scientific_Linux |title=Scientific Linux |work=Wikipedia |publisher=Wikimedia Foundation, Inc}}</ref>), as well as in the public sector.<ref name="Wienux">{{cite web |url=https://en.wikipedia.org/wiki/Wienux |title=Wienux |work=Wikipedia |publisher=Wikimedia Foundation, Inc}}</ref><ref name="LiMux">{{cite web |url=http://www.muenchen.de/rathaus/Stadtverwaltung/Direktorium/LiMux.html |title=Das Projekt LiMux |publisher=Portal München Betriebs-GmbH & Co. KG}}</ref> Software applications, such as AMDIS, have been published to be used free of charge, but their source code is not disposable. Thus, it is not possible to evaluate the algorithms implemented in the software. Especially in the case of scientific research, it is not possible to figure them out and to extend them. Even if algorithms are described in published papers<ref name="BillerRecon74" /><ref name="ColbySpect92" /><ref name="PoolBack96" /><ref name="SteinAnInt99" /><ref name="AlfassiOnThe04">{{cite journal |title=On the normalization of a mass spectrum for comparison of two spectra |journal=Journal of the American Society for Mass Spectrometry |author=Alfassi, Z.B. |volume=15 |issue=3 |pages=385-387 |year=2004 |doi=10.1016/j.jasms.2003.11.008 |pmid=14998540}}</ref>, it is often impossible to validate them manually due to the complexity of chromatographic data. Other applications like ChemStation, Xcalibur, and ACD/MS Manager are proprietary and closed source. They are only commercially available. There is no means of revealing the correctness of their utilized algorithms. Efforts have been made to solve the problems of missing interoperability and restricted access to source codes and algorithms.<ref name="SpjuthBio07">{{cite journal |title=Bioclipse: An open source workbench for chemo- and bioinformatics |journal=BMC Bioinformatics |author=Spjuth, O.; Helmus, T.; Willighagen, E.L. et al. |volume=8 |pages=59 |year=2007 |doi=10.1186/1471-2105-8-59 |pmid=17316423 |pmc=PMC1808478}}</ref> Bioclipse is a sophisticated project that is open source and is focused with its algorithms on metabolism analysis and gene sequencing. Its techniques are state-of-the-art. Some other projects are mMass<ref name="mMassArch">{{cite web |url=http://mmass.biographics.cz/ |archiveurl=https://web.archive.org/web/20090827071924/http://mmass.biographics.cz/ |title=mMass - Open Source Mass Spectrometry Tool |publisher=Martin Strohalm |archivedate=27 August 2009}}</ref>, COMSPARI<ref name="COMSPARI">{{cite web |url=http://www.biomechanic.org/comspari/ |title=The COMSPARI Homepage |publisher=J. Katz and J. Hau}}</ref> and fityk<ref name="fitykArch">{{cite web |url=http://www.unipress.waw.pl/fityk/ |archiveurl=https://web.archive.org/web/20100304192315/http://www.unipress.waw.pl/fityk |title=Fityk home |publisher=Institute of High Pressure Physics of the Polish Academy of Sciences |archivedate=04 March 2010}}</ref>, but they do have some restrictions regarding their interoperability and extensibility. BioSunMS<ref name="CaoBio09">{{cite journal |title=BioSunMS: A plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry |journal=BMC Medical Informatics and Decision Making |author=Cao, Y.; Wang, N.; Ying, X.M. et al. |volume=9 |pages=13 |year=2009 |doi=10.1186/1472-6947-9-13 |pmid=17316423 |pmc=PMC1808478}}</ref> is a tool to read TOF (Time of Flight) mass spectral data files, but it is not able to read instrument vendors' native data files. The Chemistry Development Kit (CDK)<ref name="SteinbeckTheChem03">{{cite journal |title=The Chemistry Development Kit (CDK): An open-source Java library for Chemo- and Bioinformatics |journal=Journal of Chemical Information and Computer Sciences |author=Steinbeck, C.; Han, Y.Q.; Kuhn, S. et al. |volume=43 |issue=2 |pages=493–500 |year=2003 |pmid=12653513}}</ref> implements convenient features to edit chemical data and structures, but it has no appropriate user interface. The open source tool OpenMS<ref name="SturmOpenMS08">{{cite journal |title=OpenMS – An open-source software framework for mass spectrometry |journal=BMC Bioinformatics |author=Sturm, M.; Bertsch, A.; Gropl, C. et al. |volume=9 |pages=163 |year=2008 |doi=10.1186/1471-2105-9-163 |pmid=18366760 |pmc=PMC2311306}}</ref> aims to edit mass spectrometric data, but it is not completely platform independent, as it is written in C++ programming language.
Increasing computational power enables new applications, but there is still a lack of interoperability. Instrument vendors, such as [[Vendor:Agilent Technologies, Inc.|Agilent Technologies]], [[Vendor:Shimadzu Corporation|Shimadzu]], [[Vendor:Thermo Scientific|Thermo Fisher Scientific]] and [[Vendor:Waters Corporation|Waters Corporation]] have created their own software and data format. Usually, the mass spectral data formats are binary and can only be accessed by the instrument vendors' proprietary software. Some commercial tools exist to convert the mass spectral data files into other formats, such as MASS Transit from PALISADE Corporation.<ref name="PALISADE">{{cite web |url=http://www.palisade.com/ |title=PALISADE |publisher=Palisade Corporation}}</ref> To avoid these limitations, some efforts have been made to design and implement interoperable data formats and software libraries as for example NetCDF<ref name="NetCDF">{{cite web |url=http://www.unidata.ucar.edu/software/netcdf/ |title=Network Common Data Form (NetCDF) |publisher=University Corporation for Atmospheric Research}}</ref> or mzXML.<ref name="PedrioliACommon04">{{cite journal |title=A common open representation of mass spectrometry data and its application to proteomics research |journal=Nature Biotechnology |author=Pedrioli, P.G.A.; Eng, J.K.; Hubley, R. |volume=22 |issue=11 |pages=1459–1466 |year=2004 |doi=10.1038/nbt1031 |pmid=15529173}}</ref><ref name="FalknerProt07">{{cite journal |title=ProteomeCommons.org IO Framework: Reading and writing multiple proteomics data formats |journal=Bioinformatics |author=Falkner, J.A.; Falkner, J.W.; Andrews, P.C. |volume=23 |issue=2 |pages=262–263 |year=2007 |doi=10.1093/bioinformatics/btl573 |pmid=17121776}}</ref> But even if it is possible to convert the data files to other formats, there are drawbacks in data processing as each software implements specific functions, has its own graphical user interface and is in most cases commercially available only, as for example the applicable software of ChemStation, Xcalibur or MassLynx. Hence, the users are forced to become familiar with different software systems, user interfaces and methods. Moreover, the software tools primarily target only specific operating systems, such as Microsoft Windows and Mac OSX. The number of software applications that are independent of the operating system and can also be run under Unix or Linux is limited. Linux systems are open source, available at no cost and their usage increases in scientific research (see Scientific Linux<ref name="SciLin">{{cite web |url=https://en.wikipedia.org/wiki/Scientific_Linux |title=Scientific Linux |work=Wikipedia |publisher=Wikimedia Foundation, Inc}}</ref>), as well as in the public sector.<ref name="Wienux">{{cite web |url=https://en.wikipedia.org/wiki/Wienux |title=Wienux |work=Wikipedia |publisher=Wikimedia Foundation, Inc}}</ref><ref name="LiMux">{{cite web |url=http://www.muenchen.de/rathaus/Stadtverwaltung/Direktorium/LiMux.html |title=Das Projekt LiMux |publisher=Portal München Betriebs-GmbH & Co. KG}}</ref> Software applications, such as AMDIS, have been published to be used free of charge, but their source code is not disposable. Thus, it is not possible to evaluate the algorithms implemented in the software. Especially in the case of scientific research, it is not possible to figure them out and to extend them. Even if algorithms are described in published papers<ref name="BillerRecon74" /><ref name="ColbySpect92" /><ref name="PoolBack96" /><ref name="SteinAnInt99" /><ref name="AlfassiOnThe04">{{cite journal |title=On the normalization of a mass spectrum for comparison of two spectra |journal=Journal of the American Society for Mass Spectrometry |author=Alfassi, Z.B. |volume=15 |issue=3 |pages=385-387 |year=2004 |doi=10.1016/j.jasms.2003.11.008 |pmid=14998540}}</ref>, it is often impossible to validate them manually due to the complexity of chromatographic data. Other applications like ChemStation, Xcalibur, and ACD/MS Manager are proprietary and closed source. They are only commercially available. There is no means of revealing the correctness of their utilized algorithms. Efforts have been made to solve the problems of missing interoperability and restricted access to source codes and algorithms.<ref name="SpjuthBio07">{{cite journal |title=Bioclipse: An open source workbench for chemo- and bioinformatics |journal=BMC Bioinformatics |author=Spjuth, O.; Helmus, T.; Willighagen, E.L. et al. |volume=8 |pages=59 |year=2007 |doi=10.1186/1471-2105-8-59 |pmid=17316423 |pmc=PMC1808478}}</ref> Bioclipse is a sophisticated project that is open source and is focused with its algorithms on metabolism analysis and gene sequencing. Its techniques are state-of-the-art. Some other projects are mMass<ref name="mMassArch">{{cite web |url=http://mmass.biographics.cz/ |archiveurl=https://web.archive.org/web/20090827071924/http://mmass.biographics.cz/ |title=mMass - Open Source Mass Spectrometry Tool |publisher=Martin Strohalm |archivedate=27 August 2009}}</ref>, COMSPARI<ref name="COMSPARI">{{cite web |url=http://www.biomechanic.org/comspari/ |title=The COMSPARI Homepage |publisher=J. Katz and J. Hau}}</ref> and fityk<ref name="fitykArch">{{cite web |url=http://www.unipress.waw.pl/fityk/ |archiveurl=https://web.archive.org/web/20100304192315/http://www.unipress.waw.pl/fityk |title=Fityk home |publisher=Institute of High Pressure Physics of the Polish Academy of Sciences |archivedate=04 March 2010}}</ref>, but they do have some restrictions regarding their interoperability and extensibility. BioSunMS<ref name="CaoBio09">{{cite journal |title=BioSunMS: A plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry |journal=BMC Medical Informatics and Decision Making |author=Cao, Y.; Wang, N.; Ying, X.M. et al. |volume=9 |pages=13 |year=2009 |doi=10.1186/1472-6947-9-13 |pmid=17316423 |pmc=PMC1808478}}</ref> is a tool to read TOF (Time of Flight) mass spectral data files, but it is not able to read instrument vendors' native data files. The Chemistry Development Kit (CDK)<ref name="SteinbeckTheChem03">{{cite journal |title=The Chemistry Development Kit (CDK): An open-source Java library for Chemo- and Bioinformatics |journal=Journal of Chemical Information and Computer Sciences |author=Steinbeck, C.; Han, Y.Q.; Kuhn, S. et al. |volume=43 |issue=2 |pages=493–500 |year=2003 |pmid=12653513}}</ref> implements convenient features to edit chemical data and structures, but it has no appropriate user interface. The open source tool OpenMS<ref name="SturmOpenMS08">{{cite journal |title=OpenMS – An open-source software framework for mass spectrometry |journal=BMC Bioinformatics |author=Sturm, M.; Bertsch, A.; Gropl, C. et al. |volume=9 |pages=163 |year=2008 |doi=10.1186/1471-2105-9-163 |pmid=18366760 |pmc=PMC2311306}}</ref> aims to edit mass spectrometric data, but it is not completely platform independent, as it is written in C++ programming language.
 
Projects like Bioclipse, Sashimi<ref name="Sashimi">{{cite web |url=http://sourceforge.net/projects/sashimi/ |title=Sashimi |publisher=SourceForge}}</ref> or TPP (Trans-Proteomic Pipeline)<ref name="TPP">{{cite web |url=http://tools.proteomecenter.org/ |title=Seattle Proteome Center (SPC) - Proteomics Tools |publisher=Institute for System Biology}}</ref> are focused on the evaluation of metabolism products and gene sequencing and make extensive use of accurate mass resolution techniques. But there is still a lack of software systems that are capable to enhance nominal mass spectral data files, that are flexible, extensible and that offer an easy to use graphical user interface. According to the authors' knowledge, no application offers functions to import vendor systems chromatographic data files and has the ability to edit and analyze chromatograms in the way ChemStation and AMDIS do. No application combines the flexibility in analyses, is easily extensible, open source, platform independent and has a configurable graphical user interface.
 
===Implementation===
====Architecture====
OpenChrom is an open source software that aims to solve the aforementioned constraints getting rid of several restrictions. It is based on the Eclipse Rich Client Platform (RCP)<ref name="RCP">{{cite web |url=http://wiki.eclipse.org/Rich_Client_Platform |title=Rich Client Platform |publisher=The Eclipse Foundation}}</ref>, which is an OSGi (Open Service Gateway Initiative) based application environment that allows to build modular and flexible software systems. With the OSGi platform it is possible to extend the functionality of an application by dividing its components into different bundles. It is written in Java which is an interpreted language that depends on the Java Virtual Machine (JVM) and allows the execution of the software on several operating systems (Microsoft Windows, Mac OSX, Unix, Linux) and processor platforms (x86, PPC, AMD64, IA64, SPARC). It utilizes SWT (Standard Widget Toolkit) to render its graphical user interface by using the native resources of the underlying operating system. The Rich Client Platform is state-of-the-art in today's software development. The platform is open to be extended afterwards due to the chosen concepts. It means that the platform doesn't need to be full-fledged at the beginning. Further methods and implementations can be developed separately. Nonetheless, still some effort is necessary to design a platform that covers all needs of a software application to edit, evaluate and modify chromatographic data. In contrast to Bioclipse, Sashimi or TPP, OpenChrom has a slightly different scope, as it is focused primarily on nominal mass resolution data. Mass spectrometers for nominal mass resolution are inexpensive, as for example quadrupole or ion trap instruments. But the data acquisition limits the range of possible applications. Software has the potential to enhance the quality of the recorded data, in contrary to the given limitations. Hence, the Rich Client Platform and the Java programming language were chosen, as they offer an excellent support for a highly extensible and abstract base framework. The OSGi based Rich Client Platform Equinox supports the definition of extension points. The use of different class paths makes it possible to execute code from separated bundles (Figure 1). New functionality, e.g. to export a given chromatogram to a PDF file, can be implemented in a separate bundle making use of the extension point mechanism to import and export chromatographic data.
 
[[File:Fig1 Wenig BMCBioinformatics2010 11.jpg|400px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="400px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 1. RCP/OSGi and OpenChrom architecture.''' The RCP/OSGi and OpenChrom architecture shows the supported processor platforms and operating systems.</blockquote>
|-
|}
|}
 
Tools in different areas have been implemented based on the Rich Client Platform, such as the Eclipse IDE (Integrated Development Environment), Lotus Notes, Bioclipse, BioSunMS, XMind, Apache Directory Studio and several more. It is part of the OpenChrom architecture to define useful extension points and to build a suitable object model.
 
====Object model====
OpenChrom provides a designed object model to define chromatograms, scans, mass spectra, peaks and baselines. It is important to abstract the base model, as it reduces dependencies in code and allows for the implemention of further extensions. Therefore, the decision was to support an enhanced chromatogram, mass spectrum and peak model, written in Java. There is no preliminary compilation necessary on different operating systems. Further on, it is possible to cover special needs regarding the import of instrument vendors' binary chromatographic files. An excerpt of the OpenChrom object model is shown in a simplified UML (Unified Modeling Language) diagram (Figure 2). Java, as an object orientated language, supports the use of the four base strategies in object orientation: abstraction, encapsulation, polymorphic behavior and inheritance.<ref name="HorstmannCore01">{{cite book |url=https://books.google.com/books?id=W6bomXWB-TYC |title=Core Java 2: Fundamentals |author=Horstmann, C.S.; Cornell, G. |publisher=Prentice Hall Professional |location=Upper Saddle River, NJ |year=2001 |pages=806 |isbn=9780130894687}}</ref> OpenChrom makes extensive use of the object orientated concept. The interface "IChromatogram" and the abstract class "AbstractChromatogram" define and implement methods, which are common for all types of chromatograms, independent of the instrument vendors' data format. Therefore, it is not necessary to implement them iterative in each vendor specific chromatogram class. The base framework and extension points, like peak detectors and integrators, are working still with instances of the type "IChromatogram", instead of taking for example the differences of an Agilent and a NetCDF chromatogram into account. The object model for mass spectra and mass fragments, peaks and baselines is implemented in a similar way.
 
[[File:Fig2 Wenig BMCBioinformatics2010 11.jpg|800px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="800px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 2. OpenChrom chromatogram object model.''' The OpenChrom chromatogram object model shows a simplified UML diagram of the chromatographic model OpenChrom uses.</blockquote>
|-
|}
|}
 
====Extension points====
The OpenChrom framework offers several bundles (Table 1). The most important one defines methods to implement specialized bundles that handle the import of chromatographic mass spectral data. It is possible to supply a bundle that is able to read binary chromatogram files, given by a specific instrument vendor. The bundle takes care of how to read a given file or directory. Furthermore, the framework offers extension points to detect and integrate peaks. The peak detection and integration have been separated, to make it possible to detect peaks with several peak detector methods and to integrate them with a specified integrator. This results in a more complex but also more flexible system. There is another extension point that allows to define bundles that are capable of detecting a baseline in the chromatogram model. Another flexible extension point was introduced, called filters. Bundles can extend the filter extension point to achieve a quality enhancement of the chromatographic data. They work comparable to filters in image processing software. One filter extension can for instance offer a set of methods to eliminate background signals from the chromatogram. Another filter can implement a routine to mean normalize the chromatogram. The filters offer editing steps, which are especially useful before peak detection and integration routines.
 
{|
| STYLE="vertical-align:top;"|
{| class="wikitable" border="1" cellpadding="5" cellspacing="0" width="60%"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;" colspan="2"|'''Table 1. Some selected bundles of the OpenChrom software.''' The OpenChrom software offers several extension points. Extension points are declared in bundles. The table shows a selected overview of bundles and suppliers.
|-
  ! style="padding-left:10px; padding-right:10px;"|Bundle
  ! style="padding-left:10px; padding-right:10px;"|Description
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|baseline.detector
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Detect baselines
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|comparison
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Compare chromatograms and mass spectra
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|converter
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Converter to read binary/textual data files
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|converter.supplier.agilent
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Read Agilent data files
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|converter.supplier.cdf
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Read and write NetCDF data files
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|filter
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Modify chromatographic data
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|identifier
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Identify chromatograms, mass spectra and peaks
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|integrator
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Integrate peaks
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|model
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Models (chromatogram, mass spectrum, peak,...)
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|peak.detector
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Detect peaks
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|logging
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Logging facility
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|rcp
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Base application
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"|thirdpartylibraries.*
  | style="background-color:white; padding-left:10px; padding-right:10px;"|Third party libraries (SWTChart, log4j,...)
|-
|}
|}
 
====Graphical user interface====
The Rich Client Platform offers a wide support to present an appropriate graphical user interface. Concepts detailing this include editors, views, perspectives, wizards, menus, cheat sheets, settings and help pages. OpenChrom makes extensive use of the available concepts. The editor shows the graphical representation of a chromatogram and several options, as for example a page to select or exclude distinct mass fragments. It also supports functions to save, edit and analyze chromatograms. The views are used to show different aspects of the chromatographic model. It is possible to show peaks in different kind of views. One view could show a peak including the background of the chromatogram. Another could show the peak with its increasing and decreasing tangents and its width at 50% height. A flexible mechanism was introduced to inform all views if the chromatogram selection has been changed. The update functionality is also realized by an extension point. Views and editors are composed in a task specific way using perspectives.
 
==Results and discussion==
The OpenChrom software offers several options to edit and evaluate chromatographic data. It currently implements native converters to import mass spectrometric chromatograms from Agilent Technologies and to import and export NetCDF and mzXML files as well as a custom [[XML]] format to store the chromatographic data and additional information. The chromatogram file explorer (Figure 3) shows a representation of the local file system and marks those files and directories that contain importable chromatographic data files or directories. The chromatogram can either be stored in a file, a directory or a set of files, as the converter extension point and the import and export converters take care of it. The chromatogram will be opened by a double click on the file. Additionally, a preview of the selected chromatogram file is shown in a specialized view in the user interface. The chromatogram itself is shown in a multi-page editor that is divided into a chromatogram as well as an options page. It is possible to save the chromatogram in several file formats. The NetCDF, mzXML and the customized OpenChrom XML format are actually supported. Nonetheless, the time to import and to save a chromatogram depends on its format and size. It takes more time to process XML based formats like mzXML than binary formats like NetCDF or Agilents data format. The graphical elements are drawn using SWTChart and SWT. Chromatogram selections can be chosen by applying a "zoom in" or "zoom out" action in the chromatogram editor. All views will be updated after a zoom action.
 
[[File:Fig3 Wenig BMCBioinformatics2010 11.jpg|900px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="900px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 3. OpenChrom software showing editors, views, menus and menu entries.''' The OpenChrom software is using editors, views, menus and menu entries showed in the figure.</blockquote>
|-
|}
|}
 
The menu "Chromatogram Edit" allows to access functions that modify or evaluate the chromatographic data. For example, all registered bundles that support filters will be listed in the sub menu "Filter". It is possible to add a filter that implements a Savitzky-Golay<ref name="SavitzkySmooth64">{{cite journal |title=Smoothing and differentiation of data by simplified least squares procedures |journal=Analytical Chemistry |author=Savitzky, A.; Golay, M.J.E. |volume=36 |issue=8 |pages=1627–1639 |year=1964 |doi=10.1021/ac60214a047}}</ref> smoothing operation or to add filters that remove the background of the chromatogram. Each action will be performed on the active chromatogram selection. Actions are commonly very fast, due to the fact that the chromatogram is kept in the random access memory (RAM), depending on the implemented algorithms. Furthermore, the filter actions are reversible. This editing support is well known from modern IDEs and office suites. But the support for do/undo and redo operations does cost processing time. If the reversibility is not needed, it can be deactivated in the applications preference dialog. Another extension point is responsible to register baseline detectors. Different baseline detectors can be implemented in separated bundles and will be offered in the "Baseline Detectors" sub menu. Peak detection and integration are done commonly in one run. One improvement achieved through OpenChrom is a division of the detection and the integration of peaks into two separated actions. The peak detectors can be applied by calling an appropriate detector in the sub menu "Peak Detectors" and the peak integration can be performed by using an listed integrator from the sub menu "Integrators". The separation of detector and integrator methods makes it possible to detect peaks in a chromatogram using several algorithms and methods. The chosen peak detectors could be of different types, as for example detectors using deconvolution techniques like AMDIS or CODA. All detected peaks can afterwards be integrated by a unique integrator, which leads to comparable results. This feature offers a high flexibility in using different kinds of detectors and integrators.
 
The view mechanism of the Eclipse Rich Client Platform makes it possible to show chromatographic data in different kind of views. A peak can be displayed in multiple ways, for example by its area (Figure 4), its increasing and decreasing tangents and its width at 50% of peak height. Thus, the system provides additional graphical information, especially useful for educational purposes. Each view can be shown in a small (Figure 3) and extended format (Figure 4 and 5), which allows an appropriate user interaction even on small displays.
 
[[File:Fig4 Wenig BMCBioinformatics2010 11.jpg|900px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="900px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 4. Peak with increasing and decreasing tangents and its width at 50% height in extended format.''' The view shows a maximized version of a selected peak.</blockquote>
|-
|}
|}
 
[[File:Fig5 Wenig BMCBioinformatics2010 11.jpg|900px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="900px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 5. Graphical representation of a mass spectrum in extended format.''' The view shows a maximized version of a selected mass spectrum.</blockquote>
|-
|}
|}
 
Further on, property views show miscellaneous values of the selected chromatogram. Due to the chromatogram object model, different values will be shown if different chromatogram files have been loaded. Chromatograms from Agilent Technologies and NetCDF differ in their information content. Hence, the properties view helps to inspect the files. There are additional extension points implemented that enable adding bundles to compare mass spectra using different methods<ref name="AlfassiOnThe04" /><ref name="McLaffertyComp98">{{cite journal |title=Comparison of algorithms and databases for matching unknown mass spectra |journal=Journal of the American Society for Mass Spectrometry |author=McLafferty, F.W.; Zhang, M.Y.; Stauffer, D.B.; Loh, S.Y. |volume=9 |issue=1 |pages=92-95 |year=1998 |doi=10.1016/S1044-0305(97)00235-3 |pmid=9679594}}</ref><ref name="LohExact91">{{cite journal |title=Exact mass probability based matching of high-resolution unknown mass spectra |journal=Analytical Chemistry |author=Loh, S.Y.; McLafferty, F.W. |volume=63 |issue=6 |pages=546–550 |year=1991 |doi=10.1021/ac00006a002}}</ref><ref name="DamenSiscom78">{{cite journal |title=Siscom — a new library search system for mass spectra |journal=Analytica Chimica Acta |author=Damen, H.; Henneberg, D.; Weimann, B. |volume=103 |issue=4 |pages=289-302 |year=1978 |doi=10.1016/S0003-2670(01)83095-6}}</ref><ref name="AlfassiVector05">{{cite journal |title=Vector analysis of multi-measurements identification |journal=Journal of Radioanalytical and Nuclear Chemistry |author=Alfassi, Z.B. |volume=266 |issue=2 |pages=245–250 |year=2005 |doi=10.1007/s10967-005-0899-y}}</ref> or to identify peaks or chromatograms. A method similar to the one implemented in the software F-Search<ref name="FrontLab">{{cite web |url=http://www.frontier-lab.com/ |title=Frontier Lab |publisher=Frontier Laboratories Ltd}}</ref> from Frontier Laboratories Ltd. could be used to identify chromatograms, for example.
 
Moreover, the OpenChrom platform supports bundles with a system built-in logging mechanism that extends the Apache project log4j. Each module can use the logging mechanism which makes it easier to detect problems and failures. Bundles are further separated into fragments, which allows the separation of concerns. Each OpenChrom bundle supports an internationalization (i18n) and JUnit test fragment. At the moment, approximately 3000 unit tests are written and can be executed to ensure the quality of the software.
 
If necessary, the extension point mechanism gives the flexibility to add functions needed by users at any time. Thus, OpenChrom can be connected to other systems, as for example to LIMS ([[Laboratory information management system|Laboratory Information Management System]]), databases, existing software tools or workflow systems. The object model of OpenChrom offers a convenient access to values and results from the edited chromatograms. Specialized modules take care of how to handle specific concerns, for example how to store results in an information management system. Further on, it is possible to implement bundles for specific analyses or for an automated experimentation.
 
OpenChrom enables several ways to edit and analyze chromatographic data. The advantage of the flexibility and the abstract architecture makes it partly difficult to get started with the platform, even if the functionality is provided by different bundles to decrease its complexity and to focus on special tasks. The intention to publish the software under an open source license is to support code contributions and to open the project for individual solutions. Moreover, the separation into bundles makes it easier for others to contribute new functionality. Further improvements will be done to optimize the current algorithms and to develop new and better filters, peak detectors and integrators.
 
==Conclusions==
OpenChrom has been designed to become an extensible cross-platform open source software for the mass spectrometric analysis of chromatographic data. It provides extension points to enable built-in import capabilities for binary or textual instrument vendors' data formats. In addition to its custom XML format it supports the Agilent Technologies, mzXML and NetCDF mass spectrometric data format. Further development is planned to support more data formats. The open source concept has been chosen to initiate the contributions of third parties, as it depends on the ideas and needs of the community to extend the capabilities of the presented concept. OpenChrom offers extension points that enable the implementation of different baseline detectors as well as peak detectors and integrators. Furthermore, there is an option to implement filters, used to increase the chromatographic quality. The framework offers a full support of do/undo and redo operations. The examples Bioclipse and BioSunMS show how to use the Eclipse Rich Client Platform in a specific way, but no software has been published until now that is capable to import binary chromatographic files natively, offers support to edit and analyze chromatograms and makes it possible to implement new algorithms and methods. As it is open source, everybody has the possibility to inspect the implemented algorithms and methods, especially for verification. OpenChrom is a software with a special focus on the editing and evaluation of mass spectrometric chromatographic data. OpenChrom will be hopefully extended by contributing developers, scientists and companies in the future.
 
==Availability and requirements==
'''Project name''': OpenChrom
 
'''Project homepage''': [http://​www.​openchrom.​net http://​www.​openchrom.​net]
 
'''Operating systems''': Platform independent
 
'''Programming language''': Java
 
'''Java Runtime Environment''': Sun/Oracle JVM 1.6.0, OpenJDK
 
'''Minimum RAM''': 500 MB
 
'''Minimum Processor''': 1 GHz
 
'''Commercial restrictions''': None
 
OpenChrom is available for download free of charge from the project home page.
 
The Agilent data file input converter must be installed separately using the OpenChrom update mechanism. The instructions how to install the converter can be found at the following website: https://marketplace.openchrom.net/.
 
OpenChrom is licensed under the Eclipse Public License 1.0 (EPL). The EPL is an OSI approved open source license that ensures, that the source code will remain open source. OpenChrom uses some third party libraries that are partly published under different open source licenses. All third party libraries are available in separated bundles, to ensure that no license conflicts occur. The third party library bundles are published under the Apache, LGPL, AGPL and EPL license, depending on the bundle. The GPL licenses are viral, it means that derivative works must be published under the GPL license too. The EPL and Eclipse Rich Client Platform enable a different licensing for the bundles, as a bundle using methods of another bundle can not be seen as a derivative work, though it only uses its interfaces.
 
==Declarations==
===Acknowledgements===
The authors thank all participants at the Department of Wood Science (University of Hamburg, Germany) for their support and their helpful suggestions.
 
===Authors’ original submitted files for images===
Below are the links to the authors’ original submitted files for images.
 
* [https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-405/MediaObjects/12859_2010_3862_MOESM1_ESM.png 12859_2010_3862_MOESM1_ESM.png] Authors’ original file for figure 1
* [https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-405/MediaObjects/12859_2010_3862_MOESM2_ESM.png 12859_2010_3862_MOESM2_ESM.png] Authors’ original file for figure 2
* [https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-405/MediaObjects/12859_2010_3862_MOESM3_ESM.png 12859_2010_3862_MOESM3_ESM.png] Authors’ original file for figure 3
* [https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-405/MediaObjects/12859_2010_3862_MOESM4_ESM.png 12859_2010_3862_MOESM4_ESM.png] Authors’ original file for figure 4
* [https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-11-405/MediaObjects/12859_2010_3862_MOESM5_ESM.png 12859_2010_3862_MOESM5_ESM.png] Authors’ original file for figure 5
 
===Authors' contributions===
PW designed and implemented the core API (Application Programming Interface), the software and its extension points. PW drafted most of the manuscript. JO gave feedback and corrected the manuscript. All authors performed extensive testing of the software and approved the final manuscript.


==References==
==References==
Line 43: Line 219:


==Notes==
==Notes==
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In the "Conclusion" section of the abstract, "software" was changed to "chromatography software" to encourage internal linking to the [[CDMS]] entry on the wiki.
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In the "Conclusion" section of the abstract, "software" was changed to "chromatography software" to encourage internal linking to the [[CDMS]] entry on the wiki. Two blank rows were removed from Table 1. A few URLS have changed since this was published in 2010, and they have been updated as needed.


<!--Place all category tags here-->
<!--Place all category tags here-->
[[Category:LIMSwiki journal articles (added in 2016)‎]]
[[Category:LIMSwiki journal articles (added in 2016)‎]]
[[Category:LIMSwiki journal articles (all)‎]]
[[Category:LIMSwiki journal articles (all)‎]]
[[Category:LIMSwiki journal articles on bioinformatics‎‎]]
[[Category:LIMSwiki journal articles on chromatography]]
[[Category:LIMSwiki journal articles on software‎‎]]
[[Category:LIMSwiki journal articles on software‎‎]]

Latest revision as of 18:58, 11 April 2024

Full article title OpenChrom: A cross-platform open source software for the mass spectrometric analysis of chromatographic data
Journal BMC Bioinformatics
Author(s) Wenig, Philip; Odermatt, Juergen
Author affiliation(s) University of Hamburg
Primary contact Email: philip.wenig@gmx.net
Year published 2010
Volume and issue 11
Page(s) 405
DOI 10.1186/1471-2105-11-405
ISSN 1471-2105
Distribution license Creative Commons Attribution 2.0 Generic
Website http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-405
Download http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-405 (PDF)

Abstract

Background

Today, data evaluation has become a bottleneck in chromatographic science. Analytical instruments equipped with automated samplers yield large amounts of measurement data, which needs to be verified and analyzed. Since nearly every GC/MS instrument vendor offers its own data format and software tools, the consequences are problems with data exchange and a lack of comparability between the analytical results. To challenge this situation a number of either commercial or non-profit software applications have been developed. These applications provide functionalities to import and analyze several data formats but have shortcomings in terms of the transparency of the implemented analytical algorithms and/or are restricted to a specific computer platform.

Results

This work describes a native approach to handle chromatographic data files. The approach can be extended in its functionality such as facilities to detect baselines, to detect, integrate and identify peaks and to compare mass spectra, as well as the ability to internationalize the application. Additionally, filters can be applied on the chromatographic data to enhance its quality, for example to remove background and noise. Extended operations like do, undo and redo are supported.

Conclusions

OpenChrom is a chromatography software application to edit and analyze mass spectrometric chromatographic data. It is extensible in many different ways, depending on the demands of the users or the analytical procedures and algorithms. It offers a customizable graphical user interface. The software is independent of the operating system, due to the fact that the Rich Client Platform is written in Java. OpenChrom is released under the Eclipse Public License 1.0 (EPL). There are no license constraints regarding extensions. They can be published using open source as well as proprietary licenses. OpenChrom is available free of charge at http://www.openchrom.net.

Background

Software has become an integral part of analysis techniques. Especially in the area of gas chromatography/mass spectrometry, automatic samplers enable high throughput analyses. Software assists handling large amounts of data generated by automated and fast operating analytical instruments. Modern computer systems are inexpensive, powerful and allow analysis techniques that could not have been applied in the past. Deconvolution, a chromatographic quality enhancing technique, demonstrates for instance that increasing processor power makes new analysis techniques applicable. The technique of deconvolution has been described by Biller and Biemann[1][2], Dromey et al.[3], Colby[4], Hindmarch et al.[5], Halket et al.[6], Kong et al.[7], Taylor et al.[8], Pool et al.[9][10] and Davies[11] in various ways. Stein[12] published an enhanced deconvolution algorithm that has been implemented in the software AMDIS (Automated Mass Spectral Deconvolution and Identification System).[13] AMDIS is available free of charge from the National Institute of Standards and Technology (NIST). Windig et al.[14][15] described another approach to enhance chromatographic quality by a deconvolution method called CODA (Component Detection Algorithm). The commercially available software ACD/MS Manager[16] offers an implementation of this approach.

Increasing computational power enables new applications, but there is still a lack of interoperability. Instrument vendors, such as Agilent Technologies, Shimadzu, Thermo Fisher Scientific and Waters Corporation have created their own software and data format. Usually, the mass spectral data formats are binary and can only be accessed by the instrument vendors' proprietary software. Some commercial tools exist to convert the mass spectral data files into other formats, such as MASS Transit from PALISADE Corporation.[17] To avoid these limitations, some efforts have been made to design and implement interoperable data formats and software libraries as for example NetCDF[18] or mzXML.[19][20] But even if it is possible to convert the data files to other formats, there are drawbacks in data processing as each software implements specific functions, has its own graphical user interface and is in most cases commercially available only, as for example the applicable software of ChemStation, Xcalibur or MassLynx. Hence, the users are forced to become familiar with different software systems, user interfaces and methods. Moreover, the software tools primarily target only specific operating systems, such as Microsoft Windows and Mac OSX. The number of software applications that are independent of the operating system and can also be run under Unix or Linux is limited. Linux systems are open source, available at no cost and their usage increases in scientific research (see Scientific Linux[21]), as well as in the public sector.[22][23] Software applications, such as AMDIS, have been published to be used free of charge, but their source code is not disposable. Thus, it is not possible to evaluate the algorithms implemented in the software. Especially in the case of scientific research, it is not possible to figure them out and to extend them. Even if algorithms are described in published papers[2][4][9][12][24], it is often impossible to validate them manually due to the complexity of chromatographic data. Other applications like ChemStation, Xcalibur, and ACD/MS Manager are proprietary and closed source. They are only commercially available. There is no means of revealing the correctness of their utilized algorithms. Efforts have been made to solve the problems of missing interoperability and restricted access to source codes and algorithms.[25] Bioclipse is a sophisticated project that is open source and is focused with its algorithms on metabolism analysis and gene sequencing. Its techniques are state-of-the-art. Some other projects are mMass[26], COMSPARI[27] and fityk[28], but they do have some restrictions regarding their interoperability and extensibility. BioSunMS[29] is a tool to read TOF (Time of Flight) mass spectral data files, but it is not able to read instrument vendors' native data files. The Chemistry Development Kit (CDK)[30] implements convenient features to edit chemical data and structures, but it has no appropriate user interface. The open source tool OpenMS[31] aims to edit mass spectrometric data, but it is not completely platform independent, as it is written in C++ programming language.

Projects like Bioclipse, Sashimi[32] or TPP (Trans-Proteomic Pipeline)[33] are focused on the evaluation of metabolism products and gene sequencing and make extensive use of accurate mass resolution techniques. But there is still a lack of software systems that are capable to enhance nominal mass spectral data files, that are flexible, extensible and that offer an easy to use graphical user interface. According to the authors' knowledge, no application offers functions to import vendor systems chromatographic data files and has the ability to edit and analyze chromatograms in the way ChemStation and AMDIS do. No application combines the flexibility in analyses, is easily extensible, open source, platform independent and has a configurable graphical user interface.

Implementation

Architecture

OpenChrom is an open source software that aims to solve the aforementioned constraints getting rid of several restrictions. It is based on the Eclipse Rich Client Platform (RCP)[34], which is an OSGi (Open Service Gateway Initiative) based application environment that allows to build modular and flexible software systems. With the OSGi platform it is possible to extend the functionality of an application by dividing its components into different bundles. It is written in Java which is an interpreted language that depends on the Java Virtual Machine (JVM) and allows the execution of the software on several operating systems (Microsoft Windows, Mac OSX, Unix, Linux) and processor platforms (x86, PPC, AMD64, IA64, SPARC). It utilizes SWT (Standard Widget Toolkit) to render its graphical user interface by using the native resources of the underlying operating system. The Rich Client Platform is state-of-the-art in today's software development. The platform is open to be extended afterwards due to the chosen concepts. It means that the platform doesn't need to be full-fledged at the beginning. Further methods and implementations can be developed separately. Nonetheless, still some effort is necessary to design a platform that covers all needs of a software application to edit, evaluate and modify chromatographic data. In contrast to Bioclipse, Sashimi or TPP, OpenChrom has a slightly different scope, as it is focused primarily on nominal mass resolution data. Mass spectrometers for nominal mass resolution are inexpensive, as for example quadrupole or ion trap instruments. But the data acquisition limits the range of possible applications. Software has the potential to enhance the quality of the recorded data, in contrary to the given limitations. Hence, the Rich Client Platform and the Java programming language were chosen, as they offer an excellent support for a highly extensible and abstract base framework. The OSGi based Rich Client Platform Equinox supports the definition of extension points. The use of different class paths makes it possible to execute code from separated bundles (Figure 1). New functionality, e.g. to export a given chromatogram to a PDF file, can be implemented in a separate bundle making use of the extension point mechanism to import and export chromatographic data.

Fig1 Wenig BMCBioinformatics2010 11.jpg

Figure 1. RCP/OSGi and OpenChrom architecture. The RCP/OSGi and OpenChrom architecture shows the supported processor platforms and operating systems.

Tools in different areas have been implemented based on the Rich Client Platform, such as the Eclipse IDE (Integrated Development Environment), Lotus Notes, Bioclipse, BioSunMS, XMind, Apache Directory Studio and several more. It is part of the OpenChrom architecture to define useful extension points and to build a suitable object model.

Object model

OpenChrom provides a designed object model to define chromatograms, scans, mass spectra, peaks and baselines. It is important to abstract the base model, as it reduces dependencies in code and allows for the implemention of further extensions. Therefore, the decision was to support an enhanced chromatogram, mass spectrum and peak model, written in Java. There is no preliminary compilation necessary on different operating systems. Further on, it is possible to cover special needs regarding the import of instrument vendors' binary chromatographic files. An excerpt of the OpenChrom object model is shown in a simplified UML (Unified Modeling Language) diagram (Figure 2). Java, as an object orientated language, supports the use of the four base strategies in object orientation: abstraction, encapsulation, polymorphic behavior and inheritance.[35] OpenChrom makes extensive use of the object orientated concept. The interface "IChromatogram" and the abstract class "AbstractChromatogram" define and implement methods, which are common for all types of chromatograms, independent of the instrument vendors' data format. Therefore, it is not necessary to implement them iterative in each vendor specific chromatogram class. The base framework and extension points, like peak detectors and integrators, are working still with instances of the type "IChromatogram", instead of taking for example the differences of an Agilent and a NetCDF chromatogram into account. The object model for mass spectra and mass fragments, peaks and baselines is implemented in a similar way.

Fig2 Wenig BMCBioinformatics2010 11.jpg

Figure 2. OpenChrom chromatogram object model. The OpenChrom chromatogram object model shows a simplified UML diagram of the chromatographic model OpenChrom uses.

Extension points

The OpenChrom framework offers several bundles (Table 1). The most important one defines methods to implement specialized bundles that handle the import of chromatographic mass spectral data. It is possible to supply a bundle that is able to read binary chromatogram files, given by a specific instrument vendor. The bundle takes care of how to read a given file or directory. Furthermore, the framework offers extension points to detect and integrate peaks. The peak detection and integration have been separated, to make it possible to detect peaks with several peak detector methods and to integrate them with a specified integrator. This results in a more complex but also more flexible system. There is another extension point that allows to define bundles that are capable of detecting a baseline in the chromatogram model. Another flexible extension point was introduced, called filters. Bundles can extend the filter extension point to achieve a quality enhancement of the chromatographic data. They work comparable to filters in image processing software. One filter extension can for instance offer a set of methods to eliminate background signals from the chromatogram. Another filter can implement a routine to mean normalize the chromatogram. The filters offer editing steps, which are especially useful before peak detection and integration routines.

Table 1. Some selected bundles of the OpenChrom software. The OpenChrom software offers several extension points. Extension points are declared in bundles. The table shows a selected overview of bundles and suppliers.
Bundle Description
baseline.detector Detect baselines
comparison Compare chromatograms and mass spectra
converter Converter to read binary/textual data files
converter.supplier.agilent Read Agilent data files
converter.supplier.cdf Read and write NetCDF data files
filter Modify chromatographic data
identifier Identify chromatograms, mass spectra and peaks
integrator Integrate peaks
model Models (chromatogram, mass spectrum, peak,...)
peak.detector Detect peaks
logging Logging facility
rcp Base application
thirdpartylibraries.* Third party libraries (SWTChart, log4j,...)

Graphical user interface

The Rich Client Platform offers a wide support to present an appropriate graphical user interface. Concepts detailing this include editors, views, perspectives, wizards, menus, cheat sheets, settings and help pages. OpenChrom makes extensive use of the available concepts. The editor shows the graphical representation of a chromatogram and several options, as for example a page to select or exclude distinct mass fragments. It also supports functions to save, edit and analyze chromatograms. The views are used to show different aspects of the chromatographic model. It is possible to show peaks in different kind of views. One view could show a peak including the background of the chromatogram. Another could show the peak with its increasing and decreasing tangents and its width at 50% height. A flexible mechanism was introduced to inform all views if the chromatogram selection has been changed. The update functionality is also realized by an extension point. Views and editors are composed in a task specific way using perspectives.

Results and discussion

The OpenChrom software offers several options to edit and evaluate chromatographic data. It currently implements native converters to import mass spectrometric chromatograms from Agilent Technologies and to import and export NetCDF and mzXML files as well as a custom XML format to store the chromatographic data and additional information. The chromatogram file explorer (Figure 3) shows a representation of the local file system and marks those files and directories that contain importable chromatographic data files or directories. The chromatogram can either be stored in a file, a directory or a set of files, as the converter extension point and the import and export converters take care of it. The chromatogram will be opened by a double click on the file. Additionally, a preview of the selected chromatogram file is shown in a specialized view in the user interface. The chromatogram itself is shown in a multi-page editor that is divided into a chromatogram as well as an options page. It is possible to save the chromatogram in several file formats. The NetCDF, mzXML and the customized OpenChrom XML format are actually supported. Nonetheless, the time to import and to save a chromatogram depends on its format and size. It takes more time to process XML based formats like mzXML than binary formats like NetCDF or Agilents data format. The graphical elements are drawn using SWTChart and SWT. Chromatogram selections can be chosen by applying a "zoom in" or "zoom out" action in the chromatogram editor. All views will be updated after a zoom action.

Fig3 Wenig BMCBioinformatics2010 11.jpg

Figure 3. OpenChrom software showing editors, views, menus and menu entries. The OpenChrom software is using editors, views, menus and menu entries showed in the figure.

The menu "Chromatogram Edit" allows to access functions that modify or evaluate the chromatographic data. For example, all registered bundles that support filters will be listed in the sub menu "Filter". It is possible to add a filter that implements a Savitzky-Golay[36] smoothing operation or to add filters that remove the background of the chromatogram. Each action will be performed on the active chromatogram selection. Actions are commonly very fast, due to the fact that the chromatogram is kept in the random access memory (RAM), depending on the implemented algorithms. Furthermore, the filter actions are reversible. This editing support is well known from modern IDEs and office suites. But the support for do/undo and redo operations does cost processing time. If the reversibility is not needed, it can be deactivated in the applications preference dialog. Another extension point is responsible to register baseline detectors. Different baseline detectors can be implemented in separated bundles and will be offered in the "Baseline Detectors" sub menu. Peak detection and integration are done commonly in one run. One improvement achieved through OpenChrom is a division of the detection and the integration of peaks into two separated actions. The peak detectors can be applied by calling an appropriate detector in the sub menu "Peak Detectors" and the peak integration can be performed by using an listed integrator from the sub menu "Integrators". The separation of detector and integrator methods makes it possible to detect peaks in a chromatogram using several algorithms and methods. The chosen peak detectors could be of different types, as for example detectors using deconvolution techniques like AMDIS or CODA. All detected peaks can afterwards be integrated by a unique integrator, which leads to comparable results. This feature offers a high flexibility in using different kinds of detectors and integrators.

The view mechanism of the Eclipse Rich Client Platform makes it possible to show chromatographic data in different kind of views. A peak can be displayed in multiple ways, for example by its area (Figure 4), its increasing and decreasing tangents and its width at 50% of peak height. Thus, the system provides additional graphical information, especially useful for educational purposes. Each view can be shown in a small (Figure 3) and extended format (Figure 4 and 5), which allows an appropriate user interaction even on small displays.

Fig4 Wenig BMCBioinformatics2010 11.jpg

Figure 4. Peak with increasing and decreasing tangents and its width at 50% height in extended format. The view shows a maximized version of a selected peak.

Fig5 Wenig BMCBioinformatics2010 11.jpg

Figure 5. Graphical representation of a mass spectrum in extended format. The view shows a maximized version of a selected mass spectrum.

Further on, property views show miscellaneous values of the selected chromatogram. Due to the chromatogram object model, different values will be shown if different chromatogram files have been loaded. Chromatograms from Agilent Technologies and NetCDF differ in their information content. Hence, the properties view helps to inspect the files. There are additional extension points implemented that enable adding bundles to compare mass spectra using different methods[24][37][38][39][40] or to identify peaks or chromatograms. A method similar to the one implemented in the software F-Search[41] from Frontier Laboratories Ltd. could be used to identify chromatograms, for example.

Moreover, the OpenChrom platform supports bundles with a system built-in logging mechanism that extends the Apache project log4j. Each module can use the logging mechanism which makes it easier to detect problems and failures. Bundles are further separated into fragments, which allows the separation of concerns. Each OpenChrom bundle supports an internationalization (i18n) and JUnit test fragment. At the moment, approximately 3000 unit tests are written and can be executed to ensure the quality of the software.

If necessary, the extension point mechanism gives the flexibility to add functions needed by users at any time. Thus, OpenChrom can be connected to other systems, as for example to LIMS (Laboratory Information Management System), databases, existing software tools or workflow systems. The object model of OpenChrom offers a convenient access to values and results from the edited chromatograms. Specialized modules take care of how to handle specific concerns, for example how to store results in an information management system. Further on, it is possible to implement bundles for specific analyses or for an automated experimentation.

OpenChrom enables several ways to edit and analyze chromatographic data. The advantage of the flexibility and the abstract architecture makes it partly difficult to get started with the platform, even if the functionality is provided by different bundles to decrease its complexity and to focus on special tasks. The intention to publish the software under an open source license is to support code contributions and to open the project for individual solutions. Moreover, the separation into bundles makes it easier for others to contribute new functionality. Further improvements will be done to optimize the current algorithms and to develop new and better filters, peak detectors and integrators.

Conclusions

OpenChrom has been designed to become an extensible cross-platform open source software for the mass spectrometric analysis of chromatographic data. It provides extension points to enable built-in import capabilities for binary or textual instrument vendors' data formats. In addition to its custom XML format it supports the Agilent Technologies, mzXML and NetCDF mass spectrometric data format. Further development is planned to support more data formats. The open source concept has been chosen to initiate the contributions of third parties, as it depends on the ideas and needs of the community to extend the capabilities of the presented concept. OpenChrom offers extension points that enable the implementation of different baseline detectors as well as peak detectors and integrators. Furthermore, there is an option to implement filters, used to increase the chromatographic quality. The framework offers a full support of do/undo and redo operations. The examples Bioclipse and BioSunMS show how to use the Eclipse Rich Client Platform in a specific way, but no software has been published until now that is capable to import binary chromatographic files natively, offers support to edit and analyze chromatograms and makes it possible to implement new algorithms and methods. As it is open source, everybody has the possibility to inspect the implemented algorithms and methods, especially for verification. OpenChrom is a software with a special focus on the editing and evaluation of mass spectrometric chromatographic data. OpenChrom will be hopefully extended by contributing developers, scientists and companies in the future.

Availability and requirements

Project name: OpenChrom

Project homepage: http://​www.​openchrom.​net

Operating systems: Platform independent

Programming language: Java

Java Runtime Environment: Sun/Oracle JVM 1.6.0, OpenJDK

Minimum RAM: 500 MB

Minimum Processor: 1 GHz

Commercial restrictions: None

OpenChrom is available for download free of charge from the project home page.

The Agilent data file input converter must be installed separately using the OpenChrom update mechanism. The instructions how to install the converter can be found at the following website: https://marketplace.openchrom.net/.

OpenChrom is licensed under the Eclipse Public License 1.0 (EPL). The EPL is an OSI approved open source license that ensures, that the source code will remain open source. OpenChrom uses some third party libraries that are partly published under different open source licenses. All third party libraries are available in separated bundles, to ensure that no license conflicts occur. The third party library bundles are published under the Apache, LGPL, AGPL and EPL license, depending on the bundle. The GPL licenses are viral, it means that derivative works must be published under the GPL license too. The EPL and Eclipse Rich Client Platform enable a different licensing for the bundles, as a bundle using methods of another bundle can not be seen as a derivative work, though it only uses its interfaces.

Declarations

Acknowledgements

The authors thank all participants at the Department of Wood Science (University of Hamburg, Germany) for their support and their helpful suggestions.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors' contributions

PW designed and implemented the core API (Application Programming Interface), the software and its extension points. PW drafted most of the manuscript. JO gave feedback and corrected the manuscript. All authors performed extensive testing of the software and approved the final manuscript.

References

  1. Biller, J.E.; Herlihy, W.C.; Biemann, K. (1977). "Identification of the components of complex mixtures by GC-MS". Abstracts Of Papers Of The American Chemical Society 173 (MAR20): 23–23. http://pubs.acs.org/doi/abs/10.1021/bk-1977-0054.ch002. 
  2. 2.0 2.1 Biller, J.E.; Biemann, K. (1974). "Reconstructed Mass Spectra, A Novel Approach for the Utilization of Gas Chromatograph—Mass Spectrometer Data". Analytical Letters 7 (7): 515–528. doi:10.1080/00032717408058783. 
  3. Dromey, R.G.; Stefik, M.J.; Rindfleisch, T.C.; Duffield, A.M. (1976). "Extraction of mass spectra free of background and neighboring component contributions from gas chromatography/mass spectrometry data". Analytical Chemistry 48 (9): 1368–1375. doi:10.1021/ac50003a027. 
  4. 4.0 4.1 Colby, B.N. (1992). "Spectral deconvolution for overlapping GC/MS components". Journal of the American Society for Mass Spectrometry 3 (5): 558–562. doi:10.1016/1044-0305(92)85033-G. PMID 24234499. 
  5. Hindmarch, P.; Demir, C.; Brereton, R.G. (1996). "Deconvolution and spectral clean-up of two-component mixtures by factor analysis of gas chromatographic–mass spectrometric data". Analyst 121 (8): 993-1001. doi:10.1039/AN9962100993. 
  6. Halket, J.M.; Przyborowska, A.; Stein, S.E. et al. (1999). "Deconvolution gas chromatography/mass spectrometry of urinary organic acids – potential for pattern recognition and automated identification of metabolic disorders". Rapid Communications In Mass Spectrometry 13 (4): 279–284. doi:10.1002/(SICI)1097-0231(19990228)13:4<279::AID-RCM478>3.0.CO;2-I. PMID 10097403. 
  7. Kong, H.W.; Ye, F.; Lu, X.; Guo, L.; Tian, J.; Xu, G.W. (2005). "Deconvolution of overlapped peaks based on the exponentially modified Gaussian model in comprehensive two-dimensional gas chromatography". Journal of Chromatography A 1086 (1–2): 160–164. doi:10.1016/j.chroma.2005.05.103. PMID 16130668. 
  8. Taylor, J.; Goodacre, R.; Wade, W.G. (1998). "The deconvolution of pyrolysis mass spectra using genetic programming: Application to the identification of some Eubacterium species". FEMS Microbiology Letters 160 (2): 237–246. doi:10.1111/j.1574-6968.1998.tb12917.x. PMID 9532743. 
  9. 9.0 9.1 Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. (1996). "Backfolding applied to differential gas chromatography/mass spectrometry as a mathematical enhancement of chromatographic resolution". Journal Of Mass Spectrometry 31 (5): 509–516. doi:10.1002/(SICI)1096-9888(199605)31:5<509::AID-JMS323>3.0.CO;2-B. 
  10. Pool, W.G.; deLeeuw, J.W.; vandeGraaf, B. (1997). "Automated extraction of pure mass spectra from gas chromatographic/mass spectrometric data". Journal Of Mass Spectrometry 32 (4): 438–443. doi:10.1002/(SICI)1096-9888(199704)32:4<438::AID-JMS499>3.0.CO;2-N. 
  11. Davies, A. (1998). "The new Automated Mass Spectrometry Deconvolution and Identification System (AMDIS)". Spectrometry Europe 10 (3): 22–26. 
  12. 12.0 12.1 Stein, S.E. (1999). "An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data". Journal Of the American Society for Mass Spectrometry 10 (8): 770–781. doi:10.1016/S1044-0305(99)00047-1. 
  13. "AMDIS". The National Institute of Standards and Technology. http://chemdata.nist.gov/dokuwiki/doku.php?id=chemdata:amdis. 
  14. Windig, W.; Smith, W.F. (2007). "Chemometric analysis of complex hyphenated data: Improvements of the component detection algorithm". Journal of Chromatography A 1158 (1–2): 251–257. doi:10.1016/j.chroma.2007.03.081. PMID 17418223. 
  15. Windig, W.; Phalp, J.M.; Payne, A.W. (1996). "A noise and background reduction method for component detection in liquid chromatography/mass spectrometry". Analytical Chemistry 68 (20): 3602–3606. doi:10.1021/ac960435y. 
  16. "ACD/Labs". Advanced Chemistry Development, Inc. http://www.acdlabs.com/. 
  17. "PALISADE". Palisade Corporation. http://www.palisade.com/. 
  18. "Network Common Data Form (NetCDF)". University Corporation for Atmospheric Research. http://www.unidata.ucar.edu/software/netcdf/. 
  19. Pedrioli, P.G.A.; Eng, J.K.; Hubley, R. (2004). "A common open representation of mass spectrometry data and its application to proteomics research". Nature Biotechnology 22 (11): 1459–1466. doi:10.1038/nbt1031. PMID 15529173. 
  20. Falkner, J.A.; Falkner, J.W.; Andrews, P.C. (2007). "ProteomeCommons.org IO Framework: Reading and writing multiple proteomics data formats". Bioinformatics 23 (2): 262–263. doi:10.1093/bioinformatics/btl573. PMID 17121776. 
  21. "Scientific Linux". Wikipedia. Wikimedia Foundation, Inc. https://en.wikipedia.org/wiki/Scientific_Linux. 
  22. "Wienux". Wikipedia. Wikimedia Foundation, Inc. https://en.wikipedia.org/wiki/Wienux. 
  23. "Das Projekt LiMux". Portal München Betriebs-GmbH & Co. KG. http://www.muenchen.de/rathaus/Stadtverwaltung/Direktorium/LiMux.html. 
  24. 24.0 24.1 Alfassi, Z.B. (2004). "On the normalization of a mass spectrum for comparison of two spectra". Journal of the American Society for Mass Spectrometry 15 (3): 385-387. doi:10.1016/j.jasms.2003.11.008. PMID 14998540. 
  25. Spjuth, O.; Helmus, T.; Willighagen, E.L. et al. (2007). "Bioclipse: An open source workbench for chemo- and bioinformatics". BMC Bioinformatics 8: 59. doi:10.1186/1471-2105-8-59. PMC PMC1808478. PMID 17316423. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1808478. 
  26. "mMass - Open Source Mass Spectrometry Tool". Martin Strohalm. Archived from the original on 27 August 2009. https://web.archive.org/web/20090827071924/http://mmass.biographics.cz/. 
  27. "The COMSPARI Homepage". J. Katz and J. Hau. http://www.biomechanic.org/comspari/. 
  28. "Fityk home". Institute of High Pressure Physics of the Polish Academy of Sciences. Archived from the original on 04 March 2010. https://web.archive.org/web/20100304192315/http://www.unipress.waw.pl/fityk. 
  29. Cao, Y.; Wang, N.; Ying, X.M. et al. (2009). "BioSunMS: A plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry". BMC Medical Informatics and Decision Making 9: 13. doi:10.1186/1472-6947-9-13. PMC PMC1808478. PMID 17316423. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1808478. 
  30. Steinbeck, C.; Han, Y.Q.; Kuhn, S. et al. (2003). "The Chemistry Development Kit (CDK): An open-source Java library for Chemo- and Bioinformatics". Journal of Chemical Information and Computer Sciences 43 (2): 493–500. PMID 12653513. 
  31. Sturm, M.; Bertsch, A.; Gropl, C. et al. (2008). "OpenMS – An open-source software framework for mass spectrometry". BMC Bioinformatics 9: 163. doi:10.1186/1471-2105-9-163. PMC PMC2311306. PMID 18366760. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2311306. 
  32. "Sashimi". SourceForge. http://sourceforge.net/projects/sashimi/. 
  33. "Seattle Proteome Center (SPC) - Proteomics Tools". Institute for System Biology. http://tools.proteomecenter.org/. 
  34. "Rich Client Platform". The Eclipse Foundation. http://wiki.eclipse.org/Rich_Client_Platform. 
  35. Horstmann, C.S.; Cornell, G. (2001). Core Java 2: Fundamentals. Upper Saddle River, NJ: Prentice Hall Professional. pp. 806. ISBN 9780130894687. https://books.google.com/books?id=W6bomXWB-TYC. 
  36. Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and differentiation of data by simplified least squares procedures". Analytical Chemistry 36 (8): 1627–1639. doi:10.1021/ac60214a047. 
  37. McLafferty, F.W.; Zhang, M.Y.; Stauffer, D.B.; Loh, S.Y. (1998). "Comparison of algorithms and databases for matching unknown mass spectra". Journal of the American Society for Mass Spectrometry 9 (1): 92-95. doi:10.1016/S1044-0305(97)00235-3. PMID 9679594. 
  38. Loh, S.Y.; McLafferty, F.W. (1991). "Exact mass probability based matching of high-resolution unknown mass spectra". Analytical Chemistry 63 (6): 546–550. doi:10.1021/ac00006a002. 
  39. Damen, H.; Henneberg, D.; Weimann, B. (1978). "Siscom — a new library search system for mass spectra". Analytica Chimica Acta 103 (4): 289-302. doi:10.1016/S0003-2670(01)83095-6. 
  40. Alfassi, Z.B. (2005). "Vector analysis of multi-measurements identification". Journal of Radioanalytical and Nuclear Chemistry 266 (2): 245–250. doi:10.1007/s10967-005-0899-y. 
  41. "Frontier Lab". Frontier Laboratories Ltd. http://www.frontier-lab.com/. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In the "Conclusion" section of the abstract, "software" was changed to "chromatography software" to encourage internal linking to the CDMS entry on the wiki. Two blank rows were removed from Table 1. A few URLS have changed since this was published in 2010, and they have been updated as needed.