Journal:Geochemical biodegraded oil classification using a machine learning approach
Full article title | Geochemical biodegraded oil classification using a machine learning approach |
---|---|
Journal | Geosciences |
Author(s) | Bispo-Silva, Sizenando; Ferreira de Oliveira, Cleverson J.; de Alemar Barberes, Gabriel |
Author affiliation(s) | Centro de Pesquisas Leopoldo Américo Miguez de Mello, University of Coimbra |
Primary contact | Email: sizenando at petrobras dot com dot br |
Editors | Malvić, Tomislav; Martinez-Frias, Jesus |
Year published | 2023 |
Volume and issue | 13(11) |
Article # | 321 |
DOI | 10.3390/geosciences13110321 |
ISSN | 2076-3263 |
Distribution license | Creative Commons Attribution 4.0 International |
Website | https://www.mdpi.com/2076-3263/13/11/321 |
Download | https://www.mdpi.com/2076-3263/13/11/321/pdf?version=1698160370 (PDF) |
This article should be considered a work in progress and incomplete. Consider this article incomplete until this notice is removed. |
Abstract
Chromatographic oil analysis is an important step for the identification of biodegraded petroleum via peak visualization and interpretation of phenomena that explain the oil geochemistry. However, analyses of chromatogram components by geochemists are comparative, visual, and consequently slow. This article aims to improve the chromatogram analysis process performed during geochemical interpretation by proposing the use of convolutional neural networks (CNN), which are deep learning techniques widely used by big tech companies. Two hundred and twenty-one (221) chromatographic oil images from different worldwide basins (Brazil, USA, Portugal, Angola, and Venezuela) were used. The open-source software Orange Data Mining was used to process images by CNN. The CNN algorithm extracts, pixel by pixel, recurring features from the images through convolutional operations. Subsequently, the recurring features are grouped into common feature groups. The training result obtained a classification accuracy (CA) of 96.7% and an area under the receiver operating characteristic (ROC) curve (AUC) of 99.7%. In turn, the test result obtained a 97.6% CA and a 99.7% AUC. This work suggests that the processing of petroleum chromatographic images through CNN can become a new tool for the study of petroleum geochemistry since the chromatograms can be loaded, read, grouped, and classified more efficiently and quickly than the evaluations applied in classical methods.
Keywords: convolutional neural network, biodegradation, organic geochemistry, Orange Data Mining, chromatogram image
Introduction
References
Notes
This presentation is faithful to the original, with only a few minor changes to presentation, spelling, and grammar. In some cases important information was missing from the references, and that information was added. The footnote at the end of the original version was turned into a formal citation for this version.