Journal:An integrated data analytics platform
Full article title | An integrated data analytics platform |
---|---|
Journal | Frontiers in Marine Science |
Author(s) |
Armstrong, Edward M.; Bourassa, Mark A.; Cram, Thomas A.; DeBellis, Maya; Elya, Jocelyn; Greguska III, Frank R.; Huang, Thomas; Jacob, Joseph C.; Ji, Zaihua; Jiang, Yongyao; Li, Yun; Quach, Nga; McGibbney, Lewis; Smith, Shawn; Tsontos, Vardis M.; Wilson, Brian; Worley, Steven J.; Yang, Chaowei; Yam, Elizabeth |
Author affiliation(s) |
NASA Jet Propulsion Laboratory, Center for Ocean-Atmospheric Prediction Studies, National Center for Atmospheric Research, George Mason University |
Primary contact | Email: thomas dot huang at jpl dot nasa dot gov |
Year published | 2019 |
Volume and issue | 6 |
Page(s) | 354 |
DOI | 10.3389/fmars.2019.00354 |
ISSN | 2296-7745 |
Distribution license | Creative Commons Attribution 4.0 International |
Website | https://www.frontiersin.org/articles/10.3389/fmars.2019.00354/full |
Download | https://www.frontiersin.org/articles/10.3389/fmars.2019.00354/pdf (PDF) |
This article should not be considered complete until this message box has been removed. This is a work in progress. |
Abstract
An integrated science data analytics platform (ISDAP) is an environment that enables the confluence of resources for scientific investigation. It harmonizes data, tools, and computational resources to enable the research community to focus on the investigation rather than spending time on security, data preparation, management, etc. OceanWorks is a National Aeronautics and Space Administration (NASA) technology integration project to establish a cloud-based integrated ocean science data analytics platform for managing ocean science research data at NASA’s Physical Oceanography Distributed Active Archive Center (PO.DAAC). The platform focuses on advancement and maturity by bringing together several NASA open-source, big data projects for parallel analytics, anomaly detection, in situ-to-satellite data matching, quality-screened data subsetting, search relevancy, and data discovery. Our communities are relying on data available through distributed data centers to conduct their research. In typical investigations, scientists would (1) search for data, (2) evaluate the relevance of that data, (3) download it, and (4) then apply algorithms to identify trends, anomalies, or other attributes of the data. Such a workflow cannot scale if the research involves a massive amount of data or multi-variate measurements. With the upcoming NASA Surface Water and Ocean Topography (SWOT) mission expected to produce over 20 petabytes (PB) of observational data during its three-year nominal mission, the volume of data will challenge all existing earth science data archival, distribution, and analysis paradigms. This paper discusses how OceanWorks enhances the analysis of physical ocean data where the computation is done on an elastic cloud platform next to the archive to deliver fast, web-accessible services for working with oceanographic measurements.
Keywords: big data, cloud computing, ocean science, data analysis, matchup, anomaly detection, open source
References
Notes
This presentation is faithful to the original, with only a few minor changes to presentation, grammar, and punctuation. In some cases important information was missing from the references, and that information was added.