Journal:Principles of metadata organization at the ENCODE data coordination center
Full article title | Principles of metadata organization at the ENCODE data coordination center |
---|---|
Journal | Database |
Author(s) | Hong, Eurie L.; Sloan, Cricket A.; Chan, Esther T.; Davidson, Jean M.; Malladi, Venkat S.; Strattan, J. Seth; Hitz, Benjamin C.; Gabdank, Idan; Narayanan, Aditi K.; Ho, Marcus; Lee, Brian T.; Rowe, Laurence D.; Dreszer, Timothy R.; Roe, Greg R.; Podduturi, Nikhil R.; Tanaka, Forrest; Hilton, Jason A.; Cherry, J. Michael |
Author affiliation(s) | Stanford University, University of California - Santa Cruz |
Primary contact | Email: cherry at stanford dot edu |
Year published | 2016 |
Page(s) | baw001 |
DOI | 10.1093/database/baw001 |
ISSN | 1758-0463 |
Distribution license | Creative Commons Attribution 4.0 International |
Website | http://database.oxfordjournals.org/content/2016/baw001 |
Download | http://database.oxfordjournals.org/content/2016/baw001.full.pdf+html (PDF) |
This article should not be considered complete until this message box has been removed. This is a work in progress. |
Abstract
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/).
Database URL: www.encodeproject.org
Introduction
The goal of the Encyclopedia of DNA Elements (ENCODE) project is to annotate functional regions in the human and mouse genomes. Functional regions include those that code protein-coding or non-coding RNA gene products as well as regions that could have a regulatory role.[1][2] To this end, the project has surveyed the landscape of the human genome using over 35 high-throughput experimental methods in more than 250 different cell and tissue types, resulting in over 4000 experiments.[1][3] These datasets are submitted to a Data Coordinating Center (DCC), whose role is to describe, organize and provide access to these diverse datasets.[4]
References
- ↑ 1.0 1.1 ENCODE Project Consortium (2012). "An integrated encyclopedia of DNA elements in the human genome". Nature 489 (7414): 57-74. doi:10.1038/nature11247. PMC PMC3439153. PMID 22955616. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439153.
- ↑ Yue, F.; Cheng, Y.; Breschi, A. et al. (2014). "A comparative encyclopedia of DNA elements in the mouse genome". Nature 515 (7527): 355-64. doi:10.1038/nature13992. PMC PMC4266106. PMID 25409824. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4266106.
- ↑ ENCODE Project Consortium et al. (2007). "Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project". Nature 447 (7146): 799–816. doi:10.1038/nature05874. PMC PMC2212820. PMID 17571346. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2212820.
- ↑ Sloan, C.A.; Chan, E.T.; Davidson, J.M. et al. (2016). "ENCODE data at the ENCODE portal". Nucleic Acids Research 44 (D1): D726-32. doi:10.1093/nar/gkv1160. PMC PMC4702836. PMID 26527727. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702836.
Notes
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.