Difference between revisions of "Journal:VennDiagramWeb: A web application for the generation of highly customizable Venn and Euler diagrams"

From LIMSWiki
Jump to navigationJump to search
(Saving and adding more.)
(Saving and adding more.)
Line 48: Line 48:
{| border="0" cellpadding="5" cellspacing="0" width="567px"
{| border="0" cellpadding="5" cellspacing="0" width="567px"
  |-
  |-
   | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 1.''' '''Figure 1.''' Euler and Venn diagrams produced by ''VennDiagramWeb'' each depicting three sets: x1 = {7,8}, x2 = {4,6,7}, x3 = {4,7,8,10}. '''a'''. An Euler diagram, produced with euler.d = TRUE and scaled = TRUE. '''b'''. A Venn diagram, produced with euler.d = FALSE and scaled = FALSE.</blockquote>
   | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 1.''' Euler and Venn diagrams produced by ''VennDiagramWeb'' each depicting three sets: x1 = {7,8}, x2 = {4,6,7}, x3 = {4,7,8,10}. '''a'''. An Euler diagram, produced with euler.d = TRUE and scaled = TRUE. '''b'''. A Venn diagram, produced with euler.d = FALSE and scaled = FALSE.</blockquote>
  |-  
  |-  
|}
|}
Line 56: Line 56:


We believe a graphic user interface for VennDiagram could bring the package to a wider audience and enhance workflows for pipeline developers by providing a real-time framework for plotting optimization. There are many existing web interfaces for creating Venn diagrams, including Venny<ref name="OliverosVenny">{{cite web |url=http://bioinfogp.cnb.csic.es/tools/venny/index.html |title=Venny |author=Oliveros, J.C. |work=BioinfoGP |publisher=Spanish National Biotechnology Centre |accessdate=08 June 2016}}</ref>, BioVenn<ref name="HulsenBioVenn">{{cite web |url=http://www.cmbi.ru.nl/cdd/biovenn/ |title=BioVenn - A web application for the comparison and visualization of biological lists using area-proportional Venn diagrams |author=Hulsen, T. |publisher=Centre for Molecular and Biomolecular Informatics |accessdate=08 June 2016}}</ref>, GeneVenn<ref name="PiroozniaGeneVenn">{{cite web |url=http://genevenn.sourceforge.net/ |title=GeneVenn |author=Pirooznia, M. |work=SourceForge |date=October 2006 |accessdate=08 June 2016}}</ref>, and those from the CRP-Sante Microarray Centre<ref name="MicroVenn">{{cite web |url=http://www.bioinformatics.lu/venn.php |title=Venn Diagram |author=Microarray Center |publisher=Centre de Recherche Public Santé |accessdate=08 June 2016}}</ref> and the Universiteit Gent.<ref name="VIBCalculate">{{cite web |url=http://bioinformatics.psb.ugent.be/webtools/Venn/ |title=Calculate and draw custom Venn diagrams |author=VIB / UGent |publisher=Bioinformatics & Evolutionary Genomics |accessdate=08 June 2016}}</ref> These tools perform the necessities of creating a Venn diagram, but are missing many features required to create completely customized publication-quality plots, and have no means of exporting code for integration in large scale analysis pipelines.
We believe a graphic user interface for VennDiagram could bring the package to a wider audience and enhance workflows for pipeline developers by providing a real-time framework for plotting optimization. There are many existing web interfaces for creating Venn diagrams, including Venny<ref name="OliverosVenny">{{cite web |url=http://bioinfogp.cnb.csic.es/tools/venny/index.html |title=Venny |author=Oliveros, J.C. |work=BioinfoGP |publisher=Spanish National Biotechnology Centre |accessdate=08 June 2016}}</ref>, BioVenn<ref name="HulsenBioVenn">{{cite web |url=http://www.cmbi.ru.nl/cdd/biovenn/ |title=BioVenn - A web application for the comparison and visualization of biological lists using area-proportional Venn diagrams |author=Hulsen, T. |publisher=Centre for Molecular and Biomolecular Informatics |accessdate=08 June 2016}}</ref>, GeneVenn<ref name="PiroozniaGeneVenn">{{cite web |url=http://genevenn.sourceforge.net/ |title=GeneVenn |author=Pirooznia, M. |work=SourceForge |date=October 2006 |accessdate=08 June 2016}}</ref>, and those from the CRP-Sante Microarray Centre<ref name="MicroVenn">{{cite web |url=http://www.bioinformatics.lu/venn.php |title=Venn Diagram |author=Microarray Center |publisher=Centre de Recherche Public Santé |accessdate=08 June 2016}}</ref> and the Universiteit Gent.<ref name="VIBCalculate">{{cite web |url=http://bioinformatics.psb.ugent.be/webtools/Venn/ |title=Calculate and draw custom Venn diagrams |author=VIB / UGent |publisher=Bioinformatics & Evolutionary Genomics |accessdate=08 June 2016}}</ref> These tools perform the necessities of creating a Venn diagram, but are missing many features required to create completely customized publication-quality plots, and have no means of exporting code for integration in large scale analysis pipelines.
==Implementation==
Our first step was to improve upon the existing VennDiagram R package.<ref name="ChenVenn11" /> A series of changes were made to enhance code quality, including significant refactoring and documentation and exposure of several helper functions. Major feature additions included the ability to create quintuple Venn diagrams. These are highly complex figures, but maintain symmetry and are still easily interpretable (Fig. 2). A parameter to allow users to set a scale by which the areas and labels of the categories will be adjusted to was added. The ability to display proportions of the total population contained within the areas as percentages was also introduced. Many users requested a feature to display a text table of the partitions of the Venn diagram, which is now supported by the package. Users can also now specify an argument which will force the Venn diagram to only consider unique elements in each category when tabulating the sets. In order to have more comprehensive logging which can be integrated with other pipelines which may wrap the Venn diagram code, we now use Futile Logger to log the parameters and sets of the Venn diagrams that are generated at runtime.<ref name="RoweFutile">{{cite web |url=https://cran.r-project.org/web/packages/futile.logger/index.html |title=futile.logger: A Logging Utility for R |author=Rowe, B.L.Y. |publisher=Comprehensive R Archive Network |accessdate=08 June 2016}}</ref> Finally, users can now choose file types of tiff, png or svg, and can alternatively choose to not output a file, but instead output a list of R graphical objects which compose the entirety of the plot. The user can then modify and re-render the plot as desired.
[[File:Fig2 Lam BMCBioinformatics2016 17.gif|472px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="472px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 2.''' A quintuple set Venn diagram</blockquote>
|-
|}
|}
''VennDiagramWeb'' was written using the R statistical programming language and leverages the VennDiagram R package.<ref name="ChenVenn11" /> It uses the web application server Shiny to create a robust graphical user interface which can execute R code on data and parameters dynamically as they change.<ref name="RStudioShiny">{{cite web |url=http://shiny.rstudio.com/ |title=Shiny |publisher=RStudio, Inc |date=2016 |accessdate=08 June 2016}}</ref> Using the Shiny web application server enabled us to create a solution that is composed nearly purely of R from end to end. Using a single language allows for very tight integration of security and error handling functionality. We are able to parse any inputs provided by the user and, using functionality built into the language, inspect those inputs to ensure there is no attempt to inject malicious code. The architecture of the web application is based around the code for the user interface and the code for the server. The user interface is defined by a series of widgets which accept the parameters and data files from the user, and display the rendered plot reactively as elements are changed. The server handles all arguments and data, ensures that they are safe and valid, and performs generation of figures.
==Results==
===User interface===
''VennDiagramWeb'' is a graphical user interface for the venn.diagram function.<ref name="ChenPackage16">{{cite web |url=https://cran.r-project.org/web/packages/VennDiagram/VennDiagram.pdf |format=PDF |title=Package 'VennDiagram' |author=Chen, H. |publisher=Comprehensive R Archive Network |date=18 April 2016 |accessdate=08 June 2016}}</ref> The application starts with a simple example loaded (Fig. 3). Users can also choose to load an example configuration using the drop-down menu in the top right area of the sidebar. Users can modify the parameters of the venn.diagram function using the sidebar, and the resultant plot is generated instantly in the center panel (Fig. 4). The parameters for venn.diagram are divided into eleven sections, allowing the user to quickly find parameters of interest. If the user is familiar with the R package VennDiagram, they can also search for a parameter by name. At the bottom of the sidebar, the user can download the plot displayed as an image. On the bottom bar, the user can choose the datasets plotted, preview the datasets and data partitions, view the R code used to generate the plot, and access proper citation information for VennDiagramWeb.
[[File:Fig3 Lam BMCBioinformatics2016 17.gif|778px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="778px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 3.''' The ''VennDiagramWeb'' user interface</blockquote>
|-
|}
|}
[[File:Fig4 Lam BMCBioinformatics2016 17.gif|472px]]
{{clear}}
{|
| STYLE="vertical-align:top;"|
{| border="0" cellpadding="5" cellspacing="0" width="472px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 4.''' A Venn diagram generated using ''VennDiagramWeb'', annotated to indicate the parameters corresponding to features of the plot. The panels in green highlight the parameters of the tool, showing with arrows what elements of the Venn diagram are directly affected. All green elements are not generated as part of the Venn diagram.</blockquote>
|-
|}
|}
===Features===
''VennDiagramWeb'' is meant to integrate seamlessly into scientific plot generation workflows. To this end we have included several key features: data file uploads, access to underlying R code, image downloads, and multiple workspaces.
====Data file uploads====
Users can upload up to five data files, up to two megabytes each, for use in generating their diagrams, which is quite large for Venn diagrams. These datasets are made available to the editor as dataframes titled data1, data2, etc. through to data5. This feature is found in the first tab on the bottom bar. The file uploading system is designed to accept tables output by R using the write.table function, but can accept any simple text table format, such as csv, tsv, and txt. Some options are available for file format customization, generally pertaining to the options of write.table. These options allow the user to specify the column separator, the presence of headers and/or row names, and whether or not values are contained in quotes. The user is able to preview the first four lines of the datasets available. As an example, we have provided sample data for the reader to upload (Additional file 1) and visualize (Additional file 2).<ref name="SunCross14">{{cite journal |title=Cross-species transcriptomic analysis elucidates constitutive aryl hydrocarbon receptor activity |journal=BMC Genomics |author=Sun, R.X; Chong, L.C.; Simmons, T.T. et al. |volume=15 |pages=1053 |year=2014 |doi=10.1186/1471-2164-15-1053 |pmid=25467400 |pmc=PMC4301818}}</ref>
====R code access====
As ''VennDiagramWeb'' is a graphic overlay for the R package VennDiagram, the user can access the generated R code to reproduce the plot as it appears on screen. This is useful because it allows the user to rapidly prototype the appearance of their desired Venn diagram, and then download the corresponding code for integration into their own pipelines. It is also a necessary feature because though a web application is user-friendly, many users may wish to avoid uploading their data to an external website, or may be restricted from doing so by privacy laws.<ref name="GPOpubl191">{{cite web |url=https://www.congress.gov/104/plaws/publ191/PLAW-104publ191.htm |title=Health Insurance Portability and Accountability Act of 1996 |author=104th Congress |publisher=U.S. Government Printing Office |date=21 August 1996}}</ref> For this reason, users can experiment with ''VennDiagramWeb'' for diagram formatting and customization on toy datasets, and then download the resulting code for use with their real data.
====Image downloads====
''VennDiagramWeb'' allows users to download publication-quality images of the Venn diagram displayed on screen. Users can choose the image format as tiff, png or svg, as well as resolution and physical size in inches.
====Multiple workspaces====
The web application has an interface for creating and switching between tabs. This allows users to create several different plots simultaneously. Each workspace tab is distinct and does not share data or parameters to avoid unintentional effects on diagrams which are being generated concurrently.
==Discussion==
===Benefits===
The datasets analyzed in biology and particularly in genomics can be enormous in scope and complexity, and we can only expect them to grow.<ref name="StephensBig15">{{cite journal |title=Big Data: Astronomical or Genomical? |journal=PLiS Biology |author=Stephens, Z.D.; Lee, S.Y.; Faghri, F. et al. |volume=13 |issue=7 |pages=e1002195 |year=2015 |doi=10.1371/journal.pbio.1002195 |pmid=26151137 |pmc=PMC4494865}}</ref> The rise of big data has lead to increasing attention to the field of data visualization.<ref name="SridharanData15">{{cite web |url=https://icrunchdata.com/data-visualization-rosetta-stone-data-science/ |title=Data Visualization - The Rosetta Stone of Data Science |author=Sridharan, M. |work=iCrunchData |publisher= REDN Enterprises, LLC |date=20 April 2015 |accessdate=08 June 2016}}</ref> As our datasets increase in size and our analyses increase in complexity, data visualization becomes crucial in allowing us to gain insight, see patterns and elucidate further areas of study in our experiments.<ref name="SridharanData15" /><ref name="WongData">{{cite web |url=http://www.broadinstitute.org/vis |archiveurl=https://web.archive.org/web/20150906045214/http://www.broadinstitute.org/vis |title=Data Visualization Initiative |author=Wong, B. |publisher=Broad Institute |archivedate=06 September 2015 |accessdate=08 June 2016}}</ref><ref name="SharpWhy">{{cite web |url=http://sharpsightlabs.com/blog/2015/02/10/start-with-data-visualization-manipulation/ |title=Why you should start by learning data visualization and manipulation |work=Sharp Sight Labs |date=10 February 2015 |accessdate=08 June 2016}}</ref> These visualizations have to be meaningful representations of the data.<ref name="WongData" /> Unsurprisingly, the figures we use to communicate our insights can become more convoluted with bigger and more complex data and, if poorly designed, risk confusing what they were meant to make clear. Indeed, the difficulty of achieving visual clarity in a diagram often increases in tandem with the necessity to do so.


==References==
==References==
Line 61: Line 125:


==Notes==
==Notes==
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.
This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In one case, a URL was dead and an archived version of the page was included. In another, the live URL had changed and was updated in the text.


<!--Place all category tags here-->
<!--Place all category tags here-->

Revision as of 13:15, 12 October 2016

Full article title VennDiagramWeb: A web application for the generation of highly customizable Venn and Euler diagrams
Journal BMC Bioinformatics
Author(s) Lam, F.; Lalansingh, C.M.; Babaran, H.E.; Wang, Z.; Prokepec, S.D.; Fox, N.S.; Boutros, P.C.
Author affiliation(s) Ontario Institute for Cancer Research, University of Toronto
Primary contact Email: Paul dot Boutros at oicr dot on dot ca
Year published 2016
Volume and issue 17
Page(s) 401
DOI 10.1186/s12859-016-1281-5
ISSN 1471-2105
Distribution license Creative Commons Attribution 4.0 International
Website http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1281-5
Download http://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/s12859-016-1281-5 (PDF)

Abstract

Background

Visualization of data generated by high-throughput, high-dimensionality experiments is rapidly becoming a rate-limiting step in computational biology. There is an ongoing need to quickly develop high-quality visualizations that can be easily customized or incorporated into automated pipelines. This often requires an interface for manual plot modification, rapid cycles of tweaking visualization parameters, and the generation of graphics code. To facilitate this process for the generation of highly-customizable, high-resolution Venn and Euler diagrams, we introduce VennDiagramWeb: a web application for the widely used VennDiagram R package. VennDiagramWeb is hosted at http://venndiagram.res.oicr.on.ca/.

Results

VennDiagramWeb allows real-time modification of Venn and Euler diagrams, with parameter setting through a web interface and immediate visualization of results. It allows customization of essentially all aspects of figures, but also supports integration into computational pipelines via download of R code. Users can upload data and download figures in a range of formats, and there is exhaustive support documentation.

Conclusions

VennDiagramWeb allows the easy creation of Venn and Euler diagrams for computational biologists, and indeed many other fields. Its ability to support real-time graphics changes that are linked to downloadable code that can be integrated into automated pipelines will greatly facilitate the improved visualization of complex datasets. For application support please contact PPaul dot Boutros at oicr dot on dot ca.

Background

Data visualization is a growing and important area of computational biology that demands high quality images which highlight the critical aspects of data. To elucidate all essential features of the data, one must perform a wide range of adjustments to various aspects of the plot, which can be a time-consuming process. Having fine-grained control over the parameters which define colors, fonts, label placements, element sizes, overall resolution, etc. leads to more effective plots which can convey necessary details in a publication-ready manner.

Pipelines facilitate automated, robust and reproducible data generation and analysis. Plotting is an important tool for both validation and reporting of results. Incorporating effective plots into these pipelines requires code that has been written specifically for each plot, as there is no single approach which can be applied to varied datasets. As a result, bioinformaticians often engage in long cycles of sequentially modifying plotting code, executing it, and observing the ultimate figure, until an optimum is reached. This process is inefficient and time-consuming.

Venn and Euler diagrams are used frequently in computational biology to visualize the interactions between multiple sets of data. In genomics especially, a common assay is to compare gene lists occurring from separate analyses[1], such as contrasting lists of differentially abundant RNAs following drug treatments or lists of mutated genes across disease types. Venn diagrams are typically depicted as partially intersecting circles or other closed curves such that there are 2n separated regions as depicted by overlapping closed curves.[2] While Venn diagrams always depict all 2n possible regions, Euler diagrams can omit regions under which there are zero values in that region’s subset. This allows Euler diagrams to be less visually complex, by depicting only a subset of all possible regions.[3] VennDiagramWeb facilitates the creation of both Venn and Euler diagrams, using the argument euler.d and scaled (both default TRUE). By default, VennDiagramWeb will create an Euler diagram where possible, displaying only regions containing one or more values. Users can force Venn diagrams only by setting both euler.d = FALSE and scaled = FALSE (Fig. 1).


Fig1 Lam BMCBioinformatics2016 17.gif

Figure 1. Euler and Venn diagrams produced by VennDiagramWeb each depicting three sets: x1 = {7,8}, x2 = {4,6,7}, x3 = {4,7,8,10}. a. An Euler diagram, produced with euler.d = TRUE and scaled = TRUE. b. A Venn diagram, produced with euler.d = FALSE and scaled = FALSE.

The R statistical programming language has widespread use in the bioinformatics field, and so we developed VennDiagram to generate plots in this language.[4] The initial release has proven to be robust and useful, and has garnered 186 citations. As of June 8, 2016 the package has been downloaded from the Comprehensive R Archive Network (CRAN) over 75,000 times since its release in March 2011. Over half of these (>40,000) occurred in 2015 alone, highlighting growing popularity.[5]

We believe a graphic user interface for VennDiagram could bring the package to a wider audience and enhance workflows for pipeline developers by providing a real-time framework for plotting optimization. There are many existing web interfaces for creating Venn diagrams, including Venny[6], BioVenn[7], GeneVenn[8], and those from the CRP-Sante Microarray Centre[9] and the Universiteit Gent.[10] These tools perform the necessities of creating a Venn diagram, but are missing many features required to create completely customized publication-quality plots, and have no means of exporting code for integration in large scale analysis pipelines.

Implementation

Our first step was to improve upon the existing VennDiagram R package.[4] A series of changes were made to enhance code quality, including significant refactoring and documentation and exposure of several helper functions. Major feature additions included the ability to create quintuple Venn diagrams. These are highly complex figures, but maintain symmetry and are still easily interpretable (Fig. 2). A parameter to allow users to set a scale by which the areas and labels of the categories will be adjusted to was added. The ability to display proportions of the total population contained within the areas as percentages was also introduced. Many users requested a feature to display a text table of the partitions of the Venn diagram, which is now supported by the package. Users can also now specify an argument which will force the Venn diagram to only consider unique elements in each category when tabulating the sets. In order to have more comprehensive logging which can be integrated with other pipelines which may wrap the Venn diagram code, we now use Futile Logger to log the parameters and sets of the Venn diagrams that are generated at runtime.[11] Finally, users can now choose file types of tiff, png or svg, and can alternatively choose to not output a file, but instead output a list of R graphical objects which compose the entirety of the plot. The user can then modify and re-render the plot as desired.


Fig2 Lam BMCBioinformatics2016 17.gif

Figure 2. A quintuple set Venn diagram

VennDiagramWeb was written using the R statistical programming language and leverages the VennDiagram R package.[4] It uses the web application server Shiny to create a robust graphical user interface which can execute R code on data and parameters dynamically as they change.[12] Using the Shiny web application server enabled us to create a solution that is composed nearly purely of R from end to end. Using a single language allows for very tight integration of security and error handling functionality. We are able to parse any inputs provided by the user and, using functionality built into the language, inspect those inputs to ensure there is no attempt to inject malicious code. The architecture of the web application is based around the code for the user interface and the code for the server. The user interface is defined by a series of widgets which accept the parameters and data files from the user, and display the rendered plot reactively as elements are changed. The server handles all arguments and data, ensures that they are safe and valid, and performs generation of figures.

Results

User interface

VennDiagramWeb is a graphical user interface for the venn.diagram function.[13] The application starts with a simple example loaded (Fig. 3). Users can also choose to load an example configuration using the drop-down menu in the top right area of the sidebar. Users can modify the parameters of the venn.diagram function using the sidebar, and the resultant plot is generated instantly in the center panel (Fig. 4). The parameters for venn.diagram are divided into eleven sections, allowing the user to quickly find parameters of interest. If the user is familiar with the R package VennDiagram, they can also search for a parameter by name. At the bottom of the sidebar, the user can download the plot displayed as an image. On the bottom bar, the user can choose the datasets plotted, preview the datasets and data partitions, view the R code used to generate the plot, and access proper citation information for VennDiagramWeb.


Fig3 Lam BMCBioinformatics2016 17.gif

Figure 3. The VennDiagramWeb user interface


Fig4 Lam BMCBioinformatics2016 17.gif

Figure 4. A Venn diagram generated using VennDiagramWeb, annotated to indicate the parameters corresponding to features of the plot. The panels in green highlight the parameters of the tool, showing with arrows what elements of the Venn diagram are directly affected. All green elements are not generated as part of the Venn diagram.

Features

VennDiagramWeb is meant to integrate seamlessly into scientific plot generation workflows. To this end we have included several key features: data file uploads, access to underlying R code, image downloads, and multiple workspaces.

Data file uploads

Users can upload up to five data files, up to two megabytes each, for use in generating their diagrams, which is quite large for Venn diagrams. These datasets are made available to the editor as dataframes titled data1, data2, etc. through to data5. This feature is found in the first tab on the bottom bar. The file uploading system is designed to accept tables output by R using the write.table function, but can accept any simple text table format, such as csv, tsv, and txt. Some options are available for file format customization, generally pertaining to the options of write.table. These options allow the user to specify the column separator, the presence of headers and/or row names, and whether or not values are contained in quotes. The user is able to preview the first four lines of the datasets available. As an example, we have provided sample data for the reader to upload (Additional file 1) and visualize (Additional file 2).[14]

R code access

As VennDiagramWeb is a graphic overlay for the R package VennDiagram, the user can access the generated R code to reproduce the plot as it appears on screen. This is useful because it allows the user to rapidly prototype the appearance of their desired Venn diagram, and then download the corresponding code for integration into their own pipelines. It is also a necessary feature because though a web application is user-friendly, many users may wish to avoid uploading their data to an external website, or may be restricted from doing so by privacy laws.[15] For this reason, users can experiment with VennDiagramWeb for diagram formatting and customization on toy datasets, and then download the resulting code for use with their real data.

Image downloads

VennDiagramWeb allows users to download publication-quality images of the Venn diagram displayed on screen. Users can choose the image format as tiff, png or svg, as well as resolution and physical size in inches.

Multiple workspaces

The web application has an interface for creating and switching between tabs. This allows users to create several different plots simultaneously. Each workspace tab is distinct and does not share data or parameters to avoid unintentional effects on diagrams which are being generated concurrently.

Discussion

Benefits

The datasets analyzed in biology and particularly in genomics can be enormous in scope and complexity, and we can only expect them to grow.[16] The rise of big data has lead to increasing attention to the field of data visualization.[17] As our datasets increase in size and our analyses increase in complexity, data visualization becomes crucial in allowing us to gain insight, see patterns and elucidate further areas of study in our experiments.[17][18][19] These visualizations have to be meaningful representations of the data.[18] Unsurprisingly, the figures we use to communicate our insights can become more convoluted with bigger and more complex data and, if poorly designed, risk confusing what they were meant to make clear. Indeed, the difficulty of achieving visual clarity in a diagram often increases in tandem with the necessity to do so.

References

  1. Bardou, P.; Mariette, J.; Escudié, F. et al. (2014). "jvenn: An interactive Venn diagram viewer". BMC Bioinformatics 15: 293. doi:10.1186/1471-2105-15-293. PMC PMC4261873. PMID 25176396. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261873. 
  2. "Venn Diagram". MathWorld. Wolfram Research, Inc. http://mathworld.wolfram.com/VennDiagram.html. Retrieved 22 August 2016. 
  3. Rodger, P. (22 September 2004). "Venn Diagrams, Euler Diagrams and Leibniz". Euler Diagrams 2004. https://www.cs.kent.ac.uk/events/conf/2004/euler/eulerdiagrams.html. Retrieved 22 August 2016. 
  4. 4.0 4.1 4.2 Chen, H.; Boutros, P.C. (2011). "VennDiagram: A package for the generation of highly-customizable Venn and Euler diagrams in R". BMC Bioinformatics 12: 35. doi:10.1186/1471-2105-12-35. PMC PMC3041657. PMID 21269502. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041657. 
  5. "cranlogs: Download Logs from the RStudio CRAN Mirror". GitHub. GitHub, Inc. https://github.com/metacran/cranlogs. Retrieved 08 June 2016. 
  6. Oliveros, J.C.. "Venny". BioinfoGP. Spanish National Biotechnology Centre. http://bioinfogp.cnb.csic.es/tools/venny/index.html. Retrieved 08 June 2016. 
  7. Hulsen, T.. "BioVenn - A web application for the comparison and visualization of biological lists using area-proportional Venn diagrams". Centre for Molecular and Biomolecular Informatics. http://www.cmbi.ru.nl/cdd/biovenn/. Retrieved 08 June 2016. 
  8. Pirooznia, M. (October 2006). "GeneVenn". SourceForge. http://genevenn.sourceforge.net/. Retrieved 08 June 2016. 
  9. Microarray Center. "Venn Diagram". Centre de Recherche Public Santé. http://www.bioinformatics.lu/venn.php. Retrieved 08 June 2016. 
  10. VIB / UGent. "Calculate and draw custom Venn diagrams". Bioinformatics & Evolutionary Genomics. http://bioinformatics.psb.ugent.be/webtools/Venn/. Retrieved 08 June 2016. 
  11. Rowe, B.L.Y.. "futile.logger: A Logging Utility for R". Comprehensive R Archive Network. https://cran.r-project.org/web/packages/futile.logger/index.html. Retrieved 08 June 2016. 
  12. "Shiny". RStudio, Inc. 2016. http://shiny.rstudio.com/. Retrieved 08 June 2016. 
  13. Chen, H. (18 April 2016). "Package 'VennDiagram'" (PDF). Comprehensive R Archive Network. https://cran.r-project.org/web/packages/VennDiagram/VennDiagram.pdf. Retrieved 08 June 2016. 
  14. Sun, R.X; Chong, L.C.; Simmons, T.T. et al. (2014). "Cross-species transcriptomic analysis elucidates constitutive aryl hydrocarbon receptor activity". BMC Genomics 15: 1053. doi:10.1186/1471-2164-15-1053. PMC PMC4301818. PMID 25467400. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4301818. 
  15. 104th Congress (21 August 1996). "Health Insurance Portability and Accountability Act of 1996". U.S. Government Printing Office. https://www.congress.gov/104/plaws/publ191/PLAW-104publ191.htm. 
  16. Stephens, Z.D.; Lee, S.Y.; Faghri, F. et al. (2015). "Big Data: Astronomical or Genomical?". PLiS Biology 13 (7): e1002195. doi:10.1371/journal.pbio.1002195. PMC PMC4494865. PMID 26151137. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4494865. 
  17. 17.0 17.1 Sridharan, M. (20 April 2015). "Data Visualization - The Rosetta Stone of Data Science". iCrunchData. REDN Enterprises, LLC. https://icrunchdata.com/data-visualization-rosetta-stone-data-science/. Retrieved 08 June 2016. 
  18. 18.0 18.1 Wong, B.. "Data Visualization Initiative". Broad Institute. Archived from the original on 06 September 2015. https://web.archive.org/web/20150906045214/http://www.broadinstitute.org/vis. Retrieved 08 June 2016. 
  19. "Why you should start by learning data visualization and manipulation". Sharp Sight Labs. 10 February 2015. http://sharpsightlabs.com/blog/2015/02/10/start-with-data-visualization-manipulation/. Retrieved 08 June 2016. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In one case, a URL was dead and an archived version of the page was included. In another, the live URL had changed and was updated in the text.