10.6084/m9.figshare.3840102.v1
Benj Petre
Benj
Petre
Aurore Coince
Aurore
Coince
Sophien Kamoun
Sophien
Kamoun
Petre_Slide_CategoricalScatterplotFigShare.pptx
figshare
2016
Categorical scatterplot
boxplot
ggplot2
R script
Scientific communication
2016-09-19 16:00:13
Dataset
https://figshare.com/articles/dataset/Petre_Slide_CategoricalScatterplotFigShare_pptx/3840102
<p><b>Categorical scatterplots with R for biologists: a
step-by-step guide</b></p>
<p> </p>
<p>Benjamin
Petre<sup>1</sup>, Aurore Coince<sup>2</sup>, Sophien Kamoun<sup>1</sup></p>
<p><sup>1 </sup>The Sainsbury Laboratory, Norwich, UK; <sup>2</sup>
Earlham Institute, Norwich, UK</p>
<p> </p>
<p>Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our
practices for presenting continuous data in small sample size studies’. They
called for more scatterplot and boxplot representations in scientific papers,
which ‘allow readers to critically evaluate continuous data’ (Weissgerber <i>et al</i>.,
2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently
implemented a protocol to generate categorical scatterplots (Petre <i>et al</i>.,
2016; Dagdas <i>et al</i>., 2016).
Here we describe the three steps of this protocol: 1) formatting of the data
set in a .csv file, 2) execution of the R script to generate the graph, and 3)
export of the graph as a .pdf file. </p>
<p> </p>
<p><b>Protocol</b></p>
<p><b>• Step 1: format the data set as a .csv
file.</b> Store the data in a three-column
excel file as shown in Powerpoint slide. The first
column ‘Replicate’ indicates the biological replicates. In the example, the
month and year during which the replicate was performed is indicated. The second
column ‘Condition’ indicates the conditions of the experiment (in the example,
a wild type and two mutants called A and B). The third column ‘Value’ contains
continuous values. Save the Excel file as a .csv file (File -> Save as ->
in ‘File Format’, select .csv). This .csv file is the input file to import in
R. </p>
<p><b>• Step 2: execute the R script<i> </i></b><i>(see Notes 1 and 2)</i><b>.</b> Copy the script shown in
Powerpoint slide and paste it in the R console.
Execute the script. In the dialog box, select the input .csv file from step 1.
The categorical scatterplot will appear in a separate window. Dots represent
the values for each sample; colors indicate replicates. Boxplots are
superimposed; black dots indicate outliers. </p>
<p><b>• Step 3: save the graph as a .pdf file.</b> Shape the window at your convenience and save the
graph as a .pdf file (File -> Save as). See Powerpoint
slide for an example. </p>
<p><b> </b></p>
<p><b>Notes</b></p>
<p><b>• Note 1: install the ggplot2
package.</b> The R script requires the
package ‘ggplot2’ to be installed. To install it, Packages & Data ->
Package Installer -> enter ‘ggplot2’ in the Package Search space and click
on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install
Selected’. Install all dependencies as well. </p>
<p><b>• Note 2: use a log scale for the
y-axis.</b> To use a log scale for the
y-axis of the graph, use the command line below in place of command line #7 in
the script. </p>
<p>#7 Display the graph in a separate window. Dot colors indicate
replicates</p>
<p>graph +
geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate))
+ scale_y_log10() + theme_bw()</p>
<p> </p>
<p><b>References</b></p>
<p><b>Dagdas YF, Belhaj K, Maqbool A,
Chaparro-Garcia A, Pandey P, Petre B, <i>et
al</i>.</b> (2016) An effector of the Irish potato
famine pathogen antagonizes a host autophagy cargo receptor. <i>eLife</i> 5:e10856.</p>
<p><b>Petre B, Saunders DGO, Sklenar J,
Lorrain C, Krasileva KV, Win J, <i>et al</i>.</b> (2016) Heterologous Expression Screens in <i>Nicotiana
benthamiana</i> Identify a Candidate Effector of the Wheat Yellow Rust Pathogen
that Associates with Processing Bodies. <i>PLoS
ONE</i> 11(2):e0149035</p>
<p><b>Weissgerber TL, Milic NM, Winham SJ,
Garovic VD</b> (2015) Beyond Bar
and Line Graphs: Time for a New Data Presentation Paradigm. <i>PLoS Biol</i> 13(4):e1002128</p>
<p><a href="https://cran.r-project.org/">https://cran.r-project.org/</a></p>
<p><a href="http://ggplot2.org/">http://ggplot2.org/</a></p>
<p> </p>
<p> </p>