Completed exercises for the ninth lab
This document is meant to be used to practice after you have completed the tutorial for today’s lab. Make sure to put your name as the author of the document, above!
If you intend to work on these exercises while referring to the tutorial, there are instructions on the wiki on how to do so. You may also want to refer to past labs. Don’t forget that previous labs are linked to on the main labs website.
Important reminder: as with every time you open RStudio, don’t forget to load the libraries, below.
In the tutorial, we talked about how to run ANOVAs “manually” and how to do them using R’s built-in functions. Now, you’ll practice doing so in R, and learn to use ggplot2
functions to plot it.
As always, a version of these exercises with my answers will be posted at the end of the week to the lab website: https://faculty.bard.edu/~jdainerbest/psy-203/labslist.html
I encourage you to do two things as you work through this document: (1) Save it frequently! Hit the disk image above to save, hit Command/Ctrl and s, or go to the File menu and click save. When the document is saved, the title of it will go from red to black. (2) Practice Knitting the document by hitting the Knit button. You can do it now—it’ll create an HTML file in the folder where this file lives. Knitting will both help to identify problems and provide a complete document for me to review.
If you would like feedback on your answers, you may also upload the final version of this document (knitted, ideally) to Brightspace.
As always, you must load packages if you intend to use their functions. (“Turn on the lights.”) Run the following code chunk to load necessary packages for these exercises.
library(tidyverse)
# tidyverse loads these that we'll use today: (you don't need to load them individually)
# library(readr)
# library(dplyr)
# library(ggplot2)
# library(tidyr)
As discussed in the tutorial, we’re using data from James and colleagues (2015).
James, E. L., Bonsall, M. B., Hoppitt, L., Tunbridge, E. M., Geddes, J. R., Milton, A. L., & Holmes, E. A. (2015). Computer game play reduces intrusive memories of experimental trauma via reconsolidation-update mechanisms. Psychological Science, 26, 1201–1215. https://doi.org/10.1177%2F0956797615583071
Make sure you read the description of the study in the tutorial—it’s important for thinking about what we’re doing in these exercises.
The data was downloaded with this file. Load it using the read_csv()
command:
tetris <- read.csv("data/James.csv")
For the questions below, create your own code chunks (how?: instructions here) and insert all code into them.
We spent the tutorial talking about how to use the anova()
function after creating a model with lm()
. You’ll do the same thing with the tetris
data here.
factor()
to do the same thing we did in the tutorial, renaming the levels of Condition
from 1,2,3,4 to “Control”, etc.
tetris$Condition <- factor(tetris$Condition,
levels = 1:4,
labels = c("Control", "Reactivation+Tetris",
"Tetris", "Reactivation"))
Then, run two ANOVAs: one asking whether Condition
determines the number of intrusive thoughts in the days after the intervention (Days_One_to_Seven_Number_of_Intrusions
), and the second asking whether Condition
impacts the responses on the task that was designed to measure intrusive thoughts (Number_of_Provocation_Task_Intrusions
).
For each ANOVA, you should also plan to report the results as we did in the tutorial—complete with means and the F(df1,df2)=value,p<.05 or p=value.
intrusions <- lm(Days_One_to_Seven_Number_of_Intrusions ~ Condition,
data = tetris)
anova(intrusions)
Analysis of Variance Table
Response: Days_One_to_Seven_Number_of_Intrusions
Df Sum Sq Mean Sq F value Pr(>F)
Condition 3 114.82 38.273 3.7948 0.01409 *
Residuals 68 685.83 10.086
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tetris %>%
group_by(Condition) %>%
summarize(mean_intrusions = mean(Days_One_to_Seven_Number_of_Intrusions))
# A tibble: 4 x 2
Condition mean_intrusions
<fct> <dbl>
1 Control 5.11
2 Reactivation+Tetris 1.89
3 Tetris 3.89
4 Reactivation 4.83
There was a significant effect of condition predicting the number of intrusions in the seven days following the intervention, \(F(3,68)=3.79,p<.05,\eta^2=0.14\), a large effect. While the number of intrusions had significantly decreased in the Reactivation+Tetris condition (\(M=1.89\)), they had not decreased to the same degree in the control condition (\(M=5.11\)), Tetris-only condition (\(M=3.89\)), or reactivation-only condition (\(M=4.83\)).
prov.task <- lm(Number_of_Provocation_Task_Intrusions ~ Condition,
data = tetris)
anova(prov.task)
Analysis of Variance Table
Response: Number_of_Provocation_Task_Intrusions
Df Sum Sq Mean Sq F value Pr(>F)
Condition 3 91.44 30.4815 5.5669 0.001782 **
Residuals 68 372.33 5.4755
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tetris %>%
group_by(Condition) %>%
summarize(mean_intrusions = mean(Number_of_Provocation_Task_Intrusions))
# A tibble: 4 x 2
Condition mean_intrusions
<fct> <dbl>
1 Control 3.39
2 Reactivation+Tetris 1.11
3 Tetris 3.11
4 Reactivation 4.17
There was a significant effect of condition predicting the number of intrusions during the provocation task, \(F(3,68)=5.57,p<.05,\eta^2=0.20\), a large effect. The Reactivation+Tetris condition has very few report intrusions (\(M=1.11\)), but they were higher in the other conditions, including control (\(M=3.39\)), Tetris-only (\(M=3.11\)), and reactivation-only (\(M=4.17\)).
library(lsr)
and the etaSquared()
function, or by hand. Your choice. Then edit the above paragraphs to include the effect size.
library(lsr)
etaSquared(intrusions)
eta.sq eta.sq.part
Condition 0.1434073 0.1434073
etaSquared(prov.task)
eta.sq eta.sq.part
Condition 0.197173 0.197173
I told you in class that you shouldn’t repeatedly use t-tests because of the fact that doing so increases your Type I error rate (i.e., risk of false positives). However, you can do what are called planned comparisons. After a significant result on an ANOVA, you may compare the condition of interest against the others if you intended to do so. (This is one more reason why preregistration is so important!) Usually, even when you’ve planned to do comparisons, you should do a correction for the planned comparison. In this case, we want to compare the Reactivation + Tetris condition to the others.
R has a built-in function called pairwise.t.test()
which takes three arguments: x
(the dependent variable), g
(the grouping variable), and p.adj
(the kind of correction—here, “bonf” for Bonferroni). I’m giving you an example here for how we’d have done this with the day 0 data.
pairwise.t.test(x = tetris$Number_of_Provocation_Task_Intrusions,
g = tetris$Condition,
p.adj = "bonf")
Pairwise comparisons using t tests with pooled SD
data: tetris$Number_of_Provocation_Task_Intrusions and tetris$Condition
Control Reactivation+Tetris Tetris
Reactivation+Tetris 0.0284 - -
Tetris 1.0000 0.0753 -
Reactivation 1.0000 0.0013 1.0000
P value adjustment method: bonferroni
The numbers it lists are the p-values for independent-samples t-tests between separate groups—with a Bonferroni-Holms correction. You’ll see that it rates them all as 1—that’s a 100% likelihood that they came from the null hypothesis (there are no differences). Try running the pairwise.t.test()
function on either of the two tests you ran in #2.
pairwise.t.test(x = tetris$Days_One_to_Seven_Number_of_Intrusions,
g = tetris$Condition,
p.adj = "BH")
Pairwise comparisons using t tests with pooled SD
data: tetris$Days_One_to_Seven_Number_of_Intrusions and tetris$Condition
Control Reactivation+Tetris Tetris
Reactivation+Tetris 0.020 - -
Tetris 0.378 0.126 -
Reactivation 0.794 0.021 0.451
P value adjustment method: BH
crt <- tetris %>%
filter(Condition == "Control" | Condition == "Reactivation+Tetris")
t.test(crt$Days_One_to_Seven_Number_of_Intrusions ~ crt$Condition)
Welch Two Sample t-test
data: crt$Days_One_to_Seven_Number_of_Intrusions by crt$Condition
t = 2.9893, df = 22.632, p-value = 0.006627
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.990331 5.454113
sample estimates:
mean in group Control mean in group Reactivation+Tetris
5.111111 1.888889
rrt <- tetris %>%
filter(Condition == "Reactivation" | Condition == "Reactivation+Tetris")
t.test(rrt$Days_One_to_Seven_Number_of_Intrusions ~ rrt$Condition)
Welch Two Sample t-test
data: rrt$Days_One_to_Seven_Number_of_Intrusions by rrt$Condition
t = -3.3228, df = 25.684, p-value = 0.002681
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.766996 -1.121893
sample estimates:
mean in group Reactivation+Tetris mean in group Reactivation
1.888889 4.833333
Post-hoc t-tests with Bonferroni-Holms adjusted p-values demonstrated a significant difference between Reactivation+Tetris and control participants, \(t(22.6)=2.99,p<.05\) and between Reactivation+Tetris and Reactivation-only participants, \(t(25.7)=3.32,p<.05\), but not between Reactivation+Tetris and Tetris-only participants.
Days_One_to_Seven_Number_of_Intrusions
or Number_of_Provocation_Task_Intrusions
) that you’ve been focused on, use group_by()
and summarize()
to calculated means and standard errors per Condition
group, and then plot them using geom_col()
and geom_errorbar()
. Include labels and a theme.
graph <- tetris %>%
group_by(Condition) %>%
summarize(
mean_intrusions = mean(Days_One_to_Seven_Number_of_Intrusions),
n = n(),
sd = sd(Days_One_to_Seven_Number_of_Intrusions),
sem = sd / sqrt(n)
)
ggplot(graph, aes(x = Condition, y = mean_intrusions)) +
geom_col() +
geom_errorbar(aes(ymin = mean_intrusions - sem,
ymax = mean_intrusions + sem),
width = .4) +
theme_classic() +
labs(x = "Condition", y = "Mean Number of Intrusions,\nfrom Days 1 to 7",
title = "Number of intrusions per condition after intervention")
qf(.95, df1 = 3, df2 = 68)
[1] 2.739502
graph <- graph %>%
mutate(ci = sem * qf(.95, df1 = 3, df2 = 68))
ggplot(graph, aes(x = Condition, y = mean_intrusions)) +
geom_col() +
geom_errorbar(aes(ymin = mean_intrusions - ci,
ymax = mean_intrusions + ci),
width = .4) +
theme_classic() +
labs(x = "Condition", y = "Mean Number of Intrusions,\nfrom Days 1 to 7",
title = "Number of intrusions per condition after intervention")
Have any feedback about the exercises? Let me know at the exit survey and select Lab 9.
For attribution, please cite this work as
Dainer-Best (2020, Oct. 30). PSY 203: Statistics for Psychology: One-way ANOVA (Lab 09) Exercises, Completed. Retrieved from https://faculty.bard.edu/jdainerbest/psy-203/posts/2020-10-30-lab-9-anova-completed/
BibTeX citation
@misc{dainer-best2020one-way, author = {Dainer-Best, Justin}, title = {PSY 203: Statistics for Psychology: One-way ANOVA (Lab 09) Exercises, Completed}, url = {https://faculty.bard.edu/jdainerbest/psy-203/posts/2020-10-30-lab-9-anova-completed/}, year = {2020} }