# Load the tidyverse and mosaic packages
library(tidyverse)
library(mosaic)
In 1972, 57 high school students were randomly assigned to one of three groups:
The students were asked to read the following passage and remember details from it:
If the balloons popped, the sound wouldn’t be able to carry since everything would be too far away from the correct floor. A closed window would also prevent the sound from carrying, since most buildings tend to be well insulated. Since the whole operation depends on a steady flow of electricity, a break in the middle of the wire would also cause problems. Of course, the fellow could shout, but the human voice is not loud enough to carry that far. An additional problem is that a string could break on the instrument. Then there could be no accompaniment to the message. It is clear that the best situation would involve less distance. Then there would be fewer potential problems. With face to face contact, the least number of things could go wrong.
Students were asked to:
Does seeing the picture influence a student’s ability to comprehend and/or recall the passage?
For now, let’s focus on the comprehension scores. The code
shows how the data were imported –>
# We'll download the data from my website
ambiguous <- read.csv("http://www.bradthiessen.com/html5/data/ambiguousprose.csv")
# Look at the first several rows to get a sense of the data
head(ambiguous)
# # A tibble: 6 × 2
# Condition Comprehension
# * <fctr> <int>
# 1 After 6
# 2 After 5
# 3 After 4
# 4 After 2
# 5 After 1
# 6 After 3
# Get summary statistics
ambiguous %>%
group_by(Condition) %>%
summarize(n = n(),
mean = mean(Comprehension),
sd = sd(Comprehension))
# # A tibble: 3 × 4
# Condition n mean sd
# <fctr> <int> <dbl> <dbl>
# 1 After 19 3.210526 1.397575
# 2 Before 19 4.947368 1.311220
# 3 None 19 3.368421 1.256562
ambiguous %>%
summarize(n = n(),
mean = mean(Comprehension),
sd = sd(Comprehension))
# # A tibble: 1 × 3
# n mean sd
# <int> <dbl> <dbl>
# 1 57 3.842105 1.521154
Condition | n | mean | sd |
---|---|---|---|
After | 19 | 3.211 | 1.398 |
Before | 19 | 4.947 | 1.311 |
None | 19 | 3.368 | 1.257 |
Combined | N = 57 | M = 3.842 | s = 1.521 |
ambiguous %>%
ggplot(aes(x = Condition, y = Comprehension)) +
geom_dotplot(aes(fill=Condition), binaxis = "y", binpositions="all", color = "white") +
scale_y_continuous(breaks=seq(0, 7, 1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(axis.title.y = element_text(size = 12, color="#777777")) +
theme(axis.text.y = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Comprehension by Condition"
) +
coord_flip(xlim = NULL, ylim = NULL, expand = TRUE)
ambiguous %>%
ggplot(aes(x = Condition, y = Comprehension)) +
geom_dotplot(aes(fill=Condition), binaxis = "y", binpositions="all", color = "white", alpha=0.5) +
geom_boxplot(fill = "white", color = "black", alpha = 0.6) +
scale_y_continuous(breaks=seq(0, 7, 1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(axis.title.y = element_text(size = 12, color="#777777")) +
theme(axis.text.y = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Comprehension by Condition"
) +
coord_flip(xlim = NULL, ylim = NULL, expand = TRUE)
ambiguous %>%
ggplot(aes(x = Condition, y = Comprehension)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", aes(group = 1),
colour = "Blue", linetype = "dashed") +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) +
labs(x = "Treatment", y = "Mean Comprehension Score")
If an independent samples t-test compares two group means, could we use multiple t-tests to compare all possible pairs of our three group means?
Despite this problem, let’s conduct multiple t-tests using the pairwise.t.test()
command to run all possible t-tests for a dataset.
# Wait!!!!!!!!!!
# Why would R have a pairwise.t.test function if we're NOT
# supposed to conduct multiple t-tests? Good question.
# We'll come back to this when we learn about post hoc tests.
# To use this function, we need to identify the variables within
# the dataframe using the $ identifier.
# p.adjust.method="none" just means I don't want R to modify the p-value
# In the next lesson, we'll see why we might want to modify the p-value
pairwise.t.test(ambiguous$Comprehension, ambiguous$Condition, p.adjust.method="none")
#
# Pairwise comparisons using t tests with pooled SD
#
# data: ambiguous$Comprehension and ambiguous$Condition
#
# After Before
# Before 0.00017 -
# None 0.71444 0.00054
#
# P value adjustment method: none
If running multiple t-tests is problematic, could we conduct a single test of our three group means using randomization-based methods?
Sure! If we can identify a single test statistic that compares all three group means, we can use randomization- or theory-based methods.
To compare two means, we use: \(\overline{X}_{1}-\overline{X}_{2}\). If two means are similar, the test statistic is near zero. As the means differ by a greater amount, the value of the test statistic increases.
Using the means of our three groups:
We can calculate:
\(SAD = |\overline{X}_{a}-\overline{X}_{b}|+|\overline{X}_{a}-\overline{X}_{n}|+|\overline{X}_{b}-\overline{X}_{n}|=3.474\)
\(MAD = \frac{|\overline{X}_{a}-\overline{X}_{b}|+|\overline{X}_{a}-\overline{X}_{n}|+|\overline{X}_{b}-\overline{X}_{n}|}{3}=1.158\)
Expand the code to see how to calculate this in R –>
SAD( mean(Comprehension ~ Condition, data=ambiguous) )
# [1] 3.473684
MAD( mean(Comprehension ~ Condition, data=ambiguous) )
# [1] 1.157895
test_stat <- SAD( mean(Comprehension ~ Condition, data=ambiguous) )
We’ll store SAD = 3.4736842
as our test_stat
.
# Eliminate scientific notation
options(scipen=999)
# Calculate number of randomizations (group ID does not matter)
ways <- ( choose(n=57, k=19) * choose(38, 19) * choose(19, 19) ) / factorial(3)
# Put comma separators into the number
format(ways, big.mark=",", scientific=FALSE)
# [1] "3,752,394,405,341,099,206,901,760"
We’re never going to list all those randomizations, but we can get a random, representative sample of them. For each randomization, we’ll calculate SAD. We’ll then plot our distribution of randomized SADs and determine the likelihood that our test statistic came from this distribution.
# Randomize our data 10,000 times (shuffling condition)
SADrand <- Do(10000) * SAD( mean(Comprehension ~ shuffle(Condition), data=ambiguous) )
# Here's one way to do this in dplyr
# SADrand <- Do(10000) * ambiguous %>%
# mutate(shuffled = sample(Comprehension)) %>%
# group_by(Condition) %>%
# summarize(shuffled_means = mean(shuffled)) %>%
# mutate(shuffled_SAD = SAD(shuffled_means)) %>%
# filter(Condition == "After") %>% # Eliminates triplicate results
# select(shuffled_SAD)
# Plot the distribution
# Here's the most straight-forward way to plot this
# histogram(~SAD, data=SADrand,
# xlim=c(0,5),
# width=.2,
# xlab="Possible mean differences assuming null hypothesis is true")
# ladd(panel.abline(v=test_stat)) # Add vertical line at test statistic
# Using ggplot2 with a density plot
ggplot(data = SADrand, aes(x = SAD)) +
geom_density(fill="lightblue", color="white", alpha = 0.8) +
annotate("segment", x = test_stat, xend = test_stat, y = 0, yend = .4, color = "red") +
annotate("text", x = test_stat, y = .45, label = "observed SAD = 3.474", color = "red") +
labs(
title = "Randomized SAD",
x = "SAD"
) +
scale_x_continuous(limits = c(0,5), breaks=seq(0, 5, 1), minor_breaks=NULL) +
theme(
axis.text.x = element_text(size = 11, color="grey10"),
legend.position = "none",
panel.grid.major.y = element_line(colour = "white"),
panel.grid.major.x = element_line(colour = "white", size=.15),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "grey93")
)
# Estimate p-value
pvalue <- prop(SADrand$SAD >= test_stat)
pvalue
# TRUE
# 0.0008
# A = Simulate data with same means, smaller standard deviations
dataA <- tibble(
Condition = as.factor(c(rep("After", 19), rep("Before", 19), rep("None",19))),
Comprehension = c(rnorm(19,3.211,0.5), rnorm(19,4.947,0.5), rnorm(19,3.368,0.5)),
dataset = c(rep("A", 57))
)
# B = actual data from study
dataB <- ambiguous
dataB$dataset = c(rep("B", 57))
# C = simulate data with same means, larger standard deviations
dataC <- tibble(
Condition = as.factor(c(rep("After", 19), rep("Before", 19), rep("None",19))),
Comprehension = c(rnorm(19,3.211,3), rnorm(19,4.947,3), rnorm(19,3.368,3)),
dataset = c(rep("C", 57))
)
# merge into a single data frame
data_sim <- bind_rows(dataA, dataB, dataC)
# Plot with dataset as facet
data_sim %>%
ggplot(aes(x = Condition, y = Comprehension)) +
geom_dotplot(aes(fill=Condition), binwidth=.5, binaxis = "y", binpositions="all", color = "white", alpha=0.9) +
scale_y_continuous(limits = c(-5, 9), breaks=c(3, 5), minor_breaks=NULL) +
stat_summary(fun.y = mean, geom = "point", size=3, aes(group=1)) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(axis.title.y = element_text(size = 12, color="#777777")) +
theme(axis.text.y = element_text(size = 12)) +
theme(legend.position="none") +
coord_flip(xlim = NULL, ylim = NULL, expand = FALSE) +
facet_grid(. ~ dataset)
# Warning: Removed 2 rows containing non-finite values (stat_bindot).
# Warning: Removed 2 rows containing non-finite values (stat_summary).
To address the limitations of SAD, our test statistic should consider the variability within the groups along with the variability among the group means.
Our test statistic, then, will be: \(F=\frac{\textrm{variance between group means}}{\textrm{variance within groups}}\)
We’ll call this process analysis of variance or ANOVA. Despite its name, realize the goal is to compare means (by analyzing the variance between and within the groups).
# Null hypothesis is true
ggplot(data.frame(x = c(0, 8.5)), aes(x)) +
stat_function(fun = dnorm, args = list(mean = 3.6, sd = 1.25), color="red", size=1) +
stat_function(fun = dnorm, args = list(mean = 3.7, sd = 1.25), color="skyblue", size=1) +
stat_function(fun = dnorm, args = list(mean = 3.8, sd = 1.25), color="267", size=1) +
geom_dotplot(data = ambiguous, aes(x = Comprehension, y = 1, fill=Condition, color=Condition),
binwidth=.1, binaxis="x", stackgroups=TRUE, binpositions="all") +
scale_y_continuous(limits = c(0, .4), breaks=NULL, minor_breaks=NULL) +
scale_x_continuous(limits = c(-.5, 9), breaks=seq(1,7,1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Null Hypothesis",
x = "comprehension score",
y="")
# Alternative hypothesis is true
ggplot(data.frame(x = c(0, 8.5)), aes(x)) +
stat_function(fun = dnorm, args = list(mean = 2, sd = .6), color="red", size=1) +
stat_function(fun = dnorm, args = list(mean = 3.7, sd = .6), color="skyblue", size=1) +
stat_function(fun = dnorm, args = list(mean = 5.4, sd = .6), color="267", size=1) +
geom_dotplot(data = ambiguous, aes(x = Comprehension, y = 1, fill=Condition, color=Condition),
binwidth=.1, binaxis="x", stackgroups=TRUE, binpositions="all") +
scale_y_continuous(limits = c(0, .75), breaks=NULL, minor_breaks=NULL) +
scale_x_continuous(limits = c(-.5, 9), breaks=seq(1,7,1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Alternative Hypothesis",
x = "comprehension score",
y="") +
annotate("text", x = 2.0, y = .7, label = "after", color = "red") +
annotate("text", x = 3.7, y = .7, label = "none", color = "steelblue") +
annotate("text", x = 5.4, y = .7, label = "before", color = "267")
Let’s construct a model for this ambiguous prose study. Think about 3 subjects in this study:
With ANOVA, we partition the total variation in our data (the differences among comprehension ratings from all subjects in the study) into two components:
We can (roughly) visualize these sources of variation as:
# Null hypothesis is true
ggplot(data.frame(x = c(0, 8.5)), aes(x)) +
stat_function(fun = dnorm, args = list(mean = 3.6, sd = 1.25), color="red", size=1, alpha=0.5) +
stat_function(fun = dnorm, args = list(mean = 3.7, sd = 1.25), color="skyblue", size=1, alpha=0.5) +
stat_function(fun = dnorm, args = list(mean = 3.8, sd = 1.25), color="267", size=1, alpha=0.5) +
scale_y_continuous(limits = c(0, .4), breaks=NULL, minor_breaks=NULL) +
scale_x_continuous(limits = c(-.5, 9), breaks=seq(1,7,1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Null Hypothesis",
x = "comprehension score",
y="") +
annotate("segment", x = 3.6, xend = 3.8, y = .333, yend = .333, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .35, label = "between groups variation", color = "black") +
annotate("segment", x = 2.1, xend = 5.3, y = .12, yend = .12, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .14, label = "within groups variation", color = "black") +
annotate("segment", x = 0, xend = 8, y = .0, yend = .0, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .02, label = "total variation", color = "black")
# Alternative hypothesis is true
ggplot(data.frame(x = c(0, 8.5)), aes(x)) +
stat_function(fun = dnorm, args = list(mean = 2, sd = .6), color="red", size=1, alpha=0.5) +
stat_function(fun = dnorm, args = list(mean = 3.7, sd = .6), color="skyblue", size=1, alpha=0.5) +
stat_function(fun = dnorm, args = list(mean = 5.4, sd = .6), color="267", size=1, alpha=0.5) +
scale_y_continuous(limits = c(0, .75), breaks=NULL, minor_breaks=NULL) +
scale_x_continuous(limits = c(-.5, 9), breaks=seq(1,7,1), minor_breaks=NULL) +
theme(axis.title.x = element_text(size = 12, color = "#777777")) +
theme(axis.text.x = element_text(size = 12)) +
theme(legend.position="none") +
labs(
title = "Alternative Hypothesis",
x = "comprehension score",
y="") +
annotate("segment", x = 2, xend = 5.4, y = .69, yend = .69, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .73, label = "between groups variation", color = "black") +
annotate("segment", x = 3, xend = 4.4, y = .3, yend = .3, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .33, label = "within", color = "black") +
annotate("segment", x = 0, xend = 8, y = .0, yend = .0, color = "black", size=1.5) +
annotate("text", x = 3.7, y = .02, label = "total variation", color = "black")
Recall the formula for the unbiased estimate of the population variance:
\(s^{2}=\frac{\sum_{i=1}^{n}(x_{i}-\overline{X})^{2}}{n-1}=\frac{\textrm{sum of squared deviations from the mean}}{\textrm{degrees of freedom}}=\frac{SS}{df}=\textrm{mean square}\)
\(\mathbf{\textrm{MS}_{\textrm{Total}}=\frac{\textrm{SS}_{\textrm{total}}}{\textrm{df}_{\textrm{total}}}=\frac{\sum_{i=1}^{n}(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )^{2}}{(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )}}=\)
Calculation:
SStotal <- sum( ( ambiguous$Comprehension - mean(ambiguous$Comprehension) )^2 )
SStotal
# [1] 129.5789
dftotal <- length(ambiguous$Comprehension) - 1
dftotal
# [1] 56
MStotal <- SStotal / dftotal
MStotal
# [1] 2.31391
# Notice MStotal is just the variance of all the data
var(ambiguous$Comprehension)
# [1] 2.31391
\(\mathbf{\textrm{MS}_{\textrm{Error}}=\frac{\textrm{SS}_{\textrm{error}}}{\textrm{df}_{\textrm{error}}}=\frac{\sum(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )^{2}}{(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )}}=\)
Calculation:
SSerror <- sum( (19 - 1) * var(Comprehension ~ Condition, data=ambiguous) )
SSerror
# [1] 94.52632
dferror <- length(ambiguous$Comprehension) - 3
dferror
# [1] 54
MSerror <- SSerror / dferror
MSerror
# [1] 1.750487
\(\mathbf{\textrm{MS}_{\textrm{A}}=\mathbf{\textrm{MS}_{\textrm{treatment}}}=\frac{\textrm{SS}_{\textrm{A}}}{\textrm{df}_{\textrm{A}}}=\frac{\sum\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )^{2}}{(\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ )}}\)
Calculation:
# This is not at all the easiest way to calculate this...
SSphoto <- as.numeric(
ambiguous %>%
group_by(Condition) %>%
summarize(squaredmeandiff = (mean(Comprehension)-mean(ambiguous$Comprehension))^2) %>%
mutate(ssphoto = 19 * sum(squaredmeandiff)) %>%
filter(Condition=="After") %>%
select(ssphoto)
)
SSphoto
# [1] 35.05263
# Check to see if it's the same as SStotal - SSerror
SStotal - SSerror
# [1] 35.05263
dfphoto <- n_distinct(ambiguous$Condition) - 1
dfphoto
# [1] 2
MSphoto <- SSphoto / dfphoto
MSphoto
# [1] 17.52632
For our data:
Source | SS | df | MS | MSR |
---|---|---|---|---|
Treatment | 35.053 | 2 | 17.526 | ? |
Error | 94.526 | 54 | 1.75 | ? |
Total | 129.579 | 56 | 2.314 | ? |
\(\mathbf{\textrm{SS}_{\textrm{total}}}=\sum \sum (x_{i}-M)^{2}=\sum \sum \left [ \left ( x_{i}- \overline{X}_{a}\right )-\left ( \overline{X}_{a}-M \right ) \right ]^{2}=\sum \sum \left [ \left ( \overline{X}_{a}- M\right )-\left ( x_{i}-\overline{X}_{a} \right ) \right ]^{2}=\)
\(\mathbf{\textrm{SS}_{\textrm{total}}}=\sum \sum\left ( \overline{X}_{a}- M\right )^{2}+\sum \sum\left ( x_{i}-\overline{X}_{a} \right )^{2}+2\sum \sum\left ( \overline{X}_{a}- M\right )\left ( x_{i}-\overline{X}_{a} \right )\)
Since \(\sum \left ( x_{i}-\overline{X}_{a} \right )=0\)
\(\mathbf{\textrm{SS}_{\textrm{total}}}=\sum n_{a}\left ( \overline{X}_{a}- M\right )^{2}+\sum \sum\left ( x_{i}-\overline{X}_{a} \right )^{2}=\)
\(\mathbf{\textrm{SS}_{\textrm{total}}}=\mathbf{\textrm{SS}_{\textrm{A}}}+\mathbf{\textrm{SS}_{\textrm{E}}}\)
If mean square is a fancy term for variance, MSerror must represent the variance within a single group.
If we have 3 groups in our dataset, how can MSE represent the variance of a single group? Which group do we “choose” when we calculate MSE?
If we assume our data come from groups with equal population variances, we don’t have to choose a group. Instead, we can calculate a pooled variance of all our groups. In this sense, MSE represents the average variance of each of our groups.
The concept of an average, or pooled, variance should sound familiar. We’ve used Spooled to conduct independent samples t-tests (assuming equal variances). If we take the formula for Spooled and extend it to 3 groups, we have:
\(s_{\textrm{pooled}}^{2}=\frac{\left ( n_{1}-1 \right )s_{1}^{2}+\left ( n_{2}-1 \right )s_{2}^{2}+\left ( n_{3}-1 \right )s_{3}^{2}}{\left ( n_{1}-1 \right )+\left ( n_{2}-1 \right )+\left ( n_{3}-1 \right )}\)
\(s_{\textrm{pooled}}^{2}=\frac{\left ( n_{1}-1 \right )\frac{\sum\left ( x_{i1}-\overline{X}_{1} \right )^{2}}{n_{1}-1} + \left ( n_{2}-1 \right )\frac{\sum\left ( x_{i2}-\overline{X}_{2} \right )^{2}}{n_{2}-1} + \left ( n_{3}-1 \right )\frac{\sum\left ( x_{i3}-\overline{X}_{3} \right )^{2}}{n_{3}-1}}{\left ( n_{1}-1 \right )+\left ( n_{2}-1 \right )+\left ( n_{3}-1 \right )}\)
\(s_{\textrm{pooled}}^{2}=\frac{\sum \left ( x_{i1}-\overline{X}_{1} \right )^{2} + \sum \left ( x_{i2}-\overline{X}_{2} \right )^{2} + \sum \left ( x_{i3}-\overline{X}_{3} \right )^{2}}{\left ( n_{1}-1 \right )+\left ( n_{2}-1 \right )+\left ( n_{3}-1 \right )}\)
\(s_{\textrm{pooled}}^{2}=\frac{\sum \left ( x_{ia}-\overline{X}_{a} \right )^{2}}{N-a}=\textrm{MS}_{\textrm{E}}\)
This is the logic behind ANOVA. Under a true null hypothesis, both MSphoto and MSerror represent unbiased estimates of the variance within a group. When the null hypothesis is false, MSphoto becomes larger.
Complete the following ANOVA summary table and estimate the p-value. We can estimate the p-value directly in R or use the Statkey F-distribution applet.
Source | SS | df | MS | MSR |
---|---|---|---|---|
Treatment | 35.053 | 2 | 17.53 | MSR = |
Error | 94.526 | 54 | 1.75 | p = |
Total | 129.579 | 56 | 2.31 | \(\eta^2=\) |
A quick look at an F-distribution table will give you a sense of what values of the MSR will yield low p-values.
Conducting ANOVA in R is simple. Expand the code to see –>
# Our data frame is "ambiguous"
# Our model is: Comprehension ~ Condition
# We use aov() to conduct the analysis of variance
# I'll store the results in a list called "model"
model <- aov( Comprehension ~ Condition, data = ambiguous )
# The default output isn't too helpful
# It only shows SS and df, with a residual standard error
model
# Call:
# aov(formula = Comprehension ~ Condition, data = ambiguous)
#
# Terms:
# Condition Residuals
# Sum of Squares 35.05263 94.52632
# Deg. of Freedom 2 54
#
# Residual standard error: 1.32306
# Estimated effects may be unbalanced
# We can use the anova() function to produce more useful output
anova(model)
# # A tibble: 2 × 5
# Df `Sum Sq` `Mean Sq` `F value` `Pr(>F)`
# * <int> <dbl> <dbl> <dbl> <dbl>
# 1 2 35.05263 17.526316 10.01225 0.0002002136
# 2 54 94.52632 1.750487 NA NA
# Verify that these values match our calculations
# The aov() function produces much more output than we're seeing
# We can glance at some of this output with the broom package
# The following code will install the broom package (if necessary)
list.of.packages <- c("broom")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
library(broom)
tidy(model)
# # A tibble: 2 × 6
# term df sumsq meansq statistic p.value
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Condition 2 35.05263 17.526316 10.01225 0.0002002136
# 2 Residuals 54 94.52632 1.750487 NA NA
glance(model)
# # A tibble: 1 × 11
# r.squared adj.r.squared sigma statistic p.value df logLik
# <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
# 1 0.2705118 0.2434937 1.32306 10.01225 0.0002002136 3 -95.29557
# # ... with 4 more variables: AIC <dbl>, BIC <dbl>, deviance <dbl>,
# # df.residual <int>
# By the end of this course, you'll understand what all those
# terms mean: r.squared, logLik, AIC, deviance
We can define an effect size (eta-squared) as: \(\eta ^{2}=\frac{\textrm{SS}_\textrm{A}}{\textrm{SS}_\textrm{T}}=\frac{35.05}{129.58}=0.27\)
Eta-squared is based entirely on sums of squares from the sample and no adjustment is made to estimate the effect size for the population. As we’ll learn later in this course, our models typically overfit our sample data.
As an adjusted effect size, we can calculate omega-squared: \(\omega ^{2}=\frac{\textrm{SS}_\textrm{A}-\left (\textrm{df}_{\textrm{A}} \right )\left ( \textrm{MS}_\textrm{E} \right )}{\textrm{SS}_\textrm{T}+\textrm{MS}_\textrm{E}}\)
For our ambiguous prose data: \(\omega ^{2}=\frac{35.05- \left (2 \right )\left ( 1.75 \right )}{129.58+1.75}=0.25\)
To assess homogeneity of variances, we can use:
The Fmax test, which should only be applied if data for each group are normally distributed, is something you can quickly estimate in your head. To do so, you calculate the following statistic and compare it to the Fmax distribution.
\(F_{max}=\frac{s^2_\mathrm{{biggest }}}{s^2_\mathrm{{smallest }}}=\frac{1.398^2}{1.257^2}=1.237\sim F_{max}\)
Levene’s test, which is more appropriate if your data do not follow a normal distribution, is calculated by first converting every score to:
\(z_{ai}=\left |x_{ai}-\overline{X}_a \right |\)
The test statistic is calculated as:
\(W=\frac{\left ( N-a \right )}{a-1}\frac{\sum n_{a}\left ( \overline{z}_{a}-\overline{\overline{z}} \, \right )^2}{\sum \sum \left ( z_{ia}- \overline{z}_{a}\right )^2}\sim{F}^{a-1}_{N-a}\)
Expand the code to see how to conduct Levene’s test in R –>
# We need the 'car' package
list.of.packages <- c("car")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
library(car)
# Levene Test
leveneTest(Comprehension ~ Condition, data = ambiguous, center = mean)
# # A tibble: 2 × 3
# Df `F value` `Pr(>F)`
# * <int> <dbl> <dbl>
# 1 2 0.08474171 0.9188715
# 2 54 NA NA
# Brown-Forsythe Test (uses medians instead of means)
leveneTest(Comprehension ~ Condition, data = ambiguous, center = median)
# # A tibble: 2 × 3
# Df `F value` `Pr(>F)`
# * <int> <dbl> <dbl>
# 1 2 0.02432432 0.9759798
# 2 54 NA NA
There are several ways to test for normality:
These methods (except for the energy test) are easy to understand. Expand the code to see how to conduct the Shapiro-Wilk test in R.
# The function is called 'shapiro.test`. It would test for normality
# of all the data in our data frame (ignoring the groups)
shapiro.test(ambiguous$Comprehension)
#
# Shapiro-Wilk normality test
#
# data: ambiguous$Comprehension
# W = 0.94333, p-value = 0.009909
# If we want to apply (tapply) the shapiro.test to all our groups, we use:
tapply(ambiguous$Comprehension, ambiguous$Condition, shapiro.test)
# $After
#
# Shapiro-Wilk normality test
#
# data: X[[i]]
# W = 0.94309, p-value = 0.2995
#
#
# $Before
#
# Shapiro-Wilk normality test
#
# data: X[[i]]
# W = 0.90423, p-value = 0.05805
#
#
# $None
#
# Shapiro-Wilk normality test
#
# data: X[[i]]
# W = 0.94886, p-value = 0.3779
# We could then look at a Q-Q plot for that "before" group
# qqnorm(ambiguous$Comprehension[ambiguous$Condition=="Before"], ylab = "Essay scores"); qqline(ambiguous$Comprehension[ambiguous$Condition=="Before"])
What can we do if the conditions aren’t satisfied (or if the assumptions aren’t reasonable)? We’ll deal with this issue all semester. Some options include:
Recall from the first lesson that if we want to conduct a t-test without a normality assumption, we could use the Welch-Satterthwaite approximation. We can also do this in ANOVA.
Expand the code to see how –>
# Notice the degrees of freedom differ from our "regular" ANOVA
oneway.test(Comprehension ~ Condition, data=ambiguous, var.equal=FALSE)
#
# One-way analysis of means (not assuming equal variances)
#
# data: Comprehension and Condition
# F = 9.8552, num df = 2.000, denom df = 35.933, p-value = 0.0003871
We’ve seen how to conduct randomization-based tests of the SAD statistic (which do not account for variance within each group). We’ve also seen how to conduct an ANOVA (which requires conditions of normality and equal variances). Can we get the best of both worlds?
Let’s see if we can conduct a randomization-based test using the F-statistic (mean square ratio) as our test statistic. Expand the code to see how this is done.
# We'll store our observed mean square ratio of 10.012 as MSRobserved
# I could have used: MSRobserved <- 10.012
MSRobserved <- tidy(aov(Comprehension~Condition, data=ambiguous))$statistic[1]
# 10,000 replications, shuffling the Condition each time
MSRrandomized <- Do(10000) * tidy(aov(Comprehension ~ shuffle(Condition), data=ambiguous))$statistic[1]
# Plot results
ggplot(data = MSRrandomized, aes(x = result)) +
geom_density(fill="lightblue", color="white", alpha = 0.8) +
annotate("segment", x = MSRobserved, xend = MSRobserved, y = 0, yend = .2, color = "red") +
annotate("text", x = MSRobserved, y = .25, label = "observed MSR = 10.012", color = "red") +
labs(
title = "Randomized mean square ratios",
x = "mean square ratios"
) +
scale_x_continuous(limits = c(0,12), breaks=seq(0, 12, 2), minor_breaks=NULL) +
theme(
axis.text.x = element_text(size = 11, color="grey10"),
legend.position = "none",
panel.grid.major.y = element_line(colour = "white"),
panel.grid.major.x = element_line(colour = "white", size=.15),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "grey93")
)
# Estimate p-value
pvalue <- prop(MSRrandomized$result >= MSRobserved)
pvalue
# TRUE
# 0.0002
Let’s (finally) finish things up by looking at the relationship between the F-statistic (mean square ratio) in an ANOVA and the t-statistic in a t-test:
\(t_{\textrm{n}_1+\textrm{n}_2-2}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sqrt{\frac{1}{\textrm{n}_1}+\frac{1}{\textrm{n}_2}}\sqrt{\frac{\left ( \textrm{n}_1-1 \right )s_{1}^{2}+\left ( \textrm{n}_2-1 \right )s_{2}^{2}}{\textrm{n}_1+\textrm{n}_2-2}}}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sqrt{\frac{1}{\textrm{n}_1}+\frac{1}{\textrm{n}_2}}\sqrt{s_{pooled}^{2}}}\)
\(t_{{\textrm{n}_1+\textrm{n}_2-2}}^{2}=\frac{\left (\overline{X}_{1}-\overline{X}_{2} \right )^{2}}{\left ( \frac{1}{n_{1}}+ \frac{1}{n_{2}}\right )}s_{pooled}^{2}=\frac{\left ( n_{1}+n_{2} \right )\left (\overline{X}_{1}-\overline{X}_{2} \right )^{2}}{s_{pooled}^{2}}=\frac{n_{1}\left (\overline{X}_{1}-\overline{X}_{2} \right )^{2}+n_{2}\left (\overline{X}_{1}-\overline{X}_{2} \right )^{2}}{{s_{pooled}^{2}}}\)
\(t_{{\textrm{n}_1+\textrm{n}_2-2}}^{2}=\frac{\sum n_{a}\left (\overline{X}_{a}-M \right )^{2}/1}{\textrm{MS}_{\textrm{E}}}=\frac{\textrm{SS}_{\textrm{A}}/\textrm{df}_{\textrm{A}}}{\textrm{MS}_{\textrm{E}}}=\frac{\textrm{MS}_{\textrm{A}}}{\textrm{MS}_{\textrm{E}}}=F\)
# Recreate the spider dataset from lesson #1
spider <- tibble(
group = c( rep("photo", 12), rep("real", 12) ),
anxiety = c(30, 35, 45, 40, 50, 35, 55, 25, 30, 45, 40, 50,
40, 35, 50, 55, 65, 55, 50, 35, 30, 50, 60, 39))
# Run a t-test
t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = TRUE, conf.level = 0.95)
#
# Two Sample t-test
#
# data: anxiety by group
# t = -1.6813, df = 22, p-value = 0.05342
# alternative hypothesis: true difference in means is less than 0
# 95 percent confidence interval:
# -Inf 0.1490421
# sample estimates:
# mean in group photo mean in group real
# 40 47
# Store the value of the t-statistic (-1.68)
tstat <- t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = TRUE, conf.level = 0.95)$statistic
# Run an ANOVA on the same data
anova(aov(anxiety ~ group, data=spider))
# # A tibble: 2 × 5
# Df `Sum Sq` `Mean Sq` `F value` `Pr(>F)`
# * <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 294 294 2.826923 0.1068392
# 2 22 2288 104 NA NA
# Notice the F-statistic = 2.8269
# Is that the same as the tstat squared? Let's see...
tstat^2
# t
# 2.826923
Condition | n | mean | sd |
---|---|---|---|
After | 19 | 3.211 | 1.398 |
Before | 19 | 4.947 | 1.311 |
None | 19 | 3.368 | 1.257 |
Combined | N = 57 | M = 3.842 | s = 1.521 |
aov( anova(Comprehension ~ Condition, data = ambiguous) )
diet
dataframe has been loaded.Diet
(a factor variable identifying which diet each subject is assigned to) and Weightloss
(the number of pounds lost by each subject by the end of the study).Once you’re satisfied with your solutions, email that html file to thiessenbradleya@sau.edu or give me a printed copy.
If you run into any problems, email me or work with other students in the class.
Sources
This document is released under a Creative Commons Attribution-ShareAlike 3.0 Unported license.