Does a drug designed to lower cholesterol levels actually work?
To study this, researchers randomly select 40 adult males with high cholesterol and randomly assign them to one of two groups:
• 20 subjects receive the drug
• 20 subjects receive a placebo
After a 6-month regimen of taking the drug or placebo, the subjects’ cholesterol levels were measured.
Here’s a summary of the cholesterol levels of the 40 subjects:
With this data, you could calculate a test statistic for an independent samples t-test (assuming equal variances:
with
and a critical value of \(t_{df=38}^{\alpha =0.025}=2.024\).
You publish the results of this study and receive the following feedback:
It is widely known that obese individuals have higher cholesterol than those who are non-obese. Because your drug group could have simply had more obese subjects than your placebo group, I question the results of your study.
We’re going to learn how to conduct an ANOVA when we have 2 independent variables (drug and obesity) and a single dependent variable (cholesterol). When we’re able to randomly assign subjects to levels of each independent variable, we have an AxB ANOVA (a.k.a. a two-way ANOVA or factorial design).
Before we get into the concepts and calculations, let’s look at some hypothetical results from our cholesterol study.
In an AxB ANOVA, we’ll estimate two classes of effects: main effects and interaction effects.
A main effect for a factor (one independent variable) is the overall effect it has on the dependent variable consistently across the levels of the other factor (other independent variable).
In this scenario, for example: The main effect of the drug is the effect it has on cholesterol consistently for obese and non-obese individuals.
Scenario | Drug effect | Obesity effect | Sum of effects |
---|---|---|---|
A : | |||
B : | |||
C : | |||
D : |
The other type of effect we’ll estimate is the interaction effect. Interaction is present when the two independent variables in combination produce effects that are different from the sum of their main effects.
Interaction means the effect of one independent variable depends on the other independent variable. In this study, an interaction effect would mean the effect of the drug on cholesterol depends on the obesity of the subject.
In Scenario B, we calculated the main effects for the drug and obesity to each be 5. The sum of those effects is 10. Suppose we take a subject with an average cholesterol level of 93 (the grand mean). If we have no interaction, then we’d expect that subject to have a cholesterol level of 83 (10 points lower) if that subject was given the drug and was non-obese. This is what happens in this scenario with no interaction (and we can see this in the parallel lines in our means plot).
In Scenario D, the main effects were both zero. We’d expect a non-obese subject given the drug would have the same cholesterol level as the grand mean. That’s not what the results in the scenario show. The combination of the effects of our independent variables produces an effect different from what we expect from each independent variable separately. We can tell we have interaction because the lines in our means plot are not parallel.
We can write the formal model as: \(y_{\textrm{ijk}}=\mu +\alpha _{j}+\beta _{k}+\alpha \beta _{jk}+\epsilon _{ijk}\),
where: \(\alpha _{j}=\mu_{Aj}-\mu, \ \ \ \beta _{k}=\mu_{Bk}-\mu, \ \ \ \alpha \beta _{jk}=\mu_{AjBk}-\mu_{Aj}-\mu_{Bk}+\mu, \ \ \ \epsilon _{ijk}=y_{ijk}-\mu_{AjBk}\)
The following table displays the data from this cholesterol study.
Let’s partition the total variation in cholesterol levels of the subjects in this study.
Once we calculate the SS values, we’ll convert them to MS values by dividing by their degrees of freedom. Then, we’ll compare the mean squares by taking ratios.
Here’s a visual display of how the total variation is partitioned in a (one-way) ANOVA and an AxB ANOVA:
I used R to get the following calculations:
Here’s how I calculated this in R:
# Load data as "chol1"
chol1 <- read.csv("http://www.bradthiessen.com/html5/data/cholesterol1.csv")
# Construct the model: y ~ x1 + x2 + x1x2
# In R, we use: y ~ x1 * x2 (or: y ~ x1 + x2 + x1:x2)
model1 <- aov(cholesterol ~ drug * obese, data=chol1)
# Summarize results (results differ because of rounding errors)
anova(model1)
## Analysis of Variance Table
##
## Response: cholesterol
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 1 1030.63 1030.63 14.0653 0.0006203 ***
## obese 1 1607.06 1607.06 21.9320 3.94e-05 ***
## drug:obese 1 38.49 38.49 0.5253 0.4732558
## Residuals 36 2637.89 73.27
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
After conducting the AxB ANOVA, we should check our assumptions/conditions:
# Construct the model again, this time with "lm" instead of "aov"
# There's really no difference except how R stores the results
model1 <- lm(cholesterol ~ drug * obese, data=chol1)
# Construct some plots of the model
mplot(model1)
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
# Construct one more plot of the model
plotModel(model1)
## Warning in draw.key(simpleKey(...), draw = FALSE): not enough rows for
## columns
Consider the same study with slightly different results. I changed the data for the non-obese, drug group:
Let’s load this data into R:
# Load the data as chol2
chol2 <- read.csv("http://www.bradthiessen.com/html5/data/cholesterol.csv")
# Convert the treatment variables into factors
chol2$obese <- as.factor(chol2$obese)
chol2$drug <- as.factor(chol2$drug)
chol2$Group <- as.factor(chol2$Group)
# Let's look at the structure of our dataset
str(chol2)
## 'data.frame': 40 obs. of 4 variables:
## $ obese : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ drug : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ cholest: num 92.7 85.1 76.9 107.2 77.6 ...
## $ Group : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
# Let's look at the distribution of cholesterol in each group
densityplot(~cholest | drug * obese, data=chol2,
main="Cholesterol by drug and obesity",
xlab="Cholesterol",
layout=c(2,2))
# Here's the syntax for an interaction plot
# x.factor = the variable to be used as the x-axis
# trace.factor = the variable to represent the lines
# response = the dependent variable
interaction.plot(x.factor = chol2$drug, trace.factor = chol2$obese,
response = chol2$cholest, trace.label = c("Obesity"),
xlab = "Treatment (0 = placebo, 1 = drug)",
ylab = "Mean cholesterol level",
col = "steelblue", lw=4)
# Levene's test for homogeneity of variances
leveneTest(cholest ~ drug * obese, data=chol2)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 3 0.4721 0.7036
## 36
# Shapiro-Wilk's test for normality (one for each of our 4 groups)
grp1 <- round(shapiro.test(chol2$cholest[chol2$Group=="1"])$p.value,3)
grp2 <- round(shapiro.test(chol2$cholest[chol2$Group=="2"])$p.value,3)
grp3 <- round(shapiro.test(chol2$cholest[chol2$Group=="3"])$p.value,3)
grp4 <- round(shapiro.test(chol2$cholest[chol2$Group=="4"])$p.value,3)
# Paste the p-values from the Shapiro-Wilk tests
paste("Group 1: p =", grp1, ". Group 2: p =", grp2, ". Group 3: p =", grp3, ". Group 4: p =", grp4)
## [1] "Group 1: p = 0.109 . Group 2: p = 0.59 . Group 3: p = 0.261 . Group 4: p = 0.192"
Let’s run the AxB ANOVA and take a look at the summary table:
# Run the ANOVA model: y ~ a * b
model2 <- aov(cholest ~ drug * obese, data=chol2)
# Print the summary table
anova(model2)
## Analysis of Variance Table
##
## Response: cholest
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 1 46.34 46.34 0.6322 0.431743
## obese 1 752.82 752.82 10.2704 0.002831 **
## drug:obese 1 355.24 355.24 4.8465 0.034197 *
## Residuals 36 2638.77 73.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s see some of the model plots:
# Print model plots
model2 <- lm(cholest ~ drug * obese, data=chol2)
mplot(model2)
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
A significant interaction means the effect of the drug on cholesterol depends on obesity. Because of this, it doesn’t make sense to think about the “drug effect” without considering obesity at the same time.
When we find a significant interaction, we can split our data and investigate the simple effects.
Because we’re most interested in the effect of the drug, let’s split our data into obese and non-obese groups. To do this in R, we can use:
# Select all data where obese == 0 in the chol2 data.frame
# Store the results in a data.frame called "nonobese"
# The blank space to the right of the comma in "0, ]"
# tells R to select ALL the rows of data
nonobese <- chol2[chol2$obese==0, ]
# Select all data where obese == 1
obese <- chol2[chol2$obese==1, ]
# Note that we could separate our data using dplyr
# The syntax is a little easier to read from left-to-right
# nonobese <- chol2 %>% filter(obese==0)
# obese <- chol2 %>% filter(obese==1)
Here’s a summary of those split datasets:
Let’s conduct t-tests to compare the drug and placebo group means. First, we’ll run the test for the non-obese subjects; then we’ll run the test for obese subjects:
# I'll run a t-test without the equal variance assumption
# Here's the test for nonobese subjects
t.test(cholest ~ drug, data=nonobese, var.equal=FALSE, alternative = c("less"))
##
## Welch Two Sample t-test
##
## data: cholest by drug
## t = -0.94313, df = 17.766, p-value = 0.1791
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 3.198074
## sample estimates:
## mean in group 0 mean in group 1
## 89.17777 92.98526
# t-test for obese subjects
# Note that we could run an ANOVA to compare these 2 groups, too.
t.test(cholest ~ drug, data=obese, var.equal=FALSE, alternative = c("greater"))
##
## Welch Two Sample t-test
##
## data: cholest by drug
## t = 2.2483, df = 17.747, p-value = 0.01876
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 1.850717 Inf
## sample estimates:
## mean in group 0 mean in group 1
## 103.81451 95.70151
In conducting the AxB ANOVA (and the t-tests of the simple effects), we had to assume normality, independence, and equal variances. If these assumptions seem unreasonable, we can use randomization-based methods to test for interaction and main effects.
Remember the logic behind randomization-based methods:
- First, we calculate a test statistic of interest.
- We then assume a randomization hypothesis is true.
- Assume the drug, obesity, and their interaction have no effect on cholesterol.
- We can then pretend to go back in time and randomly assign subjects to drug and obesity groups.
- Randomly assigning subjects to obesity groups is impossible, so we should generalize carefully.
- For each randomization of our data, we calculate our test statistic.
- Once we've run 10,000 randomizations, we can determine the likelihood of observing our actual test statistic if the randomization hypothesis were true.
We’re going to need to conduct randomization-based tests separately for the interaction and main effects. We’ll test for interaction first.
Let’s look one more time at our AxB ANOVA output:
anova(model2)
## Analysis of Variance Table
##
## Response: cholest
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 1 46.34 46.34 0.6322 0.431743
## obese 1 752.82 752.82 10.2704 0.002831 **
## drug:obese 1 355.24 355.24 4.8465 0.034197 *
## Residuals 36 2638.77 73.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We’ll now randomize the interaction effect by randomly assigning subjects to drug and obesity groups:
# I'm going to load the "broom" package. It allows us to easily store
# and access results from a model.
library(broom)
# First, we'll re-run the model
aovmodel <- aov(cholest ~ drug * obese, data=chol2)
# Now we'll use the "tidy" command to store those results
tidy(aovmodel)
## term df sumsq meansq statistic p.value
## 1 drug 1 46.34342 46.34342 0.6322496 0.43174279
## 2 obese 1 752.81561 752.81561 10.2704401 0.00283078
## 3 drug:obese 1 355.24497 355.24497 4.8465018 0.03419678
## 4 Residuals 36 2638.77320 73.29926 NA NA
# We want to store the 3rd statistic (F-value for interaction = 4.8465)
# We'll store this value as "obs_F_interaction"
obs_F_interaction<- tidy(aovmodel)$statistic[3]
# Now let's run 10,000 randomizations, each time storing the F-value
# We will shuffle the interaction term
random_interact <- do(10000) *
tidy(aov(cholest ~ drug + obese + shuffle(drug):shuffle(obese),
data=chol2))$statistic[3]
# We'll plot a histogram of the F values from these 10,000 randomizations
histogram(~random_interact, data = random_interact,
xlim=c(-.1, 6), xlab = "F-statistic for interaction", width=.05,
col="steelblue", border="white")
# Now, let's get the p-value by counting the proportion of randomization
# F-values greater than our observed test statistic
tally(~ (random_interact >= obs_F_interaction), format="prop", data=random_interact)
##
## TRUE FALSE
## 0.0072 0.9928
# We could have done this without the broom package
# All the extra brackets to the right are there only to
# extract the F statistic for the interaction effect.
# Notice that the F for interaction is the 3rd F in the summary table.
# Therefore, to extract it, I need [[1]][["F value"]][[3]]
# We'll first get our test statistic (F for interaction)
## observed_model <- do(1) *
## summary(aov(cholest ~ drug*obese, data=chol2))[[1]][["F value"]][[3]]
## obs_F_interaction <- observed_model$"[["
# Now let's take 10,000 randomizations
# We'll randomly shuffle the interaction term and get the F value
## random_interact <- do(10000) * summary(aov(cholest ~ drug + obese + shuffle(drug):shuffle(obese),
## data=chol2))[[1]][["F value"]][[3]]
Now that we’ve found a significant interaction effect, we could split our data and conduct randomization-based tests separately for the obese and non-obese groups.
Let’s, instead, pretend as though our interaction effect was not significant. If that were the case, we could examine the main effects of the drug and obesity. We can do this via randomization-based tests:
# We want to store the 1st statistic (F-value for interaction = 0.632)
# We'll store this value as "obs_F_drug"
obs_F_drug<- tidy(aovmodel)$statistic[1]
# Now let's run 10,000 randomizations, each time storing the F-value
# We will shuffle the interaction term
random_drug <- do(10000) *
tidy(aov(cholest ~ shuffle(drug) + obese + drug:obese,
data=chol2))$statistic[1]
# We'll plot a histogram of the F values from these 10,000 randomizations
histogram(~random_drug, data = random_interact,
xlim=c(-.1, 10), xlab = "F-statistic for interaction", width=.05,
col="steelblue", border="white")
# Now, let's get the p-value by counting the proportion of randomization
# F-values greater than our observed test statistic
tally(~ (random_drug >= obs_F_drug), format="prop", data=random_interact)
##
## TRUE FALSE
## 0.4951 0.5049
I won’t run the following code, but it would test the obesity main effect:
# Obesity effect
observed_model <- do(1) *
summary(aov(cholest ~ drug*obese, data=chol2))[[1]][["F value"]][[2]]
obs_F_obesity <- observed_model$"[["
# Shuffle assignment to obesity groups
random_obesity <- do(1000) * summary(aov(cholest ~ drug + shuffle(obese) + drug:obese, data=chol2))[[1]][["F value"]][[2]]
# Histogram
histogram(~random_obesity, data = random_obesity, xlim=c(-.1, 15),
xlab = "F-statistic for obesity effect", width=.1, col="steelblue", border="white")
# p-value
tally(~ (random_obesity >= obs_F_obesity), format="prop", data=random_interact)
We’ve calculated an effect size of eta-squared = 0.304
. Suppose we wanted to construct confidence interval around that eta-squared. We can do that with bootstrap methods:
# Take 10,000 bootstrap samples (with replacement) and calculate the AxB ANOVA
bootmodel <- do(10000) * lm( cholest ~ drug * obese, data=resample(chol2))
# Store the eta-squared values as "bootetasquared"
bootetasquared <- bootmodel$r.squared
# Plot the eta-squared values from our bootstrap samples
densityplot(~bootetasquared, plot.points = FALSE, col="steelblue", lwd=4)
# Calculate the 95% confidence interval
confint(bootetasquared, level = 0.95, method = "quantile")
## 2.5% 97.5%
## quantile 0.1437238 0.5604518
ToothGrowth
built-into R. It’s results from a study on the effect of vitamin C on tooth growth in guinea pigs. The researchers were interested in the effects of two variables:supp
: type of Vitamin C given to the guinea pigs (OJ = orange juice; VC = a liquid supplement) dose
: the amount of Vitamin C given (low, med, or high). dose
and test for differences between the OJ and VC treatments.# Load the dataset
data(ToothGrowth)
# Change the dose variable to a factor with levels: low, med, high
ToothGrowth$dose = factor(ToothGrowth$dose, levels=c(0.5,1.0,2.0),
labels=c("low","med","high"))
This table shows the means (and standard deviations) of the 10 guinea pigs in each combination of treatments.
. | Orange Juice | AbsorbAcid | Total |
---|---|---|---|
Low | 13.23 (4.460) | 07.98 (2.747) | 10.605 (4.4998) |
Med | 22.70 (3.911) | 16.77 (2.515) | 19.735 (4.4154) |
High | 26.06 (2.655) | 26.14 (4.798) | 26.100 (3.7742) |
Total | 20.66 (6.606) | 16.96 (8.266) | N=60, 18.813 (7.6493) |
# Density plot of each group
densityplot(~len | dose*supp, data=ToothGrowth,
main="Tooth Growth by Dose and Supplement",
xlab="Tooth Growth", layout=c(3,2))
# Side-by-side boxplots
bwplot(supp ~ len | dose, data=ToothGrowth,
ylab="supplement", xlab="Tooth Growth",
main="Tooth Growth by Dose and Supplement",
layout=(c(1,3)))
# Interaction plot
with(ToothGrowth, interaction.plot(x.factor=dose, trace.factor=supp,
response=len, fun=mean, type="b", legend=T,
ylab="mean tooth length", main="Interaction Plot",
pch=c(1,19)))
# Levene's test
leveneTest(len ~ supp * dose, data=ToothGrowth)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 5 1.7086 0.1484
## 54
ToothGrowth
dataset using a randomization-based equivalent to the AxB ANOVA. Test the interaction effect; then, split the data by dose and run randomization-based tests to compare pairs of means.
When you’re ready to try the Your Turn exercises:
Sources:
This document is released under a Creative Commons Attribution-ShareAlike 3.0 Unported license.