Your Turn #1

Author(s): [Enter names of students working on these solutions]

A 1999 study investigated the relationship between breastfeeding and intelligence. Researchers calculated an intelligence score for 322 four-year old children (237 of whom were breastfed) using the McCarthy Scales of Children’s Abilities.

The data have been loaded in a data.frame named bfeed with variables:
group: identifies whether each child was breastfed or not
score: each child’s score on the intelligence test.

a) Plot the data to visualize the intelligence scores for the two groups of children.

b) Calculate and interpret one effect size for the difference in mean intelligence scores between the two groups.

c) Conduct an independent samples t-test to compare the groups. State your hypotheses and the assumptions you’re making. Estimate and interpret the p-value.

d) Calculate and interpret a 90% confidence interval for the difference in means between the two groups.

Some summary statistics for each group are provided below.

# The data is loaded as bfeed
bfeed <- read.csv("http://www.bradthiessen.com/html5/stats/m300/bfeed.csv")

# Rename variables and groups
library(forcats)
bfeed <- bfeed %>%
  rename(group = Feeding, score = GCI) %>%
  mutate(group = fct_recode(group, "breastfed" = "Breastfed", "not" = "NotBreastfed"))

# Report means and standard deviations
bfeed %>%
  group_by(group) %>%
  summarize(n = n(),
            mean = mean(score),
            sd = sd(score))

## # A tibble: 2 × 4
##       group     n  mean       sd
##      <fctr> <int> <dbl>    <dbl>
## 1 breastfed   237 105.3 14.49998
## 2       not    85 100.9 13.99997

# a) Plot the data to visualize the intelligence scores
# You can modify the code I used at the beginning of the activity
# to create a dotplot or boxplot.  You'll need to change the name
# of the data frame (from "spider" to "bfeed") and the names of the
# variables (from "anxiety" and "group" to "score" and "group").
# You don't need any of the lines of code that start with: theme(.
# Type your code right below this line.
bfeed %>%
  ggplot(aes(y = score, x = group)) +
  geom_boxplot(fill = "gold", color = "black", alpha = 0.5) +
  geom_dotplot(binaxis = "y", stackdir = "center", fill = "steelblue", color = "white", alpha = 0.5) +
  scale_y_continuous(breaks=seq(50, 150, 10), minor_breaks=NULL) +
  theme(axis.title.x = element_text(size = 12, color = "#777777")) +
  theme(axis.text.x = element_text(size = 12)) +
  theme(axis.title.y = element_text(size = 12, color="#777777")) +
  theme(axis.text.y = element_text(size = 12)) +
  labs(
    title = "Anxiety by group"
  )

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

# b) Calculate either cohen's d or D.  You can use the functions
# "cohens_d" or "D_effect."  Type your code right below this line.
cohens_d(score ~ group, data = bfeed)

## [1] "Cohen's d =  0.306"

D_effect(score ~ group, data = bfeed)

## [1] "D =  0.586"

# In this shaded box, type your interpretation of the effect size you
# just calculated.

Cohen's d:  The groups differ by 0.3 standard deviation units.  That's a low/medium effect.

D:  If we randomly select one person from each group, there is a 59% chance the individual who was breastfed has the higher intelligence test score.

# c) Conduct a t-test using code similar to this:
# t.test(YYY ~ XXX, data = DDD, alternative = "greater")

t.test(score ~ group, data = bfeed, alternative = "greater")

## 
##  Welch Two Sample t-test
## 
## data:  score by group
## t = 2.4624, df = 153.01, p-value = 0.007455
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  1.442962      Inf
## sample estimates:
## mean in group breastfed       mean in group not 
##                   105.3                   100.9

# Write your hypotheses here:

Null hypothesis:  U-breastfeed = U-not

Alternative hypothesis:  U-bf > U-not

Assumptions I am making:  Independence, normality, equal variances, H0 is true
  
Interpretation of the p-value:  Likelihood of observing results as or more extreme if the null hypothesis were true.

# d) Make sure you construct a 90% interval (not 95%):
confint(t.test(score ~ group, data=bfeed, conf.level = 0.90, var.equal=TRUE))

## # A tibble: 1 × 5
##   `mean in group breastfed` `mean in group not`   lower   upper level
##                       <dbl>               <dbl>   <dbl>   <dbl> <dbl>
## 1                     105.3               100.9 1.40296 7.39714   0.9

confint(t.test(score ~ group, data=bfeed, conf.level = 0.90, var.equal=FALSE))

## # A tibble: 1 × 5
##   `mean in group breastfed` `mean in group not`    lower    upper level
##                       <dbl>               <dbl>    <dbl>    <dbl> <dbl>
## 1                     105.3               100.9 1.442962 7.357138   0.9

If we repeated this experiment a large number of times, 95% of the mean differences would fall between (1.4, 7.4)

The simdata data frame contains simulated data from a fictitious study on weightloss. Subjects were randomly assigned to one of two diet groups (Diet A or Diet B). The number of pounds each subject lost (weightloss) was then recorded.

Summary statistics, two effect sizes, and the results of an independent samples t-test are reported below. From all of this, are you willing to conclude Diet A results in greater weightloss than Diet B? What caused the p-value and effect sizes to be so low?

# You'll type your answers in the next shaded box

# Simulate the data
set.seed(1234)  # Set random number generator to replicate this data
simdata <- tibble(
  diet = c(rep("A", 20000), rep("B", 20000)),
  weightloss = c(rnorm(20000, 2, 20), rnorm(20000, 1, 20))
)

# Summary statistics
simdata %>%
  group_by(diet) %>%
  summarize(n = n(),
            mean = mean(weightloss),
            sd = sd(weightloss))

## # A tibble: 2 × 4
##    diet     n     mean       sd
##   <chr> <int>    <dbl>    <dbl>
## 1     A 20000 1.983689 19.93599
## 2     B 20000 1.150867 20.03774

# Effect sizes
cohens_d(weightloss ~ diet, data = simdata)

## [1] "Cohen's d =  0.042"

D_effect(weightloss ~ diet, data = simdata)

## [1] "D =  0.512"

# t-test
t.test(weightloss ~ diet, data = simdata, alternative="greater")

## 
##  Welch Two Sample t-test
## 
## data:  weightloss by diet
## t = 4.1668, df = 39997, p-value = 1.548e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.5040582       Inf
## sample estimates:
## mean in group A mean in group B 
##        1.983689        1.150867

# Write your conclusion here
The means differ, but the difference is not meaningful or important.

# Explain what caused both the p-value and effect sizes to be low.
The large sample size

Let’s go back to our spider data frame. Use bootstrap methods to construct a 95% confidence interval for Cohen’s d (the effect size of the difference in anxiety levels). To do this, you’ll want to use the cohens_d_2 function I wrote (instead of the diffmeans function we used to construct a confidence interval for the mean difference in the activity). Plot the distribution of bootstrap estimates and report the confidence interval limits. From this confidence interval, what can you conclude about the effect size?

# You'll calculate the effect size using
# cohens_d_2(anxiety ~ group, data = spider)
cohens_d_2(anxiety ~ group, data = spider)

##      real 
## 0.6864065

# You'll want to calculate the effect size
# as you repeatedly sample your data with replacement
# Your code will look something like this:
# bootsamples <- Do(###) * cohens_d_2(anxiety ~ group, data = spider)
# But you'll need to make sure you are *resampling* your data
bootsamples <- Do(5000) * cohens_d_2(anxiety ~ group, data = resample(spider))


# To plot those bootstrap estimates of the effect size, you can simply use:
# histogram(~real, data=bootsamples)
histogram(~real, data=bootsamples)

# To calculate the confidence interval, you'll use something like:
# confint(bootsamples, level = 0.95, method = "quantile")
confint(bootsamples, level = 0.95, method = "quantile")

## # A tibble: 1 × 6
##    name     lower    upper level     method  estimate
## * <chr>     <dbl>    <dbl> <dbl>      <chr>     <dbl>
## 1  real 0.0687742 1.652369  0.95 percentile 0.6864065

Let’s try something new with our spider data frame. Use randomization-based NHST methods to compare the medians of the real spider vs. photo groups. Plot the randomized sampling distribution and estimate the p-value. To calculate the difference in medians, you can use a diffmedian() function I wrote for this lesson.

# To calculate the difference in medians for our spider data
# you can use:  diffmedian(anxiety ~ group, data=spider)
random_medians <- Do(5000) * diffmedian(anxiety ~ shuffle(group), data = spider)

ggplot(data = random_medians, aes(x = real)) +
  geom_histogram(binwidth = 1, fill="lightblue", color="white", alpha = 0.8) +
  annotate("segment", x = 7, xend = 7, y = 0, yend = 1200, color = "red") +
  labs(
      title = "Randomized median differences",
      x = "difference between group medians"
      ) +
  scale_x_continuous(breaks=seq(-25, 25, 5), minor_breaks=NULL) +
    theme(
    axis.text.x = element_text(size = 11, color="grey10"),
    legend.position = "none",
    panel.grid.major.y = element_line(colour = "white"),
    panel.grid.major.x = element_line(colour = "white", size=.15),
    panel.grid.minor = element_blank(),
    panel.background = element_rect(fill = "grey93")
  )

prop(~ (real >= 7), data=random_medians)

##   TRUE 
## 0.2872

Suppose our spider data frame didn’t contain anxiety levels for two different groups of subjects. Instead, suppose 12 subjects first entered the room with the real spider for 5 minutes. After recording each subject’s anxiety level, suppose these same subjects then spent 5 minutes in the room with the photo of the spider. In this repeated measures study, the anxiety levels of the same 12 subjects were measured twice.

Conduct a test to determine if the real spider elicited more anxiety than the photo for these 12 subjects.

# We no longer have two independent groups.  We needed that independence
# assumption to calculate either SEpooled or the Satterthwaite approximation.
# What can we do with two dependent groups (the same group measured twice)?

# Let's separate our data into the two repeated measures
# First, let's get anxiety scores related to the real spider
real <- spider %>%
  filter(group == "real") %>%
  transmute(real = anxiety)
# Now, we'll get anxiety scores for those same people with the photo
photo <- spider %>%
  filter(group == "photo") %>%
  transmute(photo = anxiety)
# We can combine these into a single data frame
spider2 <- data.frame(real, photo)
# Let's look at this data
spider2

## # A tibble: 12 × 2
##     real photo
##    <dbl> <dbl>
## 1     40    30
## 2     35    35
## 3     50    45
## 4     55    40
## 5     65    50
## 6     55    35
## 7     50    55
## 8     35    25
## 9     30    30
## 10    50    45
## 11    60    40
## 12    39    50

# This shows that instead of having 24 independent anxiety scores
# we have a pair of anxiety scores for each of 12 subjects.
# The first subject had an anxiety of 40 with the real spider
#  and 30 with the photo.
# Now, let's calculate the difference in anxiety scores for each subject
spider2 <- spider2 %>%
  mutate(difference = real - photo)
# We can see this difference column now...
spider2

## # A tibble: 12 × 3
##     real photo difference
##    <dbl> <dbl>      <dbl>
## 1     40    30         10
## 2     35    35          0
## 3     50    45          5
## 4     55    40         15
## 5     65    50         15
## 6     55    35         20
## 7     50    55         -5
## 8     35    25         10
## 9     30    30          0
## 10    50    45          5
## 11    60    40         20
## 12    39    50        -11

# Under a true null hypothesis, we would expect this difference to be zero
# Let's see what our sample mean difference equals
mean(spider2$difference)

## [1] 7

# How likely were we to get an average difference score of 7
# if the null hypothesis were true?  Let's use a theory-based
# null hypothesis significance test

# Run the following code:
# t_test(anxiety ~ group, data=spider, paired=TRUE, alternative="less")
t_test(anxiety ~ group, data=spider, paired=TRUE, alternative="less")

## 
##  Paired t-test
## 
## data:  anxiety by group
## t = -2.4725, df = 11, p-value = 0.01549
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##       -Inf -1.915663
## sample estimates:
## mean of the differences 
##                      -7

You’re done!

Your Turn #1

Comparing two means