Are you scared of spiders?
If you have arachnophobia, you might even be scared of that picture! Suppose I’m interested in seeing if arachnophobia is specific to real spiders or if photos of spiders can evoke similiar anxiety levels.
To study this, I find 24 arachnophobe volunteers. I randomly split them into two groups and send them into separate rooms. In the center of one room is a real spider in a cage; the other room only has a photo of that same spider. After 5 minutes, I measure the anxiety of each subject in this study (using some errorless measure of anxiety that doesn’t really exist).
Our question of interest is: Do the subjects in the room with the real spider experience more anxiety than subjects in the other room? To answer this question, we’ve decided to compare the mean anxiety level of each group.
Since this study only dealt with 24 people, I’ll enter the data manually:
# Create a vector to contain group membership (photo vs real groups)
group <- c(rep("photo", 12), rep("real", 12))
# Create a vector of the 24 anxiety scores
anxiety<-c(30, 35, 45, 40, 50, 35, 55, 25, 30, 45, 40, 50,
40, 35, 50, 55, 65, 55, 50, 35, 30, 50, 60, 39)
# Combine the vectors into a data.frame called "spider"
spider <- data.frame(group, anxiety)
# Look at the first several rows of our data
head(spider)
## group anxiety
## 1 photo 30
## 2 photo 35
## 3 photo 45
## 4 photo 40
## 5 photo 50
## 6 photo 35
In this study, we’re going to model anxiety as a function of the group (real spider vs photo): anxiety ~ group
.
Before we begin testing our model, we’ll want to explore our data with summary statistics and visualizations. First, let’s calculate some favorite summary statistics:
favstats(anxiety ~ group, data=spider)
## group min Q1 median Q3 max mean sd n missing
## 1 photo 25 33.75 40 46.25 55 40 9.293204 12 0
## 2 real 30 38.00 50 55.00 65 47 11.028888 12 0
Based on this, I notice: \(\overline{X}_{1}-\overline{X}_{2}=\) 47 – 40 = 7
Now let’s visualize our data:
# Side-by-side densityplots
densityplot(~anxiety | group, data=spider, lw=4, bw=4, col="steelblue", layout=c(1,2), cex=1)
# Strip plot
stripplot(anxiety ~ group, data=spider, jitter.data = TRUE,
xlab = "anxiety", cex=1.3, col="steelblue", aspect="1", pch=20)
# Boxplot
bwplot(anxiety ~ group, data=spider, col="steelblue", xlab="Group", ylab="scores",
par.settings=list(box.rectangle=list(lwd=3), box.umbrella=list(lwd=3)))
Let’s construct a model that could have generated the data in this study. First, we’ll define:
• \(y_{ig}=\) the anxiety of subject i in group g
• \(\mu=\) a constant, overall mean anxiety level all humans possess
• \(\alpha _{g}=(\mu _{g}-\mu )=\) the change in anxiety associated with being assigned to group g
• \(\epsilon_{ig}=(y_{ig}-\mu _{g})=\) errors or deviations in anxiety among individuals in a group (due to other factors not analyzed in this study).
With this notation, we can write our full model: \(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\).
Our goal is to determine which model (the null or full) best fits our data. Additionally, we want to estimate \(\alpha _{g}=(\mu _{g}-\mu )\): the effect of the treatment (real spider vs. photo) on anxiety.
We’re already familiar with several methods we can use to compare the means of two groups. We can classify these methods into two general frameworks:
• Randomization-based framework
•• Randomization-based null hypothesis significance tests
•• Bootstrap confidence intervals
• Theory-based framework
•• t-tests (with or without assuming equal variances)
•• Confidence intervals
•• Effect sizes
Regardless of which framework we choose, we will proceed by:
• Hypothesizing the two population means are equal: \(H_{0}: \mu _{1}-\mu _{2}\)
• Calculating a test statistic of interest: \(\overline{X}_{1}-\overline{X}_{2}=\) 47 – 40 = 7
• Estimating the sampling distribution of our test-statistic under our null hypothesis.
Recall that in our study, subjects were randomly assigned to groups. Under our null hypothesis, the real spider and photo evoke identical anxiety levels. In other words, the “real” and “photo” groups are only labels – they do not have any effect on anxiety.
Using the actual data from this study, our test statistic is: \(\overline{X}_{1}-\overline{X}_{2}=7\)
We want to know the distribution of test statistics we’d get if we could go back in time and repeatedly assign our subjects at random to the two groups. To do this, we’ll:
• Randomly shuffle the groups assigned to the 24 subjects in this study
• Calculate our test statistic, \(\overline{X}_{1}-\overline{X}_{2}\)
• Repeat
This is easy to do in R. First, let’s define our test statistic as test.stat
:
# Define the test statistic as a difference in means
test.stat <- diffmean(anxiety ~ group, data=spider)
test.stat
## diffmean
## 7
Let’s randomize our data by randomly assigning 12 subjects to the “real” group and 12 subjects to the “photo” group. Since we’re hypothesizing that the treatment does not affect anxiety, a subject’s score won’t change if he or she is moved from one group to the other.
Let’s try 3 randomizations of our data to see what test statistics we get:
# Randomize our data 3 times and calculate the difference in means each time
do(3) * diffmean(anxiety ~ shuffle(group), data=spider)
## diffmean
## 1 -7.000000
## 2 1.333333
## 3 -2.000000
Due to the randomization, we get a different value of the test statistic each time. If we calculate our test statistic for 10,000 randomizations of our data, we’ll have a good sense of the sampling distribution of \(\overline{X}_{1}-\overline{X}_{2}\).
# Randomize our data 10,000 times and calculate the test statistic for each
randomizedMEANdiffs <- do(10000) * diffmean(anxiety ~ shuffle(group), data=spider)
# Histogram of the randomized sampling distribution
histogram(~diffmean, data=randomizedMEANdiffs,
xlim=c(-20,20), # Set the x-axis range
width=2.5, # Set the width of each bar
xlab="Possible mean differences assuming null hypothesis is true",
groups=diffmean >= test.stat) # Highlight values > test statistic
ladd(panel.abline(v=test.stat)) # Add vertical line at test statistic
The histogram shows the randomized distribution of our test statistic (under a true null hypothesis). The vertical line shows the value of our observed test statistic.
We can estimate the likelihood of observing a test statistic as or more extreme than our observed test statistic:
p-value = proportion of randomizations resulting in a test statistic ≥ our observed test statistic
# We want the proportion of our 10,000 values in which the mean difference
# is greater than or equal to our observed test.stat
randomizedPVALUE <- prop(~ (diffmean >= test.stat), data=randomizedMEANdiffs)
randomizedPVALUE
## TRUE
## 0.0628
Instead of calculating the likelihood of our test statistic under the null hypothesis, we may be interested in a range of likely values for our test statistic.
To construct a bootstrap confidence interval, we need to repeatedly resample our data with replacement and calculate our test statistic. In effect, we’re treating our data as the population and taking random samples from it.
Let’s resample our data 3 times and see what values of the test statistic we get:
# Generate 3 bootstrap samples and calculate the test statistic for each
do(3) * diffmean(anxiety ~ group, data=resample(spider))
## diffmean
## 1 3.728571
## 2 10.168067
## 3 13.279720
Notice we got a different value of our test statistic each time. That’s due to our resampling – each time we sample with replacement, we may choose one subject multiple times and ignore other subjects.
Now let’s generate 10,000 bootstrap samples and calculate the test statistic for each one.
# Generate 10000 bootstrap samples and calculate the test statistic for each
bootMEANdiffs <- do(10000) * diffmean(anxiety ~ group, data=resample(spider))
# Create a plot of the bootstrap estimates of our test statistic
densityplot(~diffmean, data=bootMEANdiffs, plot.points = FALSE, col="steelblue", lwd=4)
To get our bootstrap confidence interval, we simply need to find the values of the test statistic that cut-off the top and bottom 2.5% of the bootstrap distribution:
bootstrapCI <- confint(bootMEANdiffs, level = 0.95, method = "quantile")
bootstrapCI
## name lower upper level method estimate
## 1 diffmean -0.8951049 14.91667 0.95 quantile 7
Our 95% confidence interval is: (-0.895, 14.917).
Remember, we’ve defined our test statistic (\(\overline{X}_{1}-\overline{X}_{2}\)) as test.stat
. We want to estimate the distribution of test statistics we’d get if we were to repeatedly take samples from our populations (and calculate the test statistic for each sample).
In the previous sections, we estimated this sampling distribution via resampling (randomization or bootstrap samples).
If you look at the randomized and bootstrap distributions we constructed, you’ll notice they have the same unimodal, symmetric shape. Rather than take the time to resample our data, we may be able to use theory-based methods to approximate these unimodal, symmetric sampling distributions. That could speed up our calculations (and let us estimate p-values and confidence intervals without the use of a computer).
Assumptions
Recall the statistical model we constructed: \(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\).
Using randomization-based methods, we did not make any assumptions about the distribution of those model components. If we’re willing to make some assumptions, we can use theory-based methods.
Our assumptions can be summarized as: \(y_{ig}\sim\mathrm{N(\mu _{i},\sigma ^{2})}\).
This implies that while each treatment group may have its own mean, all the data are independent and normally distributed with a constant variance.
Another way to write this model would be to add an extra statement to our original model:
\(y_{ig}=\mu +\alpha _{g}+\epsilon_{ig}\), with \(\epsilon_{ig}\sim\mathrm{N(0,\sigma ^{2})}\)
This implies our errors follow a normal distribution centered at zero.
Let’s restate our assumptions one more time. In order to use these theory-based methods, we’re going to assume:
• a null hypothesis that states, \(H_{0}: \mu _{1}-\mu _{2}=0\)
• anxiety scores within and across our treatment groups are independent
• the sampling distribution of our test-statistic approximates a normal distribution
Let’s see how these assumptions allow us to estimate the sampling distribution of \(\overline{X}_{1}-\overline{X}_{2}\).
• If we assume \(\mu _{1}-\mu _{2}=0\), then: \(E[\overline{X}_{1}-\overline{X}_{2}]=E[\overline{X}_{1}]-E[\overline{X}_{2}]=\mu _{1}-\mu _{2}=0\).
This tells us our sampling distribution will be centered at zero.
• If we assume normality, we know the shape of the sampling distribution.
• Now that we know the center and shape of our sampling distribution, we only need to figure out its standard error.
When you first learned about the Central Limit Theorem, you learned that the sampling distribution of the sample mean has a standard error of \(\sigma _{\overline{x}}=\frac{\sigma _{x}}{\sqrt{n}}\).
In MATH 300, we assumed independence to derive the standard error of our test statistic: \(\sigma _{(\overline{x}_{1}-{\overline{x}_{2})}}=\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}\). This is called the Satterthwaite approximation to the standard error. It does not make any assumptions about the variances of our two groups.
If we’re willing to assume the population variances of our two groups are equal, we can calculate the standard error with a different formula. This formula pools (or finds a weighted average) of the two group standard deviations:
\(SE_{pooled}=s_{pooled}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}=\sqrt{\frac{(n_{1}-1)s_{1}^{2}+(n_{2}-1)s_{2}^{2}}{n_{1}+n_{2}-2}}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}\)
Checking assumptions
Before we choose a method (Satterthwaite or SEpooled) and calculate our standard error, we should evaluate the assumptions we’ll need to make.
Regardless of the method we choose, we need to assume our errors will follow a normal distribution. Later on in this course, we’ll learn how to check our residuals. For now, we’ll check for normality by examining plots of the sample data (shown at the beginning of this example), by looking at Q-Q plots, or by conducting a normality test.
# Q-Q plot for real spider group
qqnorm(spider$anxiety[group=="real"], ylab = "Anxiety scores", main="Real spider"); qqline(spider$anxiety[group=="real"])
# Q-Q plot for photo group
qqnorm(spider$anxiety[group=="photo"], ylab = "Anxiety scores", main="Real spider"); qqline(spider$anxiety[group=="photo"])
We could also conduct a test of normality, like the Shapiro-Wilk test.
# Shapiro-Wilks test
shapiro.test(spider$anxiety[group=="real"])
##
## Shapiro-Wilk normality test
##
## data: spider$anxiety[group == "real"]
## W = 0.94887, p-value = 0.6206
shapiro.test(spider$anxiety[group=="photo"])
##
## Shapiro-Wilk normality test
##
## data: spider$anxiety[group == "photo"]
## W = 0.96502, p-value = 0.8523
How can we check the reasonableness of our equal variances assumption? We’ll figure that out in the next lesson. For now, let’s calculate the standard error using both the Satterthwaite approximation and the pooled estimate (assuming equal variances):
Satterthwaite: \(\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}=\sqrt{\frac{11.0289^{2}}{12}+\frac{9.2932^{2}}{12}}=4.163\)
SEpooled: \(s_{pooled}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}=\sqrt{\frac{(12-1)11.0289^{2}+(12-1)9.2932^{2}}{12+12-2}}\sqrt{\frac{1}{12}+\frac{1}{12}}=4.163\)
In this case, both estimates gave virtually identical estimates. Here’s the code I used to calculate these standard errors:
# First let's define the terms in the formulas
# Calculate standard errors of each group
SD1 <- sd(anxiety[group=="real"], data=spider)
SD2 <- sd(anxiety[group=="photo"], data=spider)
# Calculate sample size in each group
n1 <- length(spider$anxiety[group=="real"])
n2 <- length(spider$anxiety[group=="photo"])
# Now to get Satterthwaite's approximation
Satt <- sqrt( ((SD1^2)/(n1)) + ((SD2^2)/(n2)) )
Satt
## [1] 4.163332
# SEpooled
Spooled <- sqrt( (((n1 - 1) * (SD1^2)) + ((n2 - 1) * (SD2^2))) / (n1 + n2 - 2) )
SEpooled <- sqrt( (1/n1) + (1/n2) ) * Spooled
SEpooled
## [1] 4.163332
Now that we have a sampling distribution, we can calculate a p-value. Remember, a p-value estimates the answer to the following question:
How likely were we to observe a test statistic of at least 7 if this sampling distribution (based on the null hypothesis) were true?
We can estimate this by seeing where our observed test statistic falls on this distribution.
If we want a more accurate estimate of this p-value, we need to understand even more about the sampling distribution. Since you learned this method in a previous statistics class, you know we’re conducting an independent samples t-test and the sampling distribution follows a t-distribution with \(n_{1}+n_{2}-2\) degrees-of-freedom.
To more formally conduct this test, we first convert our observed test statistic into a t-statistic:
\(\mathrm{t_{observed}} = \frac{\overline{X}_{1}-\overline{X}_{2}}{\mathrm{SE_{pooled}}}\).
Since our Satterthwaite and SEpooled methods yielded virtually identical values for the standard error, we can calculate our t-statistic as:
\(\mathrm{t_{observed}} = \frac{7}{4.163}=1.6813\)
Now that we have an observed t-statistic, we can see where it falls on a t-distribution. Because we have 24 observations (and we’re comparing 2 group means), we have 22 degrees of freedom:
# Let's first plot a t-distribution with df = 48
plotDist("t", df=22, lw=5, col="steelblue")
# Now let's determine the likelihood of observing a t-statistic
# greater than or equal to our observed t of 1.6813
xpt(c(1.6813), df=(n1+n2-2))
## [1] 0.9465759
Thankfully, we don’t have to go through all these steps every time we conduct a t-test. Instead, we can use the built-in t.test
function. Let’s go ahead and do that to verify the p-value we just estimated:
# t-test with no equal variance assumption
# Welch-Satterthwaite method
# Notice the weird degrees of freedom in the output
t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: anxiety by group
## t = -1.6813, df = 21.385, p-value = 0.05362
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 0.1580825
## sample estimates:
## mean in group photo mean in group real
## 40 47
# t-test with equal variance assumption
t.test(anxiety ~ group, data=spider, alternative = c("less"), var.equal = TRUE, conf.level = 0.95)
##
## Two Sample t-test
##
## data: anxiety by group
## t = -1.6813, df = 22, p-value = 0.05342
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 0.1490421
## sample estimates:
## mean in group photo mean in group real
## 40 47
If you look at the p-values from the output of those commands, you’ll find they replicate what we calculated earlier.
Having already calculated the standard error of our test statistic, constructing a 95% confidence interval is simple:
\(95\%\ \mathrm{ CI}: (\overline{X}_{1}-\overline{X}_{2})\pm t_{n_{1}+n_{2}-2}^{\alpha /2}(\mathrm{SE})\)
We can plug in the Satterthwaite approximation or the pooled standard error into that formula:
# Satterthwaite - no equal variance assumption
confint(t.test(anxiety ~ group, data=spider, conf.level = 0.95))
## mean in group photo mean in group real lower
## 40.000000 47.000000 -15.648641
## upper level
## 1.648641 0.950000
# SEpooled with equal variance assumption
confint(t.test(anxiety ~ group, data=spider, conf.level = 0.95, var.equal=TRUE))
## mean in group photo mean in group real lower
## 40.000000 47.000000 -15.634222
## upper level
## 1.634222 0.950000
Instead of conducting a null hypothesis significance test or constructing a confidence interval, we may be interested in estimating the magnitude of the difference between the group means. One such effect size could be estimated with Cohen’s d: \(\mathrm{Effect \ size}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sigma}\), where \(\sigma\) represents a standard deviation representing one or both groups.
We already know the difference in means is 7. Before we calculated SEpooled
, we first calculated Spooled
(the pooled standard deviation) to be 10.198. Our effect size is, then:
\(\mathrm{Effect \ size}=\frac{\overline{X}_{1}-\overline{X}_{2}}{\sigma }=\frac{7}{10.198}=0.686\).
Let’s summarize the p-values and confidence intervals from the randomization- and theory-based methods:
Randomized p-value: 0.063
t-test (equal variances assumed) p-value: 0.053
t-test (equal variances not assumed) p-value: 0.054
Bootstrap 95% confidence interval: (-0.895 , 14.917)
Theory-based 95% confidence interval (equal variances assumed): (-1.634, 15.634)
Theory-based 95% confidence interval (equal variances not assumed): (-1.649, 15.649)
Cohen’s d (effect size): 0.686
The results from the randomization- and theory-based methods were similar in this example. This is because the assumptions we make in the theory-based methods (the normality assumption, especially) were fairly reasonable assumptions to make in this situation.
Suppose that instead of finding 24 volunteers for the arachnophobia study, I only find 12. With such a small sample size, I decide to have each subject go into both the room with the real spider and the room with the photo. I flip a coin to randomly determine which room each subject enters first.
Further suppose that miraculously, we get the exact same data we had before.
# Create a vector of the 12 anxiety scores from the real spider room
real <- c(40, 35, 50, 55, 65, 55, 50, 35, 30, 50, 60, 39)
# Create a vector of the 12 anxiety scores from the photo room
photo<-c(30, 35, 45, 40, 50, 35, 55, 25, 30, 45, 40, 50)
# Combine the vectors into a data.frame called "spider"
spideranxiety <- data.frame(real, photo)
# Look at the data
spideranxiety
## real photo
## 1 40 30
## 2 35 35
## 3 50 45
## 4 55 40
## 5 65 50
## 6 55 35
## 7 50 55
## 8 35 25
## 9 30 30
## 10 50 45
## 11 60 40
## 12 39 50
The only difference is that we no longer have 24 independent anxiety scores. Instead, we have a pair of anxiety scores for each of 12 subjects. Higher-anxiety subjects are probably high-anxiety with both the real spider and the photo (and lower-anxiety subjects probably have lower anxiety in both rooms), so our data are now matched (or dependent).
What we’re going to do is calculate the difference in anxiety levels for each subject (real - photo).
# Calculate the difference and store it in a variable called "difference"
spideranxiety$difference <- spideranxiety$real - spideranxiety$photo
# Examine the first several rows of data
head(spideranxiety)
## real photo difference
## 1 40 30 10
## 2 35 35 0
## 3 50 45 5
## 4 55 40 15
## 5 65 50 15
## 6 55 35 20
Before we run any test, remember we should explore our data. Here are some descriptive statistics and a density plot:
# You can see the mean of our difference scores = 7
favstats(~difference, data=spideranxiety)
## min Q1 median Q3 max mean sd n missing
## -11 0 7.5 15 20 7 9.807233 12 0
# Density plot
densityplot(~difference, data=spideranxiety, lw=4, bw=4, col="steelblue", cex=1.1)
Consider the first subject in this study, who had anxiety scores of 40 and 30. The randomization null hypothesis is that the real spider and the photo of the spider are completely equivalent. This means the first subject would have had scores of 40 and 30 no matter what order he or she visited the rooms with the real spider or the photo.
Thus, under the randomization null hypothesis, we had an equal chance of observing a mean difference of +10 or -10 for this subject.
This is the key to our randomization-based method. We’re simply going to randomize whether the mean difference for each subject is positive or negative. The magnitude of the differences won’t change.
To do this, I’ll have R randomly multiply each subject’s mean difference by +1 or -1. Then, I’ll find the mean of all the difference scores. I’ll repeat this many times.
# Create a vector of -1 and +1
spideranxiety$posneg <- c(-1,1)
# Randomly sample 12 values of +1 or -1.
# Multiply those values by the difference scores for each subject.
# Find the mean of those randomized difference scores
# Do this 10,000 times
# Store the result as depRand
depRand <- do(10000) * mean( sample(spideranxiety$posneg,12) * spideranxiety$difference )
# Histogram of the randomized mean difference scores
histogram(~mean, data=depRand,
xlim=c(-11,11), # Set the x-axis range
xlab="Possible mean differences assuming null hypothesis is true",
groups=mean >= test.stat) # Highlight values > test statistic
ladd(panel.abline(v=test.stat)) # Add vertical line at test statistic
# Calculate p-value
# Proportion of 10,000 randomizations yielding mean difference >7
prop(~ (mean >= test.stat), data=depRand)
## TRUE
## 0.0058
In a previous statistics course, you learned the formula for a dependent samples t-test:
\(t_{\mathrm{n-1}}=\frac{\overline{X}_{\mathrm{diff}}-\mu _{0}}{\frac{s}{\sqrt{n}}}\)
Our data yields a t-statistic of:
\(t_{\mathrm{11}}=\frac{{7}}{\frac{9.807}{\sqrt{12}}}=2.47\)
We can visualize this t-statistic within its sampling distribution:
# plot t-distribution with 11 degrees-of-freedom
xpt(c(2.4725), df=(11))
## [1] 0.9845082
We can also conduct a dependent samples t-test in a single command (by telling R that we have paired data):
# Dependent samples t-test
t.test(real, photo, data=spideranxiety, paired=TRUE, alternative="greater")
##
## Paired t-test
##
## data: real and photo
## t = 2.4725, df = 11, p-value = 0.01549
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 1.915663 Inf
## sample estimates:
## mean of the differences
## 7
Use randomization-based methods to compare the medians of the real spider vs. photo groups. To calculate the difference in medians, you can use a diffmedian()
function I wrote for this lesson.
Let’s pretend as though our spider
dataset simply contains anxiety scores for 24 individuals. We’ll forget the fact that the individuals were randomly assigned to two groups. Calculate the standard deviation of the anxiety scores with the command: sd(~anxiety, data=spider)
. Then, use bootstrap methods to construct a 95% confidence interval for the population standard deviation.
Let’s compare results from all our methods to compare two group means. Given a dataset called d
with scores
for groups
A
and B
, do the following: (a) Calculate summary statistics using the favstats
command, (b) Create a visualization of the group distributions, (c) use shapiro.test
test to test for normal distributions, (d) run t.test
with and without the equal-variance assumption, (e) run a randomization-based test to compare the means and report a p-value, (f) calculate an effect size of the mean scores
difference between groups A
and B
, (g) write out any conclusion(s) you can make about the difference in means (be sure to identify the results you’re using to make these conclusions).
When you’re ready to try the Your Turn exercises:
Spider dataset: Field, Andy (2012). Discovering Statistics Using R. ISBN: 978-1446200469
This document is released under a Creative Commons Attribution-ShareAlike 3.0 Unported license.