t.test(x, y=NULL, alternative="two.sided", mu=0, paired=F,
var.equal=F, conf.level=.95, treatment)
saddlepoint.test(x, y=NULL, alternative="two.sided", mu=0, paired=F,
var.equal=F, conf.level=.95, treatment)
NAs and
Infs are allowed but
will be removed.
NAs and
Infs are allowed but will be
removed. If
paired=TRUE,
then
x and
y must have the same length, and observation
pairs
(x[i], y[i]) with at least one
NA or
Inf will be
removed.
"greater",
"less" or
"two.sided", or just the
initial letter of each, indicating
the specification of the alternative hypothesis. For the
one-sample and paired t-tests,
alternative refers to the true
mean of the parent population in relation to the hypothesized value
mu.
For two-sample t-tests,
alternative refers
to the difference between the true population mean for
x and that for
y,
in relation to
mu.
TRUE,
x and
y are considered as paired vectors.
TRUE, the variances of the parent populations of
x
and
y are assumed equal. Argument
var.equal should be supplied
only for the two-sample (i.e., unpaired) tests.
x, with two unique values. This is
a grouping variable used to split
x into two samples. If supplied
then
y should not be used.
"htest") (for
t.test) or
c("htest.saddlepoint", "htest") (for
saddlepoint.test),
containing the following components:
names attribute
"t".
statistic. Component
parameters has
names attribute
"df".
conf.level.
When
alternative is not
"two.sided", the confidence interval
will be half-infinite, to reflect the interpretation of a confidence
interval as the set of all values
k for which one would not reject the
null hypothesis that the true mean or difference in means is
k.
Here infinity will be represented by
NA.
estimate has a
names
attribute describing its elements.
mu. Component
null.value has a
names
attribute describing its elements.
alternative:
"greater",
"less"
or
"two.sided".
x and
y.
In addition, in one-sample or paired applications,
saddlepoint.test adds a
"saddlepoint" attribute containing
a saddlepoint approximation to bootstrap tilting
confidence intervals. The attribute has components:
For the one-sample t-test, the null hypothesis
is that the mean of the population from which
x is drawn is
mu.
For the paired t-test, the null hypothesis is that the
population mean of the difference
x - y is equal to
mu. For
the two-sample t-tests,
the null hypothesis is that the population mean for
x minus that
for
y is
mu.
The alternative hypothesis in each case
indicates the direction of divergence of the population mean for
x (or
difference of means for
x and
y)
from
mu (i.e.,
"greater",
"less",
"two.sided").
A t-statistic has a t distribution if the underlying populations are normal,
the variances are equal, and you set
var.equal=TRUE.
These conditions are never satisfied in practice.
More importantly, the actual distribution
is approximately a t distribution if the sample sizes are reasonably
large, the distributions are not skewed, and you set
var.equal=FALSE.
You should set
var.equal=TRUE
only if you have good reason to believe the variances are equal.
We recommend that you not perform a hypothesis tests for equal
variances -- while t-tests are robust against non-normality,
variance tests are not;
"To make a preliminary test on variances is rather like putting to sea in
a rowing boat to find out whether conditions are sufficiently
calm for a ocean linear to leave port!" (Box page 333).
The effect of skewness cancels out in a two-sample problem with equal sample sizes where the underlying populations have the same variance and skewness. In one-sample problems, or when sample sizes differ, the effect of skewness on the distribution of the t-statistic disappears very slowly as the sample size increases, at the rate O(1/sqrt(n)).
The t-test and the associated confidence interval are quite robust with respect to level toward heavy-tailed non-Gaussian distributions (e.g., data with outliers). However, the t-test is quite non-robust with respect to power, and the confidence interval is quite non-robust with respect to average length, toward these same types of distributions.
The usual t-test results are not very robust against skewed distributions (except in large samples). If the distribution is skewed, these procedures have errors of order O(1/sqrt(n)). The bootstrap tilting intervals do not assume symmetry, and have errors of order O(1/n).
The saddlepoint intervals are a topic of current research.
(a) One-Sample t-Test.
The arguments
y,
paired and
var.equal determine the type of
test. If
y is
NULL, a one-sample t-test is carried out
with
x. Here
statistic is given by:
t = (mean(x) - mu) / ( sqrt(var(x)) / sqrt(length(x)) )
If
x was drawn from a normal population,
t has a t-distribution
with
length(x) - 1 degrees of freedom under the null
hypothesis.
(b) Paired t-Test.
If
y is not
NULL and
paired=TRUE, a paired t-test is performed; here
statistic is
defined through
t = (mean(d) - mu) / ( sqrt(var(d)) / sqrt(length(d)) )
where
d is the vector of differences
x - y.
Under the null
hypothesis,
t follows a t-distribution with
length(d) - 1 degrees of freedom, assuming
normality of the differences
d.
(c) Pooled-Variance Two-Sample t-Test.
If
y is not
NULL and
paired=FALSE, either a pooled-variance
or Welch modified
two-sample t-test is performed, depending on whether
var.equal is
TRUE or
FALSE. For the pooled-variance t-test,
statistic is
t = (mean(x) - mean(y) - mu) / s1,
with
s1 = sp * sqrt(1/nx + 1/ny), sp = sqrt( ( (nx-1)*var(x) + (ny-1)*var(y) ) / (nx + ny - 2) ), nx = length(x), ny = length(y).Assuming that
x and
y come from normal populations
with equal variances,
t has a t-distribution
with
nx + ny - 2 degrees of freedom under the null
hypothesis.
(d) Welch Modified Two-Sample t-Test.
If
y is not
NULL,
paired=FALSE and
var.equal=FALSE, the Welch
modified two-sample t-test is performed. In this case
statistic is
t = (mean(x) - mean(y) - mu) / s2
with
s2 = sqrt( var(x)/nx + var(y)/ny ), nx = length(x), ny = length(y).If
x and
y come from normal populations, the distribution of
t
under the null hypothesis can be approximated
by a t-distribution with (non-integral) degrees of freedom
1 / ( (c^2)/(nx-1) + ((1-c)^2)/(ny-1) )
c = var(x) / (nx * s2^2).
For each of the above tests, an expression for the related confidence
interval (returned component
conf.int) can be obtained in the usual way
by inverting the expression for the test statistic. Note however that, as
explained under the description of
conf.int, the confidence interval
will be half-infinite when
alternative is not
"two.sided";
infinity will be represented by
NA.
Box, G. E. P. (1953), "Non-normality and Tests on Variances," Biometrika, pp. 318-335.
Hogg, R. V. and Craig, A. T. (1970), Introduction to Mathematical Statistics, 3rd ed. Toronto, Canada: Macmillan.
Mood, A. M., Graybill, F. A. and Boes, D. C. (1974), Introduction to the Theory of Statistics, 3rd ed. New York: McGraw-Hill.
Snedecor, G. W. and Cochran, W. G. (1980), Statistical Methods, 7th ed. Ames, Iowa: Iowa State University Press.
x <- rnorm(12)
t.test(x)
# Two-sided one-sample t-test. The null hypothesis is
# that the population mean for 'x' is zero. The
# alternative hypothesis states that it is either greater
# or less than zero. A confidence interval for the
# population mean will be computed.
data.before <- c(31, 20, 18, 17, 9, 8, 10, 7)
data.after <- c(18, 17, 14, 11, 10, 7, 5, 6)
t.test(data.after, data.before,
alternative="less", paired=T)
# One-sided paired t-test. The null hypothesis is that
# the population mean "before" and the one "after" are
# the same, or equivalently that the mean change ("after"
# minus "before") is zero. The alternative hypothesis is
# that the mean "after" is less than the one "before",
# or equivalently that the mean change is negative. A
# confidence interval for the mean change will be
# computed.
x <- c(7.8, 6.6, 6.5, 7.4, 7.3, 7., 6.4, 7.1, 6.7, 7.6, 6.8)
y <- c(4.5, 5.4, 6.1, 6.1, 5.4, 5., 4.1, 5.5)
t.test(x, y, mu=2)
# Two-sided pooled-variance two-sample t-test.
# This assumes that the two populations variances are equal.
# The null hypothesis is that the population mean for 'x'
# minus that for 'y' is 2.
# The alternative hypothesis is that this difference
# is not 2. A confidence interval for the true difference
# will be computed.
t.test(x, y, var.equal=F, conf.level=0.90)
# Two-sided Welch modified two-sample t-test. The null
# hypothesis is that the population means for 'x' and 'y'
# are the same. The alternative hypothesis is that they
# are not. The confidence interval for the difference in
# true means ('x' minus 'y') will have a confidence level
# of 0.90.