t.test(x, y=NULL, alternative="two.sided", mu=0, paired=F, var.equal=F, conf.level=.95, treatment) saddlepoint.test(x, y=NULL, alternative="two.sided", mu=0, paired=F, var.equal=F, conf.level=.95, treatment)
NA
s and
Inf
s are allowed but
will be removed.
NA
s and
Inf
s are allowed but will be
removed. If
paired=TRUE
,
then
x
and
y
must have the same length, and observation
pairs
(x[i], y[i])
with at least one
NA
or
Inf
will be
removed.
"greater"
,
"less"
or
"two.sided"
, or just the
initial letter of each, indicating
the specification of the alternative hypothesis. For the
one-sample and paired t-tests,
alternative
refers to the true
mean of the parent population in relation to the hypothesized value
mu
.
For two-sample t-tests,
alternative
refers
to the difference between the true population mean for
x
and that for
y
,
in relation to
mu
.
TRUE
,
x
and
y
are considered as paired vectors.
TRUE
, the variances of the parent populations of
x
and
y
are assumed equal. Argument
var.equal
should be supplied
only for the two-sample (i.e., unpaired) tests.
x
, with two unique values. This is
a grouping variable used to split
x
into two samples. If supplied
then
y
should not be used.
"htest")
(for
t.test
) or
c("htest.saddlepoint", "htest")
(for
saddlepoint.test
),
containing the following components:
names
attribute
"t"
.
statistic
. Component
parameters
has
names
attribute
"df"
.
conf.level
.
When
alternative
is not
"two.sided"
, the confidence interval
will be half-infinite, to reflect the interpretation of a confidence
interval as the set of all values
k
for which one would not reject the
null hypothesis that the true mean or difference in means is
k
.
Here infinity will be represented by
NA
.
estimate
has a
names
attribute describing its elements.
mu
. Component
null.value
has a
names
attribute describing its elements.
alternative
:
"greater"
,
"less"
or
"two.sided"
.
x
and
y
.
In addition, in one-sample or paired applications,
saddlepoint.test
adds a
"saddlepoint"
attribute containing
a saddlepoint approximation to bootstrap tilting
confidence intervals. The attribute has components:
For the one-sample t-test, the null hypothesis
is that the mean of the population from which
x
is drawn is
mu
.
For the paired t-test, the null hypothesis is that the
population mean of the difference
x - y
is equal to
mu
. For
the two-sample t-tests,
the null hypothesis is that the population mean for
x
minus that
for
y
is
mu
.
The alternative hypothesis in each case
indicates the direction of divergence of the population mean for
x
(or
difference of means for
x
and
y
)
from
mu
(i.e.,
"greater"
,
"less"
,
"two.sided"
).
A t-statistic has a t distribution if the underlying populations are normal,
the variances are equal, and you set
var.equal=TRUE
.
These conditions are never satisfied in practice.
More importantly, the actual distribution
is approximately a t distribution if the sample sizes are reasonably
large, the distributions are not skewed, and you set
var.equal=FALSE
.
You should set
var.equal=TRUE
only if you have good reason to believe the variances are equal.
We recommend that you not perform a hypothesis tests for equal
variances -- while t-tests are robust against non-normality,
variance tests are not;
"To make a preliminary test on variances is rather like putting to sea in
a rowing boat to find out whether conditions are sufficiently
calm for a ocean linear to leave port!" (Box page 333).
The effect of skewness cancels out in a two-sample problem with equal sample sizes where the underlying populations have the same variance and skewness. In one-sample problems, or when sample sizes differ, the effect of skewness on the distribution of the t-statistic disappears very slowly as the sample size increases, at the rate O(1/sqrt(n)).
The t-test and the associated confidence interval are quite robust with respect to level toward heavy-tailed non-Gaussian distributions (e.g., data with outliers). However, the t-test is quite non-robust with respect to power, and the confidence interval is quite non-robust with respect to average length, toward these same types of distributions.
The usual t-test results are not very robust against skewed distributions (except in large samples). If the distribution is skewed, these procedures have errors of order O(1/sqrt(n)). The bootstrap tilting intervals do not assume symmetry, and have errors of order O(1/n).
The saddlepoint intervals are a topic of current research.
(a) One-Sample t-Test.
The arguments
y
,
paired
and
var.equal
determine the type of
test. If
y
is
NULL
, a one-sample t-test is carried out
with
x
. Here
statistic
is given by:
t = (mean(x) - mu) / ( sqrt(var(x)) / sqrt(length(x)) )
If
x
was drawn from a normal population,
t
has a t-distribution
with
length(x) - 1
degrees of freedom under the null
hypothesis.
(b) Paired t-Test.
If
y
is not
NULL
and
paired=TRUE
, a paired t-test is performed; here
statistic
is
defined through
t = (mean(d) - mu) / ( sqrt(var(d)) / sqrt(length(d)) )
where
d
is the vector of differences
x - y
.
Under the null
hypothesis,
t
follows a t-distribution with
length(d) - 1
degrees of freedom, assuming
normality of the differences
d
.
(c) Pooled-Variance Two-Sample t-Test.
If
y
is not
NULL
and
paired=FALSE
, either a pooled-variance
or Welch modified
two-sample t-test is performed, depending on whether
var.equal
is
TRUE
or
FALSE
. For the pooled-variance t-test,
statistic
is
t = (mean(x) - mean(y) - mu) / s1,
with
s1 = sp * sqrt(1/nx + 1/ny), sp = sqrt( ( (nx-1)*var(x) + (ny-1)*var(y) ) / (nx + ny - 2) ), nx = length(x), ny = length(y).Assuming that
x
and
y
come from normal populations
with equal variances,
t
has a t-distribution
with
nx + ny - 2
degrees of freedom under the null
hypothesis.
(d) Welch Modified Two-Sample t-Test.
If
y
is not
NULL
,
paired=FALSE
and
var.equal=FALSE
, the Welch
modified two-sample t-test is performed. In this case
statistic
is
t = (mean(x) - mean(y) - mu) / s2
with
s2 = sqrt( var(x)/nx + var(y)/ny ), nx = length(x), ny = length(y).If
x
and
y
come from normal populations, the distribution of
t
under the null hypothesis can be approximated
by a t-distribution with (non-integral) degrees of freedom
1 / ( (c^2)/(nx-1) + ((1-c)^2)/(ny-1) )
c = var(x) / (nx * s2^2).
For each of the above tests, an expression for the related confidence
interval (returned component
conf.int
) can be obtained in the usual way
by inverting the expression for the test statistic. Note however that, as
explained under the description of
conf.int
, the confidence interval
will be half-infinite when
alternative
is not
"two.sided"
;
infinity will be represented by
NA
.
Box, G. E. P. (1953), "Non-normality and Tests on Variances," Biometrika, pp. 318-335.
Hogg, R. V. and Craig, A. T. (1970), Introduction to Mathematical Statistics, 3rd ed. Toronto, Canada: Macmillan.
Mood, A. M., Graybill, F. A. and Boes, D. C. (1974), Introduction to the Theory of Statistics, 3rd ed. New York: McGraw-Hill.
Snedecor, G. W. and Cochran, W. G. (1980), Statistical Methods, 7th ed. Ames, Iowa: Iowa State University Press.
x <- rnorm(12) t.test(x) # Two-sided one-sample t-test. The null hypothesis is # that the population mean for 'x' is zero. The # alternative hypothesis states that it is either greater # or less than zero. A confidence interval for the # population mean will be computed. data.before <- c(31, 20, 18, 17, 9, 8, 10, 7) data.after <- c(18, 17, 14, 11, 10, 7, 5, 6) t.test(data.after, data.before, alternative="less", paired=T) # One-sided paired t-test. The null hypothesis is that # the population mean "before" and the one "after" are # the same, or equivalently that the mean change ("after" # minus "before") is zero. The alternative hypothesis is # that the mean "after" is less than the one "before", # or equivalently that the mean change is negative. A # confidence interval for the mean change will be # computed. x <- c(7.8, 6.6, 6.5, 7.4, 7.3, 7., 6.4, 7.1, 6.7, 7.6, 6.8) y <- c(4.5, 5.4, 6.1, 6.1, 5.4, 5., 4.1, 5.5) t.test(x, y, mu=2) # Two-sided pooled-variance two-sample t-test. # This assumes that the two populations variances are equal. # The null hypothesis is that the population mean for 'x' # minus that for 'y' is 2. # The alternative hypothesis is that this difference # is not 2. A confidence interval for the true difference # will be computed. t.test(x, y, var.equal=F, conf.level=0.90) # Two-sided Welch modified two-sample t-test. The null # hypothesis is that the population means for 'x' and 'y' # are the same. The alternative hypothesis is that they # are not. The confidence interval for the difference in # true means ('x' minus 'y') will have a confidence level # of 0.90.