COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE
INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR
TWO NORMAL POPULATION PARAMETERS
Except where reference is made to the work of others, the work described in this
dissertation is my own or was done in collaboration with my advisory
committee. This dissertation does not include proprietary or
classified information.
_________________________________________________
Ching Ying Huang
Certificate of Approval:
_____________________________ _____________________________
Alice E. Smith Saeed Maghsoodloo, Chair
Professor Professor
Industrial and Systems Engineering Industrial and Systems Engineering
_____________________________ _____________________________
Kevin T. Phelps George T. Flowers
Professor Dean
Mathematics and Statistics Graduate School
COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE
INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR
TWO NORMAL POPULATION PARAMETERS
ChingYing Huang
A Dissertation
Submitted to
the Graduate Faculty of
Auburn University
in Partial Fulfillment of the
Requirements for the
Degree of
Doctor of Philosophy
Auburn, Alabama
December 19, 2008
COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE
INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR
TWO NORMAL POPULATION PARAMETERS
Ching Ying Huang
Permission is granted to Auburn University to make copies of this dissertation at its
discretion, upon request of individuals or institutions and at their expense. The
author reserves all publication rights.
__________________________________
Signature of Author
___________________________________
Date of Graduation
iv
VITA
ChingYing Huang, daughter of ShuhPeir Huang and YuehE Lin, was born
April 14, 1979, in Taipei, Taiwan. She entered Chang Gung University at Taoyuan,
Taiwan in September, 1998 and received the Bachelor of Science degree in Business
Administration June, 2002. She started her graduate program in Industrial and Systems
Engineering at Auburn University August 2003. She married ChihWei Chiang on
December 30, 2006.
v
DISSERTATION ABSTRACT
COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE
INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR
TWO NORMAL POPULATION PARAMETERS
ChingYing Huang
Doctor of Philosophy, December 19, 2008
(B.S., Chang Gung University, 2002)
156 Typed Pages
Directed by Saeed Maghsoodloo
Two overlapping confidence intervals have been used in many sources in the past
30 years to conduct statistical inferences about two normal population means (?
x
and ?
y
).
Several authors have examined the shortcomings of Overlap procedure in the past 13
years and have determined that such a method completely distorts the significance level
of testing the null hypothesis H
0
: ?
x
= ?
y
and reduces the statistical power of the test.
Nearly all results for small sample sizes in Overlap literature have been obtained either
by simulation or by somewhat inaccurate formulas, and only largesample (or known
variance) exact information has been provided. Nevertheless, there are many aspects of
Overlap that have not yet been presented in the literature and compared against the
standard statistical procedure. This paper will present exact formulas for the % overlap,
vi
ranging in the interval (0, 61.3626%] for a 0.05level test, that two independent
confidence intervals (CIs) can have, but the null hypothesis of equality of two population
means must still be rejected at a preassigned level of significance ? for sample sizes ? 2.
The exact impact of Overlap on the ?level and the power of pooledt test will
also be presented. Further, the impact of Overlap on the power of the Fstatistic in testing
the null hypothesis of equality of two normal process variances will be assessed. Finally,
we will use the noncentral t distribution, which has never been applied in Overlap
literature, to assess the Overlap impact on type II error probability when testing H
0
: ?
x
=
?
y
for sample sizes n
x
and n
y
? 2.
vii
ACKNOWLEDGEMENT
The author would like to thank her parents, ShuhPeir Huang and Yueh E Lin, and
two sisters, YuHsin and YuLi Huang, for their positive attitude and encouragement.
Especially, she dedicates this work to her husband, ChihWei Chiang, whose patience
and moral support have made it possible.
The author expresses her gratitude to Professor Saeed Maghsoodloo for his
experienced knowledge and assistance on this research and Professor Alice E. Smith for
her support during the author?s graduate program. The author is also thankful to
Professor Kevin Phelps for helpful suggestions.
viii
Computer software used: MS Excel
MS Word
MathType
Matlab
ix
TABLE OF CONTENTS
List of Tables ??????????????????????????? xi
List of Notation ??????????????????????................ xiii
1.0 Introduction ?????????????????????????.. 1
2.0 Literature Review ??????????????????????? 6
3.0 Comparing Two Normal Population Means for the Known Variances Case, or the
Limiting Unknown Variances Case Where Both Sample Sizes Approach Infinity
..??????????????????????????????. 12
3.1 The Case of
xy
? ??== ?.?????????????????. 13
3.2 The Case of Known but Unequal Variances ???????????. 27
4.0 Bonferroni Intervals for Comparing Two Sample Means ???????... 44
5.0 Comparing the Overlap of Two Independent CIs with a Single CI for the Ratio of
Two Normal Population Variances ???? ????????????. 52
6.0 The Impact of Overlap on Type I Error Probability of H
0
: ?
x
= ?
y
for Unknown
Normal Process Variances and Sample Sizes ????????????... 71
6.1 The case of H
0
:
xy
? ??= = Not Rejected Leading to the Pooled ttest .. 72
6.2 The Case of H
0
:
x y
? ?= Rejected Leading to the Two Independent Sample
tTest ?????????????????????????? 77
6.3 Comparing the Paired tCI with Two Independent tCIs ??????. 85
x
7.0 The Percent Overlap that Leads to the Rejection of H
0
:
x y
? ?=
7.1 The case of Unknown
xy
? ??= = ??????????????. 92
7.2 The Case of H
0
:
x y
? ?= Rejected Leading to the Two Independent Sample
tTest ?????????????????????????? 98
7.3 Comparing the Paired tCI with Two Independent tCIs ?????? 97
8.0 The Impact of Overlap on Type II Error Probability for the Case of Unknown
Process Variances
2
x
? ,
2
y
? and Small to Moderate Sample Sizes ????. 108
8.1 The case of H
0
:
xy
? ??= = Not Rejected Leading to the Pooled t Test
????????????????????????????. 109
8.2 The Case of H
0
:
x y
? ?= Rejected Leading to the Two Independent Sample
tTest (or the tPrime Test) ????????????........................ 117
8.3 The Impact of Overlap on Type II Error Probability for the Paired tTest (i.e.,
the Randomized Block Design) when Process Variances are Unknown
????????????????????????????. 121
9.0 Conclusions and Future Research ????????????????.. 127
10.0 References ?????????????????????????.... 132
Appendices ???????????????????????????? 136
Appendix A ?????????????????????????.. 137
Appendix B ?????????????????????????.. 139
xi
LIST OF TABLES
Table 1. The Relative Power of Overlap as Compared to the Standard Method for
Different Sample Sizes n and /( 2)?? Combinations ??????? 24
Table 2. Summary Conclusion of ? and ?? ???????????????. 26
Table 3. Type ? Error Prs at ? = 0.05 from the Standard Method and at ? =0.16578
from the Overlap Method ??????????????????.. 26
Table 4. The Type I Error Pr of Two Individual CIs with Different k at ? = 0.05 and
0.01 ???????????????????????????. 30
Table 5. Values of ? Versus k at ? = 0.05 and ? = 0.01 ??????????. 34
Table 6A. The Relative Power of Overlap with the Standard Method for Different
Sample Sizes n and /( 2)?? Combinations for the Case of Known but
Unequal Variances ????????????????????? 38
Table 6B. RELEFF of Overlap to the Standard Method at ? = 0.05 and K=1 ??.. 42
Table 7. Type I Errors for Overlap and Bonferroni Methods at ? =0.05 ???? 47
Table 8. The Impact of Bonferroni on Percent Overlap at Different k ????? 49
Table 9. Type ? Error Pr for the Standard, Overlap, and Bonferroni Methods with
Different k and d Combinations ????????????????. 50
Table 10. The Values of ? and ?' for Various Values of ?
x
and ?
y
??????.. 56
Table 11. The Impact of Overlap on Type I error Pr for the EqualSample Size Case
xii
When Testing the Ratio
22
xy
/? ? Against 1 ???????????? 58
Table 12. The % Overlap for the Different Combinations of Degree of Freedom at
? =0.05. ?????????????????????????.. 60
Table 13. The % Overlap for the Case of ? =0.05 and n
x
= x
y
= n ??????? 61
Table 14. The Overlap Significance Level,? , that Yields the Same 5%Level Test
or 1%Level Test by the Standard Method ???????????? 64
Table 15. The Overlap Significance Level, ? , That Yields the Same 5%Level Test
or 1%Level Test by the Standard Method at Fixed
y
? and Changing
x
? .. 65
Table 16. The Relative Power of the Overlap to the Standard Method for Different
df Combinations at ?=1.2 ?????????????????? 67
Table 17. Type II Error for Different Degrees of Freedom ?????????.. 67
Table 18. Type II Error Pr for Overlap Method at Different ? and ? Combinations.. 69
Table 19. Comparison of Exact Type II Error Pr with That of the Overlap Method
for Different df and ? Combinations ??????????????. 70
Table 20. The Pooled ?? Values for Different n
x
, n
y
and F
0
Combinations ???. 78
Table 21. Verifying the Inequality that min( ?
x
, ?
y
) < ? < ?
x
+ ?
y
for Different
x
? and
y
? Combinations ?????????????????????.. 80
Table 22. The Value of
r
? for Different F
0
and R
n
Combinations ??????? 96
Table 23. The? Value for Different Combinations of
x
n ,
y
n and
n
R at Either
0
F =
0.90, ,
x y
F
? ?
or
0
F =
0.05, ,
x y
F
? ?
?????????????????. 104
xiii
LIST OF NOTATION
SMD sampling distribution
x
? mean of population X
y
? mean of population Y
x mean of sample X
y mean of sample Y
2
x
S variance of sample X =
2
()/(1)
1
n
x
xx n
x
i
i
? ?
?
=
2
y
S variance of sample Y
2
x
? variance of population X
2
y
? variance of population Y
0
H null hypothesis
1
H alternative hypothesis
CI confidence interval
CIL confidence interval length
()L ? lower (1?? )% CI limit for?
()U ? upper (1?? )% CI limit for?
L
A lower bound for acceptance interval
xiv
U
A upper bound for acceptance interval
Pr probability
LOS level of significance
? type ? error Pr = Pr(reject H
0
 H
0
is true) by the Standard Method
?? type I error Pr from two overlapping CIs
1
?? type I error Pr from two overlapping CIs for the onesided alternative
B
?? type I error Pr using the Bonferroni procedure
? type ? error Pr = Pr(not rejecting H
0
 H
0
is false)
?? type ? error Pr from overlapping CIs
1
? type ? error Pr for the onesided alternative
B
?? Bonferroni type ? error Pr
SE population standard error
se sample standard error
K Standard Error ratio for populations, K= (/ )/(/ )nn
x xyy
??
k standard error ratio for samples, k = (/ )/(/ )
x xyy
SnSn
df degrees of freedom; the symbol ? will denote df
OC Curve operating characteristic curve
? amount of overlap length between the two individual CIs
?
r
borderline value of ? at which H
0
is barely rejected at the LOS ? .
? the exact percentage of the overlap
r
? maximum percent overlap below which H
0
must be rejected at or below ?
xv
N(?, ?
2
) a normal Pr density function (pdf) with population mean ? and
population variance ?
2
?(z) the cumulative distribution function (cdf) of the standard normal
density at point z.
PWF power Function (The graph of 1?? versus the parameter under H
0
)
F
0
the ratio of two sample variances,
22
/
x y
SS
R
n
the ratio of two sample sizes, /
y x
nn
?
x y
? ??
? equals
22
()// /
x yxy
nn?? ? ??+, which represents the
noncentrality parameter of the pooledt statistic
? studentized ? = ?
x
??
y
when
22
x y
? ?? , i.e.,
? =
22
(/)(/)()/
x yxxyy
Sn Sn?? +?
LUB least upper bound
GLB greatest lower bound
RELEFF relative efficiency
ARE asymptotic RELEFF
1
1.0 Introduction
When testing the equality of means of two processes, the sampling distribution
(SMD) of the difference of two sample means must be used to conduct statistical
inference (confidence intervals and test of hypothesis) about the corresponding processes?
mean difference ?
x
? ?
y
. An interesting problem arises as to whether the same
conclusions will be reached if the SMD of individual sample means are used to construct
separate confidence intervals for ?
x
and ?
y
and examine the amount of overlap of the
individual confidence intervals in order to make statistical inferences about ?
x
? ?
y
. If
the underlying distributions are normal with known variances, exact relationships are
given by Schenker and Gentleman, (2001) about the changes in the type I and II error
probabilities if the overlapping of individual confidence intervals are used to make
inferences about ?
x
? ?
y
at the 5% level. Because there is no mention of proof in the
above article, we will use the normal theory to generalize their formulas in chapter 3 for
any LOS ? and will verify that in order to attain a nominal type I error rate of 5%, the
corresponding two confidence levels must be set exactly at 83.42237%, which is nearly
consistent with the 85% reported by Payton et al. (2000).
When the process variances are unknown and sample sizes are small (i.e., the
reallife encountered cases), this dissertation will obtain exact formulas for type ? and II
error probabilities whose values can be obtained once the unbiased estimators,
2
x
S and
2
y
S ,
of process variances are realized. However, this dissertation will verify that in general
2
Using individual confidence intervals diminishes type I error rate, depending on sample
sizes n
x
and n
y
, and increases type II error probability. Assessment of type ? error
probability (?) for the general unknown variances case has not been investigated in the
literature because the computation of type II error probability requires the use of
noncentral tdistribution, although Schenker and Gentleman (2001) provide the impact of
Overlap on the Power Function (PWF =1 ? ?) only for the limiting case in terms of n
x
and n
y
(or the knownvariances case). The noncentral tdistribution has widespread
applications when testing a hypothesis about one or two normal means. Specifically both
the OC (Operating Characteristic) and Power Function (PWF =1 ? ?) for testing H
0
: ? =
?
0
and H
0
: ?
x
? ?
y
= ?
0
(in the unknown variance cases) are constructed using the
noncentral tdistribution. We will use the noncentral tdistribution to obtain the PWF of
testing H
0
: ?
x
? ?
y
= 0 (in the unknown variance cases) both using the SMD of xy?
(i.e., the Standard method which has been available in statistical literature for well over
50 years) and also the Overlap for sample sizes ? 2. It will be determined that the type II
error rate always increases if individual confidence intervals are used to make inferences
about ?
x
? ?
y
. Even if the underlying distributions are not LaplaceGaussian*, the t
distribution can still be used for statistical inferences about two process means for
moderate and large sample sizes because the application of the tdistribution requires the

* Kendal and Stuart (1963, Vol.1, p.135) report that ?The description of the distribution as the ?normal,?
due to Karl Pearson (who is known for the definition of productmoment correlation coefficient and
Pearson System of Statistical distributions), is now almost universal among English writers. Continental
writers refer to it variously as the second law of Laplace, the Laplace distribution, the Gauss distribution,
the LaplaceGauss distribution and the GaussLaplace distribution. As an approximation to the binomial it
was reached by DeMoivre in 1738 but he did not discuss its properties.?
3
assumption that only sample means be approximately normally distributed (due to the
Central Limit Theorem).
Investigation of the overlapping CIs is worthy because as Schenker and
Gentleman (2001) mentioned in the article ?On Judging the Significance of Differences
by Examining the Overlap Between Confidence Intervals? that there are many articles,
such as Mancuso (2001), that still use the Overlap method for testing the equality of two
population quantities. Although we found some articles, such as Payton et al. (2000)
entitled ?Testing Statistical Hypotheses Using the Standard Error Bars and Confidence
Intervals? that have somewhat rectified the Overlap problem and have pointed out the
misconceptions therein, there are still some details to be worked out. Thus, the objective
is to investigate the exact differences between the Overlap method and the Standard [a
term coined by Schenker and Gentleman (2001)] method for testing the null hypotheses
H
0
: ?
x
= ?
y
and H
0
: ?
x
= ?
y
under different assumptions. The former hypothesis has
never been investigated with the Overlap method. The statistical literature reports results
for the impact of Overlap on type I and II error probabilities in testing H
0
: ?
x
= ?
y
only
for the case of large sample sizes (i.e., the limiting case where n
x
and n
y
??). Therefore,
this work will investigate the same and other aspects of Overlap but for small sample
sizes (i.e., n ? 20, which also will hold true for moderate and large sample sizes). To be
on the conservative side, we refer to n ? 20 as small, 20 < n ? 50 as moderate and n > 50
as large in this dissertation, although some statisticians prefer n > 60 as large because for
n > 60,
,
tZ
?? ?
? to one decimal place, where Z
?
represents the (1 ? ?) quantile of a
standard normal deviate.
4
The contents of different chapters are as follows: In chapter 2, an extensive
literature survey and the results thus far are provided. In chapters 3.1 and 3.2, the known
variance case is discussed and compared with what has been reported without proof in the
literature for the limiting case. In chapter 4, the Bonferroni method is compared against
the overlap. In chapter 5, the statistical inference on the ratio of two process variances
(
22
/
x y
? ? ) from the Overlap is compared against the Standard method. Chapter 6
discusses the impact of Overlap on type I error probability. Chapter 7 discusses the
amount and % overlap required to reject H
0
: ?
x
= ?
y
at the ?level of significance when
process variances are unknown and samples sizes are small and moderate. Similarly,
chapter 8 considers the impact of Overlap on type II error Pr when process variances are
unknown for n
x
and n
y
? 50. Finally, chapter 9 summarizes the dissertation findings.
In summary, the primary objectives of this dissertation are: (1) To examine the
impact of Overlap procedure on type I error probability (Pr) when testing equality of two
process variances or two population means for unknown process variances and sample
sizes ? 2. Payton et al. (2000) obtained results for the latter objective but there are
inaccuracies (for n < 50) in their development; further, the former objective has not been
investigated. Moreover, the Overlap literature has not considered the case of pooled t
test and little has been mentioned by Schenker and Gentleman (2001) about the paired t
test. (2) To determine the maximum % overlap of two individual confidence intervals
(CIs) below which the null hypothesis (either H
0
: ?
x
= ?
y
or H
0
: ?
x
= ?
y
) cannot still be
rejected at a given level of significance (LOS) ?. This Objective has not yet been
investigated. (3) To examine the impact of Overlap procedure on type II error Pr for
5
sample sizes n
x
and n
y
? 2. Schenker and Gentleman (2001) carried out this last objective
only for limiting case (i.e., as n
x
and n
y
? ?, and or known ?
x
and ?
y
).
The above objectives are worthy of further investigation because there are many
researchers who still use the overlapping CIs to test hypotheses, especially in the biology
and medical papers (see the references mentioned in chapter 3.1). Furthermore, some
statistical software, such as Minitab, still exhibit overlapping CIs that may lead users to
wrong conclusions. Although the CI for two population quantities is the common method
to make decisions regarding H
0
: ?
x
= ?
y
or H
0
: ?
x
= ?
y
, our objective is to ascertain the
exact relationship between the overlapping of two individual CIs and the corresponding
single CI. Most former researches have only discussed the limiting case (i.e., as n
x
and n
y
? ?). In the reallife situations, sufficient resources may not be available to gather very
large samples. Thus, the case of small (n ? 20) to moderate sample sizes (20 < n ? 50) is
a major contribution of this dissertation.
The reader should bear in mind that all primed symbols in this dissertation pertain
to the Overlap method.
6
2.0 Literature Review
It has been well known that when two underlying populations are normal, the null
hypothesis H
0
:
x y
? ?= is tested, in case of known variances, using the sampling
distribution of X Y? , which is also the LaplaceGaussian
22
(,/ /)
x yxx yy
Nnn??? ??+.
However, in practice, rarely the population variances are known. Thus, the equality of
two process variances should first be tested with an Fstatistic. If H
0
:
22
/
x y
? ? = 1 is not
rejected (and Pvalue > 0.20), the twosample pooledt procedure will be applied for
testing H
0
:
x y
? ?= . Otherwise, the twoindependent sample tstatistic has to be used to
perform statistical inferences about
x y
? ?? . In case of related samples (or paired
observations), the paired tstatistic has to be used to conduct statistical inferences about
?
x
? ?
y
.
The above rules are the formal (or Standard) procedures for testing H
0
:
x y
? ?= .
What if we discuss this question with two individual relevant intervals? Cole et al. (1999)
mentioned that using two individual CIs to test the null hypothesis H
0
:
x y
? ?= would
lead to a smaller type ? Error and larger type ? error rate than the formal procedures.
Payton et al. (2000) pointed out that many researchers use the standard error bars (sample
mean ?standard error of the mean) to test the equality of two population means.
Therefore, if the two individual standard error bars fail to overlap, they will conclude
7
that two sample means are significantly different. Actually, these researchers are making
a test of hypothesis with an approximate Pr(type ? error) = ? = 0.16 not ? = 0.05. Payton
et al. (2000) also derived a formula for the probability of the overlap from two individual
CIs. Payton et al. (2000) defined A to be the event that confidence intervals computed
individually for two population means to overlap. Thus, if the sample sizes are equal
(
12
nn n==), and population variances are unknown, they deduced that Pr( ) PrA =
22
12 1 2
,1, 1
22 22
12 12
() ( )
n
nY Y S S
F
SS SS
? ?
??
?+
<
??
++
??
. They state that the random variable
2
12
22
12
()nY Y
SS
?
+
has
the Fdistribution with numerator degrees of freedom (df) ?
1
= 1 and denominator df ?
2
=
(n ? 1) if the two samples are from the same normal population. It will be shown in
Chapter 6 that their above statement is inaccurate. The two samples need not originate
from the same population, and that the denominator df of the Fdistribution is not (n ? 1)
but rather in the case of n
1
= n
2
= n it is given by
222
12
44
12
(1)( )nSS
SS
?
?+
=
+
, where (n ? 1) < ?
< 2(n ? 1). Further, they state that if the two samples are from two different normal
populations with the same mean but unequal variances, the quantity
2
12
22
12
()nY Y
SS
?
+
is still
approximately Fdistributed with ?
1
= 1 and ?
2
= (n ?1) df , where their value of ?
2
= (n
?1) df is accurate only in the limiting case. Therefore, they conclude that
12
1, 1 ,1, 1
22
12
2
Pr( ) Pr( _ ) Pr 1
nn
SS
A Intervals overlap F F
SS
???
? ?
??
=?<+? ?
??
+
? ?
??
? ?
.
Payton et al. (2000) further state that for the 95% CIs, n
x
= n
y
= n = 10, S
1
(= S
x
) = 0.80
and S
2
= 1.60, 1 ?
1,9 0.05,1,9
Pr()1Pr( _ )1Pr( 1.8)A Intervals overlap F F=? ?? < ? = 1 ?
8
0.9859 = 0.0141 (which was misprinted as 0.0149). It will be shown in Chapter 6 that
this last Overlap Pr should be revised to ?? = 0.00608057, i.e., their result has a relative
error of 56.8754%
Moreover, Payton et al. (2000) used SAS Version 6.11 to simulate from a N(0, 1)
when sample sizes varied from n = 5 to n = 50 in order to ascertain the accuracy of the
above formula. In this article, the authors do not give information about the known
variances case. For the unknown variances case, they only consider the case when the
sample sizes are equal. The largest sample size Payton et al. (2000) considered was n =
50. Furthermore, Schenker and Gentleman (2001) found more than 60 articles in the
health sciences for testing the equality of two population means by using the Overlap
method. Schenker and Gentleman (2001) state that the Overlap method will fail to reject
H
0
when the Standard method would reject it. In other words, the Overlap will lead to
less statistical power than the Standard method. The authors considered three population
quantities Q
1
, Q
2
and
12
QQ? . They state that Brownlee (1965) provided the 95%
confidence intervals for the three quantities as
l m
11
1.96QSE? ,
m n
22
1.96QSE? and
l m mm
22
12
12
( ) 1.96Q Q SE SE?? + . However, using the Overlap method, the null hypothesis
will not be rejected if and only if
l m m n
12 1 2
( ) 1.96( )Q Q SE SE?? + contains zero. Schenker
and Gentleman (2001) defined k as the limiting SE (standard error) ratio, i.e., either
SE
1
/SE
2
or SE
2
/SE
1
, and considered only ratios that are greater than or equal to 1. For a
limiting SE ratio of k and a standardized difference of d =
22
12 1 2
()/Q Q SE SE?+, they
reported that the asymptotic power for the standard method is (1.96 )d? ?++
(1.96 )d?? ? , where ? represents the cdf of N(0, 1), and the asymptotic power for the
9
Overlap method is
2
1.96(1 )
()
1
k
d
k
?+
?++
+
2
1.96(1 )
()
1
k
d
k
? +
? ?
+
. Note here, in this
dissertation, we use different definition for k from Schenker?s definition. Schenker and
Gentleman (2001) use small case k as the standard error ratio for the limiting sample
sizes or known variances cases but we use k as the standard error ratio for small to
moderate sample sizes or unknown variances cases. It means k= (/ )/(/ )
x xyy
SnSn in
this dissertation. Therefore, for distinguishing, let K = (/ )/(/ )nn
x xyy
?? represent
the limiting sample sizes or known variances cases. In this chapter, we still use
Schenker?s symbol to represent their work. Schenker and Gentleman (2001) also stated
that for the Pr of type ? error, just simply let d = 0 in the above formulas. Then, the
authors concluded that the Overlap method will lead to smaller ? and larger ?.
Furthermore, Schenker and Gentleman (2001) state that when SE
1
is nearly equal
to SE
2
, the Overlap method is expected to be more deficient (i.e., smaller type I error Pr
and larger type II error Pr) relative to the Standard method. In this article, the authors did
not give specific values of type ? error and type ? error probabilities for different k and d
values. Their results pertain only to large sample sizes so that the need for using the t
distribution in the case of small and moderate sample sizes was not discussed.
Payton et al. (2003) continued to provide the formula Pr(Intervals_Overlap),
which they had also obtained in the year 2000 as follows:
12
1, 1 ,1, 1
22
12
2
Pr( ) Pr( _ ) Pr 1
nn
SS
A Intervals overlap F F
SS
???
? ?
??
=?<+? ?
??
+
? ?
??
? ?
.
Payton et al. (2003) state that a largesample version of the above statement can be
derived (assuming the two populations are identical):
10
/2
Pr( ) Pr( _ ) Pr   2A Intervals overlap Z z
?
? ?
=?<
? ?
= ?(
/2
2 Z
?
) ??( ?
/2
2 Z
?
), where
?(z) (almost) universally represents the cumulative of the standardized normal density
function at point z. The authors set ? at the nominal value of 5%, generated 95%
confidence intervals and gave the approximate probability of overlap as
Pr( _ )Intervals overlap
[ ]
Pr 2.77 2.77 0.994Z?? << = .
Thus, the authors concluded that ?the 95% CIs will overlap over 99% of the time?. They
also mentioned that Schenker and Gentleman (2001) showed, for large sample sizes, that
the probability of type ? error when comparing the overlap of 100(1?)% confidence
intervals is
2
/2
2Pr[ (1 )/ 1 ]Z zkk
?
60 and ?
x
and ?
y
are replaced by their
biased estimates S
x
and S
y
, respectively.
3.1 The Case of ?
x
= ?
y
= ?
Statistical theory suggests that the total resources N = n
x
+n
y
be allocated
according to n
x
= /( )
x xy
N? ??+ , and hence the allocation n
x
= n
y
= n = N/2 is
recommended. Suppose that the two CIs for ?
x
and ?
y
are disjoint; then it follows that
either L(?
x
) > U(?
y
), or L(?
y
) > U(?
x
). These two possibilities lead to the condition
either
/2
/
x x
x Zn
?
?? >
/2
/
yy
yZ n
?
?+ , or
/2
/
yy
yZ n
?
?? > x +
/2
Z
?
?
/
x x
n? , respectively. Combining the two conditions leads to rejecting H
0
: ?
x
= ?
y
iff?x ?y ?>
/2
Z
? xxyy
(/n /n)?+? ; for the case of ?
x
= ?
y
= ? and thus n
x
= n
y
=
n, this last condition reduces to ?x ?y ?> 2
/2
Z
?
/n? at the level of significance ?
based on the Overlap method. If ? is set at the nominal value of 5%, this last inequality
will lead to the same condition as that of Schenker et al. (2001) who stated that the two
intervals overlap if and only if the interval
m m
12
()QQ? ?
n n
12
1.96( )SE SE+ contains 0.
Sometimes, it is then concluded that the null hypothesis H
0
: ?
x
= ?
y
must be
rejected in favor of H
1
: ?
x
? ?
y
at the LOS ?, such as Djordjevic et al. (2000), Tersmettte
et al. (2001) and Sont et al. (2001) who used this concept to test H
0
: ?
x
= ?
y
. In fact,
14
Schenker and Gentleman (2001) state that they found more than 60 articles where the
Overlap method was used either formally or informally to demonstrate visual significant
difference between x and y . This procedure is not accurate because the correct (1 ?
?)?100% CI for the difference in means of two independent normal universes must be
obtained from the SMD (sampling distribution) of the statistic xy? , which is also
Gaussian with E( xy? ) = ?
x
? ?
y
and V( xy? ) = V( x) + V(y) =
22
xx yy
/n /n?+? =
2
2/n? , assuming
xy
? ??==. Thus, the correct (1??)?100% CI on ?
x
? ?
y
is given
by
xy? ?
22
/2 x x y y
Z/n/n
?
?+? ? ?
x
? ?
y
? xy? +
22
/2 x x y y
Z/n/n
?
?+? (1a)
For the balanced design case and ?
x
= ?
y
= ?, Eq. (1a) reduces to
xy? ?
/2
2Z
?
? / n ? ?
x
? ?
y
? xy? +
/2
2Z
?
? / n (1b)
The length of the above exact (1 ? ?)?100% CI for a balanced design is
/2
22Z
?
? / n .
Thus, H
0
: ?
x
? ?
y
= 0 must be rejected at the LOS ? iff (i.e., it is necessary and sufficient)
that xy? >
22
/2 x y
Z( )/n
?
?+? =
/2
2 Z
?
/n? . (1c)
However, requiring that the two separate CIs to be disjoint leads to rejection of H
0
iff xy? >
/2 x x y y
Z ( /n /n)
?
?+? =
/2
2Z
?
? /n? . It is clear that the
requirement for rejecting H
0
of two disjoint CIs is more stringent (or more conservative)
than that of the Standard method because, in the case of
xy
? ??= = and n
x
= n
y
= n,
/2
2Z / n
?
? >
22
/2 x x y y
Z/n/n
?
?+? =
/2
2 Z
?
/n? . Further, the more stringent
requirement to reject H
0
(based on two independent separate CIs) leads to a smaller type I
15
error Pr than the specified ?. The correct value of ? using the Standard method is given
by
? = Pr ( x ?y< A
L
, or x ?y> A
U
?
xy
0? ?? = )
=
22
/2 x x y y x y
Pr[x y Z /n /n 0]
?
?> ? +? ? ?? = =
/2
Pr( Z Z )
?
> = ?,
where A
L
and A
U
denote the lower and upper ?acceptance limits, respectively.
On the other hand, if we require that the two individual CIs must be disjoint in
order to reject H
0
: ?
x
? ?
y
= 0, then the type I error Pr from Overlap is given by
??= Pr( y ?
/2 y
Z/n
?
? > x+
/2 x
Z/n
?
? ) + Pr( x ?
/2 x
Z/n
?
? > y +
/2 y
Z/n
?
? )
= 2?Pr[Z >
22
/2 x y x y
Z( )/( )
?
?+? ?+? ]
= 2?Pr(Z >
/2
2 Z
?
) = 2??(?
/2
2 Z
?
), assuming ?
x
= ?
y
= ? (2)
? Setting ? at 0.01 leads to the Overlap LOS of ?? = 0.00026971696 < < 0.01.
The % relative error, [( ) / ] 100%?????? , in the LOS ? = 0.01 is
[(0.01?0.0026971696)/0.01] 100%? = 97.303%.
? For the nominal value of ? = 0.05, Eq. (2) gives ?? = 0.00557459668 <<
0.05. The value of ?? = 0.00557459668 is consistent with the limiting value of
0.006 provided by Payton et al. (2003, p.36) in their equation (6). The %
relative error is 88.851%. As a result, the larger the LOS ? is, the smaller the
% relative error becomes. Payton et al. (2000) provide simulation results of
run sizes 10,000 from two independent N(0, 1) populations in their column 3
of TABE 1, p. 551, that claim the value of ?? ranges from 0.0039 at n = 5 to
0.0055 at n = 50 (n incremented by 5). Our Eq. (2) shows that in the case of
16
known equal variances and sample sizes the value of Overlap type I error Pr
does not depend on n at all. However, their simulation inaccuracies were
rectified by Payton et. al (2003, Table 4) again through simulation run sizes of
10,000 independent pairs from N(0, 1).
? Setting ? at the maximum widelyaccepted LOS of 10%, Eq. (2) shows that ??
= 0.020009254 << 0.10 and the % relative error is [(0.10 ? 0.020009254)/0.10]
?100% = 79.99%.
Regardless of the value of LOS ? the same conclusion made by Cole et al. (1999) will be
reached that the Overlap method will lead to a much smaller type ? error rate.
If the alternative H
1
is onesided, say H
1
: ?
x
? ?
y
> 0, then from the Overlap
standpoint H
0
should be rejected only if both conditions x ?y > 0 and L(?
x
) ?U(?
y
) > 0
[or L(?
x
) > U(?
y
)] hold, and as a result the Overlap type I error Pr reduces to
1
?? = Pr[ x ? Z
0.025
x
/n? > y+ Z
0.025
y
/n? ]
= Pr[ x ?y> Z
0.025
x
/n? + Z
0.025
y
/n? ] = Pr[ x ?y>Z
0.025
xy
()/n?+? ]
= Pr(Z >
22
0.025 x y x y
Z( )/( )?+? ?+? ) = Pr(Z >
0.025
2 Z );
assuming ?
x
= ?
y
= ?,
1
?? = 0.0027873 < < 0.05. Thus the impact of Overlap on type I
error Pr is even greater for a onesided alternative than for the 2sided one. Note that
when L(?
y
) > U(?
x
), the two CIs are disjoint but such an occurrence is congruent with H
0
:
?
x
? ?
y
? 0 rather than H
1
: ?
x
? ?
y
> 0. Thus for the onesided alternative the type I
error Pr from the Overlap is exactly half of the 2sided alternative, which was equal to
0.00557459668. Henceforth, unless specified otherwise, the alternative is twosided.
17
Now, let ? represent the amount of overlap length between the two individual CIs,
a variable that has not been considered in the Overlap literature. From Figures (1a &b),
? will be zero if either L(?
x
) > U(?
y
) or L(?
y
) > U(?
x
), in which case H
0
: ?
x
= ?
y
is
rejected at the LOS < ?. Thus, ? is larger than 0 when U(?
x
) > U(?
y
) > L(?
x
) or U(?
y
) >
U(?
x
) > L(?
y
). The overlap is 100% if U(?
x
) ? U(?
y
) > L(?
y
) ? L(?
x
), or if U(?
y
) ? U(?
x
)
> L(?
x
) ? L(?
y
). Because both conditions U(?
x
) > U(?
y
) > L(?
x
) and U(?
y
) > U(?
x
) >
L(?
y
) will lead to the same result, only the case of U(?
x
) > U(?
y
) > L(?
x
) [Figure 2(a)] for
which x y?? 0 is discussed here. See the illustration in Figure 2(a&b).
Figure 2(a) Figure 2(b)
That is, for the knownvariance case, the larger sample mean will be denoted by x . Thus
for the equalsamplesize &variance case,
? = U(?
y
) ? L(?
x
) = (
/2
/yZ n
?
?+? ) ? (
/2
/x Zn
?
??? )
= 2
/2
/Z n
?
? ? ( x y? ) (3a)
On the other hand, the span of the two individual CIs (assuming x y> ) is given by
U(?
x
) ? L(?
y
) =
/2
/x Zn
?
?+? ? (
/2
/yZ n
?
??? )
= 2
/2
/Z n
?
? + ( x y? ) (3b)
()
y
U ? ()
y
U ?
()
x
U ? ()
x
U ? ()
x
L ? ()
x
L ?
()
y
L ? ()
y
L ?
18
Combining equations (3a & 3b) gives the exact % overlap as
? =
/2
/2
2/()
2/()
Z nxy
Z nxy
?
?
?
?
? ?
+ ?
?100% (3c)
Let
r
O be the borderline value of ?at which H
0
is barely rejected at the LOS ?.
From Eq. (1c), H
0
: ?
x
= ?
y
should be rejected iff x y? ?
/2
2/Z n
?
? . Therefore, from
Eq.(3a) the value of ?at which H
0
should be rejected at the ?level or less is given by
?
? 2
/2
/Z n
?
? ?
/2
2/Z n
?
? , and the exact amount of overlap that leads to an ?
level test is given by
r
O =
/2
(2 2) /Z n
?
?? (3d)
Eq. (3d) implies that H
0
must be rejected at the LOS ? or less iff ?
?
/2
(2 2) /Z n
?
?? .
Inserting the borderline rejection condition, x y? =
/2
2/Z n
?
? , into Eq. (3b) yields
U(?
x
) ? L(?
y
) =
/2
2/Z n
?
? + 2
/2
/Z n
?
? =
/2
(2 2) /Z n
?
?+ . (3e)
Eq. (3e) implies that if the two CIs span larger than
/2
(2 2) /Z n
?
?+ , then H
0
must be
rejected at the LOS less than ?. The percent overlap in Eq. (3c) ranges from zero
(occurring when x y? = 2
/2
/Z n
?
? ) to 100% (occurring when x y? = 0). Inserting
the borderline value of x y? =
/2
2/Z n
?
? at which H
0
must be rejected into Eq. (3c)
results in
r
? =
/2
/2
(2 2) /
(2 2) /
?
+
Z n
Z n
?
?
?
?
?100% =
22
22
?
+
?100% = 17.1573% (3f)
which means that H
0
: ?
x
= ?
y
must be rejected at the LOS ? or less if the percent overlap
between the two individual CIs is less than or equal to 17.1573%. It seems that the
percent overlap at which H
0
should barely be rejected is 17.1573% regardless of the LOS
?, but the amount of overlap from (3a) does depend on ?. Further, as x y? increases,
19
the Pvalue of testing H
0
:
x y
? ?= decreases and so does the value of % overlap in Eq.
(3c). As x y? ?
/2
2/Z n
?
? , ? ? 0. Thus, in the case of known
xy
? ??==,
once the % overlap exceeds 17.1573%, then H
0
:
x y
? ?= must not be rejected at any ?
level.
If the alternative is onesided, H
1
: ?
x
? ?
y
> 0, it can be argued that the maximum
percent overlap is given by
() ( )
() ()
YX
XY
UL
UL
? ?
? ?
?
?
=
/2
/2
22ZZ
ZZ
??
??
?
+
(3g)
and for a 5%level test Eq. (3g) reduces to
0.025 0.05
0.025 0.05
(2 2)
(2 2)
ZZ
ZZ
?
+
= 25.51597%, which
implies that H
0
can be rejected at less than 5% level if the percent overlap between the
two individual CIs is smaller than 25.51597%. Thus, the impact of overlap on ?
r
is
greater for a onesided alternative because for the 2sided alternative the value of ?
r
=
17.15729%. Further, for the onesided alternative the % overlap does depend on ?. As
an example, for a 10%level onesided test the value of ?
r
increases to 28.96%.
The question now is what individual confidence levels, (1? ?), should be used that
will lead to an exact ??level test? Clearly, the overlap amount for a (1? ?)?100% CI is
given by
yx
U( ) L( )??? ? ? = ( y+
/2
Z/n
?
? ) ?( x ?
/2
Z/n
?
? )
= 2
/2
Z/n
?
?? ( x ?y ) (4)
Because H
0
:
x y
? ?= must be rejected iff x y? ?
/2
2/Z n
?
? and the overlap must
become zero or less in order to reject H
0
, Eq. (4) shows that 2
/2
Z/n
?
? =
/2
2/Z n
?
? ?
/2
Z
?
=
/2
/2Z
?
? ?/2 = ?( ?
/2
/2Z
?
) (5)
20
Eq. (5) shows that the confidence level for each individual interval must be set at
(1 )??=
/2
12 ( /2)Z
?
???? in order to reject H
0
at the LOS ? iff the two CIs are
disjoint. The value of (1 )?? can also be obtained by equating the span of the two
independent CIs, 2
/2
Z/n
?
? + ( x ?y ), to the length of the CI from the Standard
method given by 2
/2
2Z
?
? / n , and invoking the rejection condition x ?y=
/2
2/Z n
?
? .
? If ? is set at 0.01 in Eq.(5), then ? = 0.068548146, 1 ?? = 0.931451854, which
implies that the confidence level of each individual interval must be set at
0.931451854 in order to reject H
0
at the 1% level iff the two CIs are disjoint.
? If ? = 0.05 is substituted in Eq.(5), then ? = 0.165776273, 1 ?? = 0.834223727,
which implies that the confidence level of each individual interval must be set
at 83.4223727% in order to reject H
0
at the 5% level iff the two CIs are disjoint.
This assertion is in fair agreement with the simulation results given in TABLE 1
of Payton et al. (2000, p. 551) for 15 ? n ? 50. Their TABLE 1, although
inaccurate at n = 5 & 10, clearly shows that as n increases toward n = 50,
the size of adjusted CIs is equal to 83.835%, which is very close to the exact
1 ?? = 83.422372710%.
? Further, when the confidence level 1?? = 0.90, then 1 ?? = 0.755205856. The
first and third 1 ?? values have not been reported in Overlap literature.
If the alternative is onesided, H
1
: ?
x
? ?
y
> 0, it can be argued that the value of 1 ?? is
given by1??= 1?2(Z/2)
?
?? . If ? = 0.05 is substituted into this last equation, then
21
the onesided 1 ?? = 0.75520585634665, which implies that the confidence level of each
individual interval must be set at 0.75520585634665 in order to reject H
0
at the 5% level
iff the two CIs are disjoint, while for the 2sided alternative 1 ?? was equal to
0.83422372710. Again, the impact of Overlap on individual confidence levels is greater
for the onesided alternative than that of the 2sided one.
Lastly, since rejecting H
0
:
x y
? ?= using the two independent CIs is more
stringent than the SMD of x y? , therefore, it will lead to many more type II errors (or
much less statistical power) in testing H
0
: ?
x
? ?
y
= 0, as shown below.
In Figure 3, the solid line represents the null distribution of xy? , and the dotted
line curve represents the distribution of xy? under H
1
, where ? = ?
x
? ?
y
> 0 is the
amount of specified shift in ?
x
? ?
y
= ? from zero, which in Figure 3 exceeds one
standard error of xy? . Figure 3 clearly shows that the acceptance interval (AI) for the
1??
xy?
22
xy
n
?+?
/2? /2?
?
22
sample mean difference, x ?y , when testing H
0
: 0
xy
? ?? = at the LOS ? is given by
AI = (A
L
, A
U
) = [
22
/2 x y
Z( )/n
?
??+? ,
22
/2 x y
Z( )/n
?
?+? ], i.e., in the case of ?
x
= ?
y
= ? and n
x
= n
y
= n we cannot reject H
0
at the significance level ? if our test statistic
x ?y falls inside the AI = (A
L
, A
U
) = (?
/2
2Z
?
/n? ,
/2
2Z
?
/n? ). Thus, the Pr
of committing a type II error as shown in Figure 3 is given by
? = Pr[A
L
? x ?y ? A
U
? ?
x
? ?
y
= ?]
= Pr[
22
/2 x y
Z( )/n
?
??+? ? x ?y ? +
22
/2 x y
Z( )/n
?
?+? ? ?
x
? ?
y
= ?]
= Pr[ x ?y ?
2
/2
Z2/n
?
? ?? > 0] ? Pr[ x ?y ? ?
2
/2
Z2/n
?
? ?? ] (6a)
= ?(
/2
Z
?
?
n
2
?
?
) ? ?(
/2
Z
?
??
n
2
?
?
), where ? = ?
x
? ?
y
. (6b)
At ? = 0.05 if the specified value of ?
x
? ?
y
= ? exceeds 0.5
22
xy
?+? , then the
value of standard normal cdf ?( ? Z
0.025
? 0.5 n ) < 0.001 for sample sizes n ? 6, i.e.,
the last term on the RHS of equation (6a), becomes less than 0.001 once n ? 6. Hence,
Eq. (6b) for the nominal value of ? = 5% approximately reduces to
? ? ?(Z
0.025
? n/2/? ?) (6c)
where (6c) is accurate to at least 3 decimals for n ? 6 and ? > 0.5
22
xy
?+? = 0.5 2? .
When the null hypothesis H
0
: ?
x
? ?
y
= 0 is not rejected at the LOS ? iff the two
individual CIs ( x ?
/2 x
Z/n
?
? ? ?
x
? x+
/2
Z
? x
/n? ) and ( y?
/2
Z
? y
/n? ? ?
y
? y+
/2
Z
? y
/n? ) are overlapping, then the Pr of a type II error (assuming ?
x
> ?
y
)
23
from the Overlap Method is given by
?' = Pr(Overlap?? > 0) = Pr{[ ( ) ( )
x y
LU? ?? ]?[( ) ( )
yx
LU? ?? ] 0? > }
= Pr{[ x ?
/2
Z
?
/
x
n? ? y+
/2
Z
?
/
y
n? ]?[
/2
y
yZ
n
?
?
? ? x+
/2
Z
?
/
x
n? ] 0? > }
= Pr{[ x ?y ?
/2
Z
?
/
x
n? +
/2
Z
?
/
y
n? ] ?
[
/2
y
Z
n
?
?
? ?
/2
Z
?
/
x
n? ? x y? ] 0? > }
When ?
x
= ?
y
= ?, and n
x
= n
y
= n, the SE( x ?y) = 2/n? and as a result
??= Pr{[ x ?y ? 2
/2
Z
?
/ n? ] ?[?2
/2
Z
?
/ n? ? x y? ] 0? > }
= Pr{[?2
/2
Z
?
/ n? ? x ?y ? 2
/2
Z
?
/ n? ] 0? > } (7a)
= ?(
/2
2 Z
?
?
n
2
?
?
) ? ?(
/2
2 Z
?
? ?
n
2
?
?
) (7b)
Since the cdf of the standard normal density ?(z) is a monotonically increasing
function of z, comparing Eq. (6b) with Eq. (7b) shows that
?(
/2
2Z
?
?
n
2
?
?
) > ?(
/2
Z
?
?
n
2
?
?
) & ?(
/2
2Z
?
? ?
n
2
?
?
) < ?(?
/2
Z
?
?
n
2
?
?
).
The above two conditions lead to
??= ?(
/2
2Z
?
?
n
2
?
?
) ? ?(
/2
2Z
?
? ?
n
2
?
?
) >
?(
/2
Z
?
?
n
2
?
?
) ? ?(?
/2
Z
?
?
n
2
?
?
) = ?.
and as a result 1 ??? <1??, i.e., using individual CIs loses statistical power as illustrated
in Table 1 (for n = 10, 20, 40, 60 and 80 at ? = 0.05). Table 1 clearly shows that the Pr
24
Table 1. The Relative Power of Overlap as Compared to the Standard Method for
Different Sample Sizes n and /( 2)?? Combinations
n
2
?
?
1?? 1 ???
[ ]100%
1
????
??
n
2
?
?
1?? 1 ???
[ ]100%
1
????
??
10 0 0.050000 0.005575 88.850807 10 0.2 0.096935 0.016535 82.941952
10 0.2 0.096935 0.016535 82.941952 30 0.2 0.194775 0.046889 75.926797
10 0.4 0.244141 0.065946 72.988712 50 0.2 0.292989 0.087310 70.200085
10 0.6 0.475101 0.190941 59.810521 70 0.2 0.387332 0.136000 64.887986
10 0.8 0.715617 0.404396 43.489888 90 0.2 0.475101 0.190941 59.810521
10 1 0.885379 0.651905 26.369907 110 0.2 0.554768 0.250096 54.918821
10 1.2 0.966730 0.846828 12.402799 130 0.2 0.625674 0.311552 50.205361
10 1.4 0.993192 0.951076 4.240407 150 0.2 0.687770 0.373606 45.678672
10 1.6 0.999031 0.988926 1.011467 170 0.2 0.741418 0.434816 41.353531
10 1.8 0.999905 0.998251 0.165374 190 0.2 0.787231 0.494017 37.246245
10 2 0.999994 0.999809 0.018425 210 0.2 0.825958 0.550319 33.372049
20 0 0.050000 0.005575 88.850807 230 0.2 0.858407 0.603086 29.743564
20 0.2 0.145473 0.030356 79.132788 250 0.2 0.885379 0.651905 26.369907
20 0.4 0.432158 0.162818 62.324449 270 0.2 0.907642 0.696558 23.256232
20 0.6 0.765259 0.464729 39.271657 290 0.2 0.925899 0.736982 20.403620
20 0.8 0.947141 0.789850 16.606937 310 0.2 0.940785 0.773239 17.809209
20 1 0.994000 0.955465 3.876765 330 0.2 0.952858 0.805484 15.466533
20 1.2 0.999671 0.995267 0.440547 350 0.2 0.962600 0.833939 13.365991
20 1.4 0.999991 0.999758 0.023375 400 0.2 0.979327 0.890313 9.089308
20 1.6 1.000000 0.999994 0.000573 450 0.2 0.988775 0.929332 6.011824
20 1.8 1.000000 1.000000 0.000006 500 0.2 0.994000 0.955465 3.876765
20 2 1.000000 1.000000 0.000000 600 0.2 0.998354 0.983297 1.508145
40 0 0.050000 0.005575 88.850807 700 0.2 0.999568 0.994127 0.544334
40 0.2 0.244141 0.065946 72.988712 800 0.2 0.999891 0.998043 0.184785
40 0.4 0.715617 0.404396 43.489888 900 0.2 0.999973 0.999377 0.059617
40 0.6 0.966730 0.846828 12.402799 1100 0.2 0.999999 0.999944 0.005488
40 0.8 0.999031 0.988926 1.011467 1300 0.2 1.000000 0.999995 0.000444
40 1 0.999994 0.999809 0.018425 1500 0.2 1.000000 1.000000 0.000032
40 1.2 1.000000 0.999999 0.000072 20 0.5 0.608779 0.296070 51.366706
60 0 0.050000 0.005575 88.850807 40 0.5 0.885379 0.651905 26.369907
60 0.2 0.340845 0.110745 67.508566 60 0.5 0.972127 0.864590 11.062062
60 0.4 0.872528 0.628007 28.024464 80 0.5 0.994000 0.955465 3.876765
60 0.6 0.996402 0.969657 2.684165 100 0.5 0.998817 0.987066 1.176501
60 0.8 0.999989 0.999693 0.029611 120 0.5 0.999782 0.996589 0.319361
60 1 1.000000 1.000000 0.000032 140 0.5 0.999962 0.999167 0.079444
80 0 0.050000 0.005575 88.850807 160 0.5 0.999994 0.999809 0.018425
80 0.2 0.432158 0.162818 62.324449 180 0.5 0.999999 0.999959 0.004033
80 0.4 0.947141 0.789850 16.606937 200 0.5 1.000000 0.999991 0.000841
80 0.6 0.999671 0.995267 0.440547 220 0.5 1.000000 0.999998 0.000168
80 0.8 1.000000 0.999994 0.000573 240 0.5 1.000000 1.000000 0.000032
80 1 1.000000 1.000000 0.000000 260 0.5 1.000000 1.000000 0.000006
25
of type ? error from two individual CIs is always larger than the Pr of type ? error from
the Standard method (i.e.,??>?). Thus, the statistical power of Overlap method is less
than that of the standard method (1 1??? 0, clearly the expression for ?' given
in Eq. (7b) stays in tact but the Standard method type II error Pr becomes ?
1
= ?( Z
?
?
n/2/??), where ? = ?
x
? ?
y
. Because for ? > 0, ?' = ?(
/2
2 Z
?
? n/2/??) ?
?(
/2
2 Z
?
? ? n/2/??) > ? = ?(
/2
Z
?
? n/2/? ?) ? ?(
/2
Z
?
? ? n/2/??) > ?
1
= ?( Z
?
? n/2/??), it follows that the impact of Overlap on type II error Pr for the
onesided alternative is greater that that of the twosided alternative. Note that ?
1
becomes equal to ? only at ? = 0.
3.2 The Case of Known but Unequal Variances
If variances of the two independent processes are known but not equal, then
statistical theory dictates that the two sample sizes should be allocated according to
n
x
=
x
xy
N??
?+?
, n
y
=
y
xy
N? ?
? +?
, (8)
where N = n
x
+ n
y
= the total recourses available to the experimenter. The sample size
allocations given in equations (8) lead to the minimum SE( xy? ) =
xy
()/N?+? .
Schenker and Gentleman (2001) make similar statement as above but did not use
equation (8) to set the values of n
x
and n
y
. They use notational procedure by letting
/
/
x x
yy
n
k
n
?
?
= . Note that Schenker and Gentleman (2001) refer to the limiting value of
smallletter k as the SE ratio because they investigated the impact of Overlap on type I
and II error rates only when n
x
and n
y
? ?. Since we discuss both limiting case
28
(populations) and small to moderate sample size cases in this dissertation, the small k
refers to the standard error ration for samples, ie,
/
/
x x
yy
Sn
k
Sn
= and K refers to the SE
ratio for populations, ie, K=
/
/
x x
yy
n
n
?
?
= SE(x) SE(y)/ . Clearly,
SE( xy? ) =
22
//
x xyy
nn??+ =
22 2
K/ /
yy yy
nn??+
=
2
1K/
yy
n? + = SE( y)
2
1K+ (9a)
Substituting equations (9a) into the Standard (1 ? ?)?100% CI:
/2
SE( )x yZ xy
?
? ???
leads to
xy? ?
2
/2
1K/
yy
Z n
?
? +? ?
x
? ?
y
? xy? +
2
/2
1K/
yy
Z n
?
? + (9b)
Thus, the Standard CIL equals to
2
/2
21K/
yy
Z n
?
? + . Equation (9b) shows that the
null hypothesis H
0
: ?
x
? ?
y
= 0 must be rejected at the LOS ? iff the CI in equation (9b)
excludes zero, or iff xy? >
2
/2
1K/
yy
Z n
?
? + . (9c)
However, requiring that the two independent CIs must not overlap in order to
reject H
0
: ?
x
? ?
y
= 0 at the LOS ?, is equivalent to requiring that either L(?
x
) > U(?
y
) or
L(?
y
) > U(?
x
). These two inequalities lead to the Overlap rejection of H
0
: ?
x
??
y
= 0 iff
xy? >
/2 /2
//+
x xyY
Z nZ n
??
??=
/2
(1 K ) /
yy
Z n
?
? + (10)
Therefore, if the exact Pr of type I error is ? but we reject H
0
when the two independent
CIs are disjoint, the Overlap type I error Pr reduces to
??=Pr[ y ?
/2 y y
Z/n
?
? > x+
/2 x x
Z/n
?
? ]+Pr[ x ?
/2 x x
Z/n
?
? > y+
/2 y y
Z/n
?
? ]
29
= 2?Pr( x y? >
/2 y y
Z(1K)/n
?
?+ ) = 2?Pr(Z >
2
/2
Z(1K)/1K
?
++) (11)
which is identical to that of equation (7) provided by Schenker and Gentleman (2001, p.
184) when their standardized difference, d, is set equal to 0. Eq.(11) shows that as K ? 0
or ?, the value of ?? slowly approaches the exact type I error probability ? [consistent
with Table 3 on p. 3 of Payton et al. (2003)]. Further, since
2
(1 K)/ 1 K++> 1 and
2
/2
Z (1K)/1K
?
++ >
/2
Z
?
, then ??= 2?Pr[Z >
/2
Z(1K)
?
+ /
2
1K+ ] is smaller than ? =
2?Pr(Z >
/2
Z
?
), which means that the Overlap always leads to a smaller type I error Pr than
that of the Standard method, consistent with Figure 3 of Schenker and Gentleman (2001, p.
184). Table 4 shows the value of ??at? = 0.01and 0.05 for different K values. Note that
Table 4 values are valid for either Gaussian underlying distributions or for the limiting
values of n
x
and n
y
. Figure 5(a) and 5(b) show that as K increases, the value of ?? slowly
approaches the exact type I error probability ?.
To determine the minimum value of ?? from Eq.(11), let g(K)=
/2
Z(1K)/
?
+
2
1K+ . The first derivative of g(K) is (K)g? =
/2
223
1K(1K)
[]
1K (1K)
Z
?
+
?
++
. Setting
(K)g? = 0 will lead to K = 1. To ascertain whether K=1 is a point of minimum or
maximum, the second differentiation yields:
(K)g?? =
2
/2
23/2 25/2 23/2
2K 3(1 K)K 1 K
[]
(1 K ) (1 K ) (1 K )
Z
?
?+ +
+?
++
Substituting K =1 and Z
0.025
= 1.959964 into the above equation results in (1)g?? =
0.353553391 0?<, which shows that K =1 maximize g(K). Thus, ?? has the minimum
30
value at K =1, as shown in Table 4.
Table 4. The Type I Error Pr of Two Individual CIs with Different K at ? = 0.05
and 0.01
K
(0.05)? ?? =
K
(0.05)? ?? =
K
(0.01)? ?? =
K
(0.01)? ?? =
1 0.005574597 6 0.024101169 1 0.000269717 6 0.003034255
1.2 0.005772632 7 0.026592621 1.2 0.000285833 7 0.003565806
1.4 0.006255214 8 0.028674519 1.4 0.000326631 8 0.004034767
1.6 0.006916773 9 0.030432273 1.6 0.000385984 9 0.004447733
1.8 0.007695183 10 0.031932004 1.8 0.000460718 10 0.004812093
2 0.008549353 11 0.033224353 2 0.000548586 11 0.005134764
2.2 0.009450168 12 0.034348214 2.2 0.000647644 12 0.005421799
2.4 0.010376313 13 0.035333699 2.4 0.000756080 13 0.005678348
2.6 0.011312004 14 0.036204361 2.6 0.000872186 14 0.005908733
2.8 0.012245574 15 0.036978819 2.8 0.000994382 15 0.006116572
3 0.013168478 16 0.037671953 3 0.001121233 16 0.006304887
3.2 0.014074567 17 0.038295773 3.2 0.001251466 17 0.006476214
3.4 0.014959516 18 0.038860068 3.4 0.001383967 18 0.006632687
3.6 0.015820399 19 0.039372889 3.6 0.001517779 19 0.006776110
3.8 0.016655348 20 0.039840912 3.8 0.001652089 20 0.006908016
4 0.017463296 21 0.040269717 4 0.001786217 21 0.007029710
4.2 0.018243773 22 0.040664002 4.2 0.001919599 22 0.007142316
4.4 0.018996754 23 0.041027747 4.4 0.002051774 23 0.007246798
4.6 0.019722537 24 0.041364351 4.6 0.002182369 24 0.007343994
4.8 0.020421655 25 0.041676725 4.8 0.002311086 25 0.007434630
5 0.021094804 26 0.041967382 5 0.002437694 26 0.007519342
Exact Type I Error (? =0.01)
0
0.002
0.004
0.006
0.008
0.01
0.012
K
?
'
alpha'
al pha
Exact Type I Error (? =0.05)
0
0.01
0.02
0.03
0.04
0.05
0.06
1
2.45
3.9
5.35
6.8
8.25
9.7
11.2 12.6 14.1 15.5
17
18.4 19.9 21.3 22.8 24.2 25.7 27.1 28.6
30
K
?
'
alpha'
alpha
Figure 5(a) Figure 5(b)
31
As before, let ? represent the amount of overlap length between the two
individual CIs. Similar procedure as in previous section yields
? = U(?
y
) ? L(?
x
) = (
/2
/
yy
y Zn
?
?+? ) ? (
/2
/
x x
x Zn
?
??? )
=
/2
(/ / )
x xy y
Z nn
?
??+ ?( x y? ) (12a)
Let
r
? be the borderline value of ? at which H
0
is barely rejected at an ?level. From
Eq. (9c), H
0
: ?
x
= ?
y
must be rejected iff xy? >
2
/2
1K/
?
? +
y y
Z n , which upon
substitution into (12a) results in
?
r
=
/2
(/ / )
x xy y
Z nn
?
??+? ( x y? )
=
/2
(/ / )
x xy y
Z nn
?
??+?
2
/2
1K/
?
? +
y y
Z n .
Substituting /K/??=
x xyy
nn in the above equation yields
r
? =
2
/2
( / )[1K 1K]
?
? ?+? +
yy
Zn (12b)
Eq. (12b) indicates that H
0
must be rejected at the LOS ? or less iff ? ?
/2
(/ )
?
? ?
yy
Zn
2
[1 K 1 K ]+?+ . Further, the span of the two individual CIs is
U(?
x
) ? L(?
y
) = (
/2
/
x x
x Zn
?
?+? ) ? (
/2
/
yy
y Zn
?
??? )
=
/2
(/ / )
x xy y
Z nn
?
??+ + ( x y? ) (12c)
Thus, the exact percent ?overlap is given by
? =
/2
/2
(/ / )( )
(/ / )( )
xxyy
xxyy
Z nnxy
Z nnxy
?
?
??
??
+??
?
++?
100%
32
? ? =
/2
/2
(1 K ) / ( )
(1 K ) / ( )
?
?
?
?
+ ??
+ +?
yy
yy
Z nxy
Z nxy
?100% (12d)
As before, ? lies in the closed interval [0, 100%]. The % overlap in Eq. (12d) clearly
shows that as xy0?> increases, the Pvalue of the test decreases, and Eq.(12d) shows
that the % overlap also decreases. Because H
0
must be rejected at the LOS ? or less iff
x y? ?
2
/2
1K/
?
? +
y y
Z n , the maximum % overlap above which H
0
cannot be
rejected at an ?level is given by
r
(k)? =
2
/2 /2
2
/2 /2
(1 K) / 1 K /
(1 K) / 1 K /
??
??+? +
?
++ +
yyy y
yyy y
Z nZ n
Z nZ n
100%
=
2
2
1K 1K
1K 1K
+?+
?
+++
100% (12e)
Eq. (12e) shows that the maximum prevent overlap doesn?t depend on ? and reduces to
17.1573% when
/
K
/
?
?
=
x x
y y
n
n
= 1. It can be verified that the 1
st
derivative of
r
(K)? is
r
222
22K
(K)
1K (1K 1K)
?
??=
+?+++
whose root is K = 1. Moreover, the value of the 2
nd
derivative of
r
(K)? at K = 1 is ? 0.121320344, which means that K = 1 maximizes the %
overlap and the null hypothesis H
0
: ?
x
= ?
y
must be rejected at any ? if the overlap does not
exceed 17.1573%. The farther K is from 1, the smaller the amount of allowable overlap
becomes (i.e., the Overlap procedure becomes less deficient). For example, at K = 2 or
0.50, the % overlap reduces to 14.5898%. This implies that when the limiting SE ratio is K
= 2 or 0.50, the two individual CIs can overlap up to 14.5898% and H
0
: ?
x
= ?
y
must still
33
be rejected at the LOS ? or less. At K = 3 or 1/3, the % overlap reduces to 11.696312%
below which H
0
must be rejected at ? or less level; at K = 10, it reduces to 0.04513682. As
K ? 0 or ?,
r
? ? 0 so that the Overlap procedure very gradually approaches an exact ?
level test [consistent with Table 3 of Payton et al. (2003, p. 3)]
Furthermore, what should the individual confidence level, (1? ?), be so that
comparisons of individual CIs will lead to the exact ??level test? From Eq.(12c), the
corresponding span of two individual CIs at confidence level (1? ?) is U?(?
x
) ? L?(?
y
) =
/2
(1 K ) /
?
? +
y y
Z n + ( x y? ). From Eq.(9c), H
0
: ?
x
? ?
y
= 0 must be rejected at the LOS
? iff xy? >
2
/2
1K/
?
? +
yy
Z n . Substituting the critical limit x y? =
/2
(/ )
?
?
yy
Z n
2
1K?+ into U?(?
x
) ? L?(?
y
) results in U?(?
x
) ? L?(?
y
) =
/2
(1 K ) /
?
? +
y y
Z n +
2
/2
1K/
?
? +
yy
Z n . Furthermore, the (1 ? ?)?100% CIL from the Standard method is
equal to
2
/2
21K/
?
? +
yy
Z n . Thus, the individual confidence levels, (1? ?), should be
set as follows which in turn leads individual CIs to an exact ??level test.
?
/2
(1 K ) /
?
? +
y y
Z n +
2
/2
1K/
?
? +
yy
Z n =
2
/2
21K/
?
? +
yy
Z n
?
/2
(1 K ) /
?
? +
y y
Z n =
2
/2
1K/
?
? +
yy
Z n
? ? =
2
/2
2 [ 1K/(1K)]Z
?
??? + + (13)
Eq.(13) shows that the level of each CI must be set at (1 ? ?) = 1?
2
/2
2[ 1K
?
??? +Z /
(1 K)]+ in order to reject H
0
at ? LOS iff the two CIs are disjoint, which is in agreement
with Eq.(8) of Payton et al. (2003, p.2). To verify this assertion, let q(K) =
34
2
/2
1K/(1K)Z
?
?++. The 1
st
derivative of q(K) is given by (K)q? =
2
2
2
K1K
(1 K )
1K(1K)
+
?
+
++
. Setting (K)q? = 0 results in K =1. Moreover, the 2
nd
derivative of q(K) is
K1
2
2
dq(K)
dK
=
=?0.346476 <0, which implies K =1 maximizes q(k)
and in turn also maximizes ?. Table 5 shows that as K increases toward 1, ? also
increases to reach its maximum and then decreases for the fixed ? as K departs from 1.
Table5. Values of ? Versus K at ? = 0.05 and ? = 0.01
0.05?= 0.01?=
K ? K ? K ? K ?
0.2 0.095783 3.5 0.112872 0.2 0.028594 3.5 0.037197
0.4 0.131601 4 0.106045 0.4 0.047523 4 0.033663
0.6 0.153132 4.5 0.100440 0.6 0.060458 4.5 0.030857
0.8 0.163187 5 0.095783 0.8 0.066863 5 0.028594
1 0.165776 6 0.088541 1 0.068548 6 0.025201
1.2 0.164038 7 0.083206 1.2 0.067415 7 0.022802
1.4 0.160015 8 0.079131 1.4 0.064818 8 0.021030
1.6 0.154931 9 0.075927 1.6 0.061587 9 0.019674
1.8 0.149483 10 0.073346 1.8 0.058189 10 0.018606
2 0.144051 20 0.061628 2 0.054869 20 0.014040
2.2 0.138834 30 0.057723 2.2 0.051746 30 0.012627
2.5 0.131601 40 0.055779 2.5 0.047523 40 0.011944
3 0.121265 50 0.054616 3 0.041713 50 0.011543
Lastly, the impact of Overlap on type II error probabilities for the known variance
normal case is investigated. Comparing Eq.(9c) with Eq.(10), it clearly shows that the
RHS of Eq. (10) is larger than that of Eq. (9c)?
/2
(1 K ) /
?
? +
yy
Z n ?
2
/2
1K/
?
? +
yy
Z n =
2
/2
(/ )(1K 1K)
?
? +?+
yy
Zn > 0
35
because
2
1K 1K+>+ . Thus rejecting H
0
when the two separate CIs are disjoint is
more stringent than using the SMD of x ?y and will always lead to much less statistical
power.
The Standard method Pr of committing a type II error (assuming ?
x
> ?
y
), using
Figure 3 is given by
? = Pr[
22
/2 x x y y
Z/n/n
?
??+?? x ?y ?
22
/2 x x y y
Z/n/n
?
?+? ??] (14a)
=
22 22
/2 /2
22 22 22
// //
()
Pr[ ]
// // //
xx yy xx yy
xx yy xx yy xx yy
Znn Znn
xy
nn nn nn
??
? ?? ???
?
?? ?? ??
?+? +?
??
??
++
=
22 22
/2 /2
Pr[ / / / / / / ]
x xyy xxyy
Z nnZZ nn
??
?? ? ?? ??? + ??? + (14b)
=
/2 /2
22
//
Pr
1K 1K
x xxx
ZZZ
??
? ???
??
?? ?????
++??
(14c)
As in Schenker et al. (2001), let d represent a standardized difference, i.e., d =
22
//
x xyy
nn
?
??+
=
2
/
1K
? ?
+
yy
n
. Thus, the above equation results in the following form:
/2 /2
()( )Z dZd
??
?=? ? ??? ? (14d)
Eq. (14d) is the same result as Schenker et al. (2001, p.184) in their formula (6), except
that they provide the equation for 1??. Furthermore, when the null hypothesis H
0
: ?
x
? ?
y
= 0 is not rejected at LOS ? iff the two independent CIs ( x ?Z
?/2
xx
/n? ? ?
x
?x+
Z
?/2
xx
/n? ) and ( y?Z
?/2
yy
/n? ? ?
y
? y+ Z
?/2
yy
/n? ) are overlapping, the Pr of
a type II error (assuming ?
x
> ?
y
) from the Overlap method is given by
36
??= Pr(Overlap?? > 0) = Pr{[ ( ) ( )
x y
LU? ?? ]?[( ) ( )
yx
LU? ?? ]?
x
? ?
y
> 0}
= Pr{[ x ?
/2
Z
?
x
x
n
?
?y+
/2
Z
?
y
y
n
?
]? [
/2
y
y
yZ
n
?
?
? ? x+
/2
Z
?
x
x
n
?
] 0? > }
= Pr{[ x ?y ?
/2
Z
?
x
x
n
?
+
/2
Z
?
y
y
n
?
]? [
/2
y
y
Z
n
?
?
? ?
/2
Z
?
x
x
n
?
? x y? ] 0? > }
= Pr{[
/2
y
y
Z
n
?
?
? ?
/2
Z
?
x
x
n
?
? x ?y ?
/2
Z
?
x
x
n
?
+
/2
Z
?
y
y
n
?
 0? > ] (15a)
/2 /2
222
() ()
Pr[]
?+? +?
??
=??
+++
yy
xx
xy xy
yy
xx
xy xy xy
ZZ
nn nn
xy
nn nn nn
??
? ?
??
? ?
?
???
=
/2 /2
22 22
(1 K ) (1 K )
Pr
1K 1K
yy
xx
xy xy
ZZZ
nn nn
??
? ?
? ?
??
++
??? ?
++
= ?(
/2
2
(1 K )
1K
Z
?
+
+
?d) ? ?( ?
/2
2
(1 K )
1K
Z
?
+
+
?d) (15b)
where K
2
= V(x) V(y)/ , and
2
(1 K ) 1 K/+ += //()/??+
xxyy
nn
22
//??+
x xyy
nn. Thus, the PWF (power function) of the Overlap procedure in the
case of knownVariances is
1? ?? = ?(d ?
/2
2
(1 K )
1K
Z
?
+
+
) + ?( ?
/2
2
(1 K )
1K
Z
?
+
+
?d) (15c)
The result in Eq. (15c) is the same as that of Schenker et al. (2001, p. 184) in their Eq.(7),
as they also provide the expression for 1 ? ??. Schenker et al. (2001) just provide both
37
power functions without any explanation. The step by step derivations provided above
have not been presented in statistical literature. For the comparison of ? and ?? see Table
6A. In Table 6A, the comparison is done for both ? = 0.05 and ? = 0.01. As the table
shows, if d is fixed, as k increases, the type II error Pr increases. If k is fixed, the type ?
error rate decreases as d increases. Thus, as Table 6A shows, the probability of type II
error basedon Overlap is larger than that of the Standard method, i.e., the Overlap method
will lead to smaller statistical power. Secondly, when k is fixed, as d increases, ???? is
not necessarily increasing or decreasing, this is consistent with figure 4 of Schenker and
Gentleman (2001). Furthermore, Table 6A shows that at a fixed K the difference in
percent relative power decreases as the standardized difference d increases for both ? =
0.05 and ? = 0.01.
By definition, for an ?level test the relative efficiency of the Overlap to the
Standard method, assuming the same statistical power, is given by
RELEFF(Overlap to Standard) = RELEFF(O, ST) =
xy xy
(n n ) / ( )nn??+ + (15d)
where the type II error Pr of the Standard method is given by Eq. (14) and n? is the
Overlap sample size for which ?? = ?. The exact solution to n? is obtained by setting the
first argument in Eq. (14b) to that of (15b), i.e.,
2
0.025
Z(1 )/1K??++K ?
22
xx yy
//n /n? ??? +? =
0.025
Z ?
22
xx yy
//n /n?? +? (15e)
Schenker and Gentleman (2001) state in their section 3 that the minimum ARE is
? which clearly occurs at their limiting SE ratio of k = 1. We could obtain their value if
we equate the argument of ? on the RHS of Eq. (14a),
22
/2 x x y y
Z/n/n
?
?+?, to that of
38
Table 6A. The Relative Power of Overlap with the Standard Method for Different
Sample Sizes n and /( 2)?? Combinations for the Case of Known but
Unequal Variances
0.05? = 0.01? =
x
x
n?
?
d k
1?? 1 ???
( )100%
1
????
??
x
x
n?
?
d k
1?? 1 ???
( )100%
1
????
??
0.2 0.141 1 0.06898 0.00853 87.6361 0.1 0.071 1 0.01224 0.00035 97.10660
0.4 0.283 1 0.09352 0.01281 86.3005 0.3 0.212 1 0.01809 0.00060 96.67198
0.8 0.566 1 0.16323 0.02738 83.2293 0.5 0.354 1 0.02626 0.00100 96.17487
1 0.707 1 0.21026 0.03895 81.4745 1 0.707 1 0.06166 0.00333 94.60226
1.5 1.061 1 0.36849 0.08705 76.3756 1.5 1.061 1 0.12973 0.00982 92.43060
2 1.414 1 0.58524 0.17459 70.1672 2 1.414 1 0.24539 0.02584 89.46857
0.2 0.111 2 0.06445 0.00913 85.8305 0.1 0.055 1.5 0.01172 0.00044 96.27097
0.4 0.222 2 0.08220 0.01256 84.7235 0.3 0.166 1.5 0.01598 0.00066 95.86846
0.8 0.444 2 0.12947 0.02295 82.2715 0.5 0.277 1.5 0.02153 0.00099 95.42442
1 0.555 2 0.15994 0.03052 80.9184 1 0.555 1.5 0.04327 0.00255 94.10605
1.5 0.832 2 0.25936 0.05930 77.1341 1.5 0.832 1.5 0.08120 0.00614 92.43297
2 1.109 2 0.39501 0.10771 72.7329 2 1.109 1.5 0.14253 0.01379 90.32345
0.2 0.089 2 0.06141 0.01108 81.9557 0.1 0.045 2 0.01137 0.00065 94.30995
0.4 0.179 2 0.07490 0.01426 80.9631 0.3 0.134 2 0.01462 0.00089 93.87954
0.8 0.358 2 0.10911 0.02310 78.8304 0.5 0.224 2 0.01866 0.00123 93.41816
1 0.447 2 0.13034 0.02908 77.6870 1 0.447 2 0.03329 0.00262 92.11581
1.5 0.671 2 0.19735 0.05014 74.5919 1.5 0.671 2 0.05678 0.00535 90.57311
2 0.894 2 0.28663 0.08272 71.1422 2 0.894 2 0.09268 0.01042 88.75241
0.2 0.074 3 0.05934 0.01338 77.4461 0.1 0.037 2.5 0.01113 0.00093 91.64804
0.4 0.149 3 0.07008 0.01643 76.5492 0.3 0.111 2.5 0.01372 0.00121 91.19269
0.8 0.297 3 0.09634 0.02441 74.6610 0.5 0.186 2.5 0.01684 0.00156 90.71390
1 0.371 3 0.11216 0.02953 73.6684 1 0.371 2.5 0.02749 0.00291 89.40730
1.5 0.557 3 0.16065 0.04652 71.0407 1.5 0.557 2.5 0.04351 0.00525 87.93005
2 0.743 3 0.22353 0.07109 68.1980 2 0.743 2.5 0.06680 0.00918 86.26369
0.2 0.063 3 0.05787 0.01569 72.8768 0.1 0.032 3 0.01095 0.00125 88.56141
0.4 0.126 3 0.06673 0.01864 72.0702 0.3 0.095 3 0.01310 0.00156 88.09595
0.8 0.253 3 0.08783 0.02600 70.3948 0.5 0.158 3 0.01562 0.00193 87.61275
1 0.316 3 0.10023 0.03054 69.5255 1 0.316 3 0.02385 0.00326 86.32330
1.5 0.474 3 0.13738 0.04498 67.2582 1.5 0.474 3 0.03560 0.00537 84.91009
2 0.632 3 0.18434 0.06479 64.8547 2 0.632 3 0.05197 0.00865 83.36362
0.2 0.055 4 0.05678 0.01788 68.5051 0.1 0.027 3.5 0.01082 0.00159 85.26635
0.4 0.110 4 0.06430 0.02072 67.7825 0.3 0.082 3.5 0.01265 0.00192 84.80442
0.8 0.220 4 0.08183 0.02758 66.2952 0.5 0.137 3.5 0.01475 0.00231 84.32904
1 0.275 4 0.09194 0.03169 65.5304 1 0.275 3.5 0.02139 0.00362 83.07963
1.5 0.412 4 0.12165 0.04433 63.5559 1.5 0.412 3.5 0.03048 0.00557 81.73909
2 0.549 4 0.15839 0.06099 61.4915 2 0.549 3.5 0.04273 0.00842 80.30232
39
(15a), namely
/2
Z
?
?
?
x
x
n
+
/2
Z
?
?
?
y
y
n
, and letting ?
x
= ?
y
and n
x
= n
y
= n, but this seems to
ignore the true mean difference ? = ?
x
? ?
y
.
There are a numerous solutions for the Overlap sample sizes
x
n? and
y
n? from Eq.
(15e) that must be at least as large as n
x
and n
y
in order to make the Overlap attain the
same statistical power as the Standard method. Fortunately, an exact solution can be
obtained only when ?
x
= ?
y
and n
x
= n
y
because it will be shown below that optimum
efficiency will be achieved if n?
x
= n?
y
and as a result the above equation reduces to
0.025
Z2? (/)n/2??? =
0.025
Z ? (/)n/2??
The solution to this last equation is
n? =
0.025
Z(22)(/)/? ??+ n (15f)
Eq. (15f) clearly shows that as ?/? increases, the value of n? decreases. Further, as n
increases, the RELEFF of Overlap to the Standard method (n/n?) increases. In fact, the
larger ?/? is, the faster the RELEFF(O, ST) approaches 100% as n ? ?.
To obtain a rough approximation to (15e), we compare the 1
st
statement for ? with
the 4
th
statement for ?? and equate
22
xx yy
/n /n?+? to /
x x
n? ? + /
yy
n? ? . Dividing
both sides of the last equality by /
x x
n? yields
2
1K+ = /
x x
nn? + K /
yy
nn? , where
K =
/
/
y y
x x
n
n
?
?
is called the SE ratio. The equation
2
1K+ = /
x x
nn? + K /
yy
nn?
shows that the solutions
x
n? and
y
n? do not depend on the specific values of ?
x
and ?
y
but
rather only on their ratio /
x y
? ? . Unfortunately, the same cannot be said about the ratio
40
R
n
= n
y
/n
x
, i.e.,
x
n? and
y
n? do depend on the specific values of n
x
and n
y
and not just on
their ratio R
n
. Further, the equation
2
1K+ = /
x x
nn? + K /
yy
nn? clearly shows that
when
x
n? = n
x
and
y
n? = n
y
, the RHS reduces to 1+ K which obviously exceeds the LHS
2
1K+ for all k. As K ? ?, this last equation also shows that
x
n? ? n
x
and
y
n? ? n
y
so that the Overlap becomes an exact ?level test. When K > 1, the minimum n?
x
+
y
n?
occurs (i.e., the Overlap achieves its maximum relative efficiency) when n?
x
?
y
n? and
vice a versa when K < 1. It seems that we have a constrained optimization problem
where (n
x
+n
y
)/( n?
x
+
y
n? ) is to be maximized subject to the nonlinear constraint /
x x
nn? +
K/
yy
nn? =
2
1K+ . The solution to this optimization can be obtained through the
use of Lagrangian multipliers as shown below.
The objective is to maximize f(n?
x
,
y
n? ) = (n
x
+n
y
)/( n?
x
+
y
n? ) subject to
2
1K+ ?
/
x x
nn? ?K/
yy
nn? = 0 and hence it is sufficient to maximize f(n?
x
,
y
n? ) = N/( n?
x
+
y
n? )
+ ?(
2
1K+ ? /
x x
nn? ?K/
yy
nn? ), where N = n
x
+ n
y
and ? is an arbitrary constant.
Taking the partial derivatives of f(n?
x
,
y
n? ) with respective to n?
x
&
y
n? and setting them
equal to zero yields:
x
f/ n??? = ?N( n?
x
+
y
n? )
?2
+ ?( /2
x
n )
3/2
x
(n )
?
?
Set to
? ??? 0
y
f/ n??? = ?N( n?
x
+
y
n? )
?2
+ ?( K/2
y
n )
3/2
y
(n )
?
?
Set to
? ??? 0
Because ? is a completely arbitrary constant, the above system is satisfied as soon as we
equate
x
n
3/2
x
(n )
?
? to K
y
n
3/2
y
(n )
?
? , i.e.,
x
n
3/2
x
(n )
?
? = K
y
n
3/2
y
(n )
?
? ?
41
x
n
3
x
(n )
?
? =
2
K
y
n
3
y
(n )
?
? ?
2
K
y
n /
x
n =
3
x
(n )
?
? /
3
y
(n )
?
? ?
2
K
y
n /
x
n =
3
yx
(n / n )?? ?
yx
n/n??= (
2
K
y
n /
x
n )
1/3
; thus the optimum solution is obtained if we select n?
x
and
y
n? in
such a manner that their ratio
yx
n/n? ? is close to (
2
K
y
n /
x
n )
1/3
. Table 6B provides the
RELEFF of Overlap to the Standard for various values of ?/? only for the case of n
x
= n
y
= n and ?
x
= ?
y
= ? for which K = 1. When ?
x
? ?
y
, there are uncountable ways that K
can equal 1, and therefore, the procedure is to solve
x
n? and
y
n? from (15f) and to
compute the RELEFF from the ratio of (n
x
+n
y
)/(
x
n? +
y
n? ).
The results of this chapter verifies what has been reported in Overlap literature for
the limiting case (i.e., large sample sizes) by Goldstein H. & Healy MJR (1995), Payton
et al. (2000), Schenker N. & Gentleman J. F (2001), and Payton et al. (2003). Payton et
al. (2000) report some approximate Overlap results for smaller sample sizes (n
x
= n
y
= n
= 5(5) 50) but used simulation to obtain them instead of the exact normal theory as
applied here in Chapter 3. Further, it must be emphasized that the results reported in this
chapter will also apply to nonnormal underlying populations only if both n
x
& n
y
> 60.
This is due the Central Limit Theorem (CLT) that states the sample mean distribution
from nonnormal population approaches normality as n ? ?. In practice, the rate of
approach to normality depends only on skewness and kurtosis of the underlying
distributions. It is well known that both the skewness and kurtosis of a normal universe
are zero. The closer the skewness and kurtosis of the parent populations are to zero, the
more rapidly the means ( xandy) approach normality. For example, because the
skewness of a uniform distribution is zero and its kurtosis is ?1.20, only samples of size
at least 6 are needed for the corresponding sample mean to be approximately normally
42
Table 6B. RELEFF of Overlap to the Standard Method at ? = 0.05 and K=1
0.2 0.4 0.6
n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF
4 0.06676 25 0.21671 4 0.16864 25 0.40361 4 0.26117 25 0.52305
5 0.07858 30 0.23840 5 0.19175 30 0.43053 5 0.29037 30 0.54922
6 0.08945 35 0.25758 6 0.21201 35 0.45337 6 0.31519 35 0.57094
8 0.10895 40 0.27479 8 0.24634 40 0.47312 8 0.35577 40 0.58940
10 0.12617 50 0.30462 10 0.27479 50 0.50592 10 0.38814 50 0.61940
12 0.14163 60 0.32987 12 0.29907 60 0.53236 12 0.41495 60 0.64305
14 0.15571 70 0.35174 14 0.32023 70 0.55438 14 0.43776 70 0.66237
16 0.16864 80 0.37098 16 0.33898 80 0.57313 16 0.45754 80 0.67859
18 0.18060 90 0.38814 18 0.35577 90 0.58940 18 0.47495 90 0.69248
20 0.19175 100 0.40361 20 0.37098 100 0.60370 20 0.49048 100 0.70456
0.8 1 1.5
n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF
4 0.33898 25 0.60370 4 0.40361 25 0.66139 4 0.52305 25 0.75211
5 0.37098 30 0.62787 5 0.43658 30 0.68345 5 0.55501 30 0.76981
6 0.39760 35 0.64766 6 0.46358 35 0.70136 6 0.58052 35 0.78401
8 0.44009 40 0.66431 8 0.50592 40 0.71632 8 0.61940 40 0.79574
10 0.47312 50 0.69103 10 0.53823 50 0.74014 10 0.64822 50 0.81419
12 0.49994 60 0.71180 12 0.56411 60 0.75849 12 0.67081 60 0.82823
14 0.52240 70 0.72860 14 0.58553 70 0.77323 14 0.68919 70 0.83939
16 0.54162 80 0.74258 16 0.60370 80 0.78542 16 0.70456 80 0.84855
18 0.55836 90 0.75447 18 0.61940 90 0.79574 18 0.71769 90 0.85626
20 0.57313 100 0.76474 20 0.63317 100 0.80463 20 0.72908 100 0.86286
2 2.5 3
n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF
4 0.60370 25 0.80463 4 0.66139 25 0.83883 4 0.70456 25 0.86286
5 0.63317 30 0.81927 5 0.68826 30 0.85126 5 0.72908 30 0.87365
6 0.65632 35 0.83092 6 0.70916 35 0.86112 6 0.74800 35 0.88217
8 0.69103 40 0.84050 8 0.90471 40 0.86919 8 0.77584 40 0.88914
10 0.71632 50 0.85546 10 0.76246 50 0.88175 10 0.79574 50 0.89995
12 0.73589 60 0.86677 12 0.77959 60 0.89119 12 0.81092 60 0.90805
14 0.75166 70 0.87571 14 0.79331 70 0.89864 14 0.82303 70 0.91443
16 0.76474 80 0.88302 16 0.80463 80 0.90471 16 0.83298 80 0.91962
18 0.77584 90 0.88914 18 0.81419 90 0.90978 18 0.84136 90 0.92395
20 0.78542 100 0.89437 20 0.82242 100 0.91411 20 0.84855 100 0.92764
distributed. This is due to the fact that the skewness of the 6fold convolution of a U(0, 1)
is zero (due to symmetry) while its kurtosis is ?1.20/6 = ?0.20. It can be shown that
43
the kurtosis of an nfold convolution of the U(0, 1) is exactly equal to ?1.20/n (see
Appendix A). Further, our experience indicates [Hool J. N. and Maghsoodloo S. (1980)
and Maghsoodloo S. and Hool J. N. (1981)] that the 3
rd
moment (skewness) plays a more
important role in normal approximation of a linear combination than the 4
th
moment
(kurtosis).
44
4.0 Bonferroni Intervals for Comparing Two Sample Means
The two independent 95% confidence intervals for each of the two population
means have a joint Pr of 0.95
2
of containing ?
x
and ?
y
. Although, this concept of joint Pr
has not been considered in the Overlap literature, we consider it here to investigate its
impact on type I & II error rates from the Overlap method. In order to compare two 95%
CIs against a single 95% CI for ?
x
? ?
y
, it may be best to use the Bonferroni concept so
that the overall confidence Pr (regardless of the correlation structure) of the two CIs is
raised from 0.95
2
= 0.9025 to 0.95. This is accomplished by setting individual CI
coefficient at 1 ? ? = 0.95 = 0.9746794345 so that the joint confidence level will equal
to ( 0.95 )
2
. To this end, let (1 ? ?
B
) = 0.95 (the subscript B stands for Bonferroni) =
0.9746794345; thus,
B
? = 0.02532056552, which results in
B
?
/2 = 0.01266028276 and
Z
0.0126603
= 2.23647664456. Thus, the 97.468% confidence Pr statement for ?
x
is
Pr( x ?Z
0.0126603
xx
/n? ? ?
x
? x + Z
0.0126603
xx
/n? ) = 0.97468. As a result, the
lower 97.468% Bonferroni CI limit for ?
x
is L(?
x
) = x ? Z
0.0126603
xx
/n? and the
corresponding upper limit is U(?
x
) = x+ Z
0.0126603
xx
/n? resulting in the Bonferroni
CIL (confidence interval length) of CIL(?
x
) = 2?Z
0.0126603
xx
/n? . Following the same
procedure, the 97.468% CI for ?
y
will be: L(?
y
) = y ? Z
0.0126603
yy
/n? , U(?
y
) = y+
Z
0.01266
yy
/n? , and the corresponding Bonferroni CIL(?
y
) = 2Z
0.0126603
yy
/n? .
45
The Bonferroni confidence Intervals for ?
x
and ?
y
will not change the 95% CI
for
x y
? ?? , i.e., the 95% CI for
x y
? ?? is still the same as in Eq. (9b), as shown below:
xy??
2
0.025
(/ )1K
xx
Zn? ?+ ?
x y
? ?? ? xy? +
2
0.025
(/ )1K
xx
Zn? ?+ .
The 95% CI in Eq.(9c) shows that the H
0
: ?
x
? ?
y
= 0 must be rejected at the 5% level of
significance iff xy? >
2
0.025
(/ )1K
xx
Zn? ?+ . However, requiring that the two
separate independent CIs must be disjoint in order to reject H
0
: ?
x
? ?
y
= 0 at the 5%
level, is equivalent to either L(?
x
) > U(?
y
), or L(?
y
) > U(?
x
). These two possibilities lead
to either x ? Z
0.01266
xx
/n? > y+ Z
0.01266
yy
/n? , or y ? Z
0.01266
yy
/n? > x +
Z
0.01266
xx
/n? , respectively. Inserting /K/
yy xx
nn??= into this last inequality
leads to the rejection of H
0
iff
xy? >
0.0126603 0.0126603
//
x xyy
Z nZ n??+ =
0.0126603
(1 K ) /
x x
Z n?+ (16a)
Thus the Bonferroni CIL for the Eq.(16a) is
0.0126603
2(1K)/
x x
Z n?+ . (16b)
Using the same procedures as in chapter 3, if we set the exact type I error at 5%
and reject H
0
when the two independent CIs do not overlap; then the Bonferroni type I
error Pr reduces to
B
??
= Pr( x+Z
0.0126603
x
x
n
?
< y ?Z
0.0126603
y
y
n
?
)+ Pr( x ?Z
0.0126603
x
x
n
?
> y+Z
0.0126603
y
y
n
?
)
=
0.0126603
2Pr (1K)/
x x
x yZ n?
??
??> +
??
=
0.0126603
22
(1 K ) /
2Pr
//
x x
xx yy
Z n
Z
nn
?
??
? ?
+
? ?
?>
? ?
+
? ?
46
=
0.0126603
222
(1 K) /
2Pr
/K/
x x
xx xx
Z n
Z
nn
?
??
??
+
??
?>
+
??
= 2?Pr[Z >
2
0.0126603
Z (1K)/1K++] (17)
Eq. (11) leads to??=2?Pr[Z >
2
0.025
Z (1K)/1K++]. Comparing Eq.(17) with Eq.(11),
clearly, since Z
0.0126603
> Z
0.025
?
0.0126603
2
Z (1K)
1K
+
+
>
0.025
2
(1 K )
Z
1K
+
+
, then
B
? ???< .
Thus, the Bonferroni intervals lead to an even smaller type ? error Pr than both ? and??,
i.e.,
B
??
2
0.025
1K/
x x
Z n? + .
Therefore, the value of
?
r
=
()
0.0126603
//
x xy y
Z nn??+ ?
2
0.025
1K/
x x
Z n? +
48
2
0.0126603 0.025
(/ ) (1K) 1K
xx
nZ Z?
??
=? +?+
??
??
(18b)
Eq. (18b) indicates that H
0
must be rejected at the 5% or less level iff ??(/ )
xx
n? ?
2
0.0126603 0.025
(1 K ) 1 KZZ
??
+? +
??
??
. Further, the span of the two individual CIs is
() ()
xy
UL???= (
0.0126603
/
x x
x Zn?+?)? (
0.0126603
/
yy
y Zn???)
2
0.0126603 0.025
(/ ) (1K) 1K
xx
nZ Z?
? ?
=? +++
? ?
? ?
(18c)
Thus, the percentage of the overlap length at the borderline condition for the Bonferroni
case is given by
() ( )
() ()
YX
XY
UL
UL
? ?
? ?
?
?
?100%
=
2
0.0126603 0.025
2
0.0126603 0.025
(1 K ) 1 K
[ ] 100%
(1 K ) 1 K
ZZ+? +
?
++ +
(18d)
Let h(K)=
2
0.0126603 0.025
2
0.0126603 0.025
(1 K ) 1 K
[]
(1 K) 1 K
ZZ+? +
++ +
. From Maple, (K)h? =
222
0.025 .025 0.025
0.025 0.025
K/1K ( K 1K)( K/1K)
K1K (K1K)
BBBB
BB BB
ZZ ZZ Z ZZ
ZZ Z ZZ Z
?++?+++
?
++ + ++ +
(18e)
Plugging K = 1 into Eq.(18e), result in
K1
'(K)  0.117405174 0.117405174 0h
=
= ?= and
K1
(K) h
=
?? =0.0956487070.425286147+0.1174051740.022459306= ?0.425988985<0,
which implies that K =1 maximizes h(K). Thus, the maximum overlap occurs when
K=1 as before. Table 8 shows that, at the same K, the amount of overlap based on
Bonferroni concept is larger than that of two individual CIs at LOS of 0.05. As K
increases, the difference in overlap and Bonferroni overlap monotonically and slowly
49
Table 8. The Impact of Bonferroni on Percent Overlap at Different K
K
Bonferroni
Overlap(%)
Overlap
(%)
Difference
(%) K
Bonferroni
Overlap(%)
Overlap
(%)
Difference
(%)
1 23.481035 17.157288 6.323747 3.1 17.907989 11.453938 6.454051
1.1 23.427525 17.102324 6.325201 3.2 17.678328 11.219816 6.458512
1.2 23.286523 16.957512 6.329012 3.3 17.456447 10.993694 6.462753
1.3 23.082143 16.747656 6.334487 3.4 17.242089 10.775302 6.466787
1.4 22.832795 16.491705 6.341089 3.5 17.034990 10.564364 6.470625
1.5 22.552471 16.204060 6.348410 3.6 16.834880 10.360601 6.474279
1.6 22.251757 15.895613 6.356143 3.7 16.641492 10.163735 6.477758
1.7 21.938629 15.574565 6.364064 3.8 16.454562 9.973490 6.481072
1.8 21.619071 15.247062 6.372009 3.9 16.273829 9.789598 6.484232
1.9 21.297543 14.917681 6.379862 4 16.099042 9.611797 6.487245
2 20.977344 14.589803 6.387541 4.1 15.929955 9.439834 6.490121
2.1 20.660894 14.265901 6.394993 4.2 15.766332 9.273465 6.492867
2.2 20.349935 13.947753 6.402182 4.3 15.607946 9.112454 6.495491
2.3 20.045701 13.636613 6.409088 4.4 15.454576 8.956577 6.497999
2.4 19.749033 13.333333 6.415700 4.5 15.306015 8.805616 6.500399
2.5 19.460480 13.038464 6.422016 4.6 15.162061 8.659366 6.502695
2.6 19.180364 12.752325 6.428039 4.7 15.022524 8.517630 6.504894
2.7 18.908840 12.475065 6.433775 4.8 14.887219 8.380218 6.507001
2.8 18.645939 12.206705 6.439233 4.9 14.755973 8.246951 6.509022
2.9 18.391594 11.947169 6.444424 5 14.628619 8.117658 6.510960
3 18.145672 11.696312 6.449360 5.1 14.504998 7.992177 6.512821
increases toward the limit of 0.065892072.
Finally, the same conclusion as before can be reached that separate CIs lead to
larger type ? error when Bonferroni concept is applied. The exact Pr of type ? error is
the same as Eq.(14) ?
/2 /2
()( )=? ? ??? ?Z dZd
??
? .
For
B
?? (B stands for Bonferroni) will be changed to
B
?? = Pr(Overlap??
x
? ?
y
= ?) = Pr{[ () ()
x y
LU? ?? ]?[ () ()
yx
LU? ?? ] ?}
=?(
0.0126603
2
1K
1K
Z
+
+
?d) ? ?( ?
0.0126603
2
1K
1K
Z
+
+
?d) (19)
Table 9 clearly shows that the Bonferroni concept leads to the largest type ? error Pr than
other two methods, i.e.,
B
? ???>>?. Because the Bonferroni CIs always have larger
50
Table 9. Type ? Error Pr for the Standard, Overlap, and Bonferroni Methods with
Different K and d Combinations
K d ?
??
B
??
K d ?
??
B
??
1 0 0.95 0.994425 0.998438 1.8 0 0.95 0.992305 0.997643
1 0.2 0.921586 0.993461 0.998090 1.8 0.2 0.921586 0.991068 0.997157
1 0.4 0.881232 0.990392 0.996952 1.8 0.4 0.881232 0.987161 0.995579
1 0.6 0.826159 0.984692 0.994725 1.8 0.6 0.826159 0.979999 0.992544
1 0.8 0.753937 0.975507 0.990896 1.8 0.8 0.753937 0.968656 0.987431
1 1 0.662927 0.961706 0.984708 1.8 1 0.662927 0.951936 0.979356
1.2 0 0.95 0.994227 0.998367 2 0 0.95 0.991451 0.997305
1.2 0.2 0.921586 0.993237 0.998006 2 0.2 0.921586 0.990111 0.996763
1.2 0.4 0.881232 0.990085 0.996826 2 0.4 0.881232 0.985887 0.995010
1.2 0.6 0.826159 0.984241 0.994523 2 0.6 0.826159 0.97818 0.991656
1.2 0.8 0.753937 0.974842 0.990571 2 0.8 0.753937 0.96604 0.986044
1.2 1 0.662927 0.960747 0.984200 2 1 0.662927 0.948262 0.977248
1.4 0 0.95 0.993745 0.998190 2.5 0 0.95 0.989156 0.996352
1.4 0.2 0.921586 0.992690 0.997798 2.5 0.2 0.921586 0.987554 0.995662
1.4 0.4 0.881232 0.989343 0.996518 2.5 0.4 0.881232 0.98253 0.993443
1.4 0.6 0.826159 0.983155 0.994030 2.5 0.6 0.826159 0.973451 0.989250
1.4 0.8 0.753937 0.973245 0.989780 2.5 0.8 0.753937 0.959334 0.982342
1.4 1 0.662927 0.958455 0.982970 2.5 1 0.662927 0.938958 0.971701
1.6 0 0.95 0.993083 0.997943 3 0 0.95 0.986832 0.995330
1.6 0.2 0.921586 0.991944 0.997508 3 0.2 0.921586 0.984982 0.994490
1.6 0.4 0.881232 0.988334 0.996090 3 0.4 0.881232 0.979206 0.991807
1.6 0.6 0.826159 0.981690 0.993349 3 0.6 0.826159 0.968852 0.986788
1.6 0.8 0.753937 0.971106 0.988699 3 0.8 0.753937 0.952921 0.978626
1.6 1 0.662927 0.955405 0.981300 3 1 0.662927 0.930202 0.966232
confidence bands, will always lead to larger % overlap and to smaller type I error and
larger type II error rates, they will not be henceforth considered.
Moreover, Figure 7 shows the three type II errors ( , ,
B
? ?? ?? ) at k =1, 1.2, 1.5 and
2. These four figures clearly show the relation that
B
? ?? ?>>?. In other words,
Bonferroni method will lead to the largest type II error.
51
Type II error (at K=1)
0
0.2
0.4
0.6
0.8
1
1.2
0
0.6 1.2 1.8 2.4
3
3.6 4.2 4.8
d
Pr
o
f
t
y
p
e
II
erro
r
beta
beta'
beta(bon)
Type II error (at K=1.2)
0
0.2
0.4
0.6
0.8
1
1.2
0
0.4 0.8 1.2 1.6
2
2.4 2.8 3.2 3.6
4
4.4 4.8
d
P
r
o
f t
y
p
e
II erro
r
beta
beta'
beta'(bon)
Type II error (at K=1.5)
0
0.2
0.4
0.6
0.8
1
1.2
0
0.4 0.8 1.2 1.6
2
2.4 2.8 3.2 3.6
4
4.4 4.8
d
P
r
o
f t
y
p
e
II erro
r
beta
beta'
beta'(bon)
Type II error (at K=2)
0
0.2
0.4
0.6
0.8
1
1.2
0
0.
4
0.
8
1.
2
1.
6 2
2
.4
2
.8
3
.2
3
.6 4
4.
4
4.
8
d
Pr of type II
erro
r
beta
beta'
beta'(bon)
Figure 7
52
5.0 Comparing the Overlap of Two Independent CIs with a Single CI for the Ratio
of Two Normal Population Variances
Because there are two different ttests (pooled ttest and twosample ttest) to
compare independent normal means when variances are unknown, it is prudent to pretest
H
0
:
22
x y
? ?= at an ??level. Because statistical literature cautions against using the pooled
ttest unless there is convincing evidence in favor of H
0
:
22
x y
? ?= , then when testing
H
0
:
22
x y
? ?= just to ascertain to pool or not, the LOS ? will be set much higher than 5%.
Consider a random sample of size n
x
from the normal universe N(?
x
,
2
x
? ). Using
the fact that the rv
2
2
(1)
x
x
x
nS
?
?
has a chisquare distribution with ?
x
= 1
x
n ? degrees of
freedom, it follows that the Pr [
1/2,
2
x
? ?
?
?
<
2
2
(1)
x
x
nS
?
?
<
/2,
2
x
? ?
? ] = 1 ? ?. Rearranging this
last Pr statement results in the (1 ? ?)100% CI for
2
x
?
?
2
2
/2,
(1)
x
x x
nS
??
?
?
<
2
x
? <
2
2
1/2,
(1)
x
x x
nS
? ?
?
?
?
(20a)
Hence, the lower CI limit for
2
x
? is L(
2
x
? ) =
2
2
/2,
x
x x
S
? ?
?
?
and the upper CI limit is U(
2
x
? )
=
2
2
1/2,
x
xx
S
? ?
?
?
?
. These CI lower and upper limits result in the confidence interval length
53
CIL(
2
x
? ) = U(
2
x
? ) ? L(
2
x
? ) =
2
2
1/2,
x
xx
S
? ?
?
?
?
?
2
2
/2,
x
x x
S
? ?
?
?
=
2
22
1/2, /2,
11
()
?
??
x x
xx
S
?? ??
?
??
(20b)
The same procedure as above leads to the (1 ? ?)100% upper and lower CI limits for
2
y
?
as L(
2
y
? ) =
2
2
/2,
(1)
y
yy
nS
??
?
?
, U(
2
y
? ) =
2
2
1/2,
(1)
y
yy
nS
? ?
?
?
?
and
CIL(
2
y
? ) =
2
22
1/2, /2,
11
[]
yy
yy
S
? ???
?
??
?
??. (20c)
With the above information, requiring that the two independent CIs must be
disjoint in order to reject H
0
:
22
x y
? ?= at the ??100% level is equivalent to either L(
2
x
? )
> U(
2
y
? ) or L(
2
y
? ) > U(
2
x
? ), i.e., L(
2
x
? ) > U(
2
y
? ) ?
2
2
/2, 1
(1)
x
x x
n
nS
?
?
?
?
>
2
2
1/2, 1
(1)
y
yy
n
nS
?
?
??
?
?
Thus, based on the Overlap procedure, reject H
0
if
F
0
=
2
2
x
y
S
S
>
y
x
?
?
?
2
/2,
2
1/2,
x
y
??
? ?
?
?
?
, or F
0
=
2
2
x
y
S
S
>
y
x
?
?
?
/2, ,
x y
C
? ??
, (21a)
where
/2, ,
x y
C
? ??
=
2
/2,
2
1/2,
x
y
??
? ?
?
?
?
.
Or L(
2
y
? ) > U(
2
x
? ) ?
2
2
/2,
(1)
y
yy
nS
??
?
?
>
2
2
1/2,
(1)
x
x x
nS
? ?
?
?
?
?
2
2
x
y
S
S
<
y
x
?
?
2
1/2,
2
/2,
x
y
? ?
??
?
?
?
?
? Reject H
0
if F
0
=
2
2
x
y
S
S
<
y
x
?
?
?
1/2,,
x y
C
? ???
(21b)
54
However, the exact (1? ?)100% CI for the ratio of the two independent normal
variances must be obtained from the Fisher?s F distribution as follows:
Consider two samples from N(?
x
,
2
x
? ) and N(?
y
,
2
y
? ), respectively. Thus,
2
2
(1)
x
x
x
nS
?
?
?
1
2
?n
x
? ;
2
2
(1)
y
y
y
nS
?
?
?
1
2
?n
y
? ;
1, 1
xy
nn
F
? ?
=
22
22
[( 1) ] ( 1)
[( 1) ] ( 1)
xxxx
yyyy
nS n
nS n
?
?
? ?
? ?
? Pr(
,
12,
x y
F
? ???
?
22
22
x x
yy
S
S
?
?
?
,
2,
x y
F
? ??
) = 1 ? ?
?
2
2
x
y
S
S
1/2,,
yx
F
? ???
?
2
2
x
y
?
?
?
2
2
x
y
S
S
/2, ,
yx
F
? ??
(22a)
? CIL = F
0
(
/2, ,
yx
F
? ??
?
1/2,,
yx
F
? ???
), where 1
xx
n? = ? and 1
yy
n? =?. (22b)
Then H
0
:
22
x y
? ?= or H
0
:
22
/
x y
? ? = 1 must be rejected at the ??100% level of
significance if the CI in Eq. (22a) excludes one; otherwise, H
0
must not be rejected at the
??100% level. Thus, based on the Standard procedure H
0
:
22
x y
? ?= (or
2
2
1
x
y
?
?
= ) must be
rejected at the ? level iff either F
0
=
2
2
x
y
S
S
<
1/2,,
x y
F
? ???
or F
0
=
2
2
x
y
S
S
>
/2, ,
x y
F
? ??
(22c)
The Pr of type I error for the exact procedure (i.e., using the Standard method from the
null SMD of the ratio
22
x y
SSwhich is
xy
,
F
? ?
) is ?. This implies that H
0
:
22
x y
? ?= will be
rejected at the ?level iff F
0
=
22
x y
SS<
1/2,,
x y
F
? ???
, or F
0
=
22
x y
SS>
/2, ,
x y
F
? ??
. Therefore,
the type I error Pr for the two disjoint CIs (??) is given by
??(two disjoint CIs) = Pr[
22
() ()
XY
UL? ?< ] + Pr[
22
() ()
XY
LU? ?> ]
55
= Pr[
2
2
1/2,
(1)
x
x x
nS
? ?
?
?
?
<
2
2
/2,
(1)
y
yy
nS
??
?
?
] + Pr[
2
2
/2,
(1)
x
x x
nS
??
?
?
>
2
2
1/2,
(1)
y
yy
nS
? ?
?
?
?
]
= Pr[
2
2
x
y
S
S
<
y
x
?
?
?
2
1/2,
2
/2,
x
y
? ?
??
?
?
?
] + Pr[
2
2
x
y
S
S
>
y
x
?
?
?
2
/2,
2
1/2,
x
y
??
? ?
?
?
?
]
= Pr(
2
2
x
y
S
S
<
y
x
?
?
?
12,,
x y
C
? ???
) + Pr(
2
2
x
y
S
S
>
y
x
?
?
?
2, ,
x y
C
? ??
)
= Pr(
,
x y
F
? ?
<
y
x
?
?
?
12,,
x y
C
? ???
)+Pr(
,
x y
F
? ?
>
y
x
?
?
?
2, ,
x y
C
? ??
) (23)
Table 10 gives the values of ? and ??(where ?? represents type I error Pr from the
Overlap procedure) for various values of
x
? and
y
? , verifying the same conclusion as
before: the Overlap method always leads to a smaller type ? error Pr than that of the null
sampling distribution of
22
/
x y
SS, which is the Fisher?s F. Moreover, we have verified
that?? value depends on the sizes of
x y
and? ? and not much on their ratio /
x y
? ? . Eq.
(23) can easily verify that at ? = 0.01, as
x y
and? ? increase, the Overlap type I error Pr,
??, decreases toward 0.000269717, while at ? = 0.05 the value of ?? decreases (from
0.017800531 at
x y
? ?= = 1) toward 0.0055746, similar to the overlapping of CIs for
population means. For the special case that
xy
nnn= = , the rejection of H
0
from the
Overlap method given by Eq. (21a) and Eq. (21b) is reduced to
? Reject H
0
if either F
0
=
2
2
x
y
S
S
<
2
1/2,1
2
/2, 1
n
n
?
?
?
?
? ?
?
=
1/2,n1
C
???
or F
0
=
2
2
x
y
S
S
>
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
? ?
=
/2,n 1
C
? ?
(24)
56
Table 10. The Values of ? and ?' for Various Values of ?
x
and ?
y
0.05?= 0.01?=
x
?
y
?
y x
? ?
?? x
?
y
?
y x
? ?
??
10 20 2 0.007839 10 20 2 0.000624
10 40 4 0.009969 10 40 4 0.000912
10 60 6 0.011917 10 60 6 0.001182
10 80 8 0.013579 10 80 8 0.001420
30 20 0.666667 0.006640 30 20 0.666667 0.000425
30 40 1.333333 0.006262 30 40 1.333333 0.000368
30 60 2 0.006818 30 60 2 0.000424
30 80 2.666667 0.007546 30 80 2.666667 0.000502
50 20 0.4 0.007611 50 20 0.4 0.000534
50 50 1 0.005954 50 50 1 0.000326
50 80 1.6 0.006228 50 80 1.6 0.000349
50 100 2 0.006611 50 100 2 0.000386
100 60 0.6 0.006233 100 60 0.6 0.000346
100 80 0.8 0.005865 100 80 0.8 0.000308
500 500 1 0.005612 100 100 1 0.000297
1000 1000 1 0.005594 100 120 1.2 0.000300
20 30 1.5 0.006640 20 30 1.5 0.000425
40 60 1.5 0.006230 40 60 1.5 0.000355
80 120 1.5 0.006025 80 120 1.5 0.000322
100 150 1.5 0.005984 100 150 1.5 0.000315
20 40 2 0.007077 20 40 2 0.000473
40 80 2 0.006689 40 80 2 0.000400
80 160 2 0.006493 80 160 2 0.000365
1000 2000 2 0.006313 1000 2000 2 0.000333
40 100 2.5 0.007232 40 100 2.5 0.000456
60 150 2.5 0.007105 60 150 2.5 0.000430
100 250 2.5 0.007003 100 250 2.5 0.000410
1000 2500 2.5 0.006864 1000 2500 2.5 0.000383
30 150 5 0.010087 30 150 5 0.000798
40 200 5 0.009970 40 200 5 0.000765
50 250 5 0.009900 50 250 5 0.000746
100 500 5 0.009759 100 500 5 0.000706
1000 5000 5 0.009630 1000 5000 5 0.000671
From Eqs. (22a & b), the rejection of H
0
using Fisher?s F will be simplified as follows:
1, 1nn
F
??
=
22
22
x x
yy
S
S
?
?
, thus, (1 ? ?)100% CIs for
2
2
x
y
?
?
?
2
2
x
y
S
S
1 2,1,1nn
F
????
?
2
2
x
y
?
?
?
2
2
x
y
S
S
/2, 1, 1nn
F
? ? ?
(25a)
57
From Eq.(25a), the length of the exact (1?)% CI is given by F
0
(
/2,1,1nn
F
? ??
?
1,1,1nn
F
????
).
Thus, H
0
should be rejected if
F
0
=
2
2
x
y
S
S
<
1/2,1,1nn
F
????
or F
0
=
2
2
x
y
S
S
>
/2, 1, 1nn
F
? ? ?
. (25b)
Comparing Eq.(24) with Eq.(25b), it can be verified (See Figure 8) that at the same
?level,
2
1/2,1
2
/2, 1
n
n
?
?
?
?
??
?
=
1/2,n1
C
?? ?
<
1/2,1,1nn
F
?? ??
(26a)
and
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
??
=
/2,n 1
C
??
>
2, 1, 1nn
F
? ? ?
for all n. (26b)
The Reject Values of Chi and F
distribution
Eq.(26a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
2 7
12 17 22 27 32 37 42 47 52 57
n1 (df)
R
e
ject
Valu
es
ChiSquare
Distribution
FDistribution
The Reject Values of Chi and F
distribution
Eq.(26b)
0
50
100
150
200
2 8
1
4
2
0
26 32 38 44 50 56
n1 (df)
R
e
ject
Val
u
es
ChiSquare
Distribution
FDistribution
Figure 8
Furthermore, for the balanced
xy
nnn= = case, if the type I error Pr for the
Standard Method (Fisher?s F distribution) is? , the type I error Pr from the two disjoint
CIs (??), Eq.(23), is reduced to ?
??= Pr[
2
2
1/2, 1
(1)
x
x x
n
nS
?
?
??
?
<
2
2
/2, 1
(1)
y
yy
n
nS
?
?
?
?
] + Pr[
2
2
/2, 1
(1)
x
x x
n
nS
?
?
?
?
>
2
2
1/2, 1
(1)
y
yy
n
nS
?
?
? ?
?
]
= Pr(
2
2
x
y
S
S
<
1/2,1n
C
???
) + Pr(
2
2
x
y
S
S
>
/2, 1n
C
? ?
)
= Pr(
1, 1nn
F
??
<
1/2,1n
C
???
) + Pr(
1, 1nn
F
? ?
>
/2, 1n
C
? ?
)
58
=
1, 1 / 2, 1
2Pr( )
?? ?
?>
nn n
FC
?
, where
/2, 1n
C
? ?
=
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
? ?
(27)
Table 11 shows that ??is much smaller than? for the special case that
xy
nnn==. As
in the case of testing H
0
: ?
x
= ?
y
at ? = 0.05, the value of Overlap type I error Pr seems to
slowly approach 0.0055746 as n ? ? and at ? = 0.01, ?? approaches 0.0002697.
Table 11. The Impact of Overlap on Type I Error Pr for the EqualSample Size
Case When Testing the Ratio
22
xy
/? ? Against 1
n1 ? ?
?
n1 ? ?
?
10 0.01 0.000585 10 0.05 0.007468
20 0.01 0.000418 20 0.05 0.006525
50 0.01 0.000326 50 0.05 0.005954
80 0.01 0.000304 80 0.05 0.005812
100 0.01 0.000297 100 0.05 0.005764
130 0.01 0.000291 130 0.05 0.005720
150 0.01 0.000288 150 0.05 0.005701
200 0.01 0.000283 200 0.05 0.005669
250 0.01 0.000281 250 0.05 0.005650
500 0.01 0.000275 500 0.05 0.005612
1000 0.01 0.000272 1000 0.05 0.005594
2000 0.01 0.000271 2000 0.05 0.005584
3000 0.01 0.000271 3000 0.05 0.005581
As before, let ? represent the length of overlap between the CIs for
2
x
? and
2
y
? .
Thus, ? is larger than 0 only if U(
2
x
? ) > U(
2
y
? ) > L(
2
x
? ) or U(
2
y
? ) > U(
2
x
? ) > L(
2
y
? ).
Because both conditions lead to the same result only the case U(
2
x
? ) >U(
2
y
? ) > L(
2
x
? ) is
considered, and without loss of generality the Xsample is the one with larger variance so
that
22
/
x y
SS ? 1.
59
? Because of symmetry, ?= U(
2
y
? ) ? L(
2
x
? ) =
2
2
1/2,
y
yy
S
? ?
?
?
?
?
2
2
/2,
x
x x
S
? ?
?
?
(28a)
Let
r
O be the maximum value of ? at which H
0
is barely rejected at the ? level. From
Eq. (22c), H
0
must be rejected iff F
0
=
2
2
x
y
S
S
>
/2, ,
x y
F
? ??
. Therefore, the borderline value of
? will occur when
2
x
S =
2
y
S ?
/2, ,
x y
F
? ??
. Inserting this into Eq. (28a) will result in:
r
O ?
2
2
1/2,
y
yy
S
? ?
?
?
?
?
2
2, ,
2
/2,
x y
x
xy
SF
? ??
??
?
?
?
=
2
y
S
/2, ,
22
1/2, /2,
()
x y
yx
x
y
F
???
?? ??
?
?
??
?
?? (28b)
The span of the two individual CIs is U(
2
x
? ) ? L(?
2
y
) =
2
2
1/2, 1
(1)
x
x x
n
nS
?
?
? ?
?
?
2
2
/2, 1
(1)
y
yy
n
nS
?
?
?
?
=
2
2, ,
2
1/2,
x y
x
xy
SF
? ??
??
?
?
?
?
?
2
2
/2,
y
yy
S
? ?
?
?
=
2
y
S
2, ,
22
12, 2,
()
xy
x y
x
y
F
???
?? ??
?
?
??
?
?? (28c)
Thus, the percent overlap at the critical limits is
?
r
=
22
22
() ( )
() ()
YX
XY
UL
UL
? ?
? ?
?
?
= [
2, ,
22
12, 2,
2, ,
22
12, 2,
x y
yx
xy
x y
x
y
x
y
F
F
? ??
?? ??
???
?? ??
?
?
??
?
?
??
?
?
?
?
]?100%
= (
/2, , /2, /2, , /2,
22
/2, , /2, , /2, /2,
x yy xyy
x xxy y x
yx
xy
CF
CF
? ?? ?? ??? ??
? ?? ? ?? ? ? ? ?
?? ??????
?????
)?100% (28d)
Table 12 shows that as
x
? and
y
? increase, the percentage of the overlap approaches
17.1573% (although not monotonically). Further, once the % overlap exceeds Eq. (28d),
then H
0
must not be rejected at the ??100% level. Further, it is the size of
x
? and
y
? that
60
Table 12. The % Overlap for the Different Combinations of Degree of
Freedom at ? = 0.05.
x
?
y
?
/
y x
? ?
Overlap
(%) x
?
y
?
/
y x
? ?
Overlap
(%)
10 5 0.5 13.92184 10 12 1.2 10.91515
10 10 1 11.54543 20 24 1.2 13.50590
10 15 1.5 10.15131 40 48 1.2 15.04864
10 20 2 9.18956 60 72 1.2 15.62065
10 25 2.5 8.46961 80 96 1.2 15.92389
20 10 0.5 16.24648 100 120 1.2 16.11365
20 20 1 14.11596 150 180 1.2 16.37981
20 30 1.5 12.73993 300 360 1.2 16.67183
20 40 2 11.73402 500 600 1.2 16.80378
20 50 2.5 10.95112 700 840 1.2 16.86607
40 20 0.5 17.20430 800 960 1.2 16.88670
40 40 1 15.57376 900 1080 1.2 16.90327
40 60 1.5 14.35904 1000 1200 1.2 16.91691
40 80 2 13.40830 2000 2400 1.2 16.98483
40 100 2.5 12.63712 3000 3600 1.2 17.01169
60 30 0.5 17.40395 10 15 1.5 10.15131
60 60 1 16.08722 20 30 1.5 12.73993
60 90 1.5 14.98840 40 60 1.5 14.35904
60 120 2 14.08884 60 90 1.5 14.98840
60 150 2.5 13.34073 80 120 1.5 15.33298
80 40 0.5 17.45225 100 150 1.5 15.55403
80 80 1 16.34928 150 225 1.5 15.87382
80 120 1.5 15.33298 300 450 1.5 16.24474
80 160 2 14.47217 500 750 1.5 16.42384
80 200 2.5 13.74343 700 1050 1.5 16.51241
100 50 0.5 17.45418 800 1200 1.5 17.76296
100 100 1 16.50824 900 1350 1.5 16.56697
100 150 1.5 15.55403 1000 1500 1.5 16.58737
100 200 2 14.72328 2000 3000 1.5 16.69266
100 250 2.5 14.01021 3000 4500 1.5 16.73654
determines the % overlap and not the ratio /
y x
? ? . For the case that
xy
nnn==, the
percent overlap in Eq.(28d) reduces to
? ? =
22
22
() ( )
() ()
YX
XY
UL
UL
? ?
? ?
?
?
= [
2, 1, 1
2
22
12,1 2,1
2, 1, 1
2
22
12,1 2,1
1
(1) [ ]
1
(1) [ ]
nn
y
nn
nn
y
nn
F
nS
F
nS
?
??
?
??
??
??
? ?
?? ?
??
?? ?
?? ?
?? ?
]100%
61
= [
2, 1, 1
22
12,1 2,1
2, 1, 1
22
12,1 2,1
1
1
nn
nn
nn
nn
F
F
?
??
?
??
??
??
??
?? ?
??
?? ?
?
?
]?100% = [
2, 1 2, 1, 1
2,1 2,1,1
1
???
???
?
? ?
nnn
nnn
CF
CF
??
??
]?100% (29)
Eq.(29) shows that the rejection percent overlap between the two CIs for the ratio of
variances will increase as n increases. Further, ?
r
in Eq. (12b) is also a monotonically
increasing function of ?. For example, at ? = 0.05, n = 2, ?
r
= 0.1348%; at n = 3, ?
r
=
1.8781%; at n = 5, ?
r
= 6.0921%; and at n = 20 and ? = 0.05, ?
r
= 13.9695%, while at n
= 20 and ? = 0.01, ?
r
= 12.0224%. Matlab shows that at ? = n ? 1 = 7,819,285 df [note
that Matlab 7.6(R2008a) loses accuracy in inverting F at the 7
th
decimal place beyond
7,819,285 df], the 0.05level overlap is 17.157261356%, which is very close to the
overlap for two independent normal population means discussed in section 2 (which was
17.157287525%). Further, for very small sample sizes within the interval [2, 4], the
VarianceOverlap method is almost an ?level test, like the case of CIs for normal
population means when ?
x
= ?
y
and K is far away from 1. See the illustration in Table 13.
Table 13. The % Overlap for the Case of ? = 0.05 and n
x
= x
y
= n
n1
Numerator
of Eq.(31)
Denominator
of Eq.(31)
Overlap
(%) n1
Numerator
of Eq.(31)
Denominator
of Eq.(31)
Overlap
(%)
10 0.12652 1.09587 11.54543 200 0.00067 0.00397 16.83011
20 0.03214 0.22770 14.11596 400 0.00023 0.00133 16.99303
30 0.01541 0.10223 15.07427 600 0.00012 0.00070 17.04763
40 0.00933 0.05990 15.57376 800 0.00008 0.00045 17.07499
50 0.00637 0.04014 15.88014 1000 0.00005 0.00032 17.09142
60 0.00469 0.02917 16.08722 1200 0.00004 0.00024 17.10239
70 0.00363 0.02237 16.23652 1300 0.00004 0.00021 17.10661
80 0.00291 0.01783 16.34928 1400 0.00003 0.00019 17.11022
90 0.00240 0.01462 16.43743 1500 0.00003 0.00017 17.11336
100 0.00202 0.01227 16.50824 2000 0.00002 0.00011 17.12433
62
Now, what should the individual confidence level 1 ? ? be so that the two
independent CIs lead to the exact ?level test on H
0
:
2
x
? =
2
y
? . The expressions for the
two 1?? independent CIs are given by
2
2
/2,
(1)
x
x x
nS
??
?
?
?
2
x
? ?
2
2
1/2,
(1)
x
x x
nS
? ?
?
?
?
, (30a)
and
2
2
/2,
(1)
y
yy
nS
??
?
?
?
2
y
? ?
2
2
1/2,
(1)
y
yy
nS
??
?
?
?
(30b)
From Eq.(30a) and Eq.(30b), the overlap amount of two individual CIs at confidence
level (1? ?) is U'(
2
y
? ) ?L'(
2
x
? ) . Therefore, we deduce from (30a &b) that
U'(
2
y
? ) ? L'(
2
x
? ) =
2
2
1/2,
y
yy
S
? ?
?
?
?
?
2
2
/2,
x
x x
S
? ?
?
?
(30c)
Because H
0
:
2
x
? =
2
y
? must be rejected at the ??100%level as soon as Eq.(30c) becomes
zero or smaller, we thus impose the rejection criterion
2
x
S /
2
y
S ? F
?/2
(where for the sake
of convenience F
?/2
=
xy
/2, ,
F
???
) into Eq. (30c). In short, we are rejecting H
0
:
2
x
? =
2
y
? as
soon as the two independent CIs in (30a) and (30b) become disjoint. This leads to
rejecting H
0
:
2
x
? =
2
y
? iff U'(
2
y
? ) ?L'(
2
x
? ) =
2
2
1/2,
y
yy
S
? ?
?
?
?
?
2
/2
2
/2,
x
x y
FS
?
??
?
?
? 0. (31a)
At the borderline value, we set the overlap amount at LOS ? in inequality (31a) to 0 in
order to solve for ??
2
2
1/2,
y
yy
S
? ?
?
?
?
?
2
/2
2
/2,
x
x y
FS
?
??
?
?
= 0 ?
2
1/2,
y
y
? ?
?
?
?
?
/2
2
/2,
x
x
F
?
? ?
?
?
= 0
?
/2
2
/2,
x
x
F
?
? ?
?
?
=
2
1/2,
y
y
? ?
?
?
?
?
/2x
y
F
?
?
?
=
2
/2,
2
1/2,
x
y
??
? ?
?
?
?
?
/2x
y
F
?
?
?
=
/2, ,
x y
C
? ??
(31b)
63
where F
?/2
=
xy
/2, ,
F
???
and
/2, ,
x y
C
? ??
=
22
/2, 1 /2,
/
x y
? ???
??
?
. Eq. (31b) clearly shows that
the value of ? depends on the LOS ? of testing H
0
:
2
x
? =
2
y
? and the sample sizes n
x
and
n
y
. For example, when ? = 0.05, n
x
= 21 & n
y
= 11 Eq. (31b) reduces to 2F
0.025,20,10
=
/2,20,10
C
?
=
22
/2,20 1 /2,10
/
??
??
?
? 2?3.4185 =
22
/2,20 1 /2,10
/
??
??
?
? 6.8371 =
2
/2,20
/
?
?
2
1/2,10?
?
?
. Through trial & error the solution to this last inequality is ?/2 = 0.0712
( ? = 0.1424). In turns out that as long as ?
x
= 2?
y
, the required confidence level for the
two independent CI on
2
x
? and
2
y
? must be set approximately equal to 1?2?0.0712 =
85.76%. Further, if ?
y
= 2?
x
the required confidence level for the two independent CI on
2
x
? and
2
y
? must be set approximately equal to 1?2?0.083 = 83.40%. In the case of
balanced design (i.e., when ?
x
= ?
y
) Eq. (31b) reduces to
/2, 1, 1nn
F
? ? ?
=
/2, 1?n
C
?
(31c)
It can be verified that the approximate solution to Eq. (31c) when ? = 0.05, n1=10,
through trial & error, is ?/2 = 0.079. Therefore, the individual CIs have to be set at
84.20%. For the moderate sample 10 ? n ?30, the approximate solution is 0.08. We used
Matlab to determine that un the limit (as n ? 7,819,286 at 7 decimal accuracy), ?/2 ?
0.08288800. because MS Excel 2003 cannot invert
2
?
? for df beyond ? = 1119. Table14
shows the value of ? to make the two sides of Eq.(31b) equal for different
x
? and
y
? combinations. Table 15 shows the cases when
y
? is kept fixed at 20 but the ratio
/
x y
? ? changes from 0.5 to 50 causing? to become smaller and smaller.
64
Table 14. The Overlap Significance Level, ? , that Yields the Same 5%Level Test
or 1%Level Test by the Standard Method
0.05? = 0.01? =
x
? y
?
x
y
?
?
/2x
y
F
?
?
?
?
/2, ,
x y
C
? ??
x
? y
?
x
y
?
?
/2x
y
F
?
?
?
?
/2, ,
x y
C
? ??
10 20 0.5 1.38684 0.16658 1.38684 10 20 0.5 1.92350 0.06875 1.92350
20 40 0.5 1.03386 0.16560 1.03386 20 40 0.5 1.29921 0.06878 1.29921
30 60 0.5 0.90760 0.16489 0.90760 30 60 0.5 1.09372 0.06847 1.09372
40 80 0.5 0.83952 0.16440 0.83952 40 80 0.5 0.98697 0.06819 0.98697
50 100 0.5 0.79585 0.16402 0.79585 50 100 0.5 0.92002 0.06796 0.92002
60 120 0.5 0.76497 0.16373 0.76497 60 120 0.5 0.87343 0.06776 0.87343
80 160 0.5 0.72348 0.16329 0.72348 80 160 0.5 0.81179 0.06745 0.81179
100 200 0.5 0.69635 0.16297 0.69635 100 200 0.5 0.77209 0.06722 0.77209
200 400 0.5 0.63291 0.16211 0.63291 200 400 0.5 0.68121 0.06658 0.68121
500 1000 0.5 0.58092 0.16128 0.58092 500 1000 0.5 0.60881 0.06592 0.60881
1000 2000 0.5 0.55615 0.16117 0.55611 1000 2000 0.5 0.57498 0.06552 0.57499
2000 4000 0.5 0.53918 0.16075 0.53915 2000 4000 0.5 0.55207 0.06551 0.55203
10 10 1 3.71679 0.15810 3.71679 10 10 1 5.84668 0.05981 5.84668
20 20 1 2.46448 0.16189 2.46448 20 20 1 3.31779 0.06400 3.31779
30 30 1 2.07394 0.16317 2.07394 30 30 1 2.62778 0.06548 2.62778
40 40 1 1.87520 0.16382 1.87520 40 40 1 2.29584 0.06623 2.29584
50 50 1 1.75195 0.16421 1.75195 50 50 1 2.09671 0.06669 2.09671
60 60 1 1.66679 0.16447 1.66679 60 60 1 1.96217 0.06699 1.96217
80 80 1 1.55488 0.16480 1.55488 80 80 1 1.78924 0.06738 1.78924
100 100 1 1.48325 0.16499 1.48325 100 100 1 1.68089 0.06761 1.68089
200 200 1 1.32045 0.16538 1.32045 200 200 1 1.44159 0.06808 1.44159
500 500 1 1.19185 0.16562 1.19185 500 500 1 1.25956 0.06836 1.25956
1000 1000 1 1.13205 0.16570 1.13205 1000 1000 1 1.17708 0.06846 1.17708
2000 2000 1 1.09164 0.16568 1.09164 2000 2000 1 1.12214 0.06854 1.12213
10 5 2 13.23831 0.13278 13.23831 10 5 2 27.23636 0.04228 27.23640
20 10 2 6.83709 0.14239 6.83709 20 10 2 10.54803 0.04970 10.54800
30 15 2 5.28747 0.14629 5.28747 30 15 2 7.37349 0.05296 7.37349
40 20 2 4.57464 0.14848 4.57464 40 20 2 6.04306 0.05483 6.04306
50 25 2 4.15744 0.14990 4.15744 50 25 2 5.30448 0.05607 5.30448
60 30 2 3.88002 0.15092 3.88002 60 30 2 4.83030 0.05696 4.83030
80 40 2 3.52875 0.15230 3.52875 80 40 2 4.24979 0.05817 4.24979
100 50 2 3.31170 0.15320 3.31170 100 50 2 3.90249 0.05896 3.90249
200 100 2 2.84057 0.15532 2.84057 200 100 2 3.17944 0.06083 3.17944
500 250 2 2.48968 0.15705 2.48968 500 250 2 2.66931 0.06234 2.66931
1000 500 2 2.33277 0.15786 2.33277 1000 500 2 2.44932 0.06305 2.44932
2000 1000 2 2.22893 0.15825 2.22900 2000 1000 2 2.30657 0.06352 2.30658
65
Table 15. The Overlap Significance Level, ? , That Yields the Same 5%Level Test
or 1%Level Test by the Standard Method at Fixed
y
? and Changing
x
?
0.05? = 0.01? =
x
? y
?
x
y
?
?
/2x
y
F
?
?
?
?
/2, ,
x y
C
? ??
x
? y
?
x
y
?
?
/2x
y
F
?
?
?
?
/2, ,
x y
C
? ??
10 20 0.5 1.38684 0.16658 1.38684 10 20 0.5 1.92350 0.06875 1.92350
12 20 0.6 1.60550 0.16628 1.60550 12 20 0.6 2.20674 0.06803 2.20674
16 20 0.8 2.03723 0.16445 2.03723 16 20 0.8 2.76540 0.06611 2.76540
20 20 1 2.46448 0.16189 2.46448 20 20 1 3.31779 0.06400 3.31779
24 20 1.2 2.88907 0.15910 2.88907 24 20 1.2 3.86643 0.06192 3.86643
28 20 1.4 3.31194 0.15629 3.31194 28 20 1.4 4.41266 0.05995 4.41266
32 20 1.6 3.73362 0.15356 3.73362 32 20 1.6 4.95722 0.05811 4.95722
36 20 1.8 4.15445 0.15095 4.15445 36 20 1.8 5.50059 0.05641 5.50059
40 20 2 4.57464 0.14848 4.57464 40 20 2 6.04306 0.05483 6.04306
50 20 2.5 5.62323 0.14287 5.62323 50 20 2.5 7.39659 0.05139 7.39659
60 20 3 6.67008 0.13802 6.67008 60 20 3 8.74765 0.04853 8.74765
70 20 3.5 7.71585 0.13380 7.71585 70 20 3.5 10.09722 0.04612 10.09722
80 20 4 8.76092 0.13010 8.76092 80 20 4 11.44579 0.04407 11.44579
90 20 4.5 9.80550 0.12683 9.80550 90 20 4.5 12.79368 0.04229 12.79368
100 20 5 10.84972 0.12392 10.84972 100 20 5 14.14107 0.04074 14.14107
110 20 5.5 11.89369 0.12130 11.89369 110 20 5.5 15.48809 0.03936 15.48809
120 20 6 12.93745 0.11894 12.93745 120 20 6 16.83483 0.03814 16.83482
130 20 6.5 13.98105 0.11679 13.98105 130 20 6.5 18.18134 0.03705 18.18134
140 20 7 15.02453 0.11482 15.02453 140 20 7 19.52768 0.03606 19.52768
1000 20 50 104.70358 0.07610 104.70360 1000 20 50 135.22825 0.01898 135.22158
Next, the type ? error Pr for both the Fdistribution and separate CIs cases are
discussed. Comparing equations (21 a & b) with Eq. (22c), because ( /
yx
? ? )?
/2, ,
x y
C
? ??
>
/2, ,
x y
F
? ??
, and ( /
yx
? ? )
1/2,,
x y
C
? ???
? <
1/2,,
x y
F
? ???
(see the illustration in the Figure 7),
it follows that the disjoint CIs provide more stringent requirement for rejecting H
0
. Thus,
the rejecting rule from two disjoint CIs will always lead to a larger Type ? error Pr (or
much less statistical power) as illustrated below. By definition, ? =Pr( Type ? error) =
Pr(not rejecting H
0
H
0
is false). Since H
0
is assumed false, it follows that
22
x y
? ?? . Let ?
= /
x y
? ? ?
222
x y
? ??= . Thus, ?(? ) = Pr (
,
1/2,?
x y
F
? ??
? F
0
?
/2, ,
x y
F
? ??
 ? = /
x y
? ? )
66
? ?(? ) =
,
x y
cdfF
? ?
(
2
/2, ,
/
xy
F
???
? ) ?
,
x y
cdfF
? ?
(
,
2
1/2,
/
xy
F
???
?
?
) (32a)
And the Type ? error Pr of the two independent CIs is given by
??(? ) = Pr{[
22
() ()
x y
LU? ?? ]?[
22
() ()
yx
LU? ?? ]? =
x
y
?
?
}
= Pr{[
2
2
/2,
x
x x
S
? ?
?
?
?
2
2
1/2,
y
yy
S
? ?
?
?
?
]?[
2
2
/2,
y
yy
S
? ?
?
?
?
2
2
1/2,
x
xx
S
? ?
?
?
?
] ? =
x
y
?
?
}
= Pr{[
y
x
?
?
?
2
1/2,
2
/2,
x
y
? ?
??
?
?
?
?
2
2
x
y
S
S
?
y
x
?
?
?
2
/2,
2
1/2,
x
y
??
? ?
?
?
?
]? =
x
y
?
?
}
= Pr [
y
x
?
?
?
12,,
x y
C
? ???
?
2
2
x
y
S
S
?
y
x
?
?
?
2, ,
x y
C
? ??
? =
x
y
?
?
]
= Pr [
2
2
y
x
?
?
?
y
x
?
?
?
12,,
x y
C
? ???
?
,
x y
F
? ?
?
2
2
y
x
?
?
?
y
x
?
?
?
2, ,
x y
C
? ??
 ? =
x
y
?
?
]
= Pr [
2
1
?
?
y
x
?
?
?
12,,
x y
C
? ???
?
,
x y
F
? ?
?
2
1
?
?
y
x
?
?
?
2, ,
x y
C
? ??
]
=
,
2
1
(
xy
cdfF
??
?
?
y
x
?
?
?
2, ,
x y
C
? ??
) ?
,
x y
cdfF
? ?
(
2
1
?
?
y
x
?
?
?
12,,
x y
C
? ???
) (32b)
Table 16 illustrates that the Type ? Error Pr from the two overlapping CIs (Eq.(32b)) is
larger than the corresponding exact value from the F distribution (Eq.(32a)). For the
case
xy
nnn==, the type II error Pr, ?(?) in Eq.(32a), is reduced to
? ?(?)= cdfF
2
/2,1,1
(/
??nn
F
?
? ) ? cdfF(
2
1/2,1,1
/
nn
F
?
?
???
) (33a)
As n increases, the second term on the RHS of Eq.(33a), cdfF(
2
1/2,1,1
/
nn
F
?
?
???
) , becomes
smaller. For example at ? =1.6, Table 17 shows that when n ? 10, the 2
nd
term is less
67
Table 16. The Relative Power of the Overlap to the Standard Method for Different
df Combinations at ?=1.2
x
? y
?
? ??
( )100%
1
????
??
x
? y
?
? ??
( )100%
1
????
??
10 10 0.91766 0.98481 81.55134 100 10 0.91207 0.96043 54.99654
10 20 0.89163 0.98041 81.92535 100 40 0.74692 0.91585 66.74959
10 30 0.87733 0.97509 79.69653 100 70 0.63397 0.87179 64.97326
10 40 0.86842 0.96990 77.12143 100 100 0.55858 0.83125 61.77049
10 50 0.86236 0.96505 74.61099 100 150 0.47954 0.75978 53.84462
20 10 0.91548 0.97991 76.22917 120 20 0.84992 0.94123 60.84118
20 20 0.87759 0.97518 79.72614 120 50 0.69250 0.89059 64.41798
20 30 0.85247 0.96921 79.12977 120 80 0.58356 0.84178 62.00594
20 40 0.83499 0.96306 77.61297 120 110 0.50857 0.79724 58.74004
20 50 0.82223 0.95707 75.85031 120 150 0.44078 0.74501 54.40270
40 20 0.86430 0.96531 74.43748 150 30 0.78694 0.91608 60.61311
40 30 0.82554 0.95716 75.44521 150 70 0.59415 0.83886 60.29563
40 40 0.79525 0.94876 74.97338 150 100 0.49836 0.78518 57.17595
40 50 0.77124 0.94041 73.95230 150 130 0.43027 0.73690 53.81970
40 60 0.75187 0.93229 72.71069 150 160 0.38045 0.69411 50.62713
70 20 0.85581 0.95410 68.16592 200 50 0.66561 0.85840 57.65517
70 40 0.76416 0.93059 70.56976 200 80 0.52932 0.79054 55.49954
70 60 0.69858 0.90724 69.22532 200 120 0.40562 0.70859 50.97174
70 80 0.65095 0.88509 67.07999 200 150 0.34220 0.65484 47.52817
70 100 0.61528 0.86454 64.78992 200 180 0.29514 0.60752 44.31852
Table 17. Type II Error for Different Degrees of Freedom (The Case of n
x
= n
y
= n )
? ?
Eq.(33a)
1
st
term
Eq.(33a)
2
nd
term ?(?) ? ?
Eq.(33a)
1
st
term
Eq.(33a)
2
nd
term ?(?)
5 1 0.975 0.025 0.95 21 1.6 0.4451093 5.2063E05 0.4450573
5 1.2 0.9482959 0.0115 0.9367963 22 1.6 0.4243906 4.2048E05 0.4243486
5 1.4 0.9090074 0.005788 0.9032193 23 1.6 0.4043861 3.4049E05 0.4043521
5 1.6 0.8578487 0.003139 0.8547093 24 1.6 0.3850938 2.7639E05 0.3850662
5 1.8 0.7971565 0.001811 0.7953453 25 1.6 0.3665092 2.2487E05 0.3664867
5 2 0.7301745 0.0011 0.7290745 30 1.6 0.2838905 8.2591E06 0.2838822
5 2.1 0.6954029 0.000872 0.6945313 35 1.6 0.2172298 3.1597E06 0.2172266
10 1.2 0.9246265 0.006969 0.9176572 40 1.6 0.1644551 1.2481E06 0.1644538
10 1.4 0.8361705 0.002117 0.8340535 45 1.6 0.123329 5.0597E07 0.1233284
10 1.6 0.7168104 0.000705 0.716105 50 1.6 0.0917083 2.0959E07 0.0917081
15 1.2 0.9024968 0.004697 0.8977997 55 1.6 0.067677 8.8419E08 0.0676769
15 1.4 0.7639183 0.000929 0.7629895 60 1.6 0.0495987 3.7894E08 0.0495986
15 1.6 0.5840969 0.000201 0.5838961 65 1.6 0.0361208 1.6465E08 0.0361207
20 1.1 0.9400273 0.009206 0.9308214 70 1.6 0.0261534 7.2419E09 0.0261534
20 1.2 0.880935 0.003341 0.877594 75 1.6 0.0188356 3.2198E09 0.0188356
20 1.3 0.7969205 0.001215 0.7957056 80 1.6 0.0134984 1.4455E09 0.0134984
68
than 0.001 so that the 1
st
term on the RHS of (33a) gives the approximate value of
?(?=1.6) to 3 decimal accuracy once n ? 10 and ? ?1.6, where ? = n ? 1 = degrees of
freedom. Thus, for the
xy
nnn== case, if the acceptance criterion is based on
overlapping of the two CIs, then Eq.(32b) will be changed to
()???= Pr[
2
2
y
x
?
?
2
1/2,1
2
/2, 1
n
n
?
?
?
?
??
?
??
1, 1x y
nn
F
? ?
?
2
2
y
x
?
?
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
? ?
?  ? =
x
y
?
?
]
= Pr[
2
1
?
2
1/2,1
2
/2, 1
n
n
?
?
?
?
??
?
?
1, 1x y
nn
F
? ?
?
2
1
?
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
? ?
]
= cdfF(
2
/2, 1
/
n
C
?
?
?
) ? cdfF(
2
1/2,1
/
??n
C
?
? ), where
/2, 1n
C
? ?
=
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
??
. (33b)
As the degrees of freedom, ? (= n ? 1) or ? increases, the second term of the Eq.(33b),
cdfF(
2
/2, 1
1/
n
C
?
?
?
), becomes smaller. For example, if ? is fixed at 10, the cumulative
probability of the second term on the RHS of Eq.(33b) will be less than 0.001 when? =
1.2. Conversely, if? is kept at 1.2, the 2
nd
term is less than 0.001 if ? ?9 (see the
illustration in Table 18). Based on the above discussion, Eqs.(33) can be approximated
as ?(?) = cdfF(
2
/2,1,1
/
nn
F
?
?
??
) and ??(? ) = cdfF(
2
/2, 1
/
n
C
?
?
?
) (33c)
In Eqs.(26),
2
/2, 1
2
1/2,1
n
n
?
?
?
?
?
??
=
/2, 1?n
C
?
>
,1.1
2
nn
F
?
? ?
for all n and
2
/2, 1
/
n
C
?
?
?
>
2
/2,1,1
/
nn
F
?
?
??
and as a result ( )? ?? = cdfF(
2
/2, 1
/
n
C
?
?
?
) > cdfF(
2
/2,1,1
/
nn
F
?
?
??
) = ( )?? , i.e.,??is
larger than? for all n. This conclusion is the same as that of testing the difference in
population means. Thus, the disjoint confidence intervals always lead to less statistical
power (1 1???
/2,
t
? ?
? 1/ 1/
p xy
Snn+ (36c)
But, for the individual two tCIs, the rejection condition is either L(
x
? ) > U(
y
? )
or L(
y
? ) > U(
x
? ). Using the definition of type I error Pr, bearing in mind that
2
t
?
= F
1,?
,
leads to
??= Pr(reject H
0

x y
??? = 0) = Pr[L(
x
? ) >U(
y
? )] + Pr[L(
y
? ) >U(
x
? )]
= Pr[
/2,
x
x
x
S
xt
n
??
?
/2,
y
y
y
S
yt
n
??
>+ ] + Pr[
/2,
y
y
y
S
yt
n
??
? >
/2,
x
x
x
S
xt
n
??
+ ]
= Pr[ x y? >
/2,
x
x
x
S
t
n
?? /2,
y
y
y
S
t
n
??
+ ] + Pr[ x y? <
/2,
x
x
x
S
t
n
?? /2,
y
y
y
S
t
n
??
? ]
= Pr[x y? >
/2,
/
x
x x
tSn
?? /2,
/
y
yy
tSn
??
+ ] (37a)
74
= Pr[ t
?
>
/2, /2,
//
1/ 1/
xy
x xyy
px y
tSntSn
Snn
?? ??
+
+
]
= Pr[F
1,?
>
2
/2, /2,
2
(/ /)
(1 / 1 / )
xy
xx yy
px y
tSntSn
Sn n
?? ??
+
+
] (37b)
Without loss of generality, we name the sample with the larger variance as X and let
0
F
=
2
x
S/
2
y
S ? 1. Multiplying the argument on the RHS of Eq. (37b) by n
x
n
y
for both
numerator and denominator and substituting
0
F =
2
x
S/
2
y
S ? 1 into (37b) results in ?
??= Pr[F
1,?
>
2
/2, 0 /2,
0
()()
xy
yx
yxxy
tFntn
Fnn
?? ??
?
??
+
++
]
? ??= Pr[F
1,?
>
2
/2, 0 /2,
0
()(1)
xy
n
yx n
tFRt
FR
?? ??
?
??
+
++
] (37c)
= Pr[
1,
F
?
>
2
/2, /2,
0
()(1)
xy
yx n
tt
FR
k
?? ??
?
??
?+
++
] (37d)
where ? = n
x
+n
y
? 2, R
n
= n
y
/n
x
and k =
n0
RF= (S
x
y
n
)/(S
y
x
n ) is the sample se
ratio. Eq.(37c) clearly shows that, besides ?, the value ??depends only on
x
n ,
y
n and F
0
=
2
x
S/
2
y
S and not on the specific values of S
x
and S
y
. For the pooled ttest, in the most
common case of balanced design (i.e., n
x
= n
y
= n), Eq. (37c) reduces to
?? = Pr[F
1,?
>
2
,1, 1 0
0
(1 )
1
n
FF
F
? ?
+
+
] (37e)
where the pretest statistic
0
F =
22
/
x y
SS must range within the interval (
0.90,n 1,n 1
F
??
,
0.10,n 1,n 1
F
??
). The random function
2
00
(1 ) /(1 )FF++inside the argument of the RHS
75
of (37e) attains its maximum at
0
F = 1 and its minimum at
0.10,n 1,n 1
F
? ?
or at
0.90,n 1,n 1
F
? ?
.
As a result the minimum value of ??occurs at F
0
=1 and its maximum occurs at either
0.10,n 1,n 1
F
??
or
0.90,n 1,n 1
F
??
. At the same F
0
, ?? in (37e) is a monotonically increasing
function of n.
Further, Matlab has verified that the limiting value of ?? in Eq. (37e), as n ?
7,819,286 lies in the interval [0.005574595835 (at F
0
= F
0.10
), 0.005574597084 (at F
0
=
1)], both of which are very close to the knownVariance case of testing H
0
: ?
x
= ?
y
. Eq.
(37e) for Overlap type I error Pr is different from 1 ? Pr(A) atop p. 549 of Payton et al.
(2000) because theirs pertains to the general twoindependent sample tstatistic, discussed
in the next section, while (37e) pertains to the pooled ttest. Further, it will be shown that
the denominator df of the F statistic for their general case will not equal n ?1 as stated by
Payton et al. (2000).
The next objective is to show that ?? < ? for all n
x
, n
y
, S
x
and S
y
for which
xy
0.90,n 1,n 1
F
??
<
0
F =
2
x
S /
2
y
S <
xy
0.10,n 1,n 1
F
? ?
.
First, comparing inequality (36c) with Eq.(37a), it follows that if
/2,
x
x
x
S
t
n
?? /2,
y
y
y
S
t
n
??
+ >
/2,
t
? ?
?
11
p
x y
S
nn
+ (38a)
then the same conclusion as the case of known and equal variances will be reached, i.e.,
??
/2,2( 1)n
t
? ?
?
22
()/22/+?
xy
SS n
? Is
/2, 1
()
nxy
tSS
? ?
+ >
/2,2( 1)n
t
? ?
?
22
x y
SS+ ? (38b)
76
Substituting
22
0
/=
x y
FSS into Eq.(38b) results in
/2, 1 0 /2,2( 1) 0
(1 ) 1
nn
tFt F
????
+> +? (38c)
It is clear that the inequality in (38c) easily holds because
/2, 1n
t
? ?
>
/2,2( 1)n
t
? ?
for all
finite n and
00
(1 ) 1FF+>+ for all values of
0
F because
0
F is never negative.
Therefore, the inequality (38a) is true for the case of equal sample sizes but it is not
always so for the unequal sample sizes case. In the unbalanced case, the difficulty in
inequality (38a) occurs when the larger sample size (which will be denoted by n
x
) also
has much larger variance for which inequality (38a) will not be true. For example, if n
y
=
20,
2
y
S = 0.30, n
x
= 60 and
2
x
S = 1.8, the LHS of inequality (38a) becomes 0.6029 and its
RHS becomes 0.6157 so that the inequality is violated. However, in such a case the
value of Fstatistic is F
0
=
2
x
S/
2
y
S = 6 whose Pvalue for pretesting
0
:
xy
H ? ??==
will equal to 0.00007736, i.e., this last hypothesis is easily rejected so that pooling is
disallowed. Again to be on the conservative side, we allow pooling iff the Pvalue of the
variancepretest exceeds 20%. Otherwise, the twoindependent sample tstatistic will be
used for testing
0
:
x y
H ? ?= . This is consistent with Devore?s (2004, p. 377) assertion of
?using the twosample t procedure unless there is compelling evidence for doing
otherwise, particularly when the two sample sizes are different?. Further, unlike the case
of balanced design, when n
x
> n
y
the value of ?? is an increasing function (but not
monotonically) of F
0
but when n
x
< n
y
, the value of?? is almost always a decreasing
function of F
0
. Thus, for fixed n
x
> n
y
, the maximum occurs at
xy
0.10,n 1,n 1
F
??
, and when
n
x
< n
y
the maximum occurs at
xy
0.90,n 1,n 1
F
? ?
. As n
x
and n
y
both increase at the same F
0
,
77
so does the value of ??in (37c). When one sample size is twice the other, the limiting
value of ?? (in terms of n
x
and n
y
) is 0.006286690. When one sample size is three times
the other, the limiting value of ?? is 0.00733793. When one sample size is four times the
other, the limiting value of ?? is 0.008390775. When one sample size is 5 times the other,
the limiting value of ?? is 0.0093831123. When one sample size is 10 times the other
limiting value of ?? is 0.01336332. Finally, as limit of R
n
= n
y
/n
x
? ? or 0, the Overlap
type I Pr approaches that of an exact ?level test. Table 20 gives the exact ?? from Eq.
(37c) for different n
x
and n
y
combinations. The values of F
0
in Table 20 are
restricted such that Pvalue of the pretest
0
:H
x y
? ?= exceeds 20%.
6.2 The Case of H
0
:
xy
? ?= Rejected Leading to the TwoIndependent
Sample tTest
Assuming that X~N(
2
,x x
? ? ) and Y~N(
2
,yy
? ? ), than X Y? is also N(
x y
? ?? ,
22
//
x xyy
nn??+ ), but now the null hypothesis of H
0
:
x y
? ?= is rejected at the 20%
level leading to the assumption that the Fstatistic F
0
=
22
xy
SS/ > 2 for all sample sizes 16
? n
x
& n
y
. Note that for larger sample sizes such as n
x
& n
y
= 41, F
0
can be as small as
1.510 and H
0
:
x y
? ?= can still be rejected at the 20% level because F
0.10,40,40
= 1.5056,
while for n
x
& n
y
= 11, an F
0
as large as 2.323 is needed because F
0.10,10,10
= 2.3226.
Note that an F
0
= 2 is significant at the level 20% once n
x
& n
y
? 16 because F
0.10,15,15
=
1.9722. It has been shown in statistical theory that if the assumption
x y
? ?= is not
78
Table 20. The Pooled ?? Values for Different n
x
, n
y
and F
0
Combinations at ? = 0.05
x
n y
n
0
F
?? x
n y
n
0
F
??
20 40 0.8 0.0071355 20 40 1.4 0.0039812
20 60 0.8 0.0087831 20 60 1.4 0.0035984
20 80 0.8 0.0103096 20 80 1.4 0.0035130
30 10 0.8 0.0034279 30 10 1.4 0.0085406
30 20 0.8 0.0047292 30 20 1.4 0.0068050
30 30 0.8 0.0054425 30 30 1.4 0.0055284
40 20 0.8 0.0044541 40 20 1.4 0.0080709
40 40 0.8 0.0054943 40 40 1.4 0.0055823
40 80 0.8 0.0075641 40 80 1.4 0.0042383
40 100 1 0.0063762 40 100 1.4 0.0040584
20 40 1 0.0056135 20 40 1.5 0.0037263
20 60 1 0.0062056 20 60 1.5 0.0032137
20 80 1 0.0068563 20 80 1.5 0.0030428
30 10 1 0.0049815 30 10 1.5 0.0094908
30 20 1 0.0053938 30 20 1.5 0.0071647
30 30 1 0.0053753 30 30 1.5 0.0055980
40 20 1 0.0056135 40 10 1.5 0.0111374
40 40 1 0.0054254 40 20 1.5 0.0086996
40 80 1 0.0059601 40 30 1.5 0.0067838
40 100 1 0.0063762 40 40 1.5 0.0056536
20 40 1.2 0.0046424 1000 2000 F
0.90,?x, ?y
0.0067706
20 60 1.2 0.0046283 100000 200000 F
0.90,?x, ?y
0.0063436
20 80 1.2 0.0048067 10000000 20000000 F
0.90,?x, ?y
0.0063019
30 10 1.2 0.0067025 1000000000 2000000000 F
0.90,?x, ?y
0.0062871
30 30 1.2 0.0054202 1000 3000 F
0.90,?x, ?y
0.0081935
40 20 1.2 0.0068266 100000 300000 F
0.90,?x, ?y
0.0074960
40 40 1.2 0.0054714 10000000 30000000 F
0.90,?x, ?y
0.0074280
40 80 1.2 0.0049359 1000000000 3000000000 F
0.90,?x, ?y
0.0073386
20 40 1.3 0.0042824 1000 4000 F
0.90,?x, ?y
0.0095652
20 60 1.3 0.0040623 100000 400000 F
0.90,?x, ?y
0.0086485
20 80 1.3 0.0040901 10000000 40000000 F
0.90,?x, ?y
0.0085592
30 10 1.3 0.0076096 1000000000 4000000000 F
0.90,?x, ?y
0.0083917
30 30 1.3 0.0054683 1000 5000 F
0.90,?x, ?y
0.0108398
40 20 1.3 0.0074460 100000 500000 F
0.90,?x, ?y
0.0097352
40 40 1.3 0.0055207 10000000 50000000 F
0.90,?x, ?y
0.0096277
40 80 1.3 0.0045561 1000000000 5000000000 F
0.90,?x, ?y
0.0093842
tenable, the statistic
22
[( ) ( )]/ ( / ) ( / )
x yxxyy
x ySnSn???? ? + has the approximate
Student?s tdistribution with degrees of freedom
79
222
22
22
(/ /)
(/)
(/)
11
xx yy
yy
xx
xy
Sn Sn
Sn
Sn
nn
?
+
=
+
??
=
2
22
[() ()]
(() (()
xy
Vx Vy
Vx Vy
??
+
+
=
2
22
[() ()]
(() (()
xy
yx
Vx Vy
Vx Vy
?? +
+
(39a)
? =?
2
0
2
0
(1)
()
+
+
xy n
y nx
FR
FR
??
? ?
=
22
4
(1)+
+
xy
yx
k
k
??
??
(39b)
where
2
() /
x x
Vx S n= , R
n
= n
y
/n
x
, k = (/ )/(/ )
x xyy
SnSn, and F
0
=
22
xy
SS/ . Eq. (39b)
shows that ? depends only on n
x
, n
y
, and the se ratio
0n
FR . The formulas for degrees
of freedom in (39a &b) rarely lead to an integer and ? is generally rounded down to make
the test of H
0
:
x y
??? = 0 conservative, i.e., rounding down ? increases the Pvalue of
this last test. However, programs like Matlab and Minitab will provide the cdf and
percentage points of the tdistribution for noninteger values of ? in Eqs. (39). It has
been verified using a spreadsheet that Min(?
x
, ?
y
) < ? < ?
x
+ ?
y
is a certainty, and hence
this ttest is less powerful than the pooled ttest. In fact, it is easy to algebraically prove
that for the case of n
x
= n
y
= n, the value of ? always exceeds (n ? 1) and is always less
than 2(n ? 1). It can also be verified that the maximum of ? in Eqs. (39) occurs when the
larger sample also has much larger variance, but yet its value can never exceed the df, n
x
+n
y
?2, of the pooled ttest, as illustrated in Table 21.
When H
0
:
x y
? ?= is rejected at the 20% level (i.e., Pvalue of the test < 0.20), the
approximate (1??)?100% CI for
x y
??? is given by
x y?
/2,
t
? ?
??
22
//
x xyy
Sn Sn+ ?
x y
??? ? x y? +
/2,
t
? ?
?
22
//
x xyy
Sn Sn+ (40a)
resulting in the approximate CIL is 2
/2,
t
? ?
?
22
//
x xyy
Sn Sn+ , and H
0
:
x y
? ?? = 0 can
80
Table 21. Verifying the Inequality that min( ?
x
, ?
y
) < ? < ?
x
+ ?
y
for Different
x
? and
y
? Combinations
x
? y
?
()Var x
()Var y
(at
0
F =
0. 90, ,
x y
F
? ?=
)
? x
? y
?
()Var x
()Var y
(at
0
F =
0. 90, ,
x y
F
? ?=
)
?
1 11 20 6.201 1.106 11 1 20 0.331 11.992
6 16 20 9.181 8.371 16 6 20 6.987 18.725
11 21 20 10.550 17.483 21 11 20 9.449 30.068
16 26 20 11.451 27.422 26 16 20 10.779 40.883
26 31 20 12.364 49.013 31 26 20 12.174 56.686
36 41 20 13.220 69.455 41 36 20 13.110 76.486
46 51 20 13.830 89.823 51 46 20 13.758 96.317
66 71 20 14.664 130.370 71 66 20 14.626 136.052
86 91 20 15.222 170.752 91 86 20 15.198 175.855
106 111 20 15.631 211.034 111 106 20 15.615 215.702
126 131 20 15.947 251.252 131 126 20 15.935 255.579
176 181 20 16.505 351.633 181 176 20 16.498 355.351
226 231 20 16.878 451.882 231 226 20 16.874 455.192
326 331 20 17.361 652.198 331 326 20 17.359 654.979
426 431 20 17.669 852.395 431 426 20 17.668 854.840
526 531 20 17.889 1052.532 531 526 20 17.888 1054.739
1200 3000 20 18.811 2151.446 3000 1200 20 18.788 2275.082
1500 3500 20 18.921 2768.370 3500 1500 20 18.903 2912.635
2000 4000 20 19.037 3913.944 4000 2000 20 19.026 4090.453
2500 4500 20 19.120 5067.119 4500 2500 20 19.112 5263.906
2800 5000 20 19.166 5693.779 5000 2800 20 19.159 5901.618
be rejected at LOS =? if x y? >
/2,
t
? ?
?
22
//
x xyy
Sn Sn+ , i.e.,
? ? Pr(x y? >
/2,
t
? ?
?
22
//
x xyy
Sn Sn+ 
x y
? ?? =? = 0)
? Pr(F
1,?
>
2
/2,
t
? ?
 ? = 0) = Pr(F
1,?
>
,1,
F
? ?
 ? = 0) (40b)
As in the case of pooled ttest, for the individual two tCIs, the rejection
requirement is either L(
x
? ) > U(
y
? ) or L(
y
? ) > U(
x
? ) leading to the same condition as
before in Eq. (37a). That is,
??= Pr(reject H
0
 ? = 0) = Pr[x y? >
/2,
/
x
x x
tSn
?? /2,
/
y
yy
tSn
??
+ ] (37a)
81
It is impossible to studentize the argument of Eq. (37a) because when
x y
? ?? , the
expression for t
?
=
2
//Z
?
? ? will show that x y? /
22
//+
x xyy
Sn Sn is not central t
distributed with n
x
+n
y
? 2 df. In other words, there does not exist a central
2
?
? rv that
reduces t
?
=
2
//Z
?
? ? to the form x y? /
22
//+
x xyy
Sn Sn iff
x y
? ?? . However,
x y? /
22
//+
x xyy
Sn Sn is approximately t distributed with df ? given in Eqs.(39).
Therefore, Eq. (37a) can approximately be written as
??? Pr{ t
?
 > (
/2,
/
x
x x
tSn
?? /2,
/
y
yy
tSn
??
+ )/
22
//+
x xyy
Sn Sn }
= Pr{F
1,?
> (
/2,
/
x
x x
tSn
?? /2,
/
y
yy
tSn
??
+ )
2
/(
22
//
x xyy
Sn Sn+ )}
Or ??? Pr{F
1,?
>(
/2,
x
x y
tSn
?? /2,
+
y
yx
tSn
??
)
2
/(
22
yx xy
nS nS+ )}
= Pr[F
1,?
> (
/2,
x
kt
??
?+
/2,
y
t
? ?
)
2
/(
2
1 k+ )] (41a)
Let /
nyx
R nn= (or
y nx
nRn= ) and
22
0
/=
x y
FSS. Substituting
n
R = n
y
/n
x
and
0
F into Eq.
(41a) results in
??? Pr[F
1,?
> (
/2, 0
x
n
tRF
?? /2,
y
t
? ?
+ )
2
/(
0
1
n
RF+ )] (41b)
?? can be also represents as ??? Pr[F
1,?
> (
/2,
x
kt
??
? +
/2,
y
t
? ?
)
2
/(
2
1k + )] , where k =
(/ )/(/ )
x xyy
SnSn. When n
x
= n
y
= n, R
n
= 1, and the above formula for ?? reduces
to ??? Pr[F
1,?
> F
?,1,n?1
(
0
F 1+ )
2
/(
0
1F + )] (41c)
which is similar to (37e) but ? is given by Eqs. (39) instead of n
x
+n
y
? 2 in the case of the
pooled ttest, or instead of n ? 1 as stated by Payton et al. (2000). For equal sample sizes
82
Eq. (39b) simplify to ? =
22
00
(n 1)(1 F ) /(1 F )?+ + and not (n ?1) as reported by Payton et
al. (2000). Note that this last formula for ? reduces to 2(n ? 1) at F
0
= 1, which is the df
of the pooled ttest, as it should because the unlikely realization F
0
= 1 is in perfect
agreement with H
0
:
22
=
x y
? ? . Further, Eq. (41b) shows that ?? does not depend on the
specific values of
22
x y
S and S but only on their ratio k =
0
F
n
R . For Payton et al.?s
(2000) reported example of n
1
= n
2
= 10, S
1
= 0.80 and S
2
= 1.60, Eq. (41c) shows that at
n = 10 and F
0
= 0.25, ? = 13.2353 resulting in the value of ??? 0.00940573, which is
different from 0.0149 reported by Payton et al. (2000, p. 549). The df used by them was
9 which caused the % relative error in their reported ??= 0.0149 to be equal 54.414%.
Payton et al. (2000, p. 549) also make the following statement about 1/3 of the
way from the top of their p. 549: ?If the samples are collected from the same normal
population, the quantity
2
12
22
12
()nY Y
SS
+
+
is Fdistributed with 1 and n?1 degrees of
freedom.? The statement should go as follows: If the samples are collected from the two
normal populations with identical means and variances, [our Eq. (37e) shows that] the
statistic
2
12
22
12
()nY Y
SS
?
+
is Fdistributed with 1 and 2(n?1) degrees of freedom (not n ?1 as
stated).
Payton et al. (2000, p. 550) also make the following statement in the second
paragraph leading to their Eq. [1]: ? If the researcher is willing to assume that S
1
and S
2
are estimating the same parameter value (i.e., homogenous variances), then the above
equation simplifies to 0.95 = Pr[F
1,9
< 2F
?,1,9
] [1]?
83
Their above quote should be stated as follows: In the unlikely event that F
0
=
22
12
/SSis
realized to equal 1, then the above equation simplifies to 0.95 = Pr(F
1,13.2353
< 2F
?,1,9
).
Note that they are using (1 ? ?) also as the Overlap confidence level, and secondly just
because two independent population variances are equal, it does not imply that the
corresponding point estimates
22
12
SandS will be the same. Further, Payton et al. limit
their sample means,
1
Y and
2
Y , originating from the same normal population on p. 548.
Our work herein is not limited to the same normal population but to any two distinct
normal populations.
We now proceed to obtain the LUB (least upper bound) and the GLB (greatest
lower bound) for ?? in Eq. (41b). The LUB occurs when the argument on the RHS of
the Pr in Eq. (41b) is smallest. To this end, let ?
2
= Max(?
x
, ?
y
) and thus
??? Pr{F
1,?
> [
2
/2, 0n
tRF
??
2
/2,
t
? ?
+ ]
2
/(
0
1
n
RF+ )}
? LUB(??) = Pr{F
1,?
>
2
,1, 0
(
n
FRF
??
1+ )
2
/(
0
1
n
RF+ )}
Conversely, the greatest lower bound occurs when the argument on the RHS of
(41b) is largest. Letting ?
1
= Min(?
x
, ?
y
) in (41b) results in
??? Pr{F
1,?
> [
1
/2, 0n
tRF
??
1
/2,
t
? ?
+ ]
2
/(
0
1
n
RF+ )}
? GLB(??) = Pr{F
1,?
>
1
,1, 0
(
n
FRF
??
1+ )
2
/(
0
1
n
RF+ )} , or
Pr{F
1,?
>
1
,1, 0
(
n
FRF
??
1+ )
2
/(
0
1
n
RF+ )}
2
,1, 0
(
n
FRF
??
1+ )
2
/(
0
1
n
RF+ )},
while the expression for exact type I Pr from (40b) is ? ? Pr(F
1,?
>
,1,
F
? ?

x y
??? = 0).
The function (
0n
R F 1+ )
2
/(
0
1
n
RF+ ) clearly always exceeds 1 because
0n
R F =
84
22
(/)(/)
y xxy
nn SS? = ()/ ()Vx Vy, which is also a se ratio, can never equal to zero and
the function is bounded by 1 < (
0n
R F 1+ )
2
/(
0
1
n
RF+ ) ? 2, the maximum occurring
when
0n
R F = 1. Because we are seeking to establish that ?? in (41b) is always smaller
that ? ? Pr(F
1,?
>
,1,
F
? ?

x y
? ?? = 0), we consider the very worstcase scenario where the
smallest sample has the largest variance. For example, at ?
x
= 1, ?
y
=11
(so that R
n
=
n
y
/n
x
= 12/2 = 6),
2
x
S = 8.00 ,
2
y
S = 2.4805, F
0
= F
0.10,1,11
= 3.2252,
0.025,1
t = 12.7062, and
0.025,11
t
= 2.2010 the value of ? = 1.105755. Substituting these into (41b) results in ??=
Pr{F
1,1.105755
> 165.842616) = 0.0386 < 0.05. It has also been verified that
2
y
S can be as
small as
2
x
S /F
0.0001,?x,?y
and still?? < ?. Note that if
xy
0.90,n 1,n 1
F
? ?
<
0
F =
2
x
S/
2
y
S
<
xy
0.10,n 1,n 1
F
??
, then we are recommending using the pooled ttest so that the value of
0
F
must lie outside the interval (
xy
0.90,n 1,n 1
F
? ?
,
xy
0.10,n 1,n 1
F
? ?
) in order to apply the two
independent sample ttest. This is consistent with statistical literature (see J. L. Devore,
p.377 ) that suggests not to use the pooled ttest unless there is compelling evidence in
favor of H
0
: ?
x
= ?
y
.
Keeping F
0
?
0.10, 1, 1
x y
nn
F
??
fixed, ?? in Eq. (41a) attains its minimum at R
n
= n
y
/n
x
= 1, and the limit of ?? as R
n
? 0 or ? is ?; similarly, if F
0
?
0.90, 1, 1
x y
nn
F
??
is kept fixed, ??
is minimum at R
n
= 1 and its limit approaches ? as R
n
? 0 or ?. As F
0
? ?, ??
approaches the value of ?, i.e., the Overlap converges to an ?level test; however, the
farther R
n
is above 1, the faster is the limiting approach of ?? to ? as F
0
? ?. As F
0
? 0,
85
?? also approaches the value of ?, and the farther R
n
is below 1, the faster is the limiting
approach of ?? to ? as F
0
? 0.
For example, if n
x
= n
y
= 50 (i.e., R
n
=1), F
0
= 10
6
then ??= 0.04978024 (nearly
5%). If n
x
= 50, n
y
= 100, R
n
= 2, F
0
= 10
6
, then ??= 0.04984645882. However, If n
x
=
50, n
y
= 25, R
n
= 0.5, F
0
= 10
6
, ??= 0.0496811387 but if F
0
= 10
?6
, then??=
0.0498546652.
Further, if R
n
= 1, then the limiting value of ?? as n
x
? ? is equal to 0.0055751 as
long as
0.10 0
< (or
decreasing
00.90
/2, 1n
t
? ?
?
d
Sn/ , i.e., ? =
Pr( xy?
d
nS/ >
/2, 1n
t
? ?
) = Pr(t
n?1
 ?
/2, 1n
t
? ?
). For the two separate CIs, the rejection
requirement is either L(
x
? ) > U(
y
? ) or L(
y
? ) > U(
x
? ) leading to the same condition as
in Eq. (37a).
?? = Pr[x y? >
/2, 1
/
?nx
tSn
? /2, 1
/
?
+
ny
tSn
?
]
= Pr[
d
n >
/2, 1
()
nxy
tSS
? ?
+ ] = Pr[
d
/
d
nS >
/2, 1
()
nxy
tSS
? ?
+ /S
d
] (44a)
Because the null SMD of
d
/
d
nS is the Student?s t with (n ? 1) df, then (44a) can be
written as ?? = Pr[t
n ?1
>
/2, 1
()
nxy
tSS
? ?
+ /S
d
] = Pr[F
1,n1
>
2
,1, 1
()
nxy
FSS
? ?
+ /
2
d
S ]
= Pr[F
1,n1
>
2
,1, 1 0 0 0
(1)/(12)
?
++?
n
FF FrF
?
] (44b)
We now proceed to show that (S
x
+S
y
)
2
?
2
d
S =
22
?2
x yxy
SS ?+? for all values S
x
and
S
y
. There are two possibilities: (1) r > 0 ? ?
xy
? > 0, in which case it is obvious that
(S
x
+S
y
)
2
>
2
d
S . (2) r < 0 ? ?
xy
? < 0 and
2
d
S attains its maximum when r = ?1 ? Max(
2
d
S )
88
=
22
?2 
x yxy
SS ?++ . In this worstscenario case, it is clear that (S
x
+S
y
)
2
=
22
2
x yxy
SS SS++ ?
22
?2 
x yxy
SS ?++ because ??
xyxy
SS? . Thus, as before,??< ? .
How much smaller ?? is than ? depends both on the sign and magnitude of the sample
correlation coefficient r . The glb occurs when (S
x
+S
y
)
2
is largest relative to
2
d
S , i.e.,
when
2
d
S attains its minimum value. This minimum occurs when X and Y are highly
positively correlated and in the limit
?
xy
?
? S
x
S
y
. From Eq. (44b) we obtain
?? = Pr[F
1,n1
>
2
,1, 1
()
nxy
F SS
? ?
+ /
2
d
S ] ? Pr[F
1,n1
>
2
,1, 1
()
nxy
F SS
? ?
+ /(
22
2
x yxy
SS SS+? )]
? ?? > Pr[F
1,n1
>
2
,1, 1
()
nxy
FSS
? ?
+ /
2
()
x y
SS? ]
? GLB(??) = Pr[F
1,n1
>
2
,1, 1
()
nxy
F SS
? ?
+ /
2
()
x y
SS? ]
? GLB(??) = Pr[F
1,n1
>
2
,1, 1 0
(1)
?
+
n
FF
?
/
2
0
(1)?F ] (44c)
There seems to exist a problem in the inequality (44c), i.e., when S
x
? S
y
, the
expression on the RHS of the Pr is not defined. However, this occurs iff the correction
coefficient r = 1 which occurs only if the values of the rv Y is precisely a linear function
of X, i.e., Y must equal to ax + b +? and the constant a > 0 and b can be any real number.
Because a > 0, the variance of Y cannot equal to that of X.
Secondly, the largest value of ?? occurs when
2
,1, 1
()
nxy
F SS
? ?
+ /
2
d
S attains its
minimum value which in turn occurs when S
d
attains its maximum value of
22
x y
SS++
?2 
xy
? . Thus,
?? = Pr[F
1,n1
>
2
,1, 1
()
nxy
F SS
? ?
+ /
2
d
S ] ? Pr[F
1,n 1
>
2
,1, 1
()
nxy
F SS
? ?
+
22
?/( 2 )
x yxy
SS ?++ ]
89
The quantity
22
?2 
x yxy
SS ?++ attains its maximum when the sample correlation
coefficient r = ?1, in which case  ?
xy
?  = S
x
S
y
so the maximum value of ?? reduces to
?? ? Pr[F
1,n 1
>
2
,1, 1
()
nxy
F SS
? ?
+
22
/( 2 )
x yxy
SS SS++ ] = Pr[F
1,n 1
>
,1, 1n
F
? ?
) = ?.
Hence, we have the result
Pr[F
1,n1
>
2
,1, 1
()
nxy
FSS
? ?
+ /
2
()
x y
SS? ] ? ??? Pr[
n1
T >
/2, 1n
t
? ?
] = ?
For any r, the value of ??can never exceed ?. For example, for ? = 0.05, F
0
= 1, r
= ? 0.25, the value of ?? ranges in the interval 0.01369406 (at n = 101) ???? 0.0321416
(at n = 3). For ? = 0.05, F
0
= 1.5, r = ? 0.25, it ranges in the interval 0.013975172 (at n
= 101) ? ??? 0.032328946 (at n = 3); at F
0
= 2.0, 0.014512 (at n = 101) ????
0.032681748 (at n =3). As F
0
? ?, ??? ?.
For ? = 0.05, F
0
=1, r = ? 0.5, the value of ?? ranges in the interval
0.02406825415 (at n = 101) ???? 0.03820582 (at n = 3). For ? = 0.05, F
0
=1.5, r =
?0.5, it ranges in the interval 0.02430291 (at n = 101) ???? 0.03832841 (at n = 3); at F
0
= 2.0, 0.0247473226 (at n = 101) ???? 0.038559305 (at n =3). As F
0
? ?, ??? ?.
For ? = 0.05, F
0
=1, r = ? 0.75, the value of ?? ranges in the interval
0.0363984432 (at n = 101) ???? 0.0441575 (at n = 3). For ? = 0.05, F
0
=1.5, r = ?0.75,
it ranges in the interval 0.03653187 (at n = 101) ???? 0.04421765 (at n = 3); at F
0
= 2.0,
0.036783673 (at n = 101) ???? 0.04433101 (at n =3). As F
0
? ?, ??? ?.
Moreover, the effect of negative correlation is to increase?? toward ? as
n ?1 goes toward 1. For example, when n ? 1 =1, F
0
= 2 and r = ? 0.90, then ?? =
0.0487767; at n ? 1 =1, F
0
= 2 and r = ? 0.95, then ?? = 0.0493921346 while at r =
90
? 0.99 and F
0
= 2, ?? = 0.04987903. For fixed F
0
and ?1 ? r < 0, the limiting behavior
of ?? as n ? ? is difficult to investigate because as n ? ?, then per force
0
F =
2
x
S/
2
y
S ?
22
xy
/??( an unknown parameter), and r ? ?
xy
(the population correlation coefficient
which is another unknown parameter). Most importantly, Matlab loses accuracy in
inverting F
1,n1
once n?1 far exceeds 1,000,000. For example, Matlab gave ??(at n =
1,000,000, F
0
=1, r = ?0.50) = 0.0236254, ??(at n = 10,000,000, F
0
=1, r = ?0.50) =
0.0237475, but ??(at n = 100,000,000, F
0
=1, r = 0.50) = 0.0938949. We are fairly
certain that ??(as n ? ? , F
0
= 1, r = ?0.50) = 0.0236254 is the correct answer and the
last one is inaccurate because Matlab gave finv(0.95,1,100000000) = 2.10472249984741
instead of the correct value of 3.841458914. For negative correlation we have verified
that as F
0
and n ? ?,??? ?.
For ? = 0.05, F
0
= 1.00, and r = 0.25, the value of ?? ranges in the interval
0.0016243942 (at n =101) ???? 0.01966083 (at n = 3). For ? = 0.05, F
0
= 1.5, and r =
0.25, the value of ?? ranges in the interval 0.0017703 (at n = 101) ???? 0.0199853 (at n
= 3). While, For ? = 0.05, F
0
= 2.00, and r = 0.25, the value of ?? ranges in the interval
0.00206727 (at n = 101) ???? 0.02059583 (at n = 3).
For ? = 0.05, F
0
=1, r = 0.50, the value of ??lies in the interval 0.000136523 (at n
= 101) ???? 0.0132366264 (at n = 3); at F
0
=1.5, ??lies in the interval 0.000169103 (at n
= 101) ???? 0.013633621 (at n = 3); while at F
0
= 2.00, ??lies in the interval
0.0002456622 (at n = 101) ??? ? 0.01438047707496 (at n = 3).
For ? = 0.05, F
0
=1, r = 0.75, the value of ??lies in the interval
0.0000001795166(at n = 101) ???? 0.00668445233 (at n = 3);at F
0
=1.5, ?? lies in the
91
interval 0.00000041140283 (at n = 101) ???? 0.00715685 (at n = 3); while at F
0
= 2.00,
?? lies in the interval 0.000001550553 (at n = 101) ? ?? ? 0.0080453 (at n = 3).
The impact of positive correlation is to reduce ?? toward zero as n ? ?. For
example, at ? = 0.05, F
0
= 1.5, r = 0.75 and n = 500, ?? reduces to 0.00000012177 from
its value of 0.00000041140283 at n = 101, and at n = 101, r = 0.90, ?? reduces to
1.25244259407?10
?12
.
Because, we have coded Matlab functions (see Appendix B) to compute the??
values for all three above cases (pooled ttest, twoindependent ttest, and the paired t
test), no extra tables are provided.
92
7.0 The Percent Overlap that Leads to the Rejection of H
0
:?
x
= ?
y
7.1 The case of Unknown ?
x
= ?
y
= ?
Throughout this section, it is understood that a pretest on H
0
:?
x
= ?
y
= ? has
yielded a Pvalue > 0.20 so that the null hypothesis H
0
:?
x
= ?
y
= ? is tenable leading to a
pooled ttest.
As before, let ? represent the amount of overlap length between the two
individual CIs on process means. Then ? will be 0 either L(?
x
) > U(?
y
) or L(?
y
) > U(?
x
),
in which case H
0
:?
x
= ?
y
is rejected at the LOS < ?. Thus, the overlap amount ? is larger
than 0 when U(?
x
) > U(?
y
) > L(?
x
) or U(?
y
) > U(?
x
) > L(?
y
). In these two cases, both
U(?
x
) > U(?
y
) > L(?
x
) and U(?
y
) > U(?
x
) > L(?
y
) will lead to the same result. Therefore,
only U(?
x
) > U(?
y
) > L(?
x
) is discussed here so that we are making the assumption that
x y?? 0 ?
? = U(?
y
) ?L(?
x
) = (
/2,
/
y
y y
y tSn
??
+ ) ? (
/2,
/
x
x x
x tSn
??
? )
= (
/2, /2,
//
xy
x xyy
tSntSn
?? ??
+ ) ? ( x y? ) (45a)
Further, the span of the two individual CIs is
U(?
x
) ? L(?
y
) = (
/2,
/
x
x x
x tSn
??
+? ) ? (
/2,
/
y
yy
y tSn
??
?? )
= (
/2, /2,
//
xy
x xyy
tSntSn
?? ??
+ ) + ( x y? ) (45b)
93
From equations (45a &b) the % overlap is given by
/2, /2,
/2, /2,
(/ /)()
100%
(/ /)()
xy
xx yy
xx yy
tSntSnxy
tSntSnxy
?? ??
?? ??
?
+??
=?
++
(45c)
As x y? increases, the Pvalue of the test decreases (i.e., H
0
: ?
x
= ?
y
must be rejected
more strongly) and ? in Eq. (45c) decreases. Because H
0
: ?
x
= ?
y
must be rejected at the
? level if x y? ?
/2,
t
? ?
? 1/ 1/
px y
Snn+ , where ? = n
x
+ n
y
?2, then from (45c) H
0
must be barely rejected at ??100% level or less iff
/2, /2, /2,
/2, /2, /2,
(/ /)( 1/1/)
100%
(/ /)( 1/1/)
xy
xx yy p x y
xx yy p x y
tSntSnt Snn
tSntSnt Snn
?? ?? ??
?? ?? ??
?
+??+
??
++
(46a)
Putting /
nyx
R nn= into Eq. (46a) leads to the result ?
/2, /2, /2,
/2, /2, /2,
(/)(1/)
100%
xy
xy
n pn
n pn
tStSRt S R
tStSRt S R
?? ?? ??
?? ?? ??
?
+??+
??
++
?
/2, /2, /2,
/2, /2, /2,
()(1)
100%
xy
xn y pn
xn y pn
tSRtSt SR
tSRtSt SR
?? ?? ??
?? ?? ??
?
+??+
++?+
(46b)
As defined before, letting
22
0
/
x y
FSS= into Eq. (46b) and recalling ? = n
x
+n
y
?2 results in
0 /2, /2, /2, 0
0/2,/2,/2, 0
(1 )( ) /
100%
(1 )( ) /
xy
xy
nnxy
FR t t t R F
FR t t t R F
?? ?? ??
?? ?? ??
???
?
???
?+? + +
??
?++ + +
(46c)
where F
0
?R
n
= (
22
/
x y
SS)?
y
x
n
n
=
2
2
/
/
x x
y y
Sn
Sn
=
()
()
x
y
v
v
= (se ratio)
2
. Thus, the percent overlap
at which H
0
should be rejected exactly at the ?level is given by
94
0/2,/2,/2, 0
0/2,/2,/2, 0
(1 )( ) /
100%
(1 )( ) /
xy
xy
nnxy
r
FR t t t R F
FR t t t R F
?? ?? ??
?? ?? ??
???
?
???
?+? + +
=?
?++ + +
(46d)
=
/2, /2, /2, 0
/2, /2, /2, 0
(1 )( ) /
100%
(1 )( ) /
xy
xy
nx y
nx y
kt t t R F
kt t t R F
?? ?? ??
?? ?? ??
???
???
?+? + +
?
?++ + +
(46e)
Eq. (46d) shows that the % overlap at which H
0
must be rejected at the ?level depends
only on ?, n
x
, n
y
and F
0
and not on the specific values of S
x
and S
y
. For larger values of
n
x
and n
y
> 30, the dependency on ? is negligible because
/2,
x
t
? ?
,
/2,
y
t
? ?
and
/2,
t
? ?
are
close in values and are almost equal once n
x
and n
y
> 60 .
For the case of balanced completely randomized design (i.e., n = n
x
= n
y
? R
n
=1),
the inequality in (46d) reduces to
/2, 1 0 /2,2( 1) 0
/2, 1 0 /2,2( 1) 0
(1 ) 1
100%
(1 ) 1
nn
r
tFt F
??
?
??
+? +
=?
++ +
(46e)
We first discuss the limiting property of the Eq. (46e). Because this is the case of pooled
ttest, F
0
= (S
x
/S
y
)
2
(the ratio of the two sample variances) must lie within the acceptance
interval (
0.90, 1, 1nn
F
??
,
0.10, 1, 1nn
F
??
); otherwise H
0
: ?
x
= ?
y
must be rejected at the 20%
level. Further, the first derivative of ?
r
vanishes at F
0
= 1 so that the maximum of ?
r
occurs at F
0
= 1 and is equal to 0.613625686 at n = 2 and its maximum approaches
(2 2)/(2 2)?+= 0.171573 as n ? ?. Note that as n ? ?, the value of F
0
that must
lie within (
0.90, 1, 1nn
F
??
? F
0
?
0.10, 1, 1nn
F
? ?
) must per force also go towards 1 because the
limiting value of both
0.90, 1, 1nn
F
??
and
0.10, 1, 1nn
F
? ?
is nearly 1. That is, for all F
0
values
where H
0
:
x y
? ?= cannot be rejected, the limiting value of ?
r
in terms of n cannot be less
95
than 0.171573. At n = 61, ?
r
= 0.17603 if F
0
= 1.2 so that the approach of ?
r
toward
0.171573 occurs fairly rapidly in terms of n as long as
0.90, 1, 1nn
F
? ?
? F
0
?
0.10, 1, 1nn
F
??
. At
n = 31 and F
0
=1.30, the value of ?
r
= 0.1806 so that H
0
: ?
x
= ?
y
must be rejected at 5%
or less if the % overlap is less than or equal to 18.06%.
In the unbalanced case if R
n
= 0.50 or 2, the limiting value of ?
r
at F
0
=1 is equal
to (2 1 3)/(2 1 3)+? ++ = 0.164525. Further, as R
n
? 1
deviates farther from 1, the
limiting value of ?
r
decreases for a fixed F
0
as long as
0.90, 1, 1nn
F
? ?
? F
0
?
0.10, 1, 1nn
F
??
. For
example, at R
n
= 3 (or 1/3), the limiting value of ?
r
is 0.15470; at R
n
= 4 (or 0.25), its
limiting value is 0.14590; at R
n
= 5 (or 0.20), its limiting value is 0.13835; at R
n
= 0.10
(or 10), the limiting value of ?
r
is 0.11307, while at Rn = 20 (or 0.05) the limiting value
of ?
r
is equal to 0.088472. Clearly, as R
n
deviates farther from 1, the limiting value of ?
r
decreases, implying that the Overlap approaches an ?level test. See the illustration in
Table 22. Finally, it must be noted that as n
x
and n
y
become very large, the limit of Eq.
(46d) becomes identical to
r
? =
2
2
(1 1 )
(1 1 )
kk
kk
+? +
++ +
100% given in Eq. (12e).
Next, what should each individual confidence level 1 ?? be so that the two
independent CIs lead to the exact ??100%level test on H
0
: ?
x
= ?
y
. The expressions for
the two 1?? independent CIs are given by
/2,
/
x
x x
x tSn
??
?? ? ?
x
?
/2,
/
x
x x
x tSn
??
+? (47a)
/2,
/
y
yy
yt S n
??
?? ? ?
y
?
/2,
/
y
yy
yt S n
??
+? (47b)
It is clear that H
0
: ?
x
= ?
y
must be rejected at the ?level iff the amount of overlap
96
Table 22. The Value of
r
? for Different F
0
and R
n
Combinations
F
0
x
?
y
?
R
n
r
?
F
0
x
?
y
?
R
n
r
?
0.5 1 1 1 60.90835 0.8 11 5 0.5 24.33944
0.5 11 11 1 19.33138 0.8 21 10 0.5 20.81321
0.5 21 21 1 17.90985 0.8 51 25 0.5 18.92389
0.5 31 31 1 17.42757 0.8 61 30 0.5 18.72412
0.5 41 41 1 17.18504 0.8 1001 500 0.5 17.81175
0.5 51 51 1 17.03911 0.8 11 23 2.0 17.54347
0.5 100 100 1 16.74930 0.8 31 63 2.0 15.87565
0.5 150 150 1 16.64982 0.8 51 103 2.0 15.53555
0.5 200 200 1 16.60028 0.8 1001 2003 2.0 15.04753
0.8 1 1 1 61.31421 1 11 5 0.5 22.76531
0.8 11 11 1 19.95417 1 21 10 0.5 19.37801
0.8 21 21 1 18.53612 1 41 20 0.5 17.85949
0.8 31 31 1 18.05497 1 81 40 0.5 17.14235
0.8 41 41 1 17.81299 1 151 75 0.5 16.81705
0.8 51 51 1 17.66738 1 501 250 0.5 16.56104
0.8 100 100 1 17.37823 1 1001 500 0.5 16.50667
0.8 500 500 1 17.14089 1 10001 5000 0.5 16.45788
0.8 1000 1000 1 17.11144 1 500 1001 2.0 16.50667
1 1 1 1 61.36257 1 2 5 2.0 35.76007
1 10 10 1 20.33789 1 10 21 2.0 19.37801
1 50 50 1 17.75437 1 20 41 2.0 17.85949
1 100 100 1 17.45340 1 50 101 2.0 17.00221
1 10000 10000 1 17.16022 1 100 201 2.0 16.72519
1 500000 500000 1 17.15735 1 10000 20001 2.0 16.45517
1 10000000 10000000 1 17.15729 1 10000000 20000001 2.0 16.45247
1.2 1 1 1 61.33025 1 100 302 3.0 15.77155
1.2 10 10 1 20.28821 1 10000 30002 3.0 15.47304
1.2 20 20 1 18.63655 1 1000000 3000002 3.0 15.47008
1.2 60 60 1 17.60330 1 100 403 4.0 14.91845
1.3 10 10 1 20.23527 1 10000 40003 4.0 14.59306
1.3 20 20 1 18.58326 1 1000000 4000003 4.0 14.58984
1.3 30 30 1 18.05974 1 10000 50004 5.0 13.83815
1.6 5 5 1 23.67919 1 1000000 5000004 5.0 13.83471
1.6 10 10 1 20.01202 1 10000 100009 10.0 11.31132
1.6 13 13 1 19.23368 1 10000000 100000009 10.0 11.30718
between (47a) and (47b) barely becomes zero or less. Without loss of generality, the x
sample will be denoted such that x y? ? 0. Therefore, we deduce from (47a & b) that
97
U'(?
y
) ?L' (?
x
) = (
/2,
/
y
yy
yt S n
??
+? ) ? (
/2,
/
x
x x
x tSn
??
?? )
=
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ? ( x y? ) (48a)
Because H
0
: ?
x
= ?
y
must be rejected at the ?level as soon as the RHS of Eq. (48a)
becomes 0 or smaller, we impose the borderline rejection criterion x y? =
/2,
t
? ?
?
1/ 1/
p xy
Snn+ into Eq. (48a). In short, we are rejecting H
0
:?
x
= ?
y
as soon as the two
independent CIs in (47a) and (47b) become disjoint. This leads to rejecting H
0
:?
x
= ?
y
iff
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ? ( x y? ) ? 0. (48b)
At the borderline value, we set x y? =
/2,
t
? ?
? 1/ 1/
px y
Snn+ and set the LHS of
inequality (48b) to 0 in order to solve for? .
?
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ?
/2,
t
? ?
? 1/ 1/
p xy
Snn+ = 0
/2,
?
y
y
tS
??
/2,
+?
x
x
tS
?? n
R ?
/2,
t
? ?
? 1+
pn
SR = 0.
/2,
y
t
? ?
/2,
+?
x
t
?? 0 n
FR ?
/2,
t
? ?
?
0
(1)( )/++
nxy
RF? ?? = 0.
?
/2,
y
t
? ? /2,
+?
x
t
?? 0 n
FR =
/2,
t
? ?
?
0
(1)( )/++
nxy
RF? ??
or
/2,
y
t
? ? /2,
+?
x
kt
? ?
=
/2,
t
? ?
?
0
(1)( )/++
nxy
RF? ?? (49a)
where
22
0
/
x y
FSS= , R
n
= n
y
/n
x
, ? = n
x
+ n
y
?2 and k=
0 n
FR = se ratio of samples. Eq.
(49a) clearly shows that the value of ? depends on the LOS ? of testing H
0
: ?
x
= ?
y
, also
on
22
0
/
x y
FSS= , and the sample sizes n
x
, n
y
. For the case of balanced design (n
x
= n
y
= n),
(49a) reduces to
/2, 1ny
tS
? ?
?
/2, 1nx
tS
? ?
+??
/2,2( 1)n
t
? ?
? 2
p
S = 0
98
?
/2, 1
()
nxy
tSS
? ?
+ ?
/2,2( 1)n
t
? ?
?
22
x y
SS+ = 0
?
/2, 1n
t
? ?
=
/2,2( 1)n
t
? ?
?
22
x y
SS+ / ()
x y
SS+
?
/2, 1n
t
? ?
=
/2,2( 1)n
t
? ?
?
0
1F + /
0
(1)F +
?
,1, 1n
F
? ?
=
,1,2( 1)n
F
? ?
?(
0
1 F+ )/(1+
0
F )
2
(49b)
For example, when ? = 0.05, n
x
& n
y
= 21, Eq. (49b) gives ? = 0.16807 so that the two
independent CIs have to be set at the confidence level 1?? = 0.83193 in order for the
Overlap to provide an exact 5% level test. The values of 1?? range from 0.2020062 at
n?1 = 1 down to 0.16596 at n?1 = 100. In order to obtain the limiting value of ?, we let
n? ? in (49b) resulting in Lim
/2, 1n
t
? ?
( n? ?) = 1.96? 11+ / (1 1)+ =1.96/ 2 =
1.38593 ? Limit ? (as n? ?) = Pr(Z ? 1.38593) = 0.16578, which is identical to the
know&equalvariances case from Eq. (13) at K =1.
7.2 The Case of H
0
:
xy
? ?= Rejected Leading to the TwoIndependent Sample
tTest
Assuming that X~N(
2
,x x
? ? ) and Y~N(
2
,yy
? ? ) and X Y? is N(
x y
? ?? ,
2
2
y
x
x y
nn
?
?
+ ), where now the null hypothesis of H
0
:
x y
? ?= is rejected at the 20% level
(i.e., the Pvalue of the pretest is less than 20%) leading to the assumption that the F
statistic F
0
=
22
xy
SS/ is outside the interval (
xy
0.90,n 1,n 1
F
? ?
,
xy
0.10,n 1,n 1
F
? ?
), where without
loss of generality the sample with the larger mean will be called X. It has been shown in
99
statistical theory that if the assumption
x y
? ?= is not tenable, the statistic
22
[( ) ( )]/ ( / ) ( / )
x yxxyy
x ySnSn???? ? + has approximately the Student?s tdistribution
with degrees of freedom given by Eq. (39).
222
22
22
(/ /)
(/)
(/)
11
xx yy
yy
xx
xy
Sn Sn
Sn
Sn
nn
?
+
=
+
??
=
2
22
[() ()]
(() (()
xy
Vx Vy
Vx Vy
??
+
+
=
2
22
[() ()]
(() (()
xy
yx
Vx Vy
Vx Vy
?? +
+
(39a)
? =?
2
0
2
0
(1)
()
+
+
xy n
y nx
FR
FR
??
? ?
=
22
4
(1)+
+
xy
yx
k
k
??
??
(39b)
As before, the amount of overlap between the two individual CIs is given by
? = U(?
y
) ? L(?
x
) = (
/2,
/
y
y y
y tSn
??
+ ) ? (
/2,
/
x
x x
x tSn
??
? )
= (
/2, /2,
//
xy
x xyy
tSntSn
?? ??
+ ) ? ( x y? ) (50a)
Further, the span of the two individual CIs is
U(?
x
) ? L(?
y
) = (
/2,
/
x
x x
x tSn
??
+? ) ? (
/2,
/
y
yy
yt S n
??
?? )
= (
/2, /2,
//
xy
x xyy
tSntSn
?? ??
+ ) + ( x y? ) (50b)
From equations (50 a &b) the % overlap is given by
/2, /2,
/2, /2,
(/ /)()
100%
(/ /)()
xy
xx yy
xx yy
tSntSnxy
tSntSnxy
?? ??
?? ??
?
+??
=?
++
(50c)
As x y? increases, the Pvalue of the test decreases (i.e., H
0
: ?
x
= ?
y
must rejected
more strongly) and ? in Eq. (50c) decreases. Because H
0
: ?
x
= ?
y
must be barely rejected
at the ? level if x y? =
/2,
t
? ?
?
22
//+
x xyy
Sn Sn, where ? is given in Eq. (39), then
from (50c)
100
22
/2, /2, /2,
22
/2, /2, /2,
(/ /) //
100%
(/ /) //
+??+
=?
++
xy
xx yy xxyy
r
xx yy xxyy
tSntSnt SnSn
tSntSnt SnSn
?? ?? ??
?? ?? ??
? (51a)
In order to simplify (51a), we multiply throughout by
y
n , divide throughout by S
y
and
replace n
y
/n
x
by R
n
and
22
xy
SS/ by F
0
resulting in
/2, 0 /2, /2, 0
/2, 0 /2, /2, 0
()1
100%
+??+
=?
++?+
xy
nn
r
tRFt t RF
tRFt t RF
?? ?? ??
?? ?? ??
?
=
2
/2, /2, /2,
2
/2, /2, /2,
()1
100%
xy
xy
kt t t k
kt t t k
?? ?? ??
?? ?? ??
+??+
?
++?+
(51b)
where F
0
lies outside the 20% acceptance interval (
xy
0.90,n 1,n 1
F
? ?
,
xy
0.10,n 1,n 1
F
??
). The %
overlap in Eq. (51b) changes very little as ? changes, increasing a bit as ? decreases
while other parameters n
x
, n
y
and F
0
are kept fixed. As F
0
increases, the value of ?
r
decreases such that as F
0
? ?, ?
r
? 0 so that the overlap becomes an exact ?level test.
The limiting (in terms of n
x
and n
y
) values of ?
r
at R
n
= 2, 3, 4, 5, 10 and 20 are
independent of ? (because for large n
x
and n
y
all 3 t inverse functions in (51b) are almost
equal) and are almost identical to those of the pooled ttest, namely 0.164509, 0.154679,
0.1458744, 0.138322, 0.11305, and 0.08845, respectively.
When the design is balanced (n
x
= n
y
= n), the % overlap in Eq.(51b) that still
leads to the rejection of H
0
: ?
x
= ?
y
at the ?level reduces to
/2, 1 0 /2, 0
/2, 1 0 /2, 0
(1) 1
100%
(1) 1
?
?
+? ? +
=?
++ ? +
n
r
n
tFtF
tFtF
???
? (51c)
101
where in the balanced case ? =
222
44
(1)( )?+
+
xy
xy
nSS
SS
=
2
0
2
0
(1)( 1)
1
nF
F
?+
+
. If the % overlap
exceeds Eq.(51c), then H
0
: ?
x
= ?
y
can no longer be rejected at the ??100% level of
significance. For values of F
0
outside the range (
xy
0.90,n 1,n 1
F,
? ?
xy
0.10,n 1,n 1
F
??
), the limiting
value of ?
r
(at any ? ) as F
0
? 1 from Eq. (51c) is, as before, equal to (2 2) / (2 2)?+
= 0.171573. Again, as F
0
? ?, ?
r
? 0, which is consistent with the results in Chapter 3
with known but unequal sample case of the SE ratio k ? ?.
Now, what should each individual confidence level 1 ?? be so that the two
independent CIs lead to the exact ??100%level test on H
0
: ?
x
= ?
y
. As before, the
expressions for the two 1?? independent CIs are given by
/2,
/
x
x x
x tSn
??
?? ? ?
x
?
/2,
/
x
x x
x tSn
??
+? (52a)
/2,
/
y
yy
y tSn
??
?? ? ?
y
?
/2,
/
y
yy
y tSn
??
+? (52b)
It is clear that H
0
: ?
x
= ?
y
must be rejected at the ?level iff the amount of overlap
between (52a) and (52b) barely becomes zero or less. Without loss of generality, the x
sample will be denoted such that x y? ? 0. Therefore, we deduce from (52a &b) that
U'(?
y
) ?L' (?
x
) = (
/2,
/
y
yy
y tSn
??
+? ) ? (
/2,
/
x
x x
x tSn
??
?? )
=
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ? ( x y? ) (53a)
Because H
0
: ?
x
= ?
y
must be rejected at the ?level as soon as the RHS of (53a)
becomes 0 or smaller, we impose the critical limit of rejection x y? =
/2,
t
? ?
?
22
//+
x xyy
Sn Sn into Eq. (53a), where ? is given in Eq. (39). In short, we are rejecting
102
H
0
:?
x
= ?
y
as soon as the two independent CIs in Eq.(52a) and Eq.(52b) become disjoint.
This leads to rejecting H
0
:?
x
= ?
y
iff
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ? ( x y? ) ? 0. (53b)
At the borderline value, we set x y? =
/2,
t
? ?
?
22
//+
x xyy
Sn Sn and set the LHS of
inequality (53b) to 0 in order to solve for ? .
?
/2,
/
y
yy
tSn
??
?
/2,
/
x
x x
tSn
??
+? ?
/2,
t
? ?
?
22
//+
x xyy
Sn Sn = 0.
?
/2,
?
y
y
tS
??
/2,
+?
x
x
tS
?? n
R ?
/2,
t
? ?
?
22
x ny
SR S+ = 0.
?
/2,
y
t
? ?
/2,
+?
x
t
?? 0 n
FR ?
/2,
t
? ?
?
0
1
n
FR + = 0.
Or:
/2,
y
t
? ?
/2,
+?
x
t
?? 0 n
FR =
/2,
t
? ?
?
0
1
n
FR +
?
/2,
y
t
? ?
+ k?
/2,
x
t
? ?
=
/2,
t
? ?
?
2
1+k (54a)
where
22
0
/
x y
FSS= , R
n
= n
y
/n
x
and ? is given in Eq. (39). Eq. (54a) clearly shows that
the value of ? depends on the LOS ? of testing H
0
:?
x
= ?
y
, F
0
, and the sample sizes n
x
and n
y
. For the case of balanced design (n
x
= n
y
= n), (54a) reduces to
/2, 1n
t
? ?
=
/2,
t
? ?
?
0
1F + /(1+
0
F )
or
,1, 1n
F
? ?
=
,1,
F
? ?
?(
0
1 F+ )/(1+
0
F )
2
(54b)
where ? =
2
0
2
0
(1)( 1)
1
nF
F
?+
+
. The limiting value of ? in Eq. (54b), as n? ?, can easily be
obtained from
/2
Z
?
=
/2
Z
?
?
0
1F + /(1+
0
F ). The results will be the same for Eq. (54a).
For example, using (54b) at ? = 0.05, n = 10, F
0
= 4.0, ? = 13.2353 resulting in ? =
103
0.1424. Payton et al. (2000) report this value as 0.1262 because the denominator df of
,1,
F
? ?
in the formula atop their page 550 is inaccurate. For n
x
& n
y
> 100, as F
0
? ?, ?
? ? so that the Overlap approaches an ?level test. See the illustration in Table 23.
7.3 Comparing the Paired tCI with Two Independent tCIs
As before, let O represent the amount of overlap length between the two individual
CIs. Then, O will be 0 either L(?
x
) > U(?
y
) or L(?
y
) > U(?
x
), in which case H
0
: ?
x
= ?
y
is
rejected at the LOS < ?. Thus, ? is larger than 0 when U(?
x
) >U(?
y
) > L(?
x
) or
U(?
y
)>U(?
x
) > L(?
y
). In these two cases, both U(?
x
) >U(?
y
) > L(?
x
) and U(?
y
) > U(?
x
)
> L(?
y
) will lead to the same result. Therefore, only U(?
x
) >U(?
y
) > L(?
x
) is discussed
here so that we are making the assumption that x y? ? 0.
????????? O = U(?
y
) ? L(?
x
) = (
/2, 1
/
ny
y tSn
? ?
+ ) ? (
/2, 1
/
nx
x tSn
? ?
? )
= (
/2, 1 /2, 1
//
nx ny
tSntSn
????
+ ) ? ( x y? ) (55a)
Further, the span of the two individual CIs is
U(?
x
) ? L(?
y
) = (
/2, 1
/
nx
x tSn
? ?
+? ) ? (
/2, 1
/
ny
y tSn
? ?
?? )
= (
/2, 1 /2, 1
//
nx ny
tSntSn
????
+ ) + ( x y? ) (55b)
From equations (55a &b) the % overlap is given by
/2, 1 /2, 1
/2,1 /2,1
(/ /)()
100%
(/ /)()
nx ny
nx ny
tSntSnxy
tSntSnxy
??
?
??
+??
=?
++
(55c)
As x y? increases, the Pvalue of the test decreases (i.e., H
0
: ?
x
= ?
y
must rejected
more strongly) and ? in Eq. (55c) decreases. Because H
0
: ?
x
= ?
y
must be rejected at the
104
Table 23. The? Value for Different Combinations of
x
n ,
y
n and
n
R at Either
0
F =
0.90, ,
x y
F
? ?
or
0
F =
0.05, ,
x y
F
? ?
x
n y
n
n
R
0
F =
0.90, ,
x y
F
? ?
?
0
F =
0.05, ,
x y
F
? ?
?
5 5 1 0.24347 0.13970 4.10725 0.08485
20 20 1 0.54873 0.16301 1.82240 0.13999
60 60 1 0.71470 0.16510 1.39918 0.15333
500 500 1 0.89152 0.16571 1.12168 0.16206
1000 1000 1 0.92208 0.16574 1.08451 0.16320
100000 100000 1 0.99193 0.16762 1.00814 0.16553
5 6 1.2 0.24688 0.15190 3.52020 0.08565
20 24 1.2 0.55765 0.16613 1.75251 0.13711
60 72 1.2 0.72272 0.16635 1.37397 0.15122
500 600 1.2 0.89554 0.16580 1.11574 0.16098
1000 1200 1.2 0.92509 0.16568 1.08054 0.16231
100000 120000 1.2 0.99227 0.16537 1.00779 0.16505
10 15 1.5 0.42534 0.16860 2.12195 0.11484
20 30 1.5 0.56715 0.16794 1.68491 0.13282
80 120 1.5 0.76388 0.16603 1.29555 0.15011
500 750 1.5 0.89977 0.16466 1.10959 0.15858
1000 1500 1.5 0.92825 0.16437 1.07642 0.16011
100000 150000 1.5 0.99262 0.16371 1.00742 0.16329
5 10 2 0.25409 0.17390 2.69268 0.08157
20 40 2 0.57733 0.16722 1.61932 0.12632
80 160 2 0.77239 0.16349 1.27469 0.14452
500 1000 2 0.90426 0.16127 1.10318 0.15392
1000 2000 2 0.93160 0.16082 1.07212 0.15564
100000 200000 2 0.99300 0.15980 1.00704 0.15928
10 50 5 0.45063 0.15367 1.76252 0.08712
50 250 5 0.73697 0.14463 1.30352 0.11609
500 2500 5 0.91313 0.14027 1.09087 0.13122
1000 5000 5 0.93819 0.13961 1.06381 0.13320
100000 500000 5 0.99373 0.13810 1.00629 0.13746
10 100 10 0.45673 0.13019 1.69556 0.07319
50 500 10 0.74397 0.12427 1.28494 0.09796
500 5000 10 0.91637 0.12052 1.08649 0.11195
1000 10000 10 0.94059 0.11992 1.06084 0.11383
100000 1000000 10 0.99400 0.11851 1.00602 0.11790
100000 2000000 20 0.99413 0.10086 1.00588 0.10032
100000 3000000 30 0.99418 0.09215 1.00583 0.09166
100000 5000000 50 0.99422 0.08297 1.00579 0.08254
100000 10000000 100 0.99424 0.07341 1.00576 0.07305
100000 20000000 200 0.99426 0.06654 1.00575 0.06622
100000 50000000 500 0.99427 0.06042 1.00574 0.06015
105
? level if x y? ?
/2, 1
/
nd
tSn
? ?
? , then from (55c) H
0
must be barely rejected at
??100% or less iff
/2,1 /2,1 /2,1
/2,1 /2,1 /2,1
(/ /) /
100%
(/ /) /
nx ny n d
nx ny n d
tSntSnt Sn
tSntSnt Sn
?? ?
?
?? ?
+??
??
++
? 100%
xyd
xyd
SSS
SSS
?
+?
??
++
?
22
22
2
100%
2
xy xy xy
xy xy xy
SS SS rSS
SS SS rSS
?
+? +?
??
++ +?
?
00 0
00 0
112
100%
112
FFrF
FFrF
?
+? +?
??
++ +?
(56a)
Or
00 0
00 0
112
100%
112
r
FFrF
FFrF
?
+? +?
=?
++ +?
(56b)
where F
0
=
22
xy
SS/ ,
222
?2
dxy xy
SSS ?=+? and ?
xy
? =
n
ii
i=1
(x  x)(y y) n 1)/(?
?
. Just like
the case of known variances, the % overlap
r
? in (56b) depends only on the correlation
coefficient r and the ratio of the two sample variances, i.e., it does not depend on ? and
specific values of S
x
and S
y
. It is interesting to note that when r = 0 (i.e., the two samples
are independent) and F
0
= 1, then Eq. (56b) reduces to
22
22
?
+
?100% = 17.1573%,
which was the % overlap for the case of independent samples and equal variances given
in Eq. (3f). When r = 1 and F
0
? 1, ?
r
in Eq. (56b) reduces to 1/
0
F , while if r = 1 and
F
0
? 1, ?
r
in Eq. (51) reduces to
0
F . On the other hand, when r = ?1, as expected ?
r
in
Eq. (56b) reduces to zero regardless of values of F
0
so that the Overlap becomes an exact
?level test.
106
Finally, what should the individual confidence levels 1?? be so that the two
independent CIs lead to the exact ?level test on H
0
: ?
x
= ?
y
. As before, the expressions
for the two 1?? independent CIs are given by
/2, 1
/
? ?
??
nx
x tSn? ?
x
?
/2, 1
/
? ?
+?
nx
x tSn (52c)
/2, 1
/
? ?
??
ny
y tSn ? ?
y
?
/2, 1
/
? ?
+?
ny
y tSn (52d)
It is clear that H
0
: ?
x
= ?
y
must be rejected at the ?level iff the amount of overlap
between (52c) and (52d) barely becomes zero or less. Without loss of generality, the x
sample will be denoted such that x y? ? 0. Therefore, we deduce from (52c & d) that
U'(?
y
) ?L' (?
x
) = (
/2, 1
/
? ?
+?
ny
y tSn) ? (
/2, 1
/
? ?
??
nx
x tSn )
=
/2, 1
/
? ?
?
ny
tSn
/2, 1
/
? ?
+?
nx
tSn ? ( x y? ) (53c)
Because H
0
: ?
x
= ?
y
must be rejected at the ?level as soon as the RHS of (53d) becomes
0 or smaller, we impose the rejection criterion x y? ?
/2, 1? ?
?
n
t /
d
Sn into Eq. (53c).
This leads to rejecting H
0
:?
x
= ?
y
iff
/2, 1
/
? ?
?
ny
tSn
/2, 1
/
? ?
+?
nx
tSn?
/2, 1? ?
?
n
t /
d
Sn ? 0. (53d)
At the borderline value, we set the LHS of inequality (53d) equal to 0 in order to solve
for? .
?
/2, 1? ?
?
ny
tS
/2, 1? ?
+?
nx
tS?
/2, 1? ?
?
n
t
d
S = 0.
?
/2, 1
()
? ?
?+
nyx
tSS =
/2, 1? ?
?
n
t
d
S
?
/2, 1? ?n
t =
/2, 1? ?
?
n
t
22
2+?
+
x yxy
xy
SS rSS
SS
107
?
/2, 1? ?n
t =
/2, 1? ?
?
n
t
00
0
12
1
+?
+
FrF
F
(54c)
where
22
0
/
x y
FSS= . Eq. (54c) clearly shows that the value of ? depends on the LOS ? of
testing H
0
: :?
x
= ?
y
, F
0
, and the sample size n. When r = ?1, Eq. (54c) shows that ? ??
so that the Overlap becomes an exact ?level test; while if r = 1 the RHS attains its
minimum value leading to maximum value for ?. When r = 0 (i.e., uncorrelated X & Y),
Eq. (54c) shows that for very large or very small values of F
0
, the Overlap in the limit
becomes an ?level test.
108
8.0 The Impact of Overlap on Type II Error Probability for the Case of Unknown
Process Variances
2
x
? ,
2
y
? and Small to Moderate Sample Sizes
Since the population variances
2
x
? and
2
y
? are unknown, then their point unbiased
estimators
2
x
S and
2
y
S , respectively, must be used for the purpose of statistical inference.
As mentioned in Chapter 6 the rv
()
/
x
x x
x
Sn
??
is not normally distributed but its sampling
follows that of W. S. Gosset?s tdistribution with (
x
n ?1) degrees of freedom. As a result,
the acceptance interval of the test statistic
0
()
/
x
x x
x
Sn
??
at the LOS ? is (
1
/2, /2, 1
,
?
?
?
xx
nn
tt
??
),
where
/2,
t
? ?
> 0 for all 0 < ? < 0.50, and it also follows that
Pr(
/2, 1
/
x
nx x
x tSn
? ?
?
x
?? ?
/2, 1
/
x
nx x
x tSn
? ?
+ ) = 1 ?? (57a)
Hence, the lower (1 ?? )% CI for
x
? is L(
x
? ) =
/2, 1
/
x
nx x
x tSn
? ?
? , the corresponding
upper limit is U(
x
? ) =
/2, 1
/
x
nx x
x tSn
? ?
+ , and the
CIL(
x
? ) =2?
/2, 1
/
x
nx x
tSn
? ?
(57b)
Similarly, L(
y
? ) =
/2, 1
/
y
ny y
y tSn
? ?
? , U(
y
? ) =
/2, 1
/
y
ny y
y tSn
? ?
+ and
CIL (
y
? ) = 2?
/2, 1
/
y
ny y
tSn
? ?
. (57c)
109
8.1 The Case of H
0
: ==
xy
? ?? Not Rejected Leading to the Pooled tTest
Assuming that X ~ N(
2
,x
? ? ) and Y~N(
2
,y
? ? ), then X Y? has the N(
x y
? ?? ,
22
//
x y
nn??+ ) distribution, where it is assumed that
2
? is the common value of the
unknown
222
xy
? ??==. With the above assumptions, x y? is an unbiased estimator of
x y
? ?? with Var( x y? ) =
2
(1 / 1 / )
x y
nn? + . In practice a pretest on H
0
:
222
xy
? ??== is
required before deciding to use either the pooled ttest or the twoindependentsample ttest.
If the assumption
xy
? ??== is tenable and because statistical theory dictates that the
total resources be allocated according to n
x
= /( )+
x xy
N? ??= N/2 = n
y
, then the most
common application of the pooled ttest occurs under equal sample sizes. Henceforth, the
pooled ttest will be used iff the Pvalue of the pretest
0
:
xy
H ? ??= = exceeds 20%, and
for very small sample sizes n
x
& n
y
< 10, a Pvalue of at least 40% for the pretest is
recommended. Since the common value of the process variances
2
? is unknown, its
unbiased estimators
2
x
S and
2
y
S should be pooled to obtain one unbiased estimator of
2
? ,
which as before is given by their weighted average based on their degrees of freedom, i.e.,
2
p
S =
22
x xyy
xy
SS??
??
+
+
=
22
(1) (1)
2
x xy y
xy
nSnS
nn
?+?
+?
(35)
Note that E(
2
p
S ) =
2
? . Therefore, the se( x y? ) = 1/ 1/
px y
Snn+ and as a
result the rv [( ) ( )]/( 1/ 1/ )
x yP x y
x ySnn???? ? + has a central ?Student?s? sampling
distribution with ? = 2
xy
nn+?. Accordingly, the AI (acceptance interval) for a 5%
level test of H
0
:
x y
? ?? = 0 is given by
110
0.025,
??t
?
1/ 1/
px y
Snn+ ? x y? ?
0.025,
?t
?
1/ 1/
px y
Snn+ (58a)
where ? = ?
x
+ ?
y
= 2
xy
nn+?. Henceforth, in this section we let
0.025
t represent
0.025, 2+?
xy
nn
t only for notational convenience. Thus the AI in (58a) reduces to
0.025
??t 1/ 1/
p xy
Snn+ ? x y? ?
0.025
?t 1/ 1/
p xy
Snn+ (58b)
Under the null hypothesis H
0
:
x y
? ?? = 0, the SMD of t
0
= ( x y? )/ 1/ 1/
p xy
Snn+ is
that of the central t with ? = ?
x
+ ?
y
= 2
xy
nn+ ? . Put differently, the null distribution of
( x y? )/ 1/ 1/
p xy
Snn+ is
2
xy
nn
T
+ ?
. Thus, the AI for in (58b) reduces to
AI:
0.025
?t ? t
0
?
0.025
t (58c)
where t
0
= ( x y? )/ 1/ 1/
p xy
Snn+ and
0.025
t =
0.025, 2+ ?
xy
nn
t . However, if H
0
:
x y
? ?? =
0 is false (so that a type II error can occur), the SMD of ( x y? )/ 1/ 1/
px y
Snn+ is no
longer the central
2+?
xy
nn
T
. Thus, we next derive the SMD of the test statistic t
0
=
( x y? )/ 1/ 1/
px y
Snn+ under the alternative H
1
:
x y
??? = ? ? 0.
From statistical theory, the SMD of the rv
2
U/ /
?
? ? is that of the central
Student?s t with df equal to that of
2
?
? , where U ~ N(0,1), i.e., U is a unit normal rv, and
2
?
? is chisquared distributed rv with ? df and independent of U. However, if E(U) ? 0,
then
2
U/ /
?
?? is no longer central t distributed, but the rv
2
(Z ) / /
?
+???, where Z ~
N(0,1), has the noncentral t distribution with ? df and noncentrality parameter ? and the
distribution is almost universally denoted by t ( )
?
? ? , i.e.,
2
(Z ) / /
?
+??? ~ t()
?
? ? . We
111
will now illustrate how the above noncentral t distribution is used to compute type II
error Pr when testing equality of two normal means with unknown but equal process
variances. (This result has already been known in statistical literature for over 35 years.)
By definition
? = Pr (Accepting H
0
:
x y
? ?? = 0 if H
0
is false)
= Pr(
0.025
?t ? t
0
?
0.025
t  ?
x
? ?
y
= ?)
= Pr(
0.025
?t ?t
0
= ( x y? )/ 1/ 1/
p xy
Snn+ ?
0.025
t ??
x
? ?
y
= ?) (59)
If H
0
is assumed false so that E( x y? ) =
x y
??? ? 0, the SMD of
( x y? )/ 1/ 1/
px y
Snn+ is no longer the t ( 0)
?
? ?= , which is the central t with ? = ?
x
+ ?
y
= 2
xy
nn+? degrees of freedom. Thus, we first standardize x y? in Eq.(59) as shown
below, assuming ?
x
= ?
y
= ?.
? = Pr(
0.025
?t ?
22
22
[( ) ( ) ( )] / /
1/ 1/ / /
/?? ? + ? +
++
x yxy x y
px y x y
x ynn
Snn n n
?? ?? ? ?
??/
?
0.025
t
??
x
? ?
y
= ?)
= Pr(
0.025
?t ?
22
22
[( ) / / ]
/
/
x yxy
p
Z nn
S
?? ? ?
?
+? +
?
0.025
t ??
x
? ?
y
= ?)
= Pr(
0.025
?t ?
22
22
()// /
[( 2) / ] / ( 2)
[]+? +
+? +?
xy x y
xy p xy
Znn
nn S nn
?? ? ?
?
?
0.025
t ??
x
? ?
y
= ?)
= Pr(
0.025
?t ?
2
22
2
// /
/( 2)
[]
+?
++
+?
nn
xy
x y
xy
Z nn
nn
??
?
?
?
0.025
t )
112
= Pr(
0.025
?t ?
2
/
+Z
?
?
? ?
?
0.025
t ) (60a)
where
22
()// /=? +
x yxy
nn??? ? ? =
22
// /+
x y
nn?? ? and ? = n
x
+n
y
?2.
However, as stated above, the SMD of
2
/
+Z
?
?
? ?
is the noncentral t with ? = n
x
+n
y
?2 and
noncentrality parameter ? =
22
// /+
x y
nn?? ? = /( )? ?SE x y , i.e.,
2
/
+Z
?
?
? ?
~
xy
nn2
xy
t( )
1/n 1/n
+?
?
?
?+
=
xy
xy
nn2
xy
nn
t( )
nn
+?
?
?
?+
.
Thus,? = Pr(
0.025
?t ?
2
/
+Z
?
?
? ?
?
0.025
t )
= Pr(
0.025
?t ?
xy
xy
nn2
xy
nn
t( )
nn
+?
?
?
?+
?
0.025
t ) (60b)
Note that when ? = 0, the argument in (60b) becomes the central t and ? becomes equal
to 1??.
When the design is balanced, the SMD of the test statistic t
0
under H
1
reduces to
2(n 1)
n
t( )
2
?
?
?
?
, i.e., when n
x
= n
y
= n ?
? = Pr(
0.025
?t ?
2
/
+Z
?
?
? ?
?
0.025
t ) = Pr(
0.025
?t ?
2(n 1)
n
t( )
2
?
?
?
?
?
0.025
t ) (60c)
As an example, suppose we draw a random sample n
x
= 7 from a N(?
x
, ?
2
) and
one of size n
y
= 11 from another N(?
y
, ?
2
) with the objective of testing H
0
:
x y
??? = 0 at
the nominal significance level of ? = 5% versus the 2sided alternative H
1
:
x y
??? ? 0.
113
We wish to answer the question as to what is the Pr of accepting H
0
if the true mean
difference
x y
? ?? were not zero but were equal to 0.50?, i.e., we wish to compute the
type II error Pr at ? = 0.50? . Then the corresponding value of the noncentrality parameter
is equal to? =
xy x y
(/)nn/(n n)?? + = (0.50 / ) 77 /18?? = 1.03414 and type II error
Pr from Eq. (60b) is equal to ?
? = Pr(
0.025,16
?t ?
16
t (1.03414)? ?
0.025,16
t ) = Pr(?2.119905 ?
16
t (1.03414)? ? 2.119905)
= cdf[of
16
t (1.03414)? at 2.119905] ? cdf[of
16
t (1.03414)? at (?2.119905)].
Fortunately, both Minitab and Matlab provide the cdf of the noncentral t distribution.
Using Minitab, we obtain cdf[of
16
t (1.03414)? at 2.119905] = 0.838156 and cdf[of
16
t (1.03414)? at ?2.119905] = 0.0016652; thus, ? = 0.838156?0.0016652 = 0.836491 so
that the power of the test at ? = 0.50? is equal to 1?? = 1?0.836491 = 0.163509. Clearly
as ? =
x y
? ?? departs further from zero, the power of the test must increase, which is
illustrated next. Suppose now ? = 0.80?; then ? = (0.80 / ) 77 /18?? = 1.65462315
and (at? ? = 0.80?, 7, 11
xy
nn==)
= Pr(?2.119905 ?
16
t (1.65462315)? ? 2.119905)
= cdf[of
16
t (1.65462315)? at 2.119905] ? cdf[of
16
t (1.65462315)? at ?2.119905]
= 0.656987 ? 0.0002142 = 0.656733,
and hence the power of the test increases from 0.163509 to 1?0.656773 = 0.343227. It is
interesting to note that if the design is balanced, then the power of the test always
increases for the same parameter values. For example, if n
x
= n
y
= 9 so that ? = 16 stays
in tact, then at ? = 0.80?, the noncentrality parameter ? = (0.80 / ) 81/18?? =
114
1.6970563 and
? (at ? = 0.80?, 9, 9
xy
nn==) = Pr(?2.119905 ?
16
t (1.6970563)? ? 2.119905)
= cdf[of
16
t (1.6970563)? at 2.119905] ? cdf[of
16
t (1.6970563)? at ?2.119905]
= 0.642235 ? 0.0001839 = 0.6420511,
so that Power(at ? = 0.80?) = 0.357949, which exceeds the value of 0.343227 for the
unbalanced case. The syntax for Matlab noncentral t cdf is nctcdf(t, ?, ?).
As in the case of known variances, the type II error Pr from the Overlap is
computed similar to Eqs. (7) shown below.
?? = Pr(Overlap?? > 0) = Pr{[ ( ) ( )
x y
LU? ?? ]?[( ) ( )
yx
LU? ?? ] ?=
xy
? ??}
= Pr{[ x ?
/2,
x
t
? ?
/
x x
Sn ? y+
/2,
y
t
? ?
/
yy
Sn]?
[ y ?
/2,
y
t
? ?
/
yy
Sn? x+
/2,
x
t
? ?
/
x x
Sn]? }
= Pr{[ x ?y ?
/2,
x
t
? ?
/
x x
Sn+
/2,
y
t
? ?
/
yy
Sn]?
[?
/2,
y
t
? ?
/
yy
Sn?
/2,
x
t
? ?
/
x x
Sn? x ?y]? }
= Pr{[?
/2,
y
t
? ?
/
yy
Sn?
/2,
x
t
? ?
/
x x
Sn? x ?y ?
/2,
x
t
? ?
/
x x
Sn+
/2,
y
t
? ?
/
yy
Sn]? }
= Pr{[?A? x ?y ? +A]? } (61)
where A =
/2,
x
t
? ?
/
x x
Sn +
/2,
y
t
? ?
/
yy
Sn.
In order to apply the noncentral tdistribution to compute the Pr in Eq. (61), not
available in statistical literature, we must first inside brackets divide throughout by
1/ 1/
px y
Snn+ and then standardize x ?y as illustrated below.
??= Pr{[?A/ 1/ 1/
p xy
Snn+ ?( x ?y)/ 1/ 1/
p xy
Snn+ ? A/ 1/ 1/
p xy
Snn+ ] ?}
115
= Pr[?A
p
?
xy
xy
nn2
xy
nn
t( )
nn
+?
?
?
?+
? A
p
] (62a)
where A
p
=
/2, /2,
//
1/ 1/
xy
x xyy
px y
tSntSn
Snn
?? ??
+
+
. Note that if ? = 0, Eq. (62a) reduces to 1???
as was shown in Eq. (37a). In the case of balanced design, Eq. (62a) reduces to
??= Pr[?
/2,n 1 x y
22
xy
t(SS)
SS
??
+
+
?
2(n 1)
n
t( )
2
?
?
?
?
?
/2,n 1 x y
22
xy
t(SS)
SS
??
+
+
]
= Pr[?
/2,n 1 0
0
t(F1)
F1
??
+
+
?
2(n 1)
n
t( )
2
?
?
?
?
?
/2,n 1 0
0
t(F1)
F1
??
+
+
] (62b)
As an example, suppose samples of sizes n
x
= n
y
= 9 are drawn from two independent
normal universes with unknown but equal variances. We wish to compute the Pr of
accepting H
0
:
x y
??? = 0 at ? = 0.05 if
x y
??? = 0.80? and the sample statistics are S
x
=
0.65 and S
y
= 0.54. Note that it is sufficient to provide the ratio F
0
=
22
xy
S / S instead of the
specific values of S
x
and S
y
. Because
0.025,8/2,n 1
tt
??
= = 2.306004, ? =
(0.80 / ) 81/18?? = 1.6970563, Eq. (62b) yields ??(at ? =
x y
??? = 0.80?, 9, 9
xy
nn==)
= Pr[?3.247338?
16
t (1.6970563)? ? 3.247338] = 0.904239?0.0000078 = 0.904231.
The above value of ?? is much larger than ? = 0.6420511 using the Standard method. It
can easily be verified that the random function
xy
22
xy
SS
SS
+
+
lies within the interval
1 <
xy
22
xy
SS
SS
+
+
=
0
0
F1
F1
+
+
? 2 . However, we are using the pooled ttest only if
116
0.90,n 1,n 1
F
??
?
22
0
/
x y
FSS= ?
0.10,n 1,n 1
F
? ?
and hence
0.90,n 1,n 1
0.90,n 1,n 1
F1
F1
??
??
+
+
=
0.10,n 1,n 1
0.10,n 1,n 1
F1
F1
??
??
+
+
?
xy
22
xy
SS
SS
+
+
=
0
0
F1
F1
+
+
? 2 . Note that the equality on the most LHS of this last equation
follows from the fact that F
0.10,n?1,n?1
= 1/ F
0.90,n?1,n?1
for all n. Therefore, for a balanced
design the GLB of ?? for a 5%level test is given by
GLB(??) = Pr[?
0.025,n 1
t
?
0.10
0.10
F1
F1
+
+
?
2(n 1)
n
t( )
2
?
?
?
?
?
0.025,n 1
t
?
0.10
0.10
F1
F1
+
+
] (63a)
where F
0.10
= F
0.10,n?1,n?1
, and the LUB is given by
LUB(??) = Pr[?
0.025,n 1
t2
?
?
2(n 1)
n
t( )
2
?
?
?
?
?
0.025,n 1
t2
?
] (63b)
i.e.,
Pr[?
0.025,n 1
t
?
0.10
0.10
F1
F1
+
+
?
2(n 1)
n
t( )
2
?
?
?
?
?
0.10
0.10
F1
F1
+
+
0.025,n 1
t
?
] ? ?? ?
Pr[?
0.025,n 1
t2
?
?
2(n 1)
n
t( )
2
?
?
?
?
?
0.025,n 1
t2
?
] (63c)
Thus, for the example with n
x
= n
y
= 9, the GLB that the overlap type II error Pr can
attain is given by Eq. (63a) and is computed below.
GLB (?? at ? =
x y
??? = 0.80?)
= Pr[?2.306004
2.589349 1
2.589349 1
+
+
?
16
t (1.6970563)? ? 3.175781]
= Pr[?3.175781?
16
t (1.6970563)? ? 3.175781]
= 0.89451811? 0.00000954 = 0.89450857.
117
Thus, the smallest % relative error for the power of the test from Overlap is
[(0.357949? 0.105492)/ 0.357949]?100% = 70.53%.
Furthermore, the LUB that the overlap type II error Pr can become is given by Eq.(63b)
and is calculated as following:
LUB (??at ? =
x y
? ?? = 0.80?)
= Pr[ 2.306004? 2??
16
t (1.6970563)? ?3.26118232 ]
= Pr [ 3.26118232??
16
t (1.6970563)? ?3.26118232 ]
= 0.90602841?0.00000756 = 0.90602085
Therefore, the worst % relative error for the power of the test from Overlap is
[(0.357949? 0.093979)/ 0.357949]?100 %= 73.75%.
8.2 The Case of H
0
:
xy
? ?= Rejected Leading to the TwoIndependent Sample t
Test (or the tPrime Test)
Assuming that X~N(
2
,x x
? ? ) and Y~N(
2
,y y
? ? ), then X Y? is N(
x y
??? ,
22
//
x xyy
nn??+ ), but now the null hypothesis of H
0
:
x y
? ?= is rejected at the 20%
level leading to the assumption that the Fstatistic F
0
=
22
xy
SS/ > 2 for all sample sizes 16
? n
x
& n
y
.
It has been shown in statistical theory that if the assumption
x y
? ?= is not
tenable, the statistic
22
[( ) ( )] / ( / ) ( / )
x yxxyy
x ySnSn???? ? + has the approximate
central Student?s tdistribution with degrees of freedom
118
? =
2
22
[() ( )]
(()) (( ))
xy
xy
xy
??
+
+
vv
vv
=
2
22
[() ( )]
( ( )) ( ( ))
xy
yx
xy
x y
?? +
+
vv
=
2
0
2
0
(1)
()
+
+
xy n
y nx
FR
FR
??
? ?
(39)
where
2
() /=
x x
x Snv and F
0
=
22
/
x y
SS. The formula for degrees of freedom in (39) rarely
leads to an integer and is generally rounded down to make the test of
0
:
x y
H ? ?? = 0
conservative, i.e., the rounding down ? increases the Pvalue of the test. However,
programs like Matlab and Minitab will provide probabilities of the tdistribution for non
integer values of ? in Eq. (39). It has been verified by the authors that ? in Eq. (39)
attains its maximum when the larger sample also has much larger variance than the
sample whose size is much smaller. Even then, it is for certain that Min( ?
x
, ?
y
) < ? <
?
x
+ ?
y
, and hence the twosample ttest is less powerful than the pooled ttest. When
0
:
x y
H ? ?= is rejected at the 20% level (i.e., Pvalue < 0.20), the type II error Pr of a
5%level test is given by ? = Pr (Accepting H
0
:
x y
??? = 0 if H
0
is false) ?
? Pr(
0.025,
?t
?
? t
0
?
0.025,
t
?
 ?
x
? ?
y
= ?) (64)
where t
0
=
22
()/(/)(/)
x xyy
x ySnSn?+ is approximately central t distributed when H
0
is
true with df, ?, given in Eq. (39). Henceforth in this section we let t
0.025
represent
0.025,
t
?
only for notational convenience. When H
0
is false, the authors have also verified that the
exact SMD of the statistic t
0
=
22
()/(/)(/)
x xyy
x ySnSn?+ under the alternative H
1
: ?
x
? ?
y
= ? ? 0, unlike the case of
x y
? ?= , is intractable using central ?
2
. As far as we
know, the exact power of the tPrime (or the twosample independent ttest) test has not
yet been obtained in statistical literature. That is, the SMD of t
0
is not the noncentral t
119
with some noncentrality parameter ?. The development that follows, the results already
existing in statistical literature, is only an approximation because there does not exist an
exact solution for type II error Pr of testing H
0
:
x y
? ?? = 0 when the variances are
unknown and unequal. We first approximately studentize the expression for ? in Eq.(64).
? Pr(
0.025,
?t
?
? t
0
?
0.025,
t
?
 ?
x
? ?
y
= ?)
Pr(
0.025,
?t
?
?
22
()/(/)(/)
x xyy
x ySnSn?+?
0.025,
t
?
 ?
x
? ?
y
= ?)
Pr{
0.025
?t
22
(/)(/)/
x xyy
Sn Sn??+ ?
22
[( ) ] ( / ) ( / )/
x xyy
x ySnSn??? +
?
0.025
t
22
(/)(/)/
x xyy
Sn Sn??+}
Pr{
0.025
?t
22
(/)(/)/
x xyy
Sn Sn??+? t
?
?
0.025
t
22
(/)(/)/
x xyy
Sn Sn??+}
Pr{
0.025
?t ?? ? t
?
?
0.025
t ??} (65)
where the studentized mean difference?=
22
(/)(/)()/
x yxxyy
Sn Sn?? +?
=
22
(/)(/)/
x xyy
Sn Sn? + . Unfortunately, the approximate expression for ? in Eq.(65)
still depends on the sample se( ?x y ) =
22
(/)(/)
x xyy
Sn Sn+ , and therefore, the
approximation in Eq.(65) can be carried out iff ? is specified in units of the se( ?x y ), or
in units of ?
x
? ?
y
, in which case the realized values of
22
x y
S and S have to be used
posteriori in order to approximate a priori type II error probability.
For example, suppose samples of sizes n
x
= n
y
= 9 are drawn from two
independent normal populations with unknown but unequal variances. We wish to
compute the Pr of accepting H
0
:
x y
? ?? = 0 at ? = 0.05 if
x y
? ?? = ? = 0.4 and the
120
sample statistics are S
x
= 0.65 and S
y
= 0.54. Eq.(39) gives ? =
2
22
[() ( )]
( ( )) ( ( ))
xy
yx
xy
x y
?? +
+
vv
=
15.48, t
0.025
= t
0.025,15.48
= 2.1257, ? =
22
(/)(/)()/
x yxxyy
Sn Sn?? +? = 0.40/0.281681
= 1.420044 so that ?
0.025
t ?? = ? 3.545744,
0.025
t ?? = 2.1257?1.420044 = 0.7057, and
?(at ? = 0.40) Pr(?3.5457? t
15.48
?0.7056) = 0.75454016?0.00140620 = 0.75313396.
If
x y
? ?? = 0.60, similar calculations will show that ?(at
x y
? ?? = 0.6) Pr(?4.25577?
t
15.48
? ? 0.00437) = 0.4982845? 0.0003233 = 0.4979612. Note that the above
approximate type II error Prs would be in exact agreement with what UCLA?s Statistics
Department Power Calculator lists on their website (www.stat.ucla.edu). If n
x
= 7, S
x
=
0.65, n
y
=11 and S
y
= 0.54 the type II error Pr increases a bit from 0.7532 to
?(at
x y
? ?? = 0.4) = 0.79083377 ? 0.0022143 = 0.7886.
Again, the type II error Pr from the Overlap is computed similar to Eq. (7), just
like the case of the pooled ttest, as shown below.
?? = Pr(Overlap ?? > 0) = Pr{[ ( ) ( )
x y
LU? ?? ]?[( ) ( )
yx
LU? ?? ] ?
x
? ?
y
=?}
Note that the event [ () ()
x y
LU? ?? ]?[ () ()
yx
LU? ?? ] is equivalent to either ( )
x
L ? ?
()
y
U ? ()
x
U ?? or() () ()
yxy
LUU? ????. Thus,
?? = Pr{[ x ?
/2,
x
t
? ?
/
x x
Sn ? y+
/2,
y
t
? ?
/
yy
Sn]?
[ y ?
/2,
y
t
? ?
/
yy
Sn? x+
/2,
x
t
? ?
/
x x
Sn] ?}
= Pr{[ x ?y ?
/2,
x
t
? ?
/
x x
Sn +
/2,
y
t
? ?
/
yy
Sn]?
[?
/2,
y
t
? ?
/
yy
Sn?
/2,
x
t
? ?
/
x x
Sn? x ?y] ?}
121
= Pr{[?
/2,
y
t
? ?
/
yy
Sn?
/2,
x
t
? ?
/
x x
Sn? x ?y ?
/2,
x
t
? ?
/
x x
Sn +
/2,
y
t
? ?
/
yy
Sn] ?}
= Pr{[?A? x ?y ? +A] ?} (66)
where A =
/2,
x
t
? ?
/
x x
Sn+
/2,
y
t
? ?
/
yy
Sn. Studentizing inside the brackets of Eq.(66)
results in:
??= Pr{[?A?(?
x
??
y
) ?( x ?y) ?(?
x
??
y
) ? +A?(?
x
??
y
)] ?}
= Pr{[?A?(?
x
??
y
)]/ se( ?x y )?[( x ?y) ?(?
x
??
y
)] /se( ?x y ) ? (A?? )/ se( ?x y )}
where se( ?x y ) =
22
(/)(/)
x xyy
Sn Sn+ . Thus,
?? Pr{(?A?? )/ se( ?x y )? t
?
? (A?? )/ se( ?x y )} (67)
For the example, if n
x
= n
y
= 9, S
x
= 0.65 and S
y
= 0.54, A = 0.914715, ? =
2
22
[() ( )]
( ( )) ( ( ))
xy
yx
xy
x y
?? +
+
vv
= 15.48 as before, Eq. (67) now gives ??(? = 0.40)
Pr[?4.66738?
15.48
t ?1.827295] = 0.956500.00014 = 0.9564 as compared to exact value
of ?(at ? =0.40) = 0.7532 and a % relative error in power
([( ) / (1 )] 100% / (1 )???? ?? ?? ?) equal to 82.33%.
8.3 The Impact of Overlap on Type II Error Probability for the Paired tTest (i.e.,
the Randomized Block Design) when Process Variances are Unknown
Consider the 5%level test of H
0
:
x y
??? = ?
d
= 0 versus the 2sided alternative H
1
:
?
d
? 0, where the paired response (x, y) comes from a bivariate normal universe so that
X and Y are correlated random variables with unknown correlation coefficient ?. The
122
appropriate test statistic for testing H
0
: ?
d
= 0 is t
0
=
d
dnS/ , where S
d
=
22
?2
x yxy
SS ?+? and ?
xy
? =
n
ii
i=1
(x x)(y y) / (n 1)? ??
?
. The decision rule is to reject
H
0
at the 0.05level iff 
d
dnS/  >
0.025, 1n
t
?
. Thus, for a 5%level test by definition
? = Pr(Accepting H
0
:
x y
??? = 0 if H
0
is false)
= Pr(
0.025, 1?
?
n
t ? t
0
?
0.025, 1?n
t  ?
d
= ?)
Fortunately, just like in the case of the pooled ttest, the exact SMD of t
0
under the
alternative H
1
: ?
d
? 0 has been known for well over 35 years. That is to say, an exact
expression for the OC curve of the paired ttest already exists in statistical literature as
illustrated below.
? = Pr(
0.025, 1?
?
n
t ?
d
dnS/ ?
0.025, 1?n
t  ?
d
= ?) (68)
where for notational convenience we will let
0.025
t =
0.025, 1?n
t in this section.
Standardizing
d
dnS/ in Eq. (68) under the alternative H
1
: ?
d
? 0 leads to
? = Pr(
0.025, 1?
?
n
t ?
d
dnS/ ?
0.025, 1?n
t  ?
d
= ?)
= Pr(
0.025, 1?
?
n
t ?
[( ) ] /
/
dd d
dd
dn
S
? ??
?
?+
?
0.025, 1?n
t  ?
d
= ?)
= Pr(
0.025, 1?
?
n
t ?
22
Zn
S
/
/
dd
dd
? ?
?
+
?
0.025, 1?n
t  ?
d
= ?)
= Pr(
0.025, 1?
?
n
t ?
22
Z
(n 1)S
(n 1)
/
+
?
?
dd
?
?
?
0.025, 1?n
t  ?
d
= ?)
123
= Pr(
0.025, 1?
?
n
t ?
1
2
Z
/(n 1)
?
+
?
n
?
?
?
0.025, 1?n
t  ?
d
= ?)
= Pr(
0.025, 1?
?
n
t ?
1
()
n
t ?
?
? ?
0.025, 1?n
t ) (69)
Eq. (69) shows that the exact SMD of t
0
=
d
dnS/ under the alternative H
1
: ?
d
? 0 is the
noncentral t with noncentrality parameter ? = n /
dd
? ? and ? = n?1 df, while the null
SMD of t
0
is the central t
n?1
=
1
(0)
n
t
?
? . For example, suppose we wish to compute the
type II error Pr when testing H
0
:
x y
??? = ?
d
= 0 at the 5% level with a random sample of
size n = 10 blocks from a bivariate normal distribution versus the alternative H
1
: ?
d
=
0.50?
d
. Thus from Eq. (69), ?(at ?
d
= 0.50?
d
) = Pr(
0.025,9
t? ?
1
()
n
t ?
?
? ?
0.025,9
t ),
where ? = 0.50 10 /
dd
? ? = 1.581139. Consequently using Matlab we obtain
?(at ?
d
= 0.50?
d,
n=10) = Pr(?2.262157?
9
(1.581139)t? ? 2.262157)
= nctcdf(2.262157, 9, 1.581139) ? nctcdf(?2.262157, 9, 1.581139)
= 0.7071714?0.00034704 = 0.70682435,
so that the power of the test is given by PWF(at 0.50?
d
, n =10) = 0.29317565. It is
common knowledge in the field of Statistics that the power of a test should increase with
increasing sample size; a statistical test for which the limit of its PWF does not approach
1 as n ? ?, is said to be inconsistent. It is also estimated that in order to double the
power of a test, roughly more than twice the sample size is needed. For this reason,
consider this last example where the PWF(at 0.50?
d
) was equal to 0.29317565 with n =
10 but now we set the value of n at 20. Then, at n = 20, ?(at ?
d
= 0.50?
d
, n = 20) =
Pr(
0.025,19
?t ?
1
()
n
t ?
?
? ?
0.025,19
t ), where ? = 0.50 20 /
dd
? ? = 2.236068 and
124
0.025,19
t = 2.093024
? ?(at ?
d
= 0.50?
d
, n = 20)
= nctcdf(2.0930240,19, 2.236068) ? nctcdf( ?2.0930240,19, 2.236068)
= 0.43551707475811? 2.152176224301527?10
?5
= 0.43549555299587
? PWF(at 0.50?
d
, n =20)
= 0.564504447 < 2? PWF(at 0.50?
d
, n =10).
On the other hand, in order to have the same value of PWF at ?
d
= (1/2)?0.50?
d
, roughly
four times the sample size is needed.
As in sections 8.1 and 8.2, the type II error Pr using the Overlap is given by
??= Pr{[?
/2,
y
t
? ?
/
yy
Sn?
/2,
x
t
? ?
/
x x
Sn? x ?y ?
/2,
x
t
? ?
/
x x
Sn +
/2,
y
t
? ?
/
yy
Sn] 0?? }
= Pr{[?A? d ? +A] 0?? } (70a)
where as before A =
/2,
x
t
? ?
/
x x
Sn+
/2,
y
t
? ?
/
yy
Sn and d = x ?y . However, because
this is a block design, then per force n
x
= n
y
= n, and as a result A =
/2, 1n
t
? ?
(S
x
+S
y
)/ n .
Following the exact same development that leads to Eq. (69), we obtain
?? = Pr[?
/2, 1n
t
? ?
(S
x
+S
y
)/ n ? d ?
/2, 1n
t
? ?
(S
x
+S
y
)/ n ]
= Pr[?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
? d n /S
d
?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
]
= Pr[?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
?
(d ) /
/
?+
dd d
dd
n
S
? ??
?
?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
]
125
= Pr[?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
?
22
(d ) /
(1)/
1
?+
?
?
dd d
dd
n
nS
n
? ??
?
?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
]
= Pr[?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
?
1
()
n
t ?
?
? ?
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
] (70b)
where ?= n /
dd
? ? . Because S
d
=
22
2
x yxy
SS rSS+? , it follows that S
d
? S
x
+S
y
, and
equality occurring iff the sample correlation coefficient r = ?1. On comparing the
expression for ? in Eq. (69) with that of ?? in (70b), it is clear that ? ? ?? because
/2, 1n
t
? ?
(S
x
+S
y
)/S
d
?
/2, 1n
t
? ?
. Further, dividing the numerator and denominator of
(S
x
+S
y
)/S
d
by S
y
, we obtain (S
x
+S
y
)/S
d
=
00 0
(F 1)/ F 12rF++?. Substituting this
last into (70b) results in
?? = Pr[?
/2, 1n
t
? ?
(
0
00
F1
F12rF
+
+?
)?
1
()
n
t ?
?
? ?
/2, 1n
t
? ?
(
0
00
F1
F12rF
+
+?
)] (70c)
The final expression for Overlap type II error Pr in Eq. (70c) clearly shows that the value
of ?? depends only on the sample size n, noncentrality parameter ? = n /
dd
? ? , the
sample correlation coefficient r, and the sample variance ratio F
0
=
22
xy
S/S but does not
depend on the specific values of
22
xy
SandS. Only when r = ?1, the value of ?? equals
to ?; otherwise ?? > ?. Further, the limit of ?? as F
0
? 0 or as F
0
? ? is also equal to ?.
It can easily be shown using calculus that the function
00 0
(F 1)/ F 12rF++? in the
argument of ?? in Eq. (70c) attains its maximum at F
0
= 1 with its value equal to
2/(1 r)? . Thus for a 5%level test, the least upper bound for ?? is given by
126
LUB(??) = Pr[?
0.025, 1?n
t 2/(1 r)? ?
1
()
n
t ?
?
? ?
0.025, 1?n
t 2/(1 r)? ] (71)
As r? ?1, the LUB(??)? ?, but as r? +1, LUB(??) ? +1. Thus the impact of negative
correlation is to reduce the Overlap type II error Pr while the impact of positive
correlation is to increase ??. As an example, for a random bivariate pair of size n = 10, a
5%level test, and r = ? 0.50, Eq. (71) at ?
d
= 0.50?
d
yields
LUB(??) = Pr[?
0.025,9
t 2/1.5?
9
(1.581139 )?t ?2.262157 2 /1.5 ]
= Pr[?2.612114?
9
(1.581139)?t ?2.612114]
= 0.793446 ? 0.0001628 = 0.79328344
as compared to ? = 0.70682435 from the Standard method. However, if r were equal to
+0.50, then
LUB(??) = Pr[?
0.025,9
t 2/0.5?
9
(1.581139)?t ?2.262157?2]
= Pr[?4.524314 ?
9
(1.581139)?t ? 4.524314] = 0.976860.
127
9.0 Conclusions and Future Research
Chapter 3 used the normal theory with known variances to prove results that
already existed in Overlap literature, some of which were obtained through simulation. It
was proved that for a nominal significance level ? = 0.05, the corresponding 95%
overlapping CIs provide a much smaller LOS ?? = 0.0055746, which fully agrees with
the value computed from Eq. (7) on p. 184 of Schenker et al. (2001). Schenker et al.
provide their results without any proof. Further, Chapter 3 proved that for a LOS of 0.01,
the corresponding Overlap LOS was ?? = 0.0002697, while the literature provides results
only for the nominal LOS of 5%. Further the smaller the LOS of the Standard method
becomes, the larger is the % relative error of the Overlap LOS. Although, the Overlap
literature has never considered the onesided alternative, Chapter 3 showed that the
Overlap LOS is ? of the corresponding twosided alternative (i.e., the Overlap procedure
becomes even more conservative for a onesided alternative).
Second, a concept that has not been discussed in Overlap literature is the
maximum % overlap that the two independent CIs can have and H
0
: ?
x
= ?
y
cannot still
be rejected at a preassigned LOS ?. It was proven that this maximum % overlap
depends only on the SE ratio [k =
yyxx
(/n)/(/n)??or k =
xx yy
(/n)/(/n)??] and is equal to 17.1573% at k = 1 and diminishes to zero as k
?? or zero. At k = 10, it was shown that the maximum % overlap reduces to 4.5137%
128
so that the Overlap procedure converges to an exact ?level test for limiting values of k.
Third, the chapter showed that the two independent CIs must each have a
confidence level of 1 ? ? =1 ?
/2
2(Z /2)
?
?? in order to provide an exact ?level test.
This last formula gives a confidence levels of 0.931452 for both independent intervals at
? = 0.01, and 1? ? = 0.83422373 at ? = 0.05. This latter value is in perfect agreement
with Overlap literature while the former value of 1? ? = 0.931452 has not been reported.
Finally, the Overlap procedure leads to less statistical power compared to the
Standard method and its RELEFF for small sample sizes is poor and heavily depends on
?/?, but its asymptotic RELEFF is 100% as n ??. For the simplest case of ?
x
= ?
y
and
n
x
= n
y
an exact formula (15e) was obtained for the RELEFF of Overlap to the Standard
method.
Chapter 4 investigated the Bonferroni Overlap CIs against the Standard procedure
and determined that the Bonferroni concept makes the Overlap even more conservative
and loses more statistical power.
Chapter 5 examined the overlapping CIs for two process variances against the
Standard method that uses the Fisher?s F distribution; the Overlap literature has not
investigated the Overlap procedure for variance ratios. As in the case of process means,
the Overlap reduces the LOS of the test and the limiting value of ?? at ? = 0.05 and k = 1
is roughly 0.0055746, while as k ? ? or zero, the Overlap approaches an exact ?level
test.
Second, the limiting value of maximum % overlap that does not reject H
0
: ?
x
= ?
y
is exactly 17.15726% as was in the case of two process means.
Third, the individual confidence levels have to be set at ? obtained from Eq. (31b)
129
22
/2, 1 /2,
/
?
x y
? ???
??=
/2, ,
/
xy
x y
F
???
? ? , where the limiting value of ? is 0.165766 at k =1
just like the case of means; further, as k ? ? or zero, ? ? 0.
Last, the power of Overlap procedure is always less than Standard but approaches
that of the Standard as k ? ? or zero. The asymptotic RELEFF of Overlap to the
Standard method is 100% as n
x
& n
y
??.
Chapter 6 examined the impact of Overlap on type I error Pr, in the normal case
with unknown variances but samples sizes ? 50, using the pooled t and twoindependent
sample t statistics, and also the effect of positive and negative correlations on the Overlap
procedure
.
Specific formulas for ??of the pooled ttest (37c), the twoindependent sample
ttest (41b), and the paired ttest (44a) were derived and documented. The Overlap
literature has not considered the pooled ttest.
Chapter 7 used the pooled tstatistic to derive an expression for the % overlap, ?
r
,
below which H
0
: ?
x
= ?
y
cannot be rejected at the ? level. Unlike the simple case of
known variances where ?
r
depends only on the SE ratio k, when the process variances
are unknown and sample sizes are not large, ?
r
depends on n
x
, n
y
, F
0
=
22
xy
S/S, and ?.
For the case of twoindependent tstatistic, ?
r
depends on n
x
, n
y
, k, and ?, while for the
paired ttest, it depends only on the correlation coefficient between X and Y and F
0
=
22
xy
S/S. For all 3 cases, Chapter 7 also derived expressions for the individual confidence
levels, 1??, that provide an ?level test by the Overlap method. In the case of pooled t
test, ? depends only on n
x
, n
y
, and F
0
. For the twoindependent sample ttest ? depends on
n
x
, n
y
and F
0
. While for the paired ttest, it depends only on n, r and F
0
.
130
Chapter 8 used the noncentral tdistribution to derive formulas for the OC curves
(and also power functions) for the case of underlying normal distribution with unknown
variances and moderate to small sample sizes n ? 50, the results of which have been
available in statistical literature for more than 35 years. However, the chapter also
derived formulas for type II error Pr of Overlap (??) using the noncentral t. The exact
results obtained for this latter case have not been available in statistical literature.
As further research, one could consider the Overlap problem for other normal
parameters, such as the coefficient of variation ?/? [see Vangel (1996) and Payton (1996)]
and quantiles ? + Z
?
?, 0 < ? < 1. Further, we suspect that the SMD of S (= the standard
deviation for a sample) from a nonnormal population approaches normality, toward N[?,
?
2
/(2n)], but very agonizingly slowly (n > 100). The exact SMD of S from a N(?, ?
2
) has
been documented in statistical literature more than 50 years ago. For an underlying
normal population, it is also widely known that an n > 75 is needed in order for S to be
roughly normally distributed according to N[c
4
?, (1 ?
2
4
c)?
2
], where c
4
= (/2)2/n?
[( 1) / 2] 1{}nn?? ?? < 1 is a wellknown QC constant. Note that the approximate
V(S) generally reported in statistical literature is ?
2
/(2n), but we know for fact that
?
2
/[2(n ? 0.745)] is a better approximation to the exact variance of S from a N(?, ?
2
),
which is given by V(S) = (1 ?
2
4
c)?
2
. Unfortunately, the farther the skewness and
kurtosis of an underlying nonnormal distribution are from zero, the much larger sample
size is needed for the SMD of S to exhibit normality. Thus, if the underlying distribution
is nonnormal, only the limiting comparison of Standard CI to Overlap may be
accomplished based on CIs of ?
x
and ?
y
. Also, we have not yet seen the impact of
131
overlapping CIs on parameters of other underlying distributions such as Uniform,
Weibull, and Beta.
132
10.0 References
[ 1 ] Brownlee, K. A. (1965), Statistical Theory and Methodology in Science and
Engineering, Wiley, NY.
[ 2 ] Cole, S. R. and Blair, R.C. (1999), Overlapping Confidence Intervals. Journal
of the American Academy of Dermatology, 41(6), pp.10511052.
[ 3 ] Devore, J. L. (2008), Probability and Statistics, Thomson Brooks/Cole, Canada.
[ 4 ] Djordjevic, M. V., Stellman, S. D. and Zang, E.(2000), Doses of Nicotine and Lung
Carcinogens Delivered to Cigarette Smokers. Journal of the National Cancer
Institute, 92(2), pp.106111.
[ 5 ] Goldstein H. Healy MJR. (1995), The Graphical Presentation of a Collection of
Means. Journal of the Royal Statistical Society A,158: pp.175177.
[ 6 ] Hool, J. N. and Maghsoodloo, S. (1980), Normal Approximation to Linear
Combinations of Independently Distributed Random Variables. AIIE Transactions,
12, pp.140144.
133
[ 7 ] Johnson, N. L., Kotz, S., and Balakrishnan, N. (1995), Continuous Univariate
Distributions, 2
nd
edition, John Wiley & Sons, Inc.
[ 8 ] Kelton, W. D., Sadowski, R. P. and Sturrock, D. T. (2004) Simulation with Arena,
pp. 265268, McGrawHill Companies, Inc., NY.
[ 9 ] Kendall, M. G. and Stuart, A. (1963) The Advanced Theory of Statistics, Charles
Griffin & Company Limited, London.
[ 9 ] Maghsoodloo S., and Hool, J. N. (1981), On Normal Approximation of Simple
Linear Combinations. The Journal of the ALABAMA ACADEMY of SCIENCES, 52,
October 1981, No. 4, pp.207219.
[10 ] Mancuso, C. A., Peterson, M.G. E. and Charlson, M. E. (2001) Comparing
Discriminative Validity Between a DiseaseSpecific and a General Health Scale in
Patients with Moderate Asthma. Journal of Clinical Epidemiology, 54, pp.263274.
[ 11] Montgomery D. C. and Runger G. C. (1994), Applied Statistics and Probability for
Engineers, John Wiley & sons, Inc., p. 411.
[ 12] Payton, M. E., (1996), ? Confidence intervals for the coefficient of variation,? Proc.
Kansas State Univ. Conf. Appl. Statistical Agriculture, 8, pp. 8287.
134
[ 13] Payton, M. E., Miller, A. E. and Raun, W. R. (2000) Testing Statistical
Hypotheses Using Standard Error Bars and Confidence Intervals. Communication
in Soil Science and Plant Analysis, 31, pp.547552.
[ 14] Payton, M. E., Greenstone, M. H. and Schenker, N. (2003) Overlapping confidence
intervals or standard error intervals: What do they mean in terms of statistical
significance? The Journal of Insect Science, 3, pp.3439
[ 15] Schenker, N. and Gentleman, J. E. (2001) On Judging the Significance of
Differences by Examining the Overlap Between Confidence Intervals. The
American Statistician, 55, pp.182186.
[ 16] Sont, W. N., Zielinski, J. M., Ashmore, J. P., Jiang, H., Krewski, D., Fair, M. E.,
Band, P. R. and Letourneau, E. G. (2001) First Analysis of Cancer Incidence and
Occupation Radiation Exposure Based on the National Dose Registry of Canada.
American Journal of Epidemiology, 153(4), pp.309318.
[ 17] Tersmette, A. C., Petersen, G. M., Offerhaus, G.J.A., Falatko, F. C., Brune, K. A.,
Goggins, M., Rozenblum, E., Wilentz, R. E., Yeo, C.J., Cameron, J. L., Kern, S. E.
and Hruban, H. (2001) Increased Risk of Incident Pancreatic Cancer Among First
degree Relatives of Patients with Familial Pancreatic Cancer. Clinical Cancer
Research, 7, pp.738744.
135
[ 18] Vangel, M.G. (1996), ? Confidence intervals for a normal coefficient of variation,?
The American Statistician, 50, pp. 2126.
136
APPENDICES
Appendix A: The Kurtosis of the sum of n independent Uniform, U(0, 1), distributions.
Appendix B: Matlab functions
137
Appendix A: The Kurtosis of the sum of n independent uniform, U(0, 1), distributions
Suppose x
1
, x
2
, ?, x
n
are independently and each uniformly distributed over the
real interval [0, 1]. It is well known that the 1
st
four moments of each x
i
is given by ?
i
=
1/2, ?
2
= V(X
i
) = 1/12 = ?
2
, ?
3
= 0 (by symmetry), and ?
4
= 1/80 so that ?
4
= ?
4
/?
4
=
144/80 = 1.80 and the kurtosis of each x
i
is equal to ?
4
= ?
4
?3 = ?1.20.
Now consider the sum Y
n
=
1
n
i
i
x
=
?
; our objective is to compute the 1
st
four
moments of Y
n
from the known moment of each x
i
, i =1,2,?, n. Clearly, the mean of Y
n
is given by E(Y
n
) = n/2, the variance is given by V(Y
n
) = nV(X
i
) = n/12, ?
3
(Y
n
) = 0 by
symmetry, and ?
4
(Y
n
) is computed below.
?
4
(Y
n
) = E[
1
n
i
i
x
=
?
? (n/2)]
4
= E[
n
i
i1
(x 1/ 2)
=
?
?
]
4
= E[
n
4
i
i1
(x 1/ 2)
=
?
?
+
4
C
2
?
1
22
11
(1/2)( 1/2)
nn
ij
ij
xx
?
=>
??
??
]
= n?
4
(x
i
) + 6?
n
C
2
?
ij
V(X ) V(X )?
Note that in the binomial expansion of [
n
i
i1
(x 1/ 2)
=
?
?
]
4
the expectation of odd products
such as E[(x
1
1/2)(x
2
? 1/2)
3
] vanish due to mutual independence of x
i
and x
j
for all i ? j.
Hence, ?
4
(Y
n
) = n/80 + 3n(n ? 1)?
4
= n/80 + 3n(n ? 1)/144 = n/80 + n(n ? 1)/48
Thus, ?
4
(Y
n
) =
4n
n
(Y )
V(Y )
?
=
2
n/80 n(n 1)/48
(n /12)
+ ?
=
144 / 80 144(n 1) / 48
n
+ ?
=
1.80 3(n 1)
n
+?
? ?
4
(Y
n
) = 3 ? 1.20/n
138
? ?
4
(Y
n
) = ?
4
(Y
n
) ? 3 = ?1.20/n
Thus for a 2fold convolution of U(0, 1), the kurtosis is ?1.20/n = ? 0.60 while for a 6
fold convolution the kurtosis of
6
1
i
i
x
=
?
is equal to ?1.20/n = ?0.20.
139
Appendix B: Matlab functions
(a) The following Three Matlab functions compute the Overlap significance level, ??, for
a pooled ttest, twoindependent ttest, and the paired ttest, respectively, at a given
significance level ? = a, sample sizes n
x
& n
y
and sample variance ratio F
0
=
22
xy
S/S.
1. function y = aprP(a,nx,ny,F0);
tx=tinv(1a/2,nx1);ty=tinv(1a/2,ny1);nu = nx+ny2;
RHS=nu*(tx*sqrt(F0*ny)+ty*sqrt(nx))^2/((ny1+F0*(nx1))*(nx+ny));
y=1fcdf(RHS,1,nu);
2. function y = apr(a,nx,ny,F0);
tx=tinv(1/a/2,nx1); ty=tinv(1a/2,ny1);Rn=ny/nx; nu=
(nx1)*(ny1)*(1+F0*Rn)^2/(nx1+(ny1)*(F0*Rn)^2);
RHS=(tx*sqrt(F0*Rn)+ty)^2/(1+F0*Rn);
y=1fcdf(RHS,1,nu);
3. function y = aprc(a,n,F0,r);
F1=finv(1a,1,n1);
RHS=F1*(sqrt(F0)+1)^2/(1+F02*r*sqrt(F0));
y=1fcdf(RHS,1,n1);
(b) The following Matlab functions compute the overlap proportion for a pooled ttest,
twoindependent sample ttest, and the paired test, respectively at a given
significance level ? = a, sample sizes n
x
& n
y
and sample variance ratio F
0
=
22
xy
S/S.
140
1. function y = OmegaP(a,nx,ny,F0);
Rn = ny/nx;nu=nx+ny2;n1=nx1; n2=ny1;
NUM = tinv(1a/2,n1)*sqrt(F0*Rn)+tinv(1a/2,n2)
tinv(1 a/2,nu)*sqrt((1+Rn)*(n1*F0+n2)/nu);
DEN = tinv(1a/2,n1)*sqrt(F0*Rn)+tinv(1a/2,n2)+tinv(1
a/2,nu)*sqrt((1+Rn)*(n1*F0+n2)/nu);
y= NUM./DEN;
2. function y = Omega(a,nx,ny,F0);
Rn= ny/nx;n1= nx1; n2=ny1;
nu=(n1*n2*(F0*Rn+1)^2)/(n2*(Rn*F0)^2+n1);
NUM=tinv(1a/2,n1)*sqrt(Rn*F0)+tinv(1a/2,n2)tinv(1a/2,nu)*sqrt(F0*Rn +1);
DEN = tinv(1a/2,n1)*sqrt(Rn*F0)+tinv(1a/2,n2)+tinv(1a/2,nu)*sqrt(F0*Rn +1);
y= NUM./DEN;
3. function y = OmegaC(F0, r);
NUM = sqrt(F0)+1sqrt(1+F02*r*sqrt(F0));
DEN = sqrt(F0)+1+sqrt(1+F02*r*sqrt(F0));
y= NUM./DEN;
(c) The following Matlab codes compute the value of ? that provides an ?level test for
the twoindependent ttest.
a=0.05;
nx=4;
ny=8;
F0=1.5;
Rn= ny/nx;n1=nx1;
n2=ny1;
nu=(n1*n2*(F0*Rn+1)^2)/(n2*(Rn*F0)^2+n1);
RHS =tinv(1a/2,nu)*sqrt(Rn*F0+1);
141
c(1)=a;
for i= 2:25
c(i)= c(i1)+0.005
LHS(i) = tinv(1c(i)/2,n2)+tinv(1c(i)/2,n1)*sqrt(Rn*F0);
end
for i = 2: 25
if RHS0.005<=LHS(i) & LHS(i) <= RHS + 0.005
break
g=c(i)
end
end