COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR TWO NORMAL POPULATION PARAMETERS Except where reference is made to the work of others, the work described in this dissertation is my own or was done in collaboration with my advisory committee. This dissertation does not include proprietary or classified information. _________________________________________________ Ching Ying Huang Certificate of Approval: _____________________________ _____________________________ Alice E. Smith Saeed Maghsoodloo, Chair Professor Professor Industrial and Systems Engineering Industrial and Systems Engineering _____________________________ _____________________________ Kevin T. Phelps George T. Flowers Professor Dean Mathematics and Statistics Graduate School COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR TWO NORMAL POPULATION PARAMETERS Ching-Ying Huang A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Auburn, Alabama December 19, 2008 COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR TWO NORMAL POPULATION PARAMETERS Ching Ying Huang Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon request of individuals or institutions and at their expense. The author reserves all publication rights. __________________________________ Signature of Author ___________________________________ Date of Graduation iv VITA Ching-Ying Huang, daughter of Shuh-Peir Huang and Yueh-E Lin, was born April 14, 1979, in Taipei, Taiwan. She entered Chang Gung University at Taoyuan, Taiwan in September, 1998 and received the Bachelor of Science degree in Business Administration June, 2002. She started her graduate program in Industrial and Systems Engineering at Auburn University August 2003. She married Chih-Wei Chiang on December 30, 2006. v DISSERTATION ABSTRACT COMPARING THE OVERLAPPING OF TWO INDEPENDENT CONFIDENCE INTERVALS WITH A SINGLE CONFIDENCE INTERVAL FOR TWO NORMAL POPULATION PARAMETERS Ching-Ying Huang Doctor of Philosophy, December 19, 2008 (B.S., Chang Gung University, 2002) 156 Typed Pages Directed by Saeed Maghsoodloo Two overlapping confidence intervals have been used in many sources in the past 30 years to conduct statistical inferences about two normal population means (? x and ? y ). Several authors have examined the shortcomings of Overlap procedure in the past 13 years and have determined that such a method completely distorts the significance level of testing the null hypothesis H 0 : ? x = ? y and reduces the statistical power of the test. Nearly all results for small sample sizes in Overlap literature have been obtained either by simulation or by somewhat inaccurate formulas, and only large-sample (or known- variance) exact information has been provided. Nevertheless, there are many aspects of Overlap that have not yet been presented in the literature and compared against the standard statistical procedure. This paper will present exact formulas for the % overlap, vi ranging in the interval (0, 61.3626%] for a 0.05-level test, that two independent confidence intervals (CIs) can have, but the null hypothesis of equality of two population means must still be rejected at a pre-assigned level of significance ? for sample sizes ? 2. The exact impact of Overlap on the ?-level and the power of pooled-t test will also be presented. Further, the impact of Overlap on the power of the F-statistic in testing the null hypothesis of equality of two normal process variances will be assessed. Finally, we will use the noncentral t distribution, which has never been applied in Overlap literature, to assess the Overlap impact on type II error probability when testing H 0 : ? x = ? y for sample sizes n x and n y ? 2. vii ACKNOWLEDGEMENT The author would like to thank her parents, Shuh-Peir Huang and Yueh E Lin, and two sisters, Yu-Hsin and Yu-Li Huang, for their positive attitude and encouragement. Especially, she dedicates this work to her husband, Chih-Wei Chiang, whose patience and moral support have made it possible. The author expresses her gratitude to Professor Saeed Maghsoodloo for his experienced knowledge and assistance on this research and Professor Alice E. Smith for her support during the author?s graduate program. The author is also thankful to Professor Kevin Phelps for helpful suggestions. viii Computer software used: MS Excel MS Word MathType Matlab ix TABLE OF CONTENTS List of Tables ??????????????????????????? xi List of Notation ??????????????????????................ xiii 1.0 Introduction ?????????????????????????.. 1 2.0 Literature Review ??????????????????????? 6 3.0 Comparing Two Normal Population Means for the Known Variances Case, or the Limiting Unknown Variances Case Where Both Sample Sizes Approach Infinity ..??????????????????????????????. 12 3.1 The Case of xy ? ??== ?.?????????????????. 13 3.2 The Case of Known but Unequal Variances ???????????. 27 4.0 Bonferroni Intervals for Comparing Two Sample Means ???????... 44 5.0 Comparing the Overlap of Two Independent CIs with a Single CI for the Ratio of Two Normal Population Variances ???? ????????????. 52 6.0 The Impact of Overlap on Type I Error Probability of H 0 : ? x = ? y for Unknown Normal Process Variances and Sample Sizes ????????????... 71 6.1 The case of H 0 : xy ? ??= = Not Rejected Leading to the Pooled t-test .. 72 6.2 The Case of H 0 : x y ? ?= Rejected Leading to the Two- Independent Sample t-Test ?????????????????????????? 77 6.3 Comparing the Paired t-CI with Two Independent t-CIs ??????. 85 x 7.0 The Percent Overlap that Leads to the Rejection of H 0 : x y ? ?= 7.1 The case of Unknown xy ? ??= = ??????????????. 92 7.2 The Case of H 0 : x y ? ?= Rejected Leading to the Two- Independent Sample t-Test ?????????????????????????? 98 7.3 Comparing the Paired t-CI with Two Independent t-CIs ?????? 97 8.0 The Impact of Overlap on Type II Error Probability for the Case of Unknown Process Variances 2 x ? , 2 y ? and Small to Moderate Sample Sizes ????. 108 8.1 The case of H 0 : xy ? ??= = Not Rejected Leading to the Pooled t- Test ????????????????????????????. 109 8.2 The Case of H 0 : x y ? ?= Rejected Leading to the Two- Independent Sample t-Test (or the t-Prime Test) ????????????........................ 117 8.3 The Impact of Overlap on Type II Error Probability for the Paired t-Test (i.e., the Randomized Block Design) when Process Variances are Unknown ????????????????????????????. 121 9.0 Conclusions and Future Research ????????????????.. 127 10.0 References ?????????????????????????.... 132 Appendices ???????????????????????????? 136 Appendix A ?????????????????????????.. 137 Appendix B ?????????????????????????.. 139 xi LIST OF TABLES Table 1. The Relative Power of Overlap as Compared to the Standard Method for Different Sample Sizes n and /( 2)?? Combinations ??????? 24 Table 2. Summary Conclusion of ? and ?? ???????????????. 26 Table 3. Type ? Error Prs at ? = 0.05 from the Standard Method and at ? =0.16578 from the Overlap Method ??????????????????.. 26 Table 4. The Type I Error Pr of Two Individual CIs with Different k at ? = 0.05 and 0.01 ???????????????????????????. 30 Table 5. Values of ? Versus k at ? = 0.05 and ? = 0.01 ??????????. 34 Table 6A. The Relative Power of Overlap with the Standard Method for Different Sample Sizes n and /( 2)?? Combinations for the Case of Known but Unequal Variances ????????????????????? 38 Table 6B. RELEFF of Overlap to the Standard Method at ? = 0.05 and K=1 ??.. 42 Table 7. Type I Errors for Overlap and Bonferroni Methods at ? =0.05 ???? 47 Table 8. The Impact of Bonferroni on Percent Overlap at Different k ????? 49 Table 9. Type ? Error Pr for the Standard, Overlap, and Bonferroni Methods with Different k and d Combinations ????????????????. 50 Table 10. The Values of ? and ?' for Various Values of ? x and ? y ??????.. 56 Table 11. The Impact of Overlap on Type I error Pr for the Equal-Sample Size Case xii When Testing the Ratio 22 xy /? ? Against 1 ???????????? 58 Table 12. The % Overlap for the Different Combinations of Degree of Freedom at ? =0.05. ?????????????????????????.. 60 Table 13. The % Overlap for the Case of ? =0.05 and n x = x y = n ??????? 61 Table 14. The Overlap Significance Level,? , that Yields the Same 5%-Level Test or 1%-Level Test by the Standard Method ???????????? 64 Table 15. The Overlap Significance Level, ? , That Yields the Same 5%-Level Test or 1%-Level Test by the Standard Method at Fixed y ? and Changing x ? .. 65 Table 16. The Relative Power of the Overlap to the Standard Method for Different df Combinations at ?=1.2 ?????????????????? 67 Table 17. Type II Error for Different Degrees of Freedom ?????????.. 67 Table 18. Type II Error Pr for Overlap Method at Different ? and ? Combinations.. 69 Table 19. Comparison of Exact Type II Error Pr with That of the Overlap Method for Different df and ? Combinations ??????????????. 70 Table 20. The Pooled ?? Values for Different n x , n y and F 0 Combinations ???. 78 Table 21. Verifying the Inequality that min( ? x , ? y ) < ? < ? x + ? y for Different x ? and y ? Combinations ?????????????????????.. 80 Table 22. The Value of r ? for Different F 0 and R n Combinations ??????? 96 Table 23. The? Value for Different Combinations of x n , y n and n R at Either 0 F = 0.90, , x y F ? ? or 0 F = 0.05, , x y F ? ? ?????????????????. 104 xiii LIST OF NOTATION SMD sampling distribution x ? mean of population X y ? mean of population Y x mean of sample X y mean of sample Y 2 x S variance of sample X = 2 ()/(1) 1 n x xx n x i i ? ? ? = 2 y S variance of sample Y 2 x ? variance of population X 2 y ? variance of population Y 0 H null hypothesis 1 H alternative hypothesis CI confidence interval CIL confidence interval length ()L ? lower (1?? )% CI limit for? ()U ? upper (1?? )% CI limit for? L A lower bound for acceptance interval xiv U A upper bound for acceptance interval Pr probability LOS level of significance ? type ? error Pr = Pr(reject H 0 | H 0 is true) by the Standard Method ?? type I error Pr from two overlapping CIs 1 ?? type I error Pr from two overlapping CIs for the one-sided alternative B ?? type I error Pr using the Bonferroni procedure ? type ? error Pr = Pr(not rejecting H 0 | H 0 is false) ?? type ? error Pr from overlapping CIs 1 ? type ? error Pr for the one-sided alternative B ?? Bonferroni type ? error Pr SE population standard error se sample standard error K Standard Error ratio for populations, K= (/ )/(/ )nn x xyy ?? k standard error ratio for samples, k = (/ )/(/ ) x xyy SnSn df degrees of freedom; the symbol ? will denote df OC Curve operating characteristic curve ? amount of overlap length between the two individual CIs ? r borderline value of ? at which H 0 is barely rejected at the LOS ? . ? the exact percentage of the overlap r ? maximum percent overlap below which H 0 must be rejected at or below ? xv N(?, ? 2 ) a normal Pr density function (pdf) with population mean ? and population variance ? 2 ?(z) the cumulative distribution function (cdf) of the standard normal density at point z. PWF power Function (The graph of 1?? versus the parameter under H 0 ) F 0 the ratio of two sample variances, 22 / x y SS R n the ratio of two sample sizes, / y x nn ? x y ? ?? ? equals 22 ()// / x yxy nn?? ? ??+, which represents the noncentrality parameter of the pooled-t statistic ? studentized ? = ? x ?? y when 22 x y ? ?? , i.e., ? = 22 (/)(/)()/ x yxxyy Sn Sn?? +? LUB least upper bound GLB greatest lower bound RELEFF relative efficiency ARE asymptotic RELEFF 1 1.0 Introduction When testing the equality of means of two processes, the sampling distribution (SMD) of the difference of two sample means must be used to conduct statistical inference (confidence intervals and test of hypothesis) about the corresponding processes? mean difference ? x ? ? y . An interesting problem arises as to whether the same conclusions will be reached if the SMD of individual sample means are used to construct separate confidence intervals for ? x and ? y and examine the amount of overlap of the individual confidence intervals in order to make statistical inferences about ? x ? ? y . If the underlying distributions are normal with known variances, exact relationships are given by Schenker and Gentleman, (2001) about the changes in the type I and II error probabilities if the overlapping of individual confidence intervals are used to make inferences about ? x ? ? y at the 5% level. Because there is no mention of proof in the above article, we will use the normal theory to generalize their formulas in chapter 3 for any LOS ? and will verify that in order to attain a nominal type I error rate of 5%, the corresponding two confidence levels must be set exactly at 83.42237%, which is nearly consistent with the 85% reported by Payton et al. (2000). When the process variances are unknown and sample sizes are small (i.e., the real-life encountered cases), this dissertation will obtain exact formulas for type ? and II error probabilities whose values can be obtained once the unbiased estimators, 2 x S and 2 y S , of process variances are realized. However, this dissertation will verify that in general 2 Using individual confidence intervals diminishes type I error rate, depending on sample sizes n x and n y , and increases type II error probability. Assessment of type ? error probability (?) for the general unknown variances case has not been investigated in the literature because the computation of type II error probability requires the use of noncentral t-distribution, although Schenker and Gentleman (2001) provide the impact of Overlap on the Power Function (PWF =1 ? ?) only for the limiting case in terms of n x and n y (or the known-variances case). The noncentral t-distribution has wide-spread applications when testing a hypothesis about one or two normal means. Specifically both the OC (Operating Characteristic) and Power Function (PWF =1 ? ?) for testing H 0 : ? = ? 0 and H 0 : ? x ? ? y = ? 0 (in the unknown variance cases) are constructed using the noncentral t-distribution. We will use the noncentral t-distribution to obtain the PWF of testing H 0 : ? x ? ? y = 0 (in the unknown variance cases) both using the SMD of xy? (i.e., the Standard method which has been available in statistical literature for well over 50 years) and also the Overlap for sample sizes ? 2. It will be determined that the type II error rate always increases if individual confidence intervals are used to make inferences about ? x ? ? y . Even if the underlying distributions are not Laplace-Gaussian*, the t- distribution can still be used for statistical inferences about two process means for moderate and large sample sizes because the application of the t-distribution requires the ----------------------------------------------------------------------------------------------------------- * Kendal and Stuart (1963, Vol.1, p.135) report that ?The description of the distribution as the ?normal,? due to Karl Pearson (who is known for the definition of product-moment correlation coefficient and Pearson System of Statistical distributions), is now almost universal among English writers. Continental writers refer to it variously as the second law of Laplace, the Laplace distribution, the Gauss distribution, the Laplace-Gauss distribution and the Gauss-Laplace distribution. As an approximation to the binomial it was reached by DeMoivre in 1738 but he did not discuss its properties.? 3 assumption that only sample means be approximately normally distributed (due to the Central Limit Theorem). Investigation of the overlapping CIs is worthy because as Schenker and Gentleman (2001) mentioned in the article ?On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals? that there are many articles, such as Mancuso (2001), that still use the Overlap method for testing the equality of two population quantities. Although we found some articles, such as Payton et al. (2000) entitled ?Testing Statistical Hypotheses Using the Standard Error Bars and Confidence Intervals? that have somewhat rectified the Overlap problem and have pointed out the misconceptions therein, there are still some details to be worked out. Thus, the objective is to investigate the exact differences between the Overlap method and the Standard [a term coined by Schenker and Gentleman (2001)] method for testing the null hypotheses H 0 : ? x = ? y and H 0 : ? x = ? y under different assumptions. The former hypothesis has never been investigated with the Overlap method. The statistical literature reports results for the impact of Overlap on type I and II error probabilities in testing H 0 : ? x = ? y only for the case of large sample sizes (i.e., the limiting case where n x and n y ??). Therefore, this work will investigate the same and other aspects of Overlap but for small sample sizes (i.e., n ? 20, which also will hold true for moderate and large sample sizes). To be on the conservative side, we refer to n ? 20 as small, 20 < n ? 50 as moderate and n > 50 as large in this dissertation, although some statisticians prefer n > 60 as large because for n > 60, , tZ ?? ? ? to one decimal place, where Z ? represents the (1 ? ?) quantile of a standard normal deviate. 4 The contents of different chapters are as follows: In chapter 2, an extensive literature survey and the results thus far are provided. In chapters 3.1 and 3.2, the known- variance case is discussed and compared with what has been reported without proof in the literature for the limiting case. In chapter 4, the Bonferroni method is compared against the overlap. In chapter 5, the statistical inference on the ratio of two process variances ( 22 / x y ? ? ) from the Overlap is compared against the Standard method. Chapter 6 discusses the impact of Overlap on type I error probability. Chapter 7 discusses the amount and % overlap required to reject H 0 : ? x = ? y at the ?-level of significance when process variances are unknown and samples sizes are small and moderate. Similarly, chapter 8 considers the impact of Overlap on type II error Pr when process variances are unknown for n x and n y ? 50. Finally, chapter 9 summarizes the dissertation findings. In summary, the primary objectives of this dissertation are: (1) To examine the impact of Overlap procedure on type I error probability (Pr) when testing equality of two process variances or two population means for unknown process variances and sample sizes ? 2. Payton et al. (2000) obtained results for the latter objective but there are inaccuracies (for n < 50) in their development; further, the former objective has not been investigated. Moreover, the Overlap literature has not considered the case of pooled t- test and little has been mentioned by Schenker and Gentleman (2001) about the paired t- test. (2) To determine the maximum % overlap of two individual confidence intervals (CIs) below which the null hypothesis (either H 0 : ? x = ? y or H 0 : ? x = ? y ) cannot still be rejected at a given level of significance (LOS) ?. This Objective has not yet been investigated. (3) To examine the impact of Overlap procedure on type II error Pr for 5 sample sizes n x and n y ? 2. Schenker and Gentleman (2001) carried out this last objective only for limiting case (i.e., as n x and n y ? ?, and or known ? x and ? y ). The above objectives are worthy of further investigation because there are many researchers who still use the overlapping CIs to test hypotheses, especially in the biology and medical papers (see the references mentioned in chapter 3.1). Furthermore, some statistical software, such as Minitab, still exhibit overlapping CIs that may lead users to wrong conclusions. Although the CI for two population quantities is the common method to make decisions regarding H 0 : ? x = ? y or H 0 : ? x = ? y , our objective is to ascertain the exact relationship between the overlapping of two individual CIs and the corresponding single CI. Most former researches have only discussed the limiting case (i.e., as n x and n y ? ?). In the real-life situations, sufficient resources may not be available to gather very large samples. Thus, the case of small (n ? 20) to moderate sample sizes (20 < n ? 50) is a major contribution of this dissertation. The reader should bear in mind that all primed symbols in this dissertation pertain to the Overlap method. 6 2.0 Literature Review It has been well known that when two underlying populations are normal, the null hypothesis H 0 : x y ? ?= is tested, in case of known variances, using the sampling distribution of X Y? , which is also the Laplace-Gaussian 22 (,/ /) x yxx yy Nnn??? ??+. However, in practice, rarely the population variances are known. Thus, the equality of two process variances should first be tested with an F-statistic. If H 0 : 22 / x y ? ? = 1 is not rejected (and P-value > 0.20), the two-sample pooled-t procedure will be applied for testing H 0 : x y ? ?= . Otherwise, the two-independent sample t-statistic has to be used to perform statistical inferences about x y ? ?? . In case of related samples (or paired observations), the paired t-statistic has to be used to conduct statistical inferences about ? x ? ? y . The above rules are the formal (or Standard) procedures for testing H 0 : x y ? ?= . What if we discuss this question with two individual relevant intervals? Cole et al. (1999) mentioned that using two individual CIs to test the null hypothesis H 0 : x y ? ?= would lead to a smaller type ? Error and larger type ? error rate than the formal procedures. Payton et al. (2000) pointed out that many researchers use the standard error bars (sample mean ?standard error of the mean) to test the equality of two population means. Therefore, if the two individual standard error bars fail to overlap, they will conclude 7 that two sample means are significantly different. Actually, these researchers are making a test of hypothesis with an approximate Pr(type ? error) = ? = 0.16 not ? = 0.05. Payton et al. (2000) also derived a formula for the probability of the overlap from two individual CIs. Payton et al. (2000) defined A to be the event that confidence intervals computed individually for two population means to overlap. Thus, if the sample sizes are equal ( 12 nn n==), and population variances are unknown, they deduced that Pr( ) PrA = 22 12 1 2 ,1, 1 22 22 12 12 () ( ) n nY Y S S F SS SS ? ? ?? ?+ < ?? ++ ?? . They state that the random variable 2 12 22 12 ()nY Y SS ? + has the F-distribution with numerator degrees of freedom (df) ? 1 = 1 and denominator df ? 2 = (n ? 1) if the two samples are from the same normal population. It will be shown in Chapter 6 that their above statement is inaccurate. The two samples need not originate from the same population, and that the denominator df of the F-distribution is not (n ? 1) but rather in the case of n 1 = n 2 = n it is given by 222 12 44 12 (1)( )nSS SS ? ?+ = + , where (n ? 1) < ? < 2(n ? 1). Further, they state that if the two samples are from two different normal populations with the same mean but unequal variances, the quantity 2 12 22 12 ()nY Y SS ? + is still approximately F-distributed with ? 1 = 1 and ? 2 = (n ?1) df , where their value of ? 2 = (n ?1) df is accurate only in the limiting case. Therefore, they conclude that 12 1, 1 ,1, 1 22 12 2 Pr( ) Pr( _ ) Pr 1 nn SS A Intervals overlap F F SS ??? ? ? ?? =?<+? ? ?? + ? ? ?? ? ? . Payton et al. (2000) further state that for the 95% CIs, n x = n y = n = 10, S 1 (= S x ) = 0.80 and S 2 = 1.60, 1 ? 1,9 0.05,1,9 Pr()1Pr( _ )1Pr( 1.8)A Intervals overlap F F=? ?? < ? = 1 ? 8 0.9859 = 0.0141 (which was misprinted as 0.0149). It will be shown in Chapter 6 that this last Overlap Pr should be revised to ?? = 0.00608057, i.e., their result has a relative error of 56.8754% Moreover, Payton et al. (2000) used SAS Version 6.11 to simulate from a N(0, 1) when sample sizes varied from n = 5 to n = 50 in order to ascertain the accuracy of the above formula. In this article, the authors do not give information about the known variances case. For the unknown variances case, they only consider the case when the sample sizes are equal. The largest sample size Payton et al. (2000) considered was n = 50. Furthermore, Schenker and Gentleman (2001) found more than 60 articles in the health sciences for testing the equality of two population means by using the Overlap method. Schenker and Gentleman (2001) state that the Overlap method will fail to reject H 0 when the Standard method would reject it. In other words, the Overlap will lead to less statistical power than the Standard method. The authors considered three population quantities Q 1 , Q 2 and 12 QQ? . They state that Brownlee (1965) provided the 95% confidence intervals for the three quantities as l m 11 1.96QSE? , m n 22 1.96QSE? and l m mm 22 12 12 ( ) 1.96Q Q SE SE?? + . However, using the Overlap method, the null hypothesis will not be rejected if and only if l m m n 12 1 2 ( ) 1.96( )Q Q SE SE?? + contains zero. Schenker and Gentleman (2001) defined k as the limiting SE (standard error) ratio, i.e., either SE 1 /SE 2 or SE 2 /SE 1 , and considered only ratios that are greater than or equal to 1. For a limiting SE ratio of k and a standardized difference of d = 22 12 1 2 ()/Q Q SE SE?+, they reported that the asymptotic power for the standard method is (1.96 )d? ?++ (1.96 )d?? ? , where ? represents the cdf of N(0, 1), and the asymptotic power for the 9 Overlap method is 2 1.96(1 ) () 1 k d k ?+ ?++ + 2 1.96(1 ) () 1 k d k ? + ? ? + . Note here, in this dissertation, we use different definition for k from Schenker?s definition. Schenker and Gentleman (2001) use small case k as the standard error ratio for the limiting sample sizes or known variances cases but we use k as the standard error ratio for small to moderate sample sizes or unknown variances cases. It means k= (/ )/(/ ) x xyy SnSn in this dissertation. Therefore, for distinguishing, let K = (/ )/(/ )nn x xyy ?? represent the limiting sample sizes or known variances cases. In this chapter, we still use Schenker?s symbol to represent their work. Schenker and Gentleman (2001) also stated that for the Pr of type ? error, just simply let d = 0 in the above formulas. Then, the authors concluded that the Overlap method will lead to smaller ? and larger ?. Furthermore, Schenker and Gentleman (2001) state that when SE 1 is nearly equal to SE 2 , the Overlap method is expected to be more deficient (i.e., smaller type I error Pr and larger type II error Pr) relative to the Standard method. In this article, the authors did not give specific values of type ? error and type ? error probabilities for different k and d values. Their results pertain only to large sample sizes so that the need for using the t- distribution in the case of small and moderate sample sizes was not discussed. Payton et al. (2003) continued to provide the formula Pr(Intervals_Overlap), which they had also obtained in the year 2000 as follows: 12 1, 1 ,1, 1 22 12 2 Pr( ) Pr( _ ) Pr 1 nn SS A Intervals overlap F F SS ??? ? ? ?? =?<+? ? ?? + ? ? ?? ? ? . Payton et al. (2003) state that a large-sample version of the above statement can be derived (assuming the two populations are identical): 10 /2 Pr( ) Pr( _ ) Pr | | 2A Intervals overlap Z z ? ? ? =?< ? ? = ?( /2 2 Z ? ) ??( ? /2 2 Z ? ), where ?(z) (almost) universally represents the cumulative of the standardized normal density function at point z. The authors set ? at the nominal value of 5%, generated 95% confidence intervals and gave the approximate probability of overlap as Pr( _ )Intervals overlap [ ] Pr 2.77 2.77 0.994Z?? << = . Thus, the authors concluded that ?the 95% CIs will overlap over 99% of the time?. They also mentioned that Schenker and Gentleman (2001) showed, for large sample sizes, that the probability of type ? error when comparing the overlap of 100(1-?)% confidence intervals is 2 /2 2Pr[ (1 )/ 1 ]Z zkk ? 60 and ? x and ? y are replaced by their biased estimates S x and S y , respectively. 3.1 The Case of ? x = ? y = ? Statistical theory suggests that the total resources N = n x +n y be allocated according to n x = /( ) x xy N? ??+ , and hence the allocation n x = n y = n = N/2 is recommended. Suppose that the two CIs for ? x and ? y are disjoint; then it follows that either L(? x ) > U(? y ), or L(? y ) > U(? x ). These two possibilities lead to the condition either /2 / x x x Zn ? ?? > /2 / yy yZ n ? ?+ , or /2 / yy yZ n ? ?? > x + /2 Z ? ? / x x n? , respectively. Combining the two conditions leads to rejecting H 0 : ? x = ? y iff?x ?y ?> /2 Z ? xxyy (/n /n)?+? ; for the case of ? x = ? y = ? and thus n x = n y = n, this last condition reduces to ?x ?y ?> 2 /2 Z ? /n? at the level of significance ? based on the Overlap method. If ? is set at the nominal value of 5%, this last inequality will lead to the same condition as that of Schenker et al. (2001) who stated that the two intervals overlap if and only if the interval m m 12 ()QQ? ? n n 12 1.96( )SE SE+ contains 0. Sometimes, it is then concluded that the null hypothesis H 0 : ? x = ? y must be rejected in favor of H 1 : ? x ? ? y at the LOS ?, such as Djordjevic et al. (2000), Tersmettte et al. (2001) and Sont et al. (2001) who used this concept to test H 0 : ? x = ? y . In fact, 14 Schenker and Gentleman (2001) state that they found more than 60 articles where the Overlap method was used either formally or informally to demonstrate visual significant difference between x and y . This procedure is not accurate because the correct (1 ? ?)?100% CI for the difference in means of two independent normal universes must be obtained from the SMD (sampling distribution) of the statistic xy? , which is also Gaussian with E( xy? ) = ? x ? ? y and V( xy? ) = V( x) + V(y) = 22 xx yy /n /n?+? = 2 2/n? , assuming xy ? ??==. Thus, the correct (1??)?100% CI on ? x ? ? y is given by xy? ? 22 /2 x x y y Z/n/n ? ?+? ? ? x ? ? y ? xy? + 22 /2 x x y y Z/n/n ? ?+? (1a) For the balanced design case and ? x = ? y = ?, Eq. (1a) reduces to xy? ? /2 2Z ? ? / n ? ? x ? ? y ? xy? + /2 2Z ? ? / n (1b) The length of the above exact (1 ? ?)?100% CI for a balanced design is /2 22Z ? ? / n . Thus, H 0 : ? x ? ? y = 0 must be rejected at the LOS ? iff (i.e., it is necessary and sufficient) that xy? > 22 /2 x y Z( )/n ? ?+? = /2 2 Z ? /n? . (1c) However, requiring that the two separate CIs to be disjoint leads to rejection of H 0 iff xy? > /2 x x y y Z ( /n /n) ? ?+? = /2 2Z ? ? /n? . It is clear that the requirement for rejecting H 0 of two disjoint CIs is more stringent (or more conservative) than that of the Standard method because, in the case of xy ? ??= = and n x = n y = n, /2 2Z / n ? ? > 22 /2 x x y y Z/n/n ? ?+? = /2 2 Z ? /n? . Further, the more stringent requirement to reject H 0 (based on two independent separate CIs) leads to a smaller type I 15 error Pr than the specified ?. The correct value of ? using the Standard method is given by ? = Pr ( x ?y< A L , or x ?y> A U ? xy 0? ?? = ) = 22 /2 x x y y x y Pr[x y Z /n /n 0] ? ?> ? +? ? ?? = = /2 Pr( Z Z ) ? > = ?, where A L and A U denote the lower and upper ?-acceptance limits, respectively. On the other hand, if we require that the two individual CIs must be disjoint in order to reject H 0 : ? x ? ? y = 0, then the type I error Pr from Overlap is given by ??= Pr( y ? /2 y Z/n ? ? > x+ /2 x Z/n ? ? ) + Pr( x ? /2 x Z/n ? ? > y + /2 y Z/n ? ? ) = 2?Pr[Z > 22 /2 x y x y Z( )/( ) ? ?+? ?+? ] = 2?Pr(Z > /2 2 Z ? ) = 2??(? /2 2 Z ? ), assuming ? x = ? y = ? (2) ? Setting ? at 0.01 leads to the Overlap LOS of ?? = 0.00026971696 < < 0.01. The % relative error, [( ) / ] 100%?????? , in the LOS ? = 0.01 is [(0.01?0.0026971696)/0.01] 100%? = 97.303%. ? For the nominal value of ? = 0.05, Eq. (2) gives ?? = 0.00557459668 << 0.05. The value of ?? = 0.00557459668 is consistent with the limiting value of 0.006 provided by Payton et al. (2003, p.36) in their equation (6). The % relative error is 88.851%. As a result, the larger the LOS ? is, the smaller the % relative error becomes. Payton et al. (2000) provide simulation results of run sizes 10,000 from two independent N(0, 1) populations in their column 3 of TABE 1, p. 551, that claim the value of ?? ranges from 0.0039 at n = 5 to 0.0055 at n = 50 (n incremented by 5). Our Eq. (2) shows that in the case of 16 known equal variances and sample sizes the value of Overlap type I error Pr does not depend on n at all. However, their simulation inaccuracies were rectified by Payton et. al (2003, Table 4) again through simulation run sizes of 10,000 independent pairs from N(0, 1). ? Setting ? at the maximum widely-accepted LOS of 10%, Eq. (2) shows that ?? = 0.020009254 << 0.10 and the % relative error is [(0.10 ? 0.020009254)/0.10] ?100% = 79.99%. Regardless of the value of LOS ? the same conclusion made by Cole et al. (1999) will be reached that the Overlap method will lead to a much smaller type ? error rate. If the alternative H 1 is one-sided, say H 1 : ? x ? ? y > 0, then from the Overlap standpoint H 0 should be rejected only if both conditions x ?y > 0 and L(? x ) ?U(? y ) > 0 [or L(? x ) > U(? y )] hold, and as a result the Overlap type I error Pr reduces to 1 ?? = Pr[ x ? Z 0.025 x /n? > y+ Z 0.025 y /n? ] = Pr[ x ?y> Z 0.025 x /n? + Z 0.025 y /n? ] = Pr[ x ?y>Z 0.025 xy ()/n?+? ] = Pr(Z > 22 0.025 x y x y Z( )/( )?+? ?+? ) = Pr(Z > 0.025 2 Z ); assuming ? x = ? y = ?, 1 ?? = 0.0027873 < < 0.05. Thus the impact of Overlap on type I error Pr is even greater for a one-sided alternative than for the 2-sided one. Note that when L(? y ) > U(? x ), the two CIs are disjoint but such an occurrence is congruent with H 0 : ? x ? ? y ? 0 rather than H 1 : ? x ? ? y > 0. Thus for the one-sided alternative the type I error Pr from the Overlap is exactly half of the 2-sided alternative, which was equal to 0.00557459668. Henceforth, unless specified otherwise, the alternative is two-sided. 17 Now, let ? represent the amount of overlap length between the two individual CIs, a variable that has not been considered in the Overlap literature. From Figures (1a &b), ? will be zero if either L(? x ) > U(? y ) or L(? y ) > U(? x ), in which case H 0 : ? x = ? y is rejected at the LOS < ?. Thus, ? is larger than 0 when U(? x ) > U(? y ) > L(? x ) or U(? y ) > U(? x ) > L(? y ). The overlap is 100% if U(? x ) ? U(? y ) > L(? y ) ? L(? x ), or if U(? y ) ? U(? x ) > L(? x ) ? L(? y ). Because both conditions U(? x ) > U(? y ) > L(? x ) and U(? y ) > U(? x ) > L(? y ) will lead to the same result, only the case of U(? x ) > U(? y ) > L(? x ) [Figure 2(a)] for which x y?? 0 is discussed here. See the illustration in Figure 2(a&b). Figure 2(a) Figure 2(b) That is, for the known-variance case, the larger sample mean will be denoted by x . Thus for the equal-sample-size &-variance case, ? = U(? y ) ? L(? x ) = ( /2 /yZ n ? ?+? ) ? ( /2 /x Zn ? ??? ) = 2 /2 /Z n ? ? ? ( x y? ) (3a) On the other hand, the span of the two individual CIs (assuming x y> ) is given by U(? x ) ? L(? y ) = /2 /x Zn ? ?+? ? ( /2 /yZ n ? ??? ) = 2 /2 /Z n ? ? + ( x y? ) (3b) () y U ? () y U ? () x U ? () x U ? () x L ? () x L ? () y L ? () y L ? 18 Combining equations (3a & 3b) gives the exact % overlap as ? = /2 /2 2/() 2/() Z nxy Z nxy ? ? ? ? ? ? + ? ?100% (3c) Let r O be the borderline value of ?at which H 0 is barely rejected at the LOS ?. From Eq. (1c), H 0 : ? x = ? y should be rejected iff x y? ? /2 2/Z n ? ? . Therefore, from Eq.(3a) the value of ?at which H 0 should be rejected at the ?-level or less is given by ? ? 2 /2 /Z n ? ? ? /2 2/Z n ? ? , and the exact amount of overlap that leads to an ?- level test is given by r O = /2 (2 2) /Z n ? ?? (3d) Eq. (3d) implies that H 0 must be rejected at the LOS ? or less iff ? ? /2 (2 2) /Z n ? ?? . Inserting the borderline rejection condition, x y? = /2 2/Z n ? ? , into Eq. (3b) yields U(? x ) ? L(? y ) = /2 2/Z n ? ? + 2 /2 /Z n ? ? = /2 (2 2) /Z n ? ?+ . (3e) Eq. (3e) implies that if the two CIs span larger than /2 (2 2) /Z n ? ?+ , then H 0 must be rejected at the LOS less than ?. The percent overlap in Eq. (3c) ranges from zero (occurring when x y? = 2 /2 /Z n ? ? ) to 100% (occurring when x y? = 0). Inserting the borderline value of x y? = /2 2/Z n ? ? at which H 0 must be rejected into Eq. (3c) results in r ? = /2 /2 (2 2) / (2 2) / ? + Z n Z n ? ? ? ? ?100% = 22 22 ? + ?100% = 17.1573% (3f) which means that H 0 : ? x = ? y must be rejected at the LOS ? or less if the percent overlap between the two individual CIs is less than or equal to 17.1573%. It seems that the percent overlap at which H 0 should barely be rejected is 17.1573% regardless of the LOS ?, but the amount of overlap from (3a) does depend on ?. Further, as ||x y? increases, 19 the P-value of testing H 0 : x y ? ?= decreases and so does the value of % overlap in Eq. (3c). As ||x y? ? /2 2/Z n ? ? , ? ? 0. Thus, in the case of known xy ? ??==, once the % overlap exceeds 17.1573%, then H 0 : x y ? ?= must not be rejected at any ? level. If the alternative is one-sided, H 1 : ? x ? ? y > 0, it can be argued that the maximum percent overlap is given by () ( ) () () YX XY UL UL ? ? ? ? ? ? = /2 /2 22ZZ ZZ ?? ?? ? + (3g) and for a 5%-level test Eq. (3g) reduces to 0.025 0.05 0.025 0.05 (2 2) (2 2) ZZ ZZ ? + = 25.51597%, which implies that H 0 can be rejected at less than 5% level if the percent overlap between the two individual CIs is smaller than 25.51597%. Thus, the impact of overlap on ? r is greater for a one-sided alternative because for the 2-sided alternative the value of ? r = 17.15729%. Further, for the one-sided alternative the % overlap does depend on ?. As an example, for a 10%-level one-sided test the value of ? r increases to 28.96%. The question now is what individual confidence levels, (1? ?), should be used that will lead to an exact ??level test? Clearly, the overlap amount for a (1? ?)?100% CI is given by yx U( ) L( )??? ? ? = ( y+ /2 Z/n ? ? ) ?( x ? /2 Z/n ? ? ) = 2 /2 Z/n ? ?? ( x ?y ) (4) Because H 0 : x y ? ?= must be rejected iff ||x y? ? /2 2/Z n ? ? and the overlap must become zero or less in order to reject H 0 , Eq. (4) shows that 2 /2 Z/n ? ? = /2 2/Z n ? ? ? /2 Z ? = /2 /2Z ? ? ?/2 = ?( ? /2 /2Z ? ) (5) 20 Eq. (5) shows that the confidence level for each individual interval must be set at (1 )??= /2 12 ( /2)Z ? ???? in order to reject H 0 at the LOS ? iff the two CIs are disjoint. The value of (1 )?? can also be obtained by equating the span of the two independent CIs, 2 /2 Z/n ? ? + ( x ?y ), to the length of the CI from the Standard method given by 2 /2 2Z ? ? / n , and invoking the rejection condition x ?y= /2 2/Z n ? ? . ? If ? is set at 0.01 in Eq.(5), then ? = 0.068548146, 1 ?? = 0.931451854, which implies that the confidence level of each individual interval must be set at 0.931451854 in order to reject H 0 at the 1% level iff the two CIs are disjoint. ? If ? = 0.05 is substituted in Eq.(5), then ? = 0.165776273, 1 ?? = 0.834223727, which implies that the confidence level of each individual interval must be set at 83.4223727% in order to reject H 0 at the 5% level iff the two CIs are disjoint. This assertion is in fair agreement with the simulation results given in TABLE 1 of Payton et al. (2000, p. 551) for 15 ? n ? 50. Their TABLE 1, although inaccurate at n = 5 & 10, clearly shows that as n increases toward n = 50, the size of adjusted CIs is equal to 83.835%, which is very close to the exact 1 ?? = 83.422372710%. ? Further, when the confidence level 1?? = 0.90, then 1 ?? = 0.755205856. The first and third 1 ?? values have not been reported in Overlap literature. If the alternative is one-sided, H 1 : ? x ? ? y > 0, it can be argued that the value of 1 ?? is given by1??= 1?2(Z/2) ? ?? . If ? = 0.05 is substituted into this last equation, then 21 the one-sided 1 ?? = 0.75520585634665, which implies that the confidence level of each individual interval must be set at 0.75520585634665 in order to reject H 0 at the 5% level iff the two CIs are disjoint, while for the 2-sided alternative 1 ?? was equal to 0.83422372710. Again, the impact of Overlap on individual confidence levels is greater for the one-sided alternative than that of the 2-sided one. Lastly, since rejecting H 0 : x y ? ?= using the two independent CIs is more stringent than the SMD of x y? , therefore, it will lead to many more type II errors (or much less statistical power) in testing H 0 : ? x ? ? y = 0, as shown below. In Figure 3, the solid line represents the null distribution of xy? , and the dotted line curve represents the distribution of xy? under H 1 , where ? = ? x ? ? y > 0 is the amount of specified shift in ? x ? ? y = ? from zero, which in Figure 3 exceeds one standard error of xy? . Figure 3 clearly shows that the acceptance interval (AI) for the 1?? xy? 22 xy n ?+? /2? /2? ? 22 sample mean difference, x ?y , when testing H 0 : 0 xy ? ?? = at the LOS ? is given by AI = (A L , A U ) = [ 22 /2 x y Z( )/n ? ??+? , 22 /2 x y Z( )/n ? ?+? ], i.e., in the case of ? x = ? y = ? and n x = n y = n we cannot reject H 0 at the significance level ? if our test statistic x ?y falls inside the AI = (A L , A U ) = (? /2 2Z ? /n? , /2 2Z ? /n? ). Thus, the Pr of committing a type II error as shown in Figure 3 is given by ? = Pr[A L ? x ?y ? A U ? ? x ? ? y = ?] = Pr[ 22 /2 x y Z( )/n ? ??+? ? x ?y ? + 22 /2 x y Z( )/n ? ?+? ? ? x ? ? y = ?] = Pr[ x ?y ? 2 /2 Z2/n ? ? ?? > 0] ? Pr[ x ?y ? ? 2 /2 Z2/n ? ? ?? ] (6a) = ?( /2 Z ? ? n 2 ? ? ) ? ?( /2 Z ? ?? n 2 ? ? ), where ? = ? x ? ? y . (6b) At ? = 0.05 if the specified value of ? x ? ? y = ? exceeds 0.5 22 xy ?+? , then the value of standard normal cdf ?( ? Z 0.025 ? 0.5 n ) < 0.001 for sample sizes n ? 6, i.e., the last term on the RHS of equation (6a), becomes less than 0.001 once n ? 6. Hence, Eq. (6b) for the nominal value of ? = 5% approximately reduces to ? ? ?(Z 0.025 ? n/2/? ?) (6c) where (6c) is accurate to at least 3 decimals for n ? 6 and ? > 0.5 22 xy ?+? = 0.5 2? . When the null hypothesis H 0 : ? x ? ? y = 0 is not rejected at the LOS ? iff the two individual CIs ( x ? /2 x Z/n ? ? ? ? x ? x+ /2 Z ? x /n? ) and ( y? /2 Z ? y /n? ? ? y ? y+ /2 Z ? y /n? ) are overlapping, then the Pr of a type II error (assuming ? x > ? y ) 23 from the Overlap Method is given by ?' = Pr(Overlap?? > 0) = Pr{[ ( ) ( ) x y LU? ?? ]?[( ) ( ) yx LU? ?? ]| 0? > } = Pr{[ x ? /2 Z ? / x n? ? y+ /2 Z ? / y n? ]?[ /2 y yZ n ? ? ? ? x+ /2 Z ? / x n? ]| 0? > } = Pr{[ x ?y ? /2 Z ? / x n? + /2 Z ? / y n? ] ? [ /2 y Z n ? ? ? ? /2 Z ? / x n? ? x y? ]| 0? > } When ? x = ? y = ?, and n x = n y = n, the SE( x ?y) = 2/n? and as a result ??= Pr{[ x ?y ? 2 /2 Z ? / n? ] ?[?2 /2 Z ? / n? ? x y? ]| 0? > } = Pr{[?2 /2 Z ? / n? ? x ?y ? 2 /2 Z ? / n? ]| 0? > } (7a) = ?( /2 2 Z ? ? n 2 ? ? ) ? ?( /2 2 Z ? ? ? n 2 ? ? ) (7b) Since the cdf of the standard normal density ?(z) is a monotonically increasing function of z, comparing Eq. (6b) with Eq. (7b) shows that ?( /2 2Z ? ? n 2 ? ? ) > ?( /2 Z ? ? n 2 ? ? ) & ?( /2 2Z ? ? ? n 2 ? ? ) < ?(? /2 Z ? ? n 2 ? ? ). The above two conditions lead to ??= ?( /2 2Z ? ? n 2 ? ? ) ? ?( /2 2Z ? ? ? n 2 ? ? ) > ?( /2 Z ? ? n 2 ? ? ) ? ?(? /2 Z ? ? n 2 ? ? ) = ?. and as a result 1 ??? <1??, i.e., using individual CIs loses statistical power as illustrated in Table 1 (for n = 10, 20, 40, 60 and 80 at ? = 0.05). Table 1 clearly shows that the Pr 24 Table 1. The Relative Power of Overlap as Compared to the Standard Method for Different Sample Sizes n and /( 2)?? Combinations n 2 ? ? 1?? 1 ??? [ ]100% 1 ???? ?? n 2 ? ? 1?? 1 ??? [ ]100% 1 ???? ?? 10 0 0.050000 0.005575 88.850807 10 0.2 0.096935 0.016535 82.941952 10 0.2 0.096935 0.016535 82.941952 30 0.2 0.194775 0.046889 75.926797 10 0.4 0.244141 0.065946 72.988712 50 0.2 0.292989 0.087310 70.200085 10 0.6 0.475101 0.190941 59.810521 70 0.2 0.387332 0.136000 64.887986 10 0.8 0.715617 0.404396 43.489888 90 0.2 0.475101 0.190941 59.810521 10 1 0.885379 0.651905 26.369907 110 0.2 0.554768 0.250096 54.918821 10 1.2 0.966730 0.846828 12.402799 130 0.2 0.625674 0.311552 50.205361 10 1.4 0.993192 0.951076 4.240407 150 0.2 0.687770 0.373606 45.678672 10 1.6 0.999031 0.988926 1.011467 170 0.2 0.741418 0.434816 41.353531 10 1.8 0.999905 0.998251 0.165374 190 0.2 0.787231 0.494017 37.246245 10 2 0.999994 0.999809 0.018425 210 0.2 0.825958 0.550319 33.372049 20 0 0.050000 0.005575 88.850807 230 0.2 0.858407 0.603086 29.743564 20 0.2 0.145473 0.030356 79.132788 250 0.2 0.885379 0.651905 26.369907 20 0.4 0.432158 0.162818 62.324449 270 0.2 0.907642 0.696558 23.256232 20 0.6 0.765259 0.464729 39.271657 290 0.2 0.925899 0.736982 20.403620 20 0.8 0.947141 0.789850 16.606937 310 0.2 0.940785 0.773239 17.809209 20 1 0.994000 0.955465 3.876765 330 0.2 0.952858 0.805484 15.466533 20 1.2 0.999671 0.995267 0.440547 350 0.2 0.962600 0.833939 13.365991 20 1.4 0.999991 0.999758 0.023375 400 0.2 0.979327 0.890313 9.089308 20 1.6 1.000000 0.999994 0.000573 450 0.2 0.988775 0.929332 6.011824 20 1.8 1.000000 1.000000 0.000006 500 0.2 0.994000 0.955465 3.876765 20 2 1.000000 1.000000 0.000000 600 0.2 0.998354 0.983297 1.508145 40 0 0.050000 0.005575 88.850807 700 0.2 0.999568 0.994127 0.544334 40 0.2 0.244141 0.065946 72.988712 800 0.2 0.999891 0.998043 0.184785 40 0.4 0.715617 0.404396 43.489888 900 0.2 0.999973 0.999377 0.059617 40 0.6 0.966730 0.846828 12.402799 1100 0.2 0.999999 0.999944 0.005488 40 0.8 0.999031 0.988926 1.011467 1300 0.2 1.000000 0.999995 0.000444 40 1 0.999994 0.999809 0.018425 1500 0.2 1.000000 1.000000 0.000032 40 1.2 1.000000 0.999999 0.000072 20 0.5 0.608779 0.296070 51.366706 60 0 0.050000 0.005575 88.850807 40 0.5 0.885379 0.651905 26.369907 60 0.2 0.340845 0.110745 67.508566 60 0.5 0.972127 0.864590 11.062062 60 0.4 0.872528 0.628007 28.024464 80 0.5 0.994000 0.955465 3.876765 60 0.6 0.996402 0.969657 2.684165 100 0.5 0.998817 0.987066 1.176501 60 0.8 0.999989 0.999693 0.029611 120 0.5 0.999782 0.996589 0.319361 60 1 1.000000 1.000000 0.000032 140 0.5 0.999962 0.999167 0.079444 80 0 0.050000 0.005575 88.850807 160 0.5 0.999994 0.999809 0.018425 80 0.2 0.432158 0.162818 62.324449 180 0.5 0.999999 0.999959 0.004033 80 0.4 0.947141 0.789850 16.606937 200 0.5 1.000000 0.999991 0.000841 80 0.6 0.999671 0.995267 0.440547 220 0.5 1.000000 0.999998 0.000168 80 0.8 1.000000 0.999994 0.000573 240 0.5 1.000000 1.000000 0.000032 80 1 1.000000 1.000000 0.000000 260 0.5 1.000000 1.000000 0.000006 25 of type ? error from two individual CIs is always larger than the Pr of type ? error from the Standard method (i.e.,??>?). Thus, the statistical power of Overlap method is less than that of the standard method (1 1??? 0, clearly the expression for ?' given in Eq. (7b) stays in tact but the Standard method type II error Pr becomes ? 1 = ?( Z ? ? n/2/??), where ? = ? x ? ? y . Because for ? > 0, ?' = ?( /2 2 Z ? ? n/2/??) ? ?( /2 2 Z ? ? ? n/2/??) > ? = ?( /2 Z ? ? n/2/? ?) ? ?( /2 Z ? ? ? n/2/??) > ? 1 = ?( Z ? ? n/2/??), it follows that the impact of Overlap on type II error Pr for the one-sided alternative is greater that that of the two-sided alternative. Note that ? 1 becomes equal to ? only at ? = 0. 3.2 The Case of Known but Unequal Variances If variances of the two independent processes are known but not equal, then statistical theory dictates that the two sample sizes should be allocated according to n x = x xy N?? ?+? , n y = y xy N? ? ? +? , (8) where N = n x + n y = the total recourses available to the experimenter. The sample size allocations given in equations (8) lead to the minimum SE( xy? ) = xy ()/N?+? . Schenker and Gentleman (2001) make similar statement as above but did not use equation (8) to set the values of n x and n y . They use notational procedure by letting / / x x yy n k n ? ? = . Note that Schenker and Gentleman (2001) refer to the limiting value of small-letter k as the SE ratio because they investigated the impact of Overlap on type I and II error rates only when n x and n y ? ?. Since we discuss both limiting case 28 (populations) and small to moderate sample size cases in this dissertation, the small k refers to the standard error ration for samples, ie, / / x x yy Sn k Sn = and K refers to the SE ratio for populations, ie, K= / / x x yy n n ? ? = SE(x) SE(y)/ . Clearly, SE( xy? ) = 22 // x xyy nn??+ = 22 2 K/ / yy yy nn??+ = 2 1K/ yy n? + = SE( y) 2 1K+ (9a) Substituting equations (9a) into the Standard (1 ? ?)?100% CI: /2 SE( )x yZ xy ? ? ??? leads to xy? ? 2 /2 1K/ yy Z n ? ? +? ? x ? ? y ? xy? + 2 /2 1K/ yy Z n ? ? + (9b) Thus, the Standard CIL equals to 2 /2 21K/ yy Z n ? ? + . Equation (9b) shows that the null hypothesis H 0 : ? x ? ? y = 0 must be rejected at the LOS ? iff the CI in equation (9b) excludes zero, or iff xy? > 2 /2 1K/ yy Z n ? ? + . (9c) However, requiring that the two independent CIs must not overlap in order to reject H 0 : ? x ? ? y = 0 at the LOS ?, is equivalent to requiring that either L(? x ) > U(? y ) or L(? y ) > U(? x ). These two inequalities lead to the Overlap rejection of H 0 : ? x ?? y = 0 iff xy? > /2 /2 //+ x xyY Z nZ n ?? ??= /2 (1 K ) / yy Z n ? ? + (10) Therefore, if the exact Pr of type I error is ? but we reject H 0 when the two independent CIs are disjoint, the Overlap type I error Pr reduces to ??=Pr[ y ? /2 y y Z/n ? ? > x+ /2 x x Z/n ? ? ]+Pr[ x ? /2 x x Z/n ? ? > y+ /2 y y Z/n ? ? ] 29 = 2?Pr( x y? > /2 y y Z(1K)/n ? ?+ ) = 2?Pr(Z > 2 /2 Z(1K)/1K ? ++) (11) which is identical to that of equation (7) provided by Schenker and Gentleman (2001, p. 184) when their standardized difference, d, is set equal to 0. Eq.(11) shows that as K ? 0 or ?, the value of ?? slowly approaches the exact type I error probability ? [consistent with Table 3 on p. 3 of Payton et al. (2003)]. Further, since 2 (1 K)/ 1 K++> 1 and 2 /2 Z (1K)/1K ? ++ > /2 Z ? , then ??= 2?Pr[Z > /2 Z(1K) ? + / 2 1K+ ] is smaller than ? = 2?Pr(Z > /2 Z ? ), which means that the Overlap always leads to a smaller type I error Pr than that of the Standard method, consistent with Figure 3 of Schenker and Gentleman (2001, p. 184). Table 4 shows the value of ??at? = 0.01and 0.05 for different K values. Note that Table 4 values are valid for either Gaussian underlying distributions or for the limiting values of n x and n y . Figure 5(a) and 5(b) show that as K increases, the value of ?? slowly approaches the exact type I error probability ?. To determine the minimum value of ?? from Eq.(11), let g(K)= /2 Z(1K)/ ? + 2 1K+ . The first derivative of g(K) is (K)g? = /2 223 1K(1K) [] 1K (1K) Z ? + ? ++ . Setting (K)g? = 0 will lead to K = 1. To ascertain whether K=1 is a point of minimum or maximum, the second differentiation yields: (K)g?? = 2 /2 23/2 25/2 23/2 2K 3(1 K)K 1 K [] (1 K ) (1 K ) (1 K ) Z ? ?+ + +? ++ Substituting K =1 and Z 0.025 = 1.959964 into the above equation results in (1)g?? = 0.353553391 0?<, which shows that K =1 maximize g(K). Thus, ?? has the minimum 30 value at K =1, as shown in Table 4. Table 4. The Type I Error Pr of Two Individual CIs with Different K at ? = 0.05 and 0.01 K (0.05)? ?? = K (0.05)? ?? = K (0.01)? ?? = K (0.01)? ?? = 1 0.005574597 6 0.024101169 1 0.000269717 6 0.003034255 1.2 0.005772632 7 0.026592621 1.2 0.000285833 7 0.003565806 1.4 0.006255214 8 0.028674519 1.4 0.000326631 8 0.004034767 1.6 0.006916773 9 0.030432273 1.6 0.000385984 9 0.004447733 1.8 0.007695183 10 0.031932004 1.8 0.000460718 10 0.004812093 2 0.008549353 11 0.033224353 2 0.000548586 11 0.005134764 2.2 0.009450168 12 0.034348214 2.2 0.000647644 12 0.005421799 2.4 0.010376313 13 0.035333699 2.4 0.000756080 13 0.005678348 2.6 0.011312004 14 0.036204361 2.6 0.000872186 14 0.005908733 2.8 0.012245574 15 0.036978819 2.8 0.000994382 15 0.006116572 3 0.013168478 16 0.037671953 3 0.001121233 16 0.006304887 3.2 0.014074567 17 0.038295773 3.2 0.001251466 17 0.006476214 3.4 0.014959516 18 0.038860068 3.4 0.001383967 18 0.006632687 3.6 0.015820399 19 0.039372889 3.6 0.001517779 19 0.006776110 3.8 0.016655348 20 0.039840912 3.8 0.001652089 20 0.006908016 4 0.017463296 21 0.040269717 4 0.001786217 21 0.007029710 4.2 0.018243773 22 0.040664002 4.2 0.001919599 22 0.007142316 4.4 0.018996754 23 0.041027747 4.4 0.002051774 23 0.007246798 4.6 0.019722537 24 0.041364351 4.6 0.002182369 24 0.007343994 4.8 0.020421655 25 0.041676725 4.8 0.002311086 25 0.007434630 5 0.021094804 26 0.041967382 5 0.002437694 26 0.007519342 Exact Type I Error (? =0.01) 0 0.002 0.004 0.006 0.008 0.01 0.012 K ? ' alpha' al pha Exact Type I Error (? =0.05) 0 0.01 0.02 0.03 0.04 0.05 0.06 1 2.45 3.9 5.35 6.8 8.25 9.7 11.2 12.6 14.1 15.5 17 18.4 19.9 21.3 22.8 24.2 25.7 27.1 28.6 30 K ? ' alpha' alpha Figure 5(a) Figure 5(b) 31 As before, let ? represent the amount of overlap length between the two individual CIs. Similar procedure as in previous section yields ? = U(? y ) ? L(? x ) = ( /2 / yy y Zn ? ?+? ) ? ( /2 / x x x Zn ? ??? ) = /2 (/ / ) x xy y Z nn ? ??+ ?( x y? ) (12a) Let r ? be the borderline value of ? at which H 0 is barely rejected at an ?-level. From Eq. (9c), H 0 : ? x = ? y must be rejected iff xy? > 2 /2 1K/ ? ? + y y Z n , which upon substitution into (12a) results in ? r = /2 (/ / ) x xy y Z nn ? ??+? ( x y? ) = /2 (/ / ) x xy y Z nn ? ??+? 2 /2 1K/ ? ? + y y Z n . Substituting /K/??= x xyy nn in the above equation yields r ? = 2 /2 ( / )[1K 1K] ? ? ?+? + yy Zn (12b) Eq. (12b) indicates that H 0 must be rejected at the LOS ? or less iff ? ? /2 (/ ) ? ? ? yy Zn 2 [1 K 1 K ]+?+ . Further, the span of the two individual CIs is U(? x ) ? L(? y ) = ( /2 / x x x Zn ? ?+? ) ? ( /2 / yy y Zn ? ??? ) = /2 (/ / ) x xy y Z nn ? ??+ + ( x y? ) (12c) Thus, the exact percent ?-overlap is given by ? = /2 /2 (/ / )( ) (/ / )( ) xxyy xxyy Z nnxy Z nnxy ? ? ?? ?? +?? ? ++? 100% 32 ? ? = /2 /2 (1 K ) / ( ) (1 K ) / ( ) ? ? ? ? + ?? + +? yy yy Z nxy Z nxy ?100% (12d) As before, ? lies in the closed interval [0, 100%]. The % overlap in Eq. (12d) clearly shows that as xy0?> increases, the P-value of the test decreases, and Eq.(12d) shows that the % overlap also decreases. Because H 0 must be rejected at the LOS ? or less iff ||x y? ? 2 /2 1K/ ? ? + y y Z n , the maximum % overlap above which H 0 cannot be rejected at an ?-level is given by r (k)? = 2 /2 /2 2 /2 /2 (1 K) / 1 K / (1 K) / 1 K / ?? ??+? + ? ++ + yyy y yyy y Z nZ n Z nZ n 100% = 2 2 1K 1K 1K 1K +?+ ? +++ 100% (12e) Eq. (12e) shows that the maximum prevent overlap doesn?t depend on ? and reduces to 17.1573% when / K / ? ? = x x y y n n = 1. It can be verified that the 1 st derivative of r (K)? is r 222 22K (K) 1K (1K 1K) ? ??= +?+++ whose root is K = 1. Moreover, the value of the 2 nd derivative of r (K)? at K = 1 is ? 0.121320344, which means that K = 1 maximizes the % overlap and the null hypothesis H 0 : ? x = ? y must be rejected at any ? if the overlap does not exceed 17.1573%. The farther K is from 1, the smaller the amount of allowable overlap becomes (i.e., the Overlap procedure becomes less deficient). For example, at K = 2 or 0.50, the % overlap reduces to 14.5898%. This implies that when the limiting SE ratio is K = 2 or 0.50, the two individual CIs can overlap up to 14.5898% and H 0 : ? x = ? y must still 33 be rejected at the LOS ? or less. At K = 3 or 1/3, the % overlap reduces to 11.696312% below which H 0 must be rejected at ? or less level; at K = 10, it reduces to 0.04513682. As K ? 0 or ?, r ? ? 0 so that the Overlap procedure very gradually approaches an exact ?- level test [consistent with Table 3 of Payton et al. (2003, p. 3)] Furthermore, what should the individual confidence level, (1? ?), be so that comparisons of individual CIs will lead to the exact ??level test? From Eq.(12c), the corresponding span of two individual CIs at confidence level (1? ?) is U?(? x ) ? L?(? y ) = /2 (1 K ) / ? ? + y y Z n + ( x y? ). From Eq.(9c), H 0 : ? x ? ? y = 0 must be rejected at the LOS ? iff xy? > 2 /2 1K/ ? ? + yy Z n . Substituting the critical limit x y? = /2 (/ ) ? ? yy Z n 2 1K?+ into U?(? x ) ? L?(? y ) results in U?(? x ) ? L?(? y ) = /2 (1 K ) / ? ? + y y Z n + 2 /2 1K/ ? ? + yy Z n . Furthermore, the (1 ? ?)?100% CIL from the Standard method is equal to 2 /2 21K/ ? ? + yy Z n . Thus, the individual confidence levels, (1? ?), should be set as follows which in turn leads individual CIs to an exact ??level test. ? /2 (1 K ) / ? ? + y y Z n + 2 /2 1K/ ? ? + yy Z n = 2 /2 21K/ ? ? + yy Z n ? /2 (1 K ) / ? ? + y y Z n = 2 /2 1K/ ? ? + yy Z n ? ? = 2 /2 2 [ 1K/(1K)]Z ? ??? + + (13) Eq.(13) shows that the level of each CI must be set at (1 ? ?) = 1? 2 /2 2[ 1K ? ??? +Z / (1 K)]+ in order to reject H 0 at ? LOS iff the two CIs are disjoint, which is in agreement with Eq.(8) of Payton et al. (2003, p.2). To verify this assertion, let q(K) = 34 2 /2 1K/(1K)Z ? ?++. The 1 st derivative of q(K) is given by (K)q? = 2 2 2 K1K (1 K ) 1K(1K) + ? + ++ . Setting (K)q? = 0 results in K =1. Moreover, the 2 nd derivative of q(K) is K1 2 2 dq(K) dK = =?0.346476 <0, which implies K =1 maximizes q(k) and in turn also maximizes ?. Table 5 shows that as K increases toward 1, ? also increases to reach its maximum and then decreases for the fixed ? as K departs from 1. Table5. Values of ? Versus K at ? = 0.05 and ? = 0.01 0.05?= 0.01?= K ? K ? K ? K ? 0.2 0.095783 3.5 0.112872 0.2 0.028594 3.5 0.037197 0.4 0.131601 4 0.106045 0.4 0.047523 4 0.033663 0.6 0.153132 4.5 0.100440 0.6 0.060458 4.5 0.030857 0.8 0.163187 5 0.095783 0.8 0.066863 5 0.028594 1 0.165776 6 0.088541 1 0.068548 6 0.025201 1.2 0.164038 7 0.083206 1.2 0.067415 7 0.022802 1.4 0.160015 8 0.079131 1.4 0.064818 8 0.021030 1.6 0.154931 9 0.075927 1.6 0.061587 9 0.019674 1.8 0.149483 10 0.073346 1.8 0.058189 10 0.018606 2 0.144051 20 0.061628 2 0.054869 20 0.014040 2.2 0.138834 30 0.057723 2.2 0.051746 30 0.012627 2.5 0.131601 40 0.055779 2.5 0.047523 40 0.011944 3 0.121265 50 0.054616 3 0.041713 50 0.011543 Lastly, the impact of Overlap on type II error probabilities for the known variance normal case is investigated. Comparing Eq.(9c) with Eq.(10), it clearly shows that the RHS of Eq. (10) is larger than that of Eq. (9c)? /2 (1 K ) / ? ? + yy Z n ? 2 /2 1K/ ? ? + yy Z n = 2 /2 (/ )(1K 1K) ? ? +?+ yy Zn > 0 35 because 2 1K 1K+>+ . Thus rejecting H 0 when the two separate CIs are disjoint is more stringent than using the SMD of x ?y and will always lead to much less statistical power. The Standard method Pr of committing a type II error (assuming ? x > ? y ), using Figure 3 is given by ? = Pr[ 22 /2 x x y y Z/n/n ? ??+?? x ?y ? 22 /2 x x y y Z/n/n ? ?+? ??] (14a) = 22 22 /2 /2 22 22 22 // // () Pr[ ] // // // xx yy xx yy xx yy xx yy xx yy Znn Znn xy nn nn nn ?? ? ?? ??? ? ?? ?? ?? ?+? +? ?? ?? ++ = 22 22 /2 /2 Pr[ / / / / / / ] x xyy xxyy Z nnZZ nn ?? ?? ? ?? ??? + ??? + (14b) = /2 /2 22 // Pr 1K 1K x xxx ZZZ ?? ? ??? ?? ?? ????? ++?? (14c) As in Schenker et al. (2001), let d represent a standardized difference, i.e., d = 22 // x xyy nn ? ??+ = 2 / 1K ? ? + yy n . Thus, the above equation results in the following form: /2 /2 ()( )Z dZd ?? ?=? ? ??? ? (14d) Eq. (14d) is the same result as Schenker et al. (2001, p.184) in their formula (6), except that they provide the equation for 1??. Furthermore, when the null hypothesis H 0 : ? x ? ? y = 0 is not rejected at LOS ? iff the two independent CIs ( x ?Z ?/2 xx /n? ? ? x ?x+ Z ?/2 xx /n? ) and ( y?Z ?/2 yy /n? ? ? y ? y+ Z ?/2 yy /n? ) are overlapping, the Pr of a type II error (assuming ? x > ? y ) from the Overlap method is given by 36 ??= Pr(Overlap?? > 0) = Pr{[ ( ) ( ) x y LU? ?? ]?[( ) ( ) yx LU? ?? ]|? x ? ? y > 0} = Pr{[ x ? /2 Z ? x x n ? ?y+ /2 Z ? y y n ? ]? [ /2 y y yZ n ? ? ? ? x+ /2 Z ? x x n ? ]| 0? > } = Pr{[ x ?y ? /2 Z ? x x n ? + /2 Z ? y y n ? ]? [ /2 y y Z n ? ? ? ? /2 Z ? x x n ? ? x y? ]| 0? > } = Pr{[ /2 y y Z n ? ? ? ? /2 Z ? x x n ? ? x ?y ? /2 Z ? x x n ? + /2 Z ? y y n ? | 0? > ] (15a) /2 /2 222 () () Pr[] ?+? +? ?? =?? +++ yy xx xy xy yy xx xy xy xy ZZ nn nn xy nn nn nn ?? ? ? ?? ? ? ? ??? = /2 /2 22 22 (1 K ) (1 K ) Pr 1K 1K yy xx xy xy ZZZ nn nn ?? ? ? ? ? ?? ++ ??? ? ++ = ?( /2 2 (1 K ) 1K Z ? + + ?d) ? ?( ? /2 2 (1 K ) 1K Z ? + + ?d) (15b) where K 2 = V(x) V(y)/ , and 2 (1 K ) 1 K/+ += //()/??+ xxyy nn 22 //??+ x xyy nn. Thus, the PWF (power function) of the Overlap procedure in the case of known-Variances is 1? ?? = ?(d ? /2 2 (1 K ) 1K Z ? + + ) + ?( ? /2 2 (1 K ) 1K Z ? + + ?d) (15c) The result in Eq. (15c) is the same as that of Schenker et al. (2001, p. 184) in their Eq.(7), as they also provide the expression for 1 ? ??. Schenker et al. (2001) just provide both 37 power functions without any explanation. The step by step derivations provided above have not been presented in statistical literature. For the comparison of ? and ?? see Table 6A. In Table 6A, the comparison is done for both ? = 0.05 and ? = 0.01. As the table shows, if d is fixed, as k increases, the type II error Pr increases. If k is fixed, the type ? error rate decreases as d increases. Thus, as Table 6A shows, the probability of type II error based-on Overlap is larger than that of the Standard method, i.e., the Overlap method will lead to smaller statistical power. Secondly, when k is fixed, as d increases, ???? is not necessarily increasing or decreasing, this is consistent with figure 4 of Schenker and Gentleman (2001). Furthermore, Table 6A shows that at a fixed K the difference in percent relative power decreases as the standardized difference d increases for both ? = 0.05 and ? = 0.01. By definition, for an ?-level test the relative efficiency of the Overlap to the Standard method, assuming the same statistical power, is given by RELEFF(Overlap to Standard) = RELEFF(O, ST) = xy xy (n n ) / ( )nn??+ + (15d) where the type II error Pr of the Standard method is given by Eq. (14) and n? is the Overlap sample size for which ?? = ?. The exact solution to n? is obtained by setting the first argument in Eq. (14b) to that of (15b), i.e., 2 0.025 Z(1 )/1K??++K ? 22 xx yy //n /n? ??? +? = 0.025 Z ? 22 xx yy //n /n?? +? (15e) Schenker and Gentleman (2001) state in their section 3 that the minimum ARE is ? which clearly occurs at their limiting SE ratio of k = 1. We could obtain their value if we equate the argument of ? on the RHS of Eq. (14a), 22 /2 x x y y Z/n/n ? ?+?, to that of 38 Table 6A. The Relative Power of Overlap with the Standard Method for Different Sample Sizes n and /( 2)?? Combinations for the Case of Known but Unequal Variances 0.05? = 0.01? = x x n? ? d k 1?? 1 ??? ( )100% 1 ???? ?? x x n? ? d k 1?? 1 ??? ( )100% 1 ???? ?? 0.2 0.141 1 0.06898 0.00853 87.6361 0.1 0.071 1 0.01224 0.00035 97.10660 0.4 0.283 1 0.09352 0.01281 86.3005 0.3 0.212 1 0.01809 0.00060 96.67198 0.8 0.566 1 0.16323 0.02738 83.2293 0.5 0.354 1 0.02626 0.00100 96.17487 1 0.707 1 0.21026 0.03895 81.4745 1 0.707 1 0.06166 0.00333 94.60226 1.5 1.061 1 0.36849 0.08705 76.3756 1.5 1.061 1 0.12973 0.00982 92.43060 2 1.414 1 0.58524 0.17459 70.1672 2 1.414 1 0.24539 0.02584 89.46857 0.2 0.111 2 0.06445 0.00913 85.8305 0.1 0.055 1.5 0.01172 0.00044 96.27097 0.4 0.222 2 0.08220 0.01256 84.7235 0.3 0.166 1.5 0.01598 0.00066 95.86846 0.8 0.444 2 0.12947 0.02295 82.2715 0.5 0.277 1.5 0.02153 0.00099 95.42442 1 0.555 2 0.15994 0.03052 80.9184 1 0.555 1.5 0.04327 0.00255 94.10605 1.5 0.832 2 0.25936 0.05930 77.1341 1.5 0.832 1.5 0.08120 0.00614 92.43297 2 1.109 2 0.39501 0.10771 72.7329 2 1.109 1.5 0.14253 0.01379 90.32345 0.2 0.089 2 0.06141 0.01108 81.9557 0.1 0.045 2 0.01137 0.00065 94.30995 0.4 0.179 2 0.07490 0.01426 80.9631 0.3 0.134 2 0.01462 0.00089 93.87954 0.8 0.358 2 0.10911 0.02310 78.8304 0.5 0.224 2 0.01866 0.00123 93.41816 1 0.447 2 0.13034 0.02908 77.6870 1 0.447 2 0.03329 0.00262 92.11581 1.5 0.671 2 0.19735 0.05014 74.5919 1.5 0.671 2 0.05678 0.00535 90.57311 2 0.894 2 0.28663 0.08272 71.1422 2 0.894 2 0.09268 0.01042 88.75241 0.2 0.074 3 0.05934 0.01338 77.4461 0.1 0.037 2.5 0.01113 0.00093 91.64804 0.4 0.149 3 0.07008 0.01643 76.5492 0.3 0.111 2.5 0.01372 0.00121 91.19269 0.8 0.297 3 0.09634 0.02441 74.6610 0.5 0.186 2.5 0.01684 0.00156 90.71390 1 0.371 3 0.11216 0.02953 73.6684 1 0.371 2.5 0.02749 0.00291 89.40730 1.5 0.557 3 0.16065 0.04652 71.0407 1.5 0.557 2.5 0.04351 0.00525 87.93005 2 0.743 3 0.22353 0.07109 68.1980 2 0.743 2.5 0.06680 0.00918 86.26369 0.2 0.063 3 0.05787 0.01569 72.8768 0.1 0.032 3 0.01095 0.00125 88.56141 0.4 0.126 3 0.06673 0.01864 72.0702 0.3 0.095 3 0.01310 0.00156 88.09595 0.8 0.253 3 0.08783 0.02600 70.3948 0.5 0.158 3 0.01562 0.00193 87.61275 1 0.316 3 0.10023 0.03054 69.5255 1 0.316 3 0.02385 0.00326 86.32330 1.5 0.474 3 0.13738 0.04498 67.2582 1.5 0.474 3 0.03560 0.00537 84.91009 2 0.632 3 0.18434 0.06479 64.8547 2 0.632 3 0.05197 0.00865 83.36362 0.2 0.055 4 0.05678 0.01788 68.5051 0.1 0.027 3.5 0.01082 0.00159 85.26635 0.4 0.110 4 0.06430 0.02072 67.7825 0.3 0.082 3.5 0.01265 0.00192 84.80442 0.8 0.220 4 0.08183 0.02758 66.2952 0.5 0.137 3.5 0.01475 0.00231 84.32904 1 0.275 4 0.09194 0.03169 65.5304 1 0.275 3.5 0.02139 0.00362 83.07963 1.5 0.412 4 0.12165 0.04433 63.5559 1.5 0.412 3.5 0.03048 0.00557 81.73909 2 0.549 4 0.15839 0.06099 61.4915 2 0.549 3.5 0.04273 0.00842 80.30232 39 (15a), namely /2 Z ? ? ? x x n + /2 Z ? ? ? y y n , and letting ? x = ? y and n x = n y = n, but this seems to ignore the true mean difference ? = ? x ? ? y . There are a numerous solutions for the Overlap sample sizes x n? and y n? from Eq. (15e) that must be at least as large as n x and n y in order to make the Overlap attain the same statistical power as the Standard method. Fortunately, an exact solution can be obtained only when ? x = ? y and n x = n y because it will be shown below that optimum efficiency will be achieved if n? x = n? y and as a result the above equation reduces to 0.025 Z2? (/)n/2??? = 0.025 Z ? (/)n/2?? The solution to this last equation is n? = 0.025 Z(22)(/)/? ??+ n (15f) Eq. (15f) clearly shows that as ?/? increases, the value of n? decreases. Further, as n increases, the RELEFF of Overlap to the Standard method (n/n?) increases. In fact, the larger ?/? is, the faster the RELEFF(O, ST) approaches 100% as n ? ?. To obtain a rough approximation to (15e), we compare the 1 st statement for ? with the 4 th statement for ?? and equate 22 xx yy /n /n?+? to / x x n? ? + / yy n? ? . Dividing both sides of the last equality by / x x n? yields 2 1K+ = / x x nn? + K / yy nn? , where K = / / y y x x n n ? ? is called the SE ratio. The equation 2 1K+ = / x x nn? + K / yy nn? shows that the solutions x n? and y n? do not depend on the specific values of ? x and ? y but rather only on their ratio / x y ? ? . Unfortunately, the same cannot be said about the ratio 40 R n = n y /n x , i.e., x n? and y n? do depend on the specific values of n x and n y and not just on their ratio R n . Further, the equation 2 1K+ = / x x nn? + K / yy nn? clearly shows that when x n? = n x and y n? = n y , the RHS reduces to 1+ K which obviously exceeds the LHS 2 1K+ for all k. As K ? ?, this last equation also shows that x n? ? n x and y n? ? n y so that the Overlap becomes an exact ?-level test. When K > 1, the minimum n? x + y n? occurs (i.e., the Overlap achieves its maximum relative efficiency) when n? x ? y n? and vice a versa when K < 1. It seems that we have a constrained optimization problem where (n x +n y )/( n? x + y n? ) is to be maximized subject to the nonlinear constraint / x x nn? + K/ yy nn? = 2 1K+ . The solution to this optimization can be obtained through the use of Lagrangian multipliers as shown below. The objective is to maximize f(n? x , y n? ) = (n x +n y )/( n? x + y n? ) subject to 2 1K+ ? / x x nn? ?K/ yy nn? = 0 and hence it is sufficient to maximize f(n? x , y n? ) = N/( n? x + y n? ) + ?( 2 1K+ ? / x x nn? ?K/ yy nn? ), where N = n x + n y and ? is an arbitrary constant. Taking the partial derivatives of f(n? x , y n? ) with respective to n? x & y n? and setting them equal to zero yields: x f/ n??? = ?N( n? x + y n? ) ?2 + ?( /2 x n ) 3/2 x (n ) ? ? Set to ? ??? 0 y f/ n??? = ?N( n? x + y n? ) ?2 + ?( K/2 y n ) 3/2 y (n ) ? ? Set to ? ??? 0 Because ? is a completely arbitrary constant, the above system is satisfied as soon as we equate x n 3/2 x (n ) ? ? to K y n 3/2 y (n ) ? ? , i.e., x n 3/2 x (n ) ? ? = K y n 3/2 y (n ) ? ? ? 41 x n 3 x (n ) ? ? = 2 K y n 3 y (n ) ? ? ? 2 K y n / x n = 3 x (n ) ? ? / 3 y (n ) ? ? ? 2 K y n / x n = 3 yx (n / n )?? ? yx n/n??= ( 2 K y n / x n ) 1/3 ; thus the optimum solution is obtained if we select n? x and y n? in such a manner that their ratio yx n/n? ? is close to ( 2 K y n / x n ) 1/3 . Table 6B provides the RELEFF of Overlap to the Standard for various values of ?/? only for the case of n x = n y = n and ? x = ? y = ? for which K = 1. When ? x ? ? y , there are uncountable ways that K can equal 1, and therefore, the procedure is to solve x n? and y n? from (15f) and to compute the RELEFF from the ratio of (n x +n y )/( x n? + y n? ). The results of this chapter verifies what has been reported in Overlap literature for the limiting case (i.e., large sample sizes) by Goldstein H. & Healy MJR (1995), Payton et al. (2000), Schenker N. & Gentleman J. F (2001), and Payton et al. (2003). Payton et al. (2000) report some approximate Overlap results for smaller sample sizes (n x = n y = n = 5(5) 50) but used simulation to obtain them instead of the exact normal theory as applied here in Chapter 3. Further, it must be emphasized that the results reported in this chapter will also apply to non-normal underlying populations only if both n x & n y > 60. This is due the Central Limit Theorem (CLT) that states the sample mean distribution from non-normal population approaches normality as n ? ?. In practice, the rate of approach to normality depends only on skewness and kurtosis of the underlying distributions. It is well known that both the skewness and kurtosis of a normal universe are zero. The closer the skewness and kurtosis of the parent populations are to zero, the more rapidly the means ( xandy) approach normality. For example, because the skewness of a uniform distribution is zero and its kurtosis is ?1.20, only samples of size at least 6 are needed for the corresponding sample mean to be approximately normally 42 Table 6B. RELEFF of Overlap to the Standard Method at ? = 0.05 and K=1 0.2 0.4 0.6 n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF 4 0.06676 25 0.21671 4 0.16864 25 0.40361 4 0.26117 25 0.52305 5 0.07858 30 0.23840 5 0.19175 30 0.43053 5 0.29037 30 0.54922 6 0.08945 35 0.25758 6 0.21201 35 0.45337 6 0.31519 35 0.57094 8 0.10895 40 0.27479 8 0.24634 40 0.47312 8 0.35577 40 0.58940 10 0.12617 50 0.30462 10 0.27479 50 0.50592 10 0.38814 50 0.61940 12 0.14163 60 0.32987 12 0.29907 60 0.53236 12 0.41495 60 0.64305 14 0.15571 70 0.35174 14 0.32023 70 0.55438 14 0.43776 70 0.66237 16 0.16864 80 0.37098 16 0.33898 80 0.57313 16 0.45754 80 0.67859 18 0.18060 90 0.38814 18 0.35577 90 0.58940 18 0.47495 90 0.69248 20 0.19175 100 0.40361 20 0.37098 100 0.60370 20 0.49048 100 0.70456 0.8 1 1.5 n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF 4 0.33898 25 0.60370 4 0.40361 25 0.66139 4 0.52305 25 0.75211 5 0.37098 30 0.62787 5 0.43658 30 0.68345 5 0.55501 30 0.76981 6 0.39760 35 0.64766 6 0.46358 35 0.70136 6 0.58052 35 0.78401 8 0.44009 40 0.66431 8 0.50592 40 0.71632 8 0.61940 40 0.79574 10 0.47312 50 0.69103 10 0.53823 50 0.74014 10 0.64822 50 0.81419 12 0.49994 60 0.71180 12 0.56411 60 0.75849 12 0.67081 60 0.82823 14 0.52240 70 0.72860 14 0.58553 70 0.77323 14 0.68919 70 0.83939 16 0.54162 80 0.74258 16 0.60370 80 0.78542 16 0.70456 80 0.84855 18 0.55836 90 0.75447 18 0.61940 90 0.79574 18 0.71769 90 0.85626 20 0.57313 100 0.76474 20 0.63317 100 0.80463 20 0.72908 100 0.86286 2 2.5 3 n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF n RELEFF 4 0.60370 25 0.80463 4 0.66139 25 0.83883 4 0.70456 25 0.86286 5 0.63317 30 0.81927 5 0.68826 30 0.85126 5 0.72908 30 0.87365 6 0.65632 35 0.83092 6 0.70916 35 0.86112 6 0.74800 35 0.88217 8 0.69103 40 0.84050 8 0.90471 40 0.86919 8 0.77584 40 0.88914 10 0.71632 50 0.85546 10 0.76246 50 0.88175 10 0.79574 50 0.89995 12 0.73589 60 0.86677 12 0.77959 60 0.89119 12 0.81092 60 0.90805 14 0.75166 70 0.87571 14 0.79331 70 0.89864 14 0.82303 70 0.91443 16 0.76474 80 0.88302 16 0.80463 80 0.90471 16 0.83298 80 0.91962 18 0.77584 90 0.88914 18 0.81419 90 0.90978 18 0.84136 90 0.92395 20 0.78542 100 0.89437 20 0.82242 100 0.91411 20 0.84855 100 0.92764 distributed. This is due to the fact that the skewness of the 6-fold convolution of a U(0, 1) is zero (due to symmetry) while its kurtosis is ?1.20/6 = ?0.20. It can be shown that 43 the kurtosis of an n-fold convolution of the U(0, 1) is exactly equal to ?1.20/n (see Appendix A). Further, our experience indicates [Hool J. N. and Maghsoodloo S. (1980) and Maghsoodloo S. and Hool J. N. (1981)] that the 3 rd moment (skewness) plays a more important role in normal approximation of a linear combination than the 4 th moment (kurtosis). 44 4.0 Bonferroni Intervals for Comparing Two Sample Means The two independent 95% confidence intervals for each of the two population means have a joint Pr of 0.95 2 of containing ? x and ? y . Although, this concept of joint Pr has not been considered in the Overlap literature, we consider it here to investigate its impact on type I & II error rates from the Overlap method. In order to compare two 95% CIs against a single 95% CI for ? x ? ? y , it may be best to use the Bonferroni concept so that the overall confidence Pr (regardless of the correlation structure) of the two CIs is raised from 0.95 2 = 0.9025 to 0.95. This is accomplished by setting individual CI coefficient at 1 ? ? = 0.95 = 0.9746794345 so that the joint confidence level will equal to ( 0.95 ) 2 . To this end, let (1 ? ? B ) = 0.95 (the subscript B stands for Bonferroni) = 0.9746794345; thus, B ? = 0.02532056552, which results in B ? /2 = 0.01266028276 and Z 0.0126603 = 2.23647664456. Thus, the 97.468% confidence Pr statement for ? x is Pr( x ?Z 0.0126603 xx /n? ? ? x ? x + Z 0.0126603 xx /n? ) = 0.97468. As a result, the lower 97.468% Bonferroni CI limit for ? x is L(? x ) = x ? Z 0.0126603 xx /n? and the corresponding upper limit is U(? x ) = x+ Z 0.0126603 xx /n? resulting in the Bonferroni CIL (confidence interval length) of CIL(? x ) = 2?Z 0.0126603 xx /n? . Following the same procedure, the 97.468% CI for ? y will be: L(? y ) = y ? Z 0.0126603 yy /n? , U(? y ) = y+ Z 0.01266 yy /n? , and the corresponding Bonferroni CIL(? y ) = 2Z 0.0126603 yy /n? . 45 The Bonferroni confidence Intervals for ? x and ? y will not change the 95% CI for x y ? ?? , i.e., the 95% CI for x y ? ?? is still the same as in Eq. (9b), as shown below: xy?? 2 0.025 (/ )1K xx Zn? ?+ ? x y ? ?? ? xy? + 2 0.025 (/ )1K xx Zn? ?+ . The 95% CI in Eq.(9c) shows that the H 0 : ? x ? ? y = 0 must be rejected at the 5% level of significance iff xy? > 2 0.025 (/ )1K xx Zn? ?+ . However, requiring that the two separate independent CIs must be disjoint in order to reject H 0 : ? x ? ? y = 0 at the 5% level, is equivalent to either L(? x ) > U(? y ), or L(? y ) > U(? x ). These two possibilities lead to either x ? Z 0.01266 xx /n? > y+ Z 0.01266 yy /n? , or y ? Z 0.01266 yy /n? > x + Z 0.01266 xx /n? , respectively. Inserting /K/ yy xx nn??= into this last inequality leads to the rejection of H 0 iff xy? > 0.0126603 0.0126603 // x xyy Z nZ n??+ = 0.0126603 (1 K ) / x x Z n?+ (16a) Thus the Bonferroni CIL for the Eq.(16a) is 0.0126603 2(1K)/ x x Z n?+ . (16b) Using the same procedures as in chapter 3, if we set the exact type I error at 5% and reject H 0 when the two independent CIs do not overlap; then the Bonferroni type I error Pr reduces to B ?? = Pr( x+Z 0.0126603 x x n ? < y ?Z 0.0126603 y y n ? )+ Pr( x ?Z 0.0126603 x x n ? > y+Z 0.0126603 y y n ? ) = 0.0126603 2Pr (1K)/ x x x yZ n? ?? ??> + ?? = 0.0126603 22 (1 K ) / 2Pr // x x xx yy Z n Z nn ? ?? ? ? + ? ? ?> ? ? + ? ? 46 = 0.0126603 222 (1 K) / 2Pr /K/ x x xx xx Z n Z nn ? ?? ?? + ?? ?> + ?? = 2?Pr[Z > 2 0.0126603 Z (1K)/1K++] (17) Eq. (11) leads to??=2?Pr[Z > 2 0.025 Z (1K)/1K++]. Comparing Eq.(17) with Eq.(11), clearly, since Z 0.0126603 > Z 0.025 ? 0.0126603 2 Z (1K) 1K + + > 0.025 2 (1 K ) Z 1K + + , then B ? ???< . Thus, the Bonferroni intervals lead to an even smaller type ? error Pr than both ? and??, i.e., B ?? 2 0.025 1K/ x x Z n? + . Therefore, the value of ? r = () 0.0126603 // x xy y Z nn??+ ? 2 0.025 1K/ x x Z n? + 48 2 0.0126603 0.025 (/ ) (1K) 1K xx nZ Z? ?? =? +?+ ?? ?? (18b) Eq. (18b) indicates that H 0 must be rejected at the 5% or less level iff ??(/ ) xx n? ? 2 0.0126603 0.025 (1 K ) 1 KZZ ?? +? + ?? ?? . Further, the span of the two individual CIs is () () xy UL???= ( 0.0126603 / x x x Zn?+?)? ( 0.0126603 / yy y Zn???) 2 0.0126603 0.025 (/ ) (1K) 1K xx nZ Z? ? ? =? +++ ? ? ? ? (18c) Thus, the percentage of the overlap length at the borderline condition for the Bonferroni case is given by () ( ) () () YX XY UL UL ? ? ? ? ? ? ?100% = 2 0.0126603 0.025 2 0.0126603 0.025 (1 K ) 1 K [ ] 100% (1 K ) 1 K ZZ+? + ? ++ + (18d) Let h(K)= 2 0.0126603 0.025 2 0.0126603 0.025 (1 K ) 1 K [] (1 K) 1 K ZZ+? + ++ + . From Maple, (K)h? = 222 0.025 .025 0.025 0.025 0.025 K/1K ( K 1K)( K/1K) K1K (K1K) BBBB BB BB ZZ ZZ Z ZZ ZZ Z ZZ Z ?++?+++ ? ++ + ++ + (18e) Plugging K = 1 into Eq.(18e), result in K1 '(K) | 0.117405174 0.117405174 0h = = ?= and K1 (K) |h = ?? =-0.095648707-0.425286147+0.117405174-0.022459306= ?0.425988985<0, which implies that K =1 maximizes h(K). Thus, the maximum overlap occurs when K=1 as before. Table 8 shows that, at the same K, the amount of overlap based on Bonferroni concept is larger than that of two individual CIs at LOS of 0.05. As K increases, the difference in overlap and Bonferroni overlap monotonically and slowly 49 Table 8. The Impact of Bonferroni on Percent Overlap at Different K K Bonferroni Overlap(%) Overlap (%) Difference (%) K Bonferroni Overlap(%) Overlap (%) Difference (%) 1 23.481035 17.157288 6.323747 3.1 17.907989 11.453938 6.454051 1.1 23.427525 17.102324 6.325201 3.2 17.678328 11.219816 6.458512 1.2 23.286523 16.957512 6.329012 3.3 17.456447 10.993694 6.462753 1.3 23.082143 16.747656 6.334487 3.4 17.242089 10.775302 6.466787 1.4 22.832795 16.491705 6.341089 3.5 17.034990 10.564364 6.470625 1.5 22.552471 16.204060 6.348410 3.6 16.834880 10.360601 6.474279 1.6 22.251757 15.895613 6.356143 3.7 16.641492 10.163735 6.477758 1.7 21.938629 15.574565 6.364064 3.8 16.454562 9.973490 6.481072 1.8 21.619071 15.247062 6.372009 3.9 16.273829 9.789598 6.484232 1.9 21.297543 14.917681 6.379862 4 16.099042 9.611797 6.487245 2 20.977344 14.589803 6.387541 4.1 15.929955 9.439834 6.490121 2.1 20.660894 14.265901 6.394993 4.2 15.766332 9.273465 6.492867 2.2 20.349935 13.947753 6.402182 4.3 15.607946 9.112454 6.495491 2.3 20.045701 13.636613 6.409088 4.4 15.454576 8.956577 6.497999 2.4 19.749033 13.333333 6.415700 4.5 15.306015 8.805616 6.500399 2.5 19.460480 13.038464 6.422016 4.6 15.162061 8.659366 6.502695 2.6 19.180364 12.752325 6.428039 4.7 15.022524 8.517630 6.504894 2.7 18.908840 12.475065 6.433775 4.8 14.887219 8.380218 6.507001 2.8 18.645939 12.206705 6.439233 4.9 14.755973 8.246951 6.509022 2.9 18.391594 11.947169 6.444424 5 14.628619 8.117658 6.510960 3 18.145672 11.696312 6.449360 5.1 14.504998 7.992177 6.512821 increases toward the limit of 0.065892072. Finally, the same conclusion as before can be reached that separate CIs lead to larger type ? error when Bonferroni concept is applied. The exact Pr of type ? error is the same as Eq.(14) ? /2 /2 ()( )=? ? ??? ?Z dZd ?? ? . For B ?? (B stands for Bonferroni) will be changed to B ?? = Pr(Overlap?? x ? ? y = ?) = Pr{[ () () x y LU? ?? ]?[ () () yx LU? ?? ]| ?} =?( 0.0126603 2 1K 1K Z + + ?d) ? ?( ? 0.0126603 2 1K 1K Z + + ?d) (19) Table 9 clearly shows that the Bonferroni concept leads to the largest type ? error Pr than other two methods, i.e., B ? ???>>?. Because the Bonferroni CIs always have larger 50 Table 9. Type ? Error Pr for the Standard, Overlap, and Bonferroni Methods with Different K and d Combinations K d ? ?? B ?? K d ? ?? B ?? 1 0 0.95 0.994425 0.998438 1.8 0 0.95 0.992305 0.997643 1 0.2 0.921586 0.993461 0.998090 1.8 0.2 0.921586 0.991068 0.997157 1 0.4 0.881232 0.990392 0.996952 1.8 0.4 0.881232 0.987161 0.995579 1 0.6 0.826159 0.984692 0.994725 1.8 0.6 0.826159 0.979999 0.992544 1 0.8 0.753937 0.975507 0.990896 1.8 0.8 0.753937 0.968656 0.987431 1 1 0.662927 0.961706 0.984708 1.8 1 0.662927 0.951936 0.979356 1.2 0 0.95 0.994227 0.998367 2 0 0.95 0.991451 0.997305 1.2 0.2 0.921586 0.993237 0.998006 2 0.2 0.921586 0.990111 0.996763 1.2 0.4 0.881232 0.990085 0.996826 2 0.4 0.881232 0.985887 0.995010 1.2 0.6 0.826159 0.984241 0.994523 2 0.6 0.826159 0.97818 0.991656 1.2 0.8 0.753937 0.974842 0.990571 2 0.8 0.753937 0.96604 0.986044 1.2 1 0.662927 0.960747 0.984200 2 1 0.662927 0.948262 0.977248 1.4 0 0.95 0.993745 0.998190 2.5 0 0.95 0.989156 0.996352 1.4 0.2 0.921586 0.992690 0.997798 2.5 0.2 0.921586 0.987554 0.995662 1.4 0.4 0.881232 0.989343 0.996518 2.5 0.4 0.881232 0.98253 0.993443 1.4 0.6 0.826159 0.983155 0.994030 2.5 0.6 0.826159 0.973451 0.989250 1.4 0.8 0.753937 0.973245 0.989780 2.5 0.8 0.753937 0.959334 0.982342 1.4 1 0.662927 0.958455 0.982970 2.5 1 0.662927 0.938958 0.971701 1.6 0 0.95 0.993083 0.997943 3 0 0.95 0.986832 0.995330 1.6 0.2 0.921586 0.991944 0.997508 3 0.2 0.921586 0.984982 0.994490 1.6 0.4 0.881232 0.988334 0.996090 3 0.4 0.881232 0.979206 0.991807 1.6 0.6 0.826159 0.981690 0.993349 3 0.6 0.826159 0.968852 0.986788 1.6 0.8 0.753937 0.971106 0.988699 3 0.8 0.753937 0.952921 0.978626 1.6 1 0.662927 0.955405 0.981300 3 1 0.662927 0.930202 0.966232 confidence bands, will always lead to larger % overlap and to smaller type I error and larger type II error rates, they will not be henceforth considered. Moreover, Figure 7 shows the three type II errors ( , , B ? ?? ?? ) at k =1, 1.2, 1.5 and 2. These four figures clearly show the relation that B ? ?? ?>>?. In other words, Bonferroni method will lead to the largest type II error. 51 Type II error (at K=1) 0 0.2 0.4 0.6 0.8 1 1.2 0 0.6 1.2 1.8 2.4 3 3.6 4.2 4.8 d Pr o f t y p e II erro r beta beta' beta(bon) Type II error (at K=1.2) 0 0.2 0.4 0.6 0.8 1 1.2 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2 3.6 4 4.4 4.8 d P r o f t y p e II erro r beta beta' beta'(bon) Type II error (at K=1.5) 0 0.2 0.4 0.6 0.8 1 1.2 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2 3.6 4 4.4 4.8 d P r o f t y p e II erro r beta beta' beta'(bon) Type II error (at K=2) 0 0.2 0.4 0.6 0.8 1 1.2 0 0. 4 0. 8 1. 2 1. 6 2 2 .4 2 .8 3 .2 3 .6 4 4. 4 4. 8 d Pr of type II erro r beta beta' beta'(bon) Figure 7 52 5.0 Comparing the Overlap of Two Independent CIs with a Single CI for the Ratio of Two Normal Population Variances Because there are two different t-tests (pooled t-test and two-sample t-test) to compare independent normal means when variances are unknown, it is prudent to pretest H 0 : 22 x y ? ?= at an ??level. Because statistical literature cautions against using the pooled t-test unless there is convincing evidence in favor of H 0 : 22 x y ? ?= , then when testing H 0 : 22 x y ? ?= just to ascertain to pool or not, the LOS ? will be set much higher than 5%. Consider a random sample of size n x from the normal universe N(? x , 2 x ? ). Using the fact that the rv 2 2 (1) x x x nS ? ? has a chi-square distribution with ? x = 1 x n ? degrees of freedom, it follows that the Pr [ 1/2, 2 x ? ? ? ? < 2 2 (1) x x nS ? ? < /2, 2 x ? ? ? ] = 1 ? ?. Rearranging this last Pr statement results in the (1 ? ?)100% CI for 2 x ? ? 2 2 /2, (1) x x x nS ?? ? ? < 2 x ? < 2 2 1/2, (1) x x x nS ? ? ? ? ? (20a) Hence, the lower CI limit for 2 x ? is L( 2 x ? ) = 2 2 /2, x x x S ? ? ? ? and the upper CI limit is U( 2 x ? ) = 2 2 1/2, x xx S ? ? ? ? ? . These CI lower and upper limits result in the confidence interval length 53 CIL( 2 x ? ) = U( 2 x ? ) ? L( 2 x ? ) = 2 2 1/2, x xx S ? ? ? ? ? ? 2 2 /2, x x x S ? ? ? ? = 2 22 1/2, /2, 11 () ? ?? x x xx S ?? ?? ? ?? (20b) The same procedure as above leads to the (1 ? ?)100% upper and lower CI limits for 2 y ? as L( 2 y ? ) = 2 2 /2, (1) y yy nS ?? ? ? , U( 2 y ? ) = 2 2 1/2, (1) y yy nS ? ? ? ? ? and CIL( 2 y ? ) = 2 22 1/2, /2, 11 [] yy yy S ? ??? ? ?? ? ??. (20c) With the above information, requiring that the two independent CIs must be disjoint in order to reject H 0 : 22 x y ? ?= at the ??100% level is equivalent to either L( 2 x ? ) > U( 2 y ? ) or L( 2 y ? ) > U( 2 x ? ), i.e., L( 2 x ? ) > U( 2 y ? ) ? 2 2 /2, 1 (1) x x x n nS ? ? ? ? > 2 2 1/2, 1 (1) y yy n nS ? ? ?? ? ? Thus, based on the Overlap procedure, reject H 0 if F 0 = 2 2 x y S S > y x ? ? ? 2 /2, 2 1/2, x y ?? ? ? ? ? ? , or F 0 = 2 2 x y S S > y x ? ? ? /2, , x y C ? ?? , (21a) where /2, , x y C ? ?? = 2 /2, 2 1/2, x y ?? ? ? ? ? ? . Or L( 2 y ? ) > U( 2 x ? ) ? 2 2 /2, (1) y yy nS ?? ? ? > 2 2 1/2, (1) x x x nS ? ? ? ? ? ? 2 2 x y S S < y x ? ? 2 1/2, 2 /2, x y ? ? ?? ? ? ? ? ? Reject H 0 if F 0 = 2 2 x y S S < y x ? ? ? 1/2,, x y C ? ??? (21b) 54 However, the exact (1? ?)100% CI for the ratio of the two independent normal variances must be obtained from the Fisher?s F distribution as follows: Consider two samples from N(? x , 2 x ? ) and N(? y , 2 y ? ), respectively. Thus, 2 2 (1) x x x nS ? ? ? 1 2 ?n x ? ; 2 2 (1) y y y nS ? ? ? 1 2 ?n y ? ; 1, 1 xy nn F ? ? = 22 22 [( 1) ] ( 1) [( 1) ] ( 1) xxxx yyyy nS n nS n ? ? ? ? ? ? ? Pr( , 12, x y F ? ??? ? 22 22 x x yy S S ? ? ? , 2, x y F ? ?? ) = 1 ? ? ? 2 2 x y S S 1/2,, yx F ? ??? ? 2 2 x y ? ? ? 2 2 x y S S /2, , yx F ? ?? (22a) ? CIL = F 0 ( /2, , yx F ? ?? ? 1/2,, yx F ? ??? ), where 1 xx n? = ? and 1 yy n? =?. (22b) Then H 0 : 22 x y ? ?= or H 0 : 22 / x y ? ? = 1 must be rejected at the ??100% level of significance if the CI in Eq. (22a) excludes one; otherwise, H 0 must not be rejected at the ??100% level. Thus, based on the Standard procedure H 0 : 22 x y ? ?= (or 2 2 1 x y ? ? = ) must be rejected at the ? level iff either F 0 = 2 2 x y S S < 1/2,, x y F ? ??? or F 0 = 2 2 x y S S > /2, , x y F ? ?? (22c) The Pr of type I error for the exact procedure (i.e., using the Standard method from the null SMD of the ratio 22 x y SSwhich is xy , F ? ? ) is ?. This implies that H 0 : 22 x y ? ?= will be rejected at the ?-level iff F 0 = 22 x y SS< 1/2,, x y F ? ??? , or F 0 = 22 x y SS> /2, , x y F ? ?? . Therefore, the type I error Pr for the two disjoint CIs (??) is given by ??(two disjoint CIs) = Pr[ 22 () () XY UL? ?< ] + Pr[ 22 () () XY LU? ?> ] 55 = Pr[ 2 2 1/2, (1) x x x nS ? ? ? ? ? < 2 2 /2, (1) y yy nS ?? ? ? ] + Pr[ 2 2 /2, (1) x x x nS ?? ? ? > 2 2 1/2, (1) y yy nS ? ? ? ? ? ] = Pr[ 2 2 x y S S < y x ? ? ? 2 1/2, 2 /2, x y ? ? ?? ? ? ? ] + Pr[ 2 2 x y S S > y x ? ? ? 2 /2, 2 1/2, x y ?? ? ? ? ? ? ] = Pr( 2 2 x y S S < y x ? ? ? 12,, x y C ? ??? ) + Pr( 2 2 x y S S > y x ? ? ? 2, , x y C ? ?? ) = Pr( , x y F ? ? < y x ? ? ? 12,, x y C ? ??? )+Pr( , x y F ? ? > y x ? ? ? 2, , x y C ? ?? ) (23) Table 10 gives the values of ? and ??(where ?? represents type I error Pr from the Overlap procedure) for various values of x ? and y ? , verifying the same conclusion as before: the Overlap method always leads to a smaller type ? error Pr than that of the null sampling distribution of 22 / x y SS, which is the Fisher?s F. Moreover, we have verified that?? value depends on the sizes of x y and? ? and not much on their ratio / x y ? ? . Eq. (23) can easily verify that at ? = 0.01, as x y and? ? increase, the Overlap type I error Pr, ??, decreases toward 0.000269717, while at ? = 0.05 the value of ?? decreases (from 0.017800531 at x y ? ?= = 1) toward 0.0055746, similar to the overlapping of CIs for population means. For the special case that xy nnn= = , the rejection of H 0 from the Overlap method given by Eq. (21a) and Eq. (21b) is reduced to ? Reject H 0 if either F 0 = 2 2 x y S S < 2 1/2,1 2 /2, 1 n n ? ? ? ? ? ? ? = 1/2,n1 C ??? or F 0 = 2 2 x y S S > 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ? ? = /2,n 1 C ? ? (24) 56 Table 10. The Values of ? and ?' for Various Values of ? x and ? y 0.05?= 0.01?= x ? y ? y x ? ? ?? x ? y ? y x ? ? ?? 10 20 2 0.007839 10 20 2 0.000624 10 40 4 0.009969 10 40 4 0.000912 10 60 6 0.011917 10 60 6 0.001182 10 80 8 0.013579 10 80 8 0.001420 30 20 0.666667 0.006640 30 20 0.666667 0.000425 30 40 1.333333 0.006262 30 40 1.333333 0.000368 30 60 2 0.006818 30 60 2 0.000424 30 80 2.666667 0.007546 30 80 2.666667 0.000502 50 20 0.4 0.007611 50 20 0.4 0.000534 50 50 1 0.005954 50 50 1 0.000326 50 80 1.6 0.006228 50 80 1.6 0.000349 50 100 2 0.006611 50 100 2 0.000386 100 60 0.6 0.006233 100 60 0.6 0.000346 100 80 0.8 0.005865 100 80 0.8 0.000308 500 500 1 0.005612 100 100 1 0.000297 1000 1000 1 0.005594 100 120 1.2 0.000300 20 30 1.5 0.006640 20 30 1.5 0.000425 40 60 1.5 0.006230 40 60 1.5 0.000355 80 120 1.5 0.006025 80 120 1.5 0.000322 100 150 1.5 0.005984 100 150 1.5 0.000315 20 40 2 0.007077 20 40 2 0.000473 40 80 2 0.006689 40 80 2 0.000400 80 160 2 0.006493 80 160 2 0.000365 1000 2000 2 0.006313 1000 2000 2 0.000333 40 100 2.5 0.007232 40 100 2.5 0.000456 60 150 2.5 0.007105 60 150 2.5 0.000430 100 250 2.5 0.007003 100 250 2.5 0.000410 1000 2500 2.5 0.006864 1000 2500 2.5 0.000383 30 150 5 0.010087 30 150 5 0.000798 40 200 5 0.009970 40 200 5 0.000765 50 250 5 0.009900 50 250 5 0.000746 100 500 5 0.009759 100 500 5 0.000706 1000 5000 5 0.009630 1000 5000 5 0.000671 From Eqs. (22a & b), the rejection of H 0 using Fisher?s F will be simplified as follows: 1, 1nn F ?? = 22 22 x x yy S S ? ? , thus, (1 ? ?)100% CIs for 2 2 x y ? ? ? 2 2 x y S S 1 2,1,1nn F ???? ? 2 2 x y ? ? ? 2 2 x y S S /2, 1, 1nn F ? ? ? (25a) 57 From Eq.(25a), the length of the exact (1-?)% CI is given by F 0 ( /2,1,1nn F ? ?? ? 1,1,1nn F ???? ). Thus, H 0 should be rejected if F 0 = 2 2 x y S S < 1/2,1,1nn F ???? or F 0 = 2 2 x y S S > /2, 1, 1nn F ? ? ? . (25b) Comparing Eq.(24) with Eq.(25b), it can be verified (See Figure 8) that at the same ?-level, 2 1/2,1 2 /2, 1 n n ? ? ? ? ?? ? = 1/2,n1 C ?? ? < 1/2,1,1nn F ?? ?? (26a) and 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ?? = /2,n 1 C ?? > 2, 1, 1nn F ? ? ? for all n. (26b) The Reject Values of Chi and F distribution Eq.(26a) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 2 7 12 17 22 27 32 37 42 47 52 57 n-1 (df) R e ject Valu es ChiSquare- Distribution F-Distribution The Reject Values of Chi and F distribution Eq.(26b) 0 50 100 150 200 2 8 1 4 2 0 26 32 38 44 50 56 n-1 (df) R e ject Val u es ChiSquare- Distribution F-Distribution Figure 8 Furthermore, for the balanced xy nnn= = case, if the type I error Pr for the Standard Method (Fisher?s F distribution) is? , the type I error Pr from the two disjoint CIs (??), Eq.(23), is reduced to ? ??= Pr[ 2 2 1/2, 1 (1) x x x n nS ? ? ?? ? < 2 2 /2, 1 (1) y yy n nS ? ? ? ? ] + Pr[ 2 2 /2, 1 (1) x x x n nS ? ? ? ? > 2 2 1/2, 1 (1) y yy n nS ? ? ? ? ? ] = Pr( 2 2 x y S S < 1/2,1n C ??? ) + Pr( 2 2 x y S S > /2, 1n C ? ? ) = Pr( 1, 1nn F ?? < 1/2,1n C ??? ) + Pr( 1, 1nn F ? ? > /2, 1n C ? ? ) 58 = 1, 1 / 2, 1 2Pr( ) ?? ? ?> nn n FC ? , where /2, 1n C ? ? = 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ? ? (27) Table 11 shows that ??is much smaller than? for the special case that xy nnn==. As in the case of testing H 0 : ? x = ? y at ? = 0.05, the value of Overlap type I error Pr seems to slowly approach 0.0055746 as n ? ? and at ? = 0.01, ?? approaches 0.0002697. Table 11. The Impact of Overlap on Type I Error Pr for the Equal-Sample Size Case When Testing the Ratio 22 xy /? ? Against 1 n-1 ? ? ? n-1 ? ? ? 10 0.01 0.000585 10 0.05 0.007468 20 0.01 0.000418 20 0.05 0.006525 50 0.01 0.000326 50 0.05 0.005954 80 0.01 0.000304 80 0.05 0.005812 100 0.01 0.000297 100 0.05 0.005764 130 0.01 0.000291 130 0.05 0.005720 150 0.01 0.000288 150 0.05 0.005701 200 0.01 0.000283 200 0.05 0.005669 250 0.01 0.000281 250 0.05 0.005650 500 0.01 0.000275 500 0.05 0.005612 1000 0.01 0.000272 1000 0.05 0.005594 2000 0.01 0.000271 2000 0.05 0.005584 3000 0.01 0.000271 3000 0.05 0.005581 As before, let ? represent the length of overlap between the CIs for 2 x ? and 2 y ? . Thus, ? is larger than 0 only if U( 2 x ? ) > U( 2 y ? ) > L( 2 x ? ) or U( 2 y ? ) > U( 2 x ? ) > L( 2 y ? ). Because both conditions lead to the same result only the case U( 2 x ? ) >U( 2 y ? ) > L( 2 x ? ) is considered, and without loss of generality the X-sample is the one with larger variance so that 22 / x y SS ? 1. 59 ? Because of symmetry, ?= U( 2 y ? ) ? L( 2 x ? ) = 2 2 1/2, y yy S ? ? ? ? ? ? 2 2 /2, x x x S ? ? ? ? (28a) Let r O be the maximum value of ? at which H 0 is barely rejected at the ? level. From Eq. (22c), H 0 must be rejected iff F 0 = 2 2 x y S S > /2, , x y F ? ?? . Therefore, the borderline value of ? will occur when 2 x S = 2 y S ? /2, , x y F ? ?? . Inserting this into Eq. (28a) will result in: r O ? 2 2 1/2, y yy S ? ? ? ? ? ? 2 2, , 2 /2, x y x xy SF ? ?? ?? ? ? ? = 2 y S /2, , 22 1/2, /2, () x y yx x y F ??? ?? ?? ? ? ?? ? ?? (28b) The span of the two individual CIs is U( 2 x ? ) ? L(? 2 y ) = 2 2 1/2, 1 (1) x x x n nS ? ? ? ? ? ? 2 2 /2, 1 (1) y yy n nS ? ? ? ? = 2 2, , 2 1/2, x y x xy SF ? ?? ?? ? ? ? ? ? 2 2 /2, y yy S ? ? ? ? = 2 y S 2, , 22 12, 2, () xy x y x y F ??? ?? ?? ? ? ?? ? ?? (28c) Thus, the percent overlap at the critical limits is ? r = 22 22 () ( ) () () YX XY UL UL ? ? ? ? ? ? = [ 2, , 22 12, 2, 2, , 22 12, 2, x y yx xy x y x y x y F F ? ?? ?? ?? ??? ?? ?? ? ? ?? ? ? ?? ? ? ? ? ]?100% = ( /2, , /2, /2, , /2, 22 /2, , /2, , /2, /2, x yy xyy x xxy y x yx xy CF CF ? ?? ?? ??? ?? ? ?? ? ?? ? ? ? ? ?? ?????? ????? )?100% (28d) Table 12 shows that as x ? and y ? increase, the percentage of the overlap approaches 17.1573% (although not monotonically). Further, once the % overlap exceeds Eq. (28d), then H 0 must not be rejected at the ??100% level. Further, it is the size of x ? and y ? that 60 Table 12. The % Overlap for the Different Combinations of Degree of Freedom at ? = 0.05. x ? y ? / y x ? ? Overlap (%) x ? y ? / y x ? ? Overlap (%) 10 5 0.5 13.92184 10 12 1.2 10.91515 10 10 1 11.54543 20 24 1.2 13.50590 10 15 1.5 10.15131 40 48 1.2 15.04864 10 20 2 9.18956 60 72 1.2 15.62065 10 25 2.5 8.46961 80 96 1.2 15.92389 20 10 0.5 16.24648 100 120 1.2 16.11365 20 20 1 14.11596 150 180 1.2 16.37981 20 30 1.5 12.73993 300 360 1.2 16.67183 20 40 2 11.73402 500 600 1.2 16.80378 20 50 2.5 10.95112 700 840 1.2 16.86607 40 20 0.5 17.20430 800 960 1.2 16.88670 40 40 1 15.57376 900 1080 1.2 16.90327 40 60 1.5 14.35904 1000 1200 1.2 16.91691 40 80 2 13.40830 2000 2400 1.2 16.98483 40 100 2.5 12.63712 3000 3600 1.2 17.01169 60 30 0.5 17.40395 10 15 1.5 10.15131 60 60 1 16.08722 20 30 1.5 12.73993 60 90 1.5 14.98840 40 60 1.5 14.35904 60 120 2 14.08884 60 90 1.5 14.98840 60 150 2.5 13.34073 80 120 1.5 15.33298 80 40 0.5 17.45225 100 150 1.5 15.55403 80 80 1 16.34928 150 225 1.5 15.87382 80 120 1.5 15.33298 300 450 1.5 16.24474 80 160 2 14.47217 500 750 1.5 16.42384 80 200 2.5 13.74343 700 1050 1.5 16.51241 100 50 0.5 17.45418 800 1200 1.5 17.76296 100 100 1 16.50824 900 1350 1.5 16.56697 100 150 1.5 15.55403 1000 1500 1.5 16.58737 100 200 2 14.72328 2000 3000 1.5 16.69266 100 250 2.5 14.01021 3000 4500 1.5 16.73654 determines the % overlap and not the ratio / y x ? ? . For the case that xy nnn==, the percent overlap in Eq.(28d) reduces to ? ? = 22 22 () ( ) () () YX XY UL UL ? ? ? ? ? ? = [ 2, 1, 1 2 22 12,1 2,1 2, 1, 1 2 22 12,1 2,1 1 (1) [ ] 1 (1) [ ] nn y nn nn y nn F nS F nS ? ?? ? ?? ?? ?? ? ? ?? ? ?? ?? ? ?? ? ?? ? ]100% 61 = [ 2, 1, 1 22 12,1 2,1 2, 1, 1 22 12,1 2,1 1 1 nn nn nn nn F F ? ?? ? ?? ?? ?? ?? ?? ? ?? ?? ? ? ? ]?100% = [ 2, 1 2, 1, 1 2,1 2,1,1 1 ??? ??? ? ? ? nnn nnn CF CF ?? ?? ]?100% (29) Eq.(29) shows that the rejection percent overlap between the two CIs for the ratio of variances will increase as n increases. Further, ? r in Eq. (12b) is also a monotonically increasing function of ?. For example, at ? = 0.05, n = 2, ? r = 0.1348%; at n = 3, ? r = 1.8781%; at n = 5, ? r = 6.0921%; and at n = 20 and ? = 0.05, ? r = 13.9695%, while at n = 20 and ? = 0.01, ? r = 12.0224%. Matlab shows that at ? = n ? 1 = 7,819,285 df [note that Matlab 7.6(R2008a) loses accuracy in inverting F at the 7 th decimal place beyond 7,819,285 df], the 0.05-level overlap is 17.157261356%, which is very close to the overlap for two independent normal population means discussed in section 2 (which was 17.157287525%). Further, for very small sample sizes within the interval [2, 4], the Variance-Overlap method is almost an ?-level test, like the case of CIs for normal population means when ? x = ? y and K is far away from 1. See the illustration in Table 13. Table 13. The % Overlap for the Case of ? = 0.05 and n x = x y = n n-1 Numerator of Eq.(31) Denominator of Eq.(31) Overlap (%) n-1 Numerator of Eq.(31) Denominator of Eq.(31) Overlap (%) 10 0.12652 1.09587 11.54543 200 0.00067 0.00397 16.83011 20 0.03214 0.22770 14.11596 400 0.00023 0.00133 16.99303 30 0.01541 0.10223 15.07427 600 0.00012 0.00070 17.04763 40 0.00933 0.05990 15.57376 800 0.00008 0.00045 17.07499 50 0.00637 0.04014 15.88014 1000 0.00005 0.00032 17.09142 60 0.00469 0.02917 16.08722 1200 0.00004 0.00024 17.10239 70 0.00363 0.02237 16.23652 1300 0.00004 0.00021 17.10661 80 0.00291 0.01783 16.34928 1400 0.00003 0.00019 17.11022 90 0.00240 0.01462 16.43743 1500 0.00003 0.00017 17.11336 100 0.00202 0.01227 16.50824 2000 0.00002 0.00011 17.12433 62 Now, what should the individual confidence level 1 ? ? be so that the two independent CIs lead to the exact ?-level test on H 0 : 2 x ? = 2 y ? . The expressions for the two 1?? independent CIs are given by 2 2 /2, (1) x x x nS ?? ? ? ? 2 x ? ? 2 2 1/2, (1) x x x nS ? ? ? ? ? , (30a) and 2 2 /2, (1) y yy nS ?? ? ? ? 2 y ? ? 2 2 1/2, (1) y yy nS ?? ? ? ? (30b) From Eq.(30a) and Eq.(30b), the overlap amount of two individual CIs at confidence level (1? ?) is U'( 2 y ? ) ?L'( 2 x ? ) . Therefore, we deduce from (30a &b) that U'( 2 y ? ) ? L'( 2 x ? ) = 2 2 1/2, y yy S ? ? ? ? ? ? 2 2 /2, x x x S ? ? ? ? (30c) Because H 0 : 2 x ? = 2 y ? must be rejected at the ??100%-level as soon as Eq.(30c) becomes zero or smaller, we thus impose the rejection criterion 2 x S / 2 y S ? F ?/2 (where for the sake of convenience F ?/2 = xy /2, , F ??? ) into Eq. (30c). In short, we are rejecting H 0 : 2 x ? = 2 y ? as soon as the two independent CIs in (30a) and (30b) become disjoint. This leads to rejecting H 0 : 2 x ? = 2 y ? iff U'( 2 y ? ) ?L'( 2 x ? ) = 2 2 1/2, y yy S ? ? ? ? ? ? 2 /2 2 /2, x x y FS ? ?? ? ? ? 0. (31a) At the borderline value, we set the overlap amount at LOS ? in inequality (31a) to 0 in order to solve for ?? 2 2 1/2, y yy S ? ? ? ? ? ? 2 /2 2 /2, x x y FS ? ?? ? ? = 0 ? 2 1/2, y y ? ? ? ? ? ? /2 2 /2, x x F ? ? ? ? ? = 0 ? /2 2 /2, x x F ? ? ? ? ? = 2 1/2, y y ? ? ? ? ? ? /2x y F ? ? ? = 2 /2, 2 1/2, x y ?? ? ? ? ? ? ? /2x y F ? ? ? = /2, , x y C ? ?? (31b) 63 where F ?/2 = xy /2, , F ??? and /2, , x y C ? ?? = 22 /2, 1 /2, / x y ? ??? ?? ? . Eq. (31b) clearly shows that the value of ? depends on the LOS ? of testing H 0 : 2 x ? = 2 y ? and the sample sizes n x and n y . For example, when ? = 0.05, n x = 21 & n y = 11 Eq. (31b) reduces to 2F 0.025,20,10 = /2,20,10 C ? = 22 /2,20 1 /2,10 / ?? ?? ? ? 2?3.4185 = 22 /2,20 1 /2,10 / ?? ?? ? ? 6.8371 = 2 /2,20 / ? ? 2 1/2,10? ? ? . Through trial & error the solution to this last inequality is ?/2 = 0.0712 ( ? = 0.1424). In turns out that as long as ? x = 2? y , the required confidence level for the two independent CI on 2 x ? and 2 y ? must be set approximately equal to 1?2?0.0712 = 85.76%. Further, if ? y = 2? x the required confidence level for the two independent CI on 2 x ? and 2 y ? must be set approximately equal to 1?2?0.083 = 83.40%. In the case of balanced design (i.e., when ? x = ? y ) Eq. (31b) reduces to /2, 1, 1nn F ? ? ? = /2, 1?n C ? (31c) It can be verified that the approximate solution to Eq. (31c) when ? = 0.05, n-1=10, through trial & error, is ?/2 = 0.079. Therefore, the individual CIs have to be set at 84.20%. For the moderate sample 10 ? n ?30, the approximate solution is 0.08. We used Matlab to determine that un the limit (as n ? 7,819,286 at 7 decimal accuracy), ?/2 ? 0.08288800. because MS Excel 2003 cannot invert 2 ? ? for df beyond ? = 1119. Table14 shows the value of ? to make the two sides of Eq.(31b) equal for different x ? and y ? combinations. Table 15 shows the cases when y ? is kept fixed at 20 but the ratio / x y ? ? changes from 0.5 to 50 causing? to become smaller and smaller. 64 Table 14. The Overlap Significance Level, ? , that Yields the Same 5%-Level Test or 1%-Level Test by the Standard Method 0.05? = 0.01? = x ? y ? x y ? ? /2x y F ? ? ? ? /2, , x y C ? ?? x ? y ? x y ? ? /2x y F ? ? ? ? /2, , x y C ? ?? 10 20 0.5 1.38684 0.16658 1.38684 10 20 0.5 1.92350 0.06875 1.92350 20 40 0.5 1.03386 0.16560 1.03386 20 40 0.5 1.29921 0.06878 1.29921 30 60 0.5 0.90760 0.16489 0.90760 30 60 0.5 1.09372 0.06847 1.09372 40 80 0.5 0.83952 0.16440 0.83952 40 80 0.5 0.98697 0.06819 0.98697 50 100 0.5 0.79585 0.16402 0.79585 50 100 0.5 0.92002 0.06796 0.92002 60 120 0.5 0.76497 0.16373 0.76497 60 120 0.5 0.87343 0.06776 0.87343 80 160 0.5 0.72348 0.16329 0.72348 80 160 0.5 0.81179 0.06745 0.81179 100 200 0.5 0.69635 0.16297 0.69635 100 200 0.5 0.77209 0.06722 0.77209 200 400 0.5 0.63291 0.16211 0.63291 200 400 0.5 0.68121 0.06658 0.68121 500 1000 0.5 0.58092 0.16128 0.58092 500 1000 0.5 0.60881 0.06592 0.60881 1000 2000 0.5 0.55615 0.16117 0.55611 1000 2000 0.5 0.57498 0.06552 0.57499 2000 4000 0.5 0.53918 0.16075 0.53915 2000 4000 0.5 0.55207 0.06551 0.55203 10 10 1 3.71679 0.15810 3.71679 10 10 1 5.84668 0.05981 5.84668 20 20 1 2.46448 0.16189 2.46448 20 20 1 3.31779 0.06400 3.31779 30 30 1 2.07394 0.16317 2.07394 30 30 1 2.62778 0.06548 2.62778 40 40 1 1.87520 0.16382 1.87520 40 40 1 2.29584 0.06623 2.29584 50 50 1 1.75195 0.16421 1.75195 50 50 1 2.09671 0.06669 2.09671 60 60 1 1.66679 0.16447 1.66679 60 60 1 1.96217 0.06699 1.96217 80 80 1 1.55488 0.16480 1.55488 80 80 1 1.78924 0.06738 1.78924 100 100 1 1.48325 0.16499 1.48325 100 100 1 1.68089 0.06761 1.68089 200 200 1 1.32045 0.16538 1.32045 200 200 1 1.44159 0.06808 1.44159 500 500 1 1.19185 0.16562 1.19185 500 500 1 1.25956 0.06836 1.25956 1000 1000 1 1.13205 0.16570 1.13205 1000 1000 1 1.17708 0.06846 1.17708 2000 2000 1 1.09164 0.16568 1.09164 2000 2000 1 1.12214 0.06854 1.12213 10 5 2 13.23831 0.13278 13.23831 10 5 2 27.23636 0.04228 27.23640 20 10 2 6.83709 0.14239 6.83709 20 10 2 10.54803 0.04970 10.54800 30 15 2 5.28747 0.14629 5.28747 30 15 2 7.37349 0.05296 7.37349 40 20 2 4.57464 0.14848 4.57464 40 20 2 6.04306 0.05483 6.04306 50 25 2 4.15744 0.14990 4.15744 50 25 2 5.30448 0.05607 5.30448 60 30 2 3.88002 0.15092 3.88002 60 30 2 4.83030 0.05696 4.83030 80 40 2 3.52875 0.15230 3.52875 80 40 2 4.24979 0.05817 4.24979 100 50 2 3.31170 0.15320 3.31170 100 50 2 3.90249 0.05896 3.90249 200 100 2 2.84057 0.15532 2.84057 200 100 2 3.17944 0.06083 3.17944 500 250 2 2.48968 0.15705 2.48968 500 250 2 2.66931 0.06234 2.66931 1000 500 2 2.33277 0.15786 2.33277 1000 500 2 2.44932 0.06305 2.44932 2000 1000 2 2.22893 0.15825 2.22900 2000 1000 2 2.30657 0.06352 2.30658 65 Table 15. The Overlap Significance Level, ? , That Yields the Same 5%-Level Test or 1%-Level Test by the Standard Method at Fixed y ? and Changing x ? 0.05? = 0.01? = x ? y ? x y ? ? /2x y F ? ? ? ? /2, , x y C ? ?? x ? y ? x y ? ? /2x y F ? ? ? ? /2, , x y C ? ?? 10 20 0.5 1.38684 0.16658 1.38684 10 20 0.5 1.92350 0.06875 1.92350 12 20 0.6 1.60550 0.16628 1.60550 12 20 0.6 2.20674 0.06803 2.20674 16 20 0.8 2.03723 0.16445 2.03723 16 20 0.8 2.76540 0.06611 2.76540 20 20 1 2.46448 0.16189 2.46448 20 20 1 3.31779 0.06400 3.31779 24 20 1.2 2.88907 0.15910 2.88907 24 20 1.2 3.86643 0.06192 3.86643 28 20 1.4 3.31194 0.15629 3.31194 28 20 1.4 4.41266 0.05995 4.41266 32 20 1.6 3.73362 0.15356 3.73362 32 20 1.6 4.95722 0.05811 4.95722 36 20 1.8 4.15445 0.15095 4.15445 36 20 1.8 5.50059 0.05641 5.50059 40 20 2 4.57464 0.14848 4.57464 40 20 2 6.04306 0.05483 6.04306 50 20 2.5 5.62323 0.14287 5.62323 50 20 2.5 7.39659 0.05139 7.39659 60 20 3 6.67008 0.13802 6.67008 60 20 3 8.74765 0.04853 8.74765 70 20 3.5 7.71585 0.13380 7.71585 70 20 3.5 10.09722 0.04612 10.09722 80 20 4 8.76092 0.13010 8.76092 80 20 4 11.44579 0.04407 11.44579 90 20 4.5 9.80550 0.12683 9.80550 90 20 4.5 12.79368 0.04229 12.79368 100 20 5 10.84972 0.12392 10.84972 100 20 5 14.14107 0.04074 14.14107 110 20 5.5 11.89369 0.12130 11.89369 110 20 5.5 15.48809 0.03936 15.48809 120 20 6 12.93745 0.11894 12.93745 120 20 6 16.83483 0.03814 16.83482 130 20 6.5 13.98105 0.11679 13.98105 130 20 6.5 18.18134 0.03705 18.18134 140 20 7 15.02453 0.11482 15.02453 140 20 7 19.52768 0.03606 19.52768 1000 20 50 104.70358 0.07610 104.70360 1000 20 50 135.22825 0.01898 135.22158 Next, the type ? error Pr for both the F-distribution and separate CIs cases are discussed. Comparing equations (21 a & b) with Eq. (22c), because ( / yx ? ? )? /2, , x y C ? ?? > /2, , x y F ? ?? , and ( / yx ? ? ) 1/2,, x y C ? ??? ? < 1/2,, x y F ? ??? (see the illustration in the Figure 7), it follows that the disjoint CIs provide more stringent requirement for rejecting H 0 . Thus, the rejecting rule from two disjoint CIs will always lead to a larger Type ? error Pr (or much less statistical power) as illustrated below. By definition, ? =Pr( Type ? error) = Pr(not rejecting H 0 |H 0 is false). Since H 0 is assumed false, it follows that 22 x y ? ?? . Let ? = / x y ? ? ? 222 x y ? ??= . Thus, ?(? ) = Pr ( , 1/2,? x y F ? ?? ? F 0 ? /2, , x y F ? ?? | ? = / x y ? ? ) 66 ? ?(? ) = , x y cdfF ? ? ( 2 /2, , / xy F ??? ? ) ? , x y cdfF ? ? ( , 2 1/2, / xy F ??? ? ? ) (32a) And the Type ? error Pr of the two independent CIs is given by ??(? ) = Pr{[ 22 () () x y LU? ?? ]?[ 22 () () yx LU? ?? ]|? = x y ? ? } = Pr{[ 2 2 /2, x x x S ? ? ? ? ? 2 2 1/2, y yy S ? ? ? ? ? ]?[ 2 2 /2, y yy S ? ? ? ? ? 2 2 1/2, x xx S ? ? ? ? ? ]| ? = x y ? ? } = Pr{[ y x ? ? ? 2 1/2, 2 /2, x y ? ? ?? ? ? ? ? 2 2 x y S S ? y x ? ? ? 2 /2, 2 1/2, x y ?? ? ? ? ? ? ]|? = x y ? ? } = Pr [ y x ? ? ? 12,, x y C ? ??? ? 2 2 x y S S ? y x ? ? ? 2, , x y C ? ?? |? = x y ? ? ] = Pr [ 2 2 y x ? ? ? y x ? ? ? 12,, x y C ? ??? ? , x y F ? ? ? 2 2 y x ? ? ? y x ? ? ? 2, , x y C ? ?? | ? = x y ? ? ] = Pr [ 2 1 ? ? y x ? ? ? 12,, x y C ? ??? ? , x y F ? ? ? 2 1 ? ? y x ? ? ? 2, , x y C ? ?? ] = , 2 1 ( xy cdfF ?? ? ? y x ? ? ? 2, , x y C ? ?? ) ? , x y cdfF ? ? ( 2 1 ? ? y x ? ? ? 12,, x y C ? ??? ) (32b) Table 16 illustrates that the Type ? Error Pr from the two overlapping CIs (Eq.(32b)) is larger than the corresponding exact value from the F distribution (Eq.(32a)). For the case xy nnn==, the type II error Pr, ?(?) in Eq.(32a), is reduced to ? ?(?)= cdfF 2 /2,1,1 (/ ??nn F ? ? ) ? cdfF( 2 1/2,1,1 / nn F ? ? ??? ) (33a) As n increases, the second term on the RHS of Eq.(33a), cdfF( 2 1/2,1,1 / nn F ? ? ??? ) , becomes smaller. For example at ? =1.6, Table 17 shows that when n ? 10, the 2 nd term is less 67 Table 16. The Relative Power of the Overlap to the Standard Method for Different df Combinations at ?=1.2 x ? y ? ? ?? ( )100% 1 ???? ?? x ? y ? ? ?? ( )100% 1 ???? ?? 10 10 0.91766 0.98481 81.55134 100 10 0.91207 0.96043 54.99654 10 20 0.89163 0.98041 81.92535 100 40 0.74692 0.91585 66.74959 10 30 0.87733 0.97509 79.69653 100 70 0.63397 0.87179 64.97326 10 40 0.86842 0.96990 77.12143 100 100 0.55858 0.83125 61.77049 10 50 0.86236 0.96505 74.61099 100 150 0.47954 0.75978 53.84462 20 10 0.91548 0.97991 76.22917 120 20 0.84992 0.94123 60.84118 20 20 0.87759 0.97518 79.72614 120 50 0.69250 0.89059 64.41798 20 30 0.85247 0.96921 79.12977 120 80 0.58356 0.84178 62.00594 20 40 0.83499 0.96306 77.61297 120 110 0.50857 0.79724 58.74004 20 50 0.82223 0.95707 75.85031 120 150 0.44078 0.74501 54.40270 40 20 0.86430 0.96531 74.43748 150 30 0.78694 0.91608 60.61311 40 30 0.82554 0.95716 75.44521 150 70 0.59415 0.83886 60.29563 40 40 0.79525 0.94876 74.97338 150 100 0.49836 0.78518 57.17595 40 50 0.77124 0.94041 73.95230 150 130 0.43027 0.73690 53.81970 40 60 0.75187 0.93229 72.71069 150 160 0.38045 0.69411 50.62713 70 20 0.85581 0.95410 68.16592 200 50 0.66561 0.85840 57.65517 70 40 0.76416 0.93059 70.56976 200 80 0.52932 0.79054 55.49954 70 60 0.69858 0.90724 69.22532 200 120 0.40562 0.70859 50.97174 70 80 0.65095 0.88509 67.07999 200 150 0.34220 0.65484 47.52817 70 100 0.61528 0.86454 64.78992 200 180 0.29514 0.60752 44.31852 Table 17. Type II Error for Different Degrees of Freedom (The Case of n x = n y = n ) ? ? Eq.(33a) 1 st term Eq.(33a) 2 nd term ?(?) ? ? Eq.(33a) 1 st term Eq.(33a) 2 nd term ?(?) 5 1 0.975 0.025 0.95 21 1.6 0.4451093 5.2063E-05 0.4450573 5 1.2 0.9482959 0.0115 0.9367963 22 1.6 0.4243906 4.2048E-05 0.4243486 5 1.4 0.9090074 0.005788 0.9032193 23 1.6 0.4043861 3.4049E-05 0.4043521 5 1.6 0.8578487 0.003139 0.8547093 24 1.6 0.3850938 2.7639E-05 0.3850662 5 1.8 0.7971565 0.001811 0.7953453 25 1.6 0.3665092 2.2487E-05 0.3664867 5 2 0.7301745 0.0011 0.7290745 30 1.6 0.2838905 8.2591E-06 0.2838822 5 2.1 0.6954029 0.000872 0.6945313 35 1.6 0.2172298 3.1597E-06 0.2172266 10 1.2 0.9246265 0.006969 0.9176572 40 1.6 0.1644551 1.2481E-06 0.1644538 10 1.4 0.8361705 0.002117 0.8340535 45 1.6 0.123329 5.0597E-07 0.1233284 10 1.6 0.7168104 0.000705 0.716105 50 1.6 0.0917083 2.0959E-07 0.0917081 15 1.2 0.9024968 0.004697 0.8977997 55 1.6 0.067677 8.8419E-08 0.0676769 15 1.4 0.7639183 0.000929 0.7629895 60 1.6 0.0495987 3.7894E-08 0.0495986 15 1.6 0.5840969 0.000201 0.5838961 65 1.6 0.0361208 1.6465E-08 0.0361207 20 1.1 0.9400273 0.009206 0.9308214 70 1.6 0.0261534 7.2419E-09 0.0261534 20 1.2 0.880935 0.003341 0.877594 75 1.6 0.0188356 3.2198E-09 0.0188356 20 1.3 0.7969205 0.001215 0.7957056 80 1.6 0.0134984 1.4455E-09 0.0134984 68 than 0.001 so that the 1 st term on the RHS of (33a) gives the approximate value of ?(?=1.6) to 3 decimal accuracy once n ? 10 and ? ?1.6, where ? = n ? 1 = degrees of freedom. Thus, for the xy nnn== case, if the acceptance criterion is based on overlapping of the two CIs, then Eq.(32b) will be changed to ()???= Pr[ 2 2 y x ? ? 2 1/2,1 2 /2, 1 n n ? ? ? ? ?? ? ?? 1, 1x y nn F ? ? ? 2 2 y x ? ? 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ? ? ? | ? = x y ? ? ] = Pr[ 2 1 ? 2 1/2,1 2 /2, 1 n n ? ? ? ? ?? ? ? 1, 1x y nn F ? ? ? 2 1 ? 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ? ? ] = cdfF( 2 /2, 1 / n C ? ? ? ) ? cdfF( 2 1/2,1 / ??n C ? ? ), where /2, 1n C ? ? = 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ?? . (33b) As the degrees of freedom, ? (= n ? 1) or ? increases, the second term of the Eq.(33b), cdfF( 2 /2, 1 1/ n C ? ? ? ), becomes smaller. For example, if ? is fixed at 10, the cumulative probability of the second term on the RHS of Eq.(33b) will be less than 0.001 when? = 1.2. Conversely, if? is kept at 1.2, the 2 nd term is less than 0.001 if ? ?9 (see the illustration in Table 18). Based on the above discussion, Eqs.(33) can be approximated as ?(?) = cdfF( 2 /2,1,1 / nn F ? ? ?? ) and ??(? ) = cdfF( 2 /2, 1 / n C ? ? ? ) (33c) In Eqs.(26), 2 /2, 1 2 1/2,1 n n ? ? ? ? ? ?? = /2, 1?n C ? > ,1.1 2 nn F ? ? ? for all n and 2 /2, 1 / n C ? ? ? > 2 /2,1,1 / nn F ? ? ?? and as a result ( )? ?? = cdfF( 2 /2, 1 / n C ? ? ? ) > cdfF( 2 /2,1,1 / nn F ? ? ?? ) = ( )?? , i.e.,??is larger than? for all n. This conclusion is the same as that of testing the difference in population means. Thus, the disjoint confidence intervals always lead to less statistical power (1 1??? /2, t ? ? ? 1/ 1/ p xy Snn+ (36c) But, for the individual two t-CIs, the rejection condition is either L( x ? ) > U( y ? ) or L( y ? ) > U( x ? ). Using the definition of type I error Pr, bearing in mind that 2 t ? = F 1,? , leads to ??= Pr(reject H 0 | x y ??? = 0) = Pr[L( x ? ) >U( y ? )] + Pr[L( y ? ) >U( x ? )] = Pr[ /2, x x x S xt n ?? ? /2, y y y S yt n ?? >+ ] + Pr[ /2, y y y S yt n ?? ? > /2, x x x S xt n ?? + ] = Pr[ x y? > /2, x x x S t n ?? /2, y y y S t n ?? + ] + Pr[ x y? <- /2, x x x S t n ?? /2, y y y S t n ?? ? ] = Pr[||x y? > /2, / x x x tSn ?? /2, / y yy tSn ?? + ] (37a) 74 = Pr[ t ? > /2, /2, // 1/ 1/ xy x xyy px y tSntSn Snn ?? ?? + + ] = Pr[F 1,? > 2 /2, /2, 2 (/ /) (1 / 1 / ) xy xx yy px y tSntSn Sn n ?? ?? + + ] (37b) Without loss of generality, we name the sample with the larger variance as X and let 0 F = 2 x S/ 2 y S ? 1. Multiplying the argument on the RHS of Eq. (37b) by n x n y for both numerator and denominator and substituting 0 F = 2 x S/ 2 y S ? 1 into (37b) results in ? ??= Pr[F 1,? > 2 /2, 0 /2, 0 ()() xy yx yxxy tFntn Fnn ?? ?? ? ?? + ++ ] ? ??= Pr[F 1,? > 2 /2, 0 /2, 0 ()(1) xy n yx n tFRt FR ?? ?? ? ?? + ++ ] (37c) = Pr[ 1, F ? > 2 /2, /2, 0 ()(1) xy yx n tt FR k ?? ?? ? ?? ?+ ++ ] (37d) where ? = n x +n y ? 2, R n = n y /n x and k = n0 RF= (S x y n )/(S y x n ) is the sample se ratio. Eq.(37c) clearly shows that, besides ?, the value ??depends only on x n , y n and F 0 = 2 x S/ 2 y S and not on the specific values of S x and S y . For the pooled t-test, in the most common case of balanced design (i.e., n x = n y = n), Eq. (37c) reduces to ?? = Pr[F 1,? > 2 ,1, 1 0 0 (1 ) 1 n FF F ? ? + + ] (37e) where the pretest statistic 0 F = 22 / x y SS must range within the interval ( 0.90,n 1,n 1 F ?? , 0.10,n 1,n 1 F ?? ). The random function 2 00 (1 ) /(1 )FF++inside the argument of the RHS 75 of (37e) attains its maximum at 0 F = 1 and its minimum at 0.10,n 1,n 1 F ? ? or at 0.90,n 1,n 1 F ? ? . As a result the minimum value of ??occurs at F 0 =1 and its maximum occurs at either 0.10,n 1,n 1 F ?? or 0.90,n 1,n 1 F ?? . At the same F 0 , ?? in (37e) is a monotonically increasing function of n. Further, Matlab has verified that the limiting value of ?? in Eq. (37e), as n ? 7,819,286 lies in the interval [0.005574595835 (at F 0 = F 0.10 ), 0.005574597084 (at F 0 = 1)], both of which are very close to the known-Variance case of testing H 0 : ? x = ? y . Eq. (37e) for Overlap type I error Pr is different from 1 ? Pr(A) atop p. 549 of Payton et al. (2000) because theirs pertains to the general two-independent sample t-statistic, discussed in the next section, while (37e) pertains to the pooled t-test. Further, it will be shown that the denominator df of the F statistic for their general case will not equal n ?1 as stated by Payton et al. (2000). The next objective is to show that ?? < ? for all n x , n y , S x and S y for which xy 0.90,n 1,n 1 F ?? < 0 F = 2 x S / 2 y S < xy 0.10,n 1,n 1 F ? ? . First, comparing inequality (36c) with Eq.(37a), it follows that if /2, x x x S t n ?? /2, y y y S t n ?? + > /2, t ? ? ? 11 p x y S nn + (38a) then the same conclusion as the case of known and equal variances will be reached, i.e., ?? /2,2( 1)n t ? ? ? 22 ()/22/+? xy SS n ? Is /2, 1 () nxy tSS ? ? + > /2,2( 1)n t ? ? ? 22 x y SS+ ? (38b) 76 Substituting 22 0 /= x y FSS into Eq.(38b) results in /2, 1 0 /2,2( 1) 0 (1 ) 1 nn tFt F ???? +> +? (38c) It is clear that the inequality in (38c) easily holds because /2, 1n t ? ? > /2,2( 1)n t ? ? for all finite n and 00 (1 ) 1FF+>+ for all values of 0 F because 0 F is never negative. Therefore, the inequality (38a) is true for the case of equal sample sizes but it is not always so for the unequal sample sizes case. In the unbalanced case, the difficulty in inequality (38a) occurs when the larger sample size (which will be denoted by n x ) also has much larger variance for which inequality (38a) will not be true. For example, if n y = 20, 2 y S = 0.30, n x = 60 and 2 x S = 1.8, the LHS of inequality (38a) becomes 0.6029 and its RHS becomes 0.6157 so that the inequality is violated. However, in such a case the value of F-statistic is F 0 = 2 x S/ 2 y S = 6 whose P-value for pre-testing 0 : xy H ? ??== will equal to 0.00007736, i.e., this last hypothesis is easily rejected so that pooling is disallowed. Again to be on the conservative side, we allow pooling iff the P-value of the variance-pretest exceeds 20%. Otherwise, the two-independent sample t-statistic will be used for testing 0 : x y H ? ?= . This is consistent with Devore?s (2004, p. 377) assertion of ?using the two-sample t procedure unless there is compelling evidence for doing otherwise, particularly when the two sample sizes are different?. Further, unlike the case of balanced design, when n x > n y the value of ?? is an increasing function (but not monotonically) of F 0 but when n x < n y , the value of?? is almost always a decreasing function of F 0 . Thus, for fixed n x > n y , the maximum occurs at xy 0.10,n 1,n 1 F ?? , and when n x < n y the maximum occurs at xy 0.90,n 1,n 1 F ? ? . As n x and n y both increase at the same F 0 , 77 so does the value of ??in (37c). When one sample size is twice the other, the limiting value of ?? (in terms of n x and n y ) is 0.006286690. When one sample size is three times the other, the limiting value of ?? is 0.00733793. When one sample size is four times the other, the limiting value of ?? is 0.008390775. When one sample size is 5 times the other, the limiting value of ?? is 0.0093831123. When one sample size is 10 times the other limiting value of ?? is 0.01336332. Finally, as limit of R n = n y /n x ? ? or 0, the Overlap type I Pr approaches that of an exact ?-level test. Table 20 gives the exact ?? from Eq. (37c) for different n x and n y combinations. The values of F 0 in Table 20 are restricted such that P-value of the pretest 0 :H x y ? ?= exceeds 20%. 6.2 The Case of H 0 : xy ? ?= Rejected Leading to the Two-Independent Sample t-Test Assuming that X~N( 2 ,x x ? ? ) and Y~N( 2 ,yy ? ? ), than X Y? is also N( x y ? ?? , 22 // x xyy nn??+ ), but now the null hypothesis of H 0 : x y ? ?= is rejected at the 20% level leading to the assumption that the F-statistic F 0 = 22 xy SS/ > 2 for all sample sizes 16 ? n x & n y . Note that for larger sample sizes such as n x & n y = 41, F 0 can be as small as 1.510 and H 0 : x y ? ?= can still be rejected at the 20% level because F 0.10,40,40 = 1.5056, while for n x & n y = 11, an F 0 as large as 2.323 is needed because F 0.10,10,10 = 2.3226. Note that an F 0 = 2 is significant at the level 20% once n x & n y ? 16 because F 0.10,15,15 = 1.9722. It has been shown in statistical theory that if the assumption x y ? ?= is not 78 Table 20. The Pooled ?? Values for Different n x , n y and F 0 Combinations at ? = 0.05 x n y n 0 F ?? x n y n 0 F ?? 20 40 0.8 0.0071355 20 40 1.4 0.0039812 20 60 0.8 0.0087831 20 60 1.4 0.0035984 20 80 0.8 0.0103096 20 80 1.4 0.0035130 30 10 0.8 0.0034279 30 10 1.4 0.0085406 30 20 0.8 0.0047292 30 20 1.4 0.0068050 30 30 0.8 0.0054425 30 30 1.4 0.0055284 40 20 0.8 0.0044541 40 20 1.4 0.0080709 40 40 0.8 0.0054943 40 40 1.4 0.0055823 40 80 0.8 0.0075641 40 80 1.4 0.0042383 40 100 1 0.0063762 40 100 1.4 0.0040584 20 40 1 0.0056135 20 40 1.5 0.0037263 20 60 1 0.0062056 20 60 1.5 0.0032137 20 80 1 0.0068563 20 80 1.5 0.0030428 30 10 1 0.0049815 30 10 1.5 0.0094908 30 20 1 0.0053938 30 20 1.5 0.0071647 30 30 1 0.0053753 30 30 1.5 0.0055980 40 20 1 0.0056135 40 10 1.5 0.0111374 40 40 1 0.0054254 40 20 1.5 0.0086996 40 80 1 0.0059601 40 30 1.5 0.0067838 40 100 1 0.0063762 40 40 1.5 0.0056536 20 40 1.2 0.0046424 1000 2000 F 0.90,?x, ?y 0.0067706 20 60 1.2 0.0046283 100000 200000 F 0.90,?x, ?y 0.0063436 20 80 1.2 0.0048067 10000000 20000000 F 0.90,?x, ?y 0.0063019 30 10 1.2 0.0067025 1000000000 2000000000 F 0.90,?x, ?y 0.0062871 30 30 1.2 0.0054202 1000 3000 F 0.90,?x, ?y 0.0081935 40 20 1.2 0.0068266 100000 300000 F 0.90,?x, ?y 0.0074960 40 40 1.2 0.0054714 10000000 30000000 F 0.90,?x, ?y 0.0074280 40 80 1.2 0.0049359 1000000000 3000000000 F 0.90,?x, ?y 0.0073386 20 40 1.3 0.0042824 1000 4000 F 0.90,?x, ?y 0.0095652 20 60 1.3 0.0040623 100000 400000 F 0.90,?x, ?y 0.0086485 20 80 1.3 0.0040901 10000000 40000000 F 0.90,?x, ?y 0.0085592 30 10 1.3 0.0076096 1000000000 4000000000 F 0.90,?x, ?y 0.0083917 30 30 1.3 0.0054683 1000 5000 F 0.90,?x, ?y 0.0108398 40 20 1.3 0.0074460 100000 500000 F 0.90,?x, ?y 0.0097352 40 40 1.3 0.0055207 10000000 50000000 F 0.90,?x, ?y 0.0096277 40 80 1.3 0.0045561 1000000000 5000000000 F 0.90,?x, ?y 0.0093842 tenable, the statistic 22 [( ) ( )]/ ( / ) ( / ) x yxxyy x ySnSn???? ? + has the approximate Student?s t-distribution with degrees of freedom 79 222 22 22 (/ /) (/) (/) 11 xx yy yy xx xy Sn Sn Sn Sn nn ? + = + ?? = 2 22 [() ()] (() (() xy Vx Vy Vx Vy ?? + + = 2 22 [() ()] (() (() xy yx Vx Vy Vx Vy ?? + + (39a) ? =? 2 0 2 0 (1) () + + xy n y nx FR FR ?? ? ? = 22 4 (1)+ + xy yx k k ?? ?? (39b) where 2 () / x x Vx S n= , R n = n y /n x , k = (/ )/(/ ) x xyy SnSn, and F 0 = 22 xy SS/ . Eq. (39b) shows that ? depends only on n x , n y , and the se ratio 0n FR . The formulas for degrees of freedom in (39a &b) rarely lead to an integer and ? is generally rounded down to make the test of H 0 : x y ??? = 0 conservative, i.e., rounding down ? increases the P-value of this last test. However, programs like Matlab and Minitab will provide the cdf and percentage points of the t-distribution for non-integer values of ? in Eqs. (39). It has been verified using a spreadsheet that Min(? x , ? y ) < ? < ? x + ? y is a certainty, and hence this t-test is less powerful than the pooled t-test. In fact, it is easy to algebraically prove that for the case of n x = n y = n, the value of ? always exceeds (n ? 1) and is always less than 2(n ? 1). It can also be verified that the maximum of ? in Eqs. (39) occurs when the larger sample also has much larger variance, but yet its value can never exceed the df, n x +n y ?2, of the pooled t-test, as illustrated in Table 21. When H 0 : x y ? ?= is rejected at the 20% level (i.e., P-value of the test < 0.20), the approximate (1??)?100% CI for x y ??? is given by x y? /2, t ? ? ?? 22 // x xyy Sn Sn+ ? x y ??? ? x y? + /2, t ? ? ? 22 // x xyy Sn Sn+ (40a) resulting in the approximate CIL is 2 /2, t ? ? ? 22 // x xyy Sn Sn+ , and H 0 : x y ? ?? = 0 can 80 Table 21. Verifying the Inequality that min( ? x , ? y ) < ? < ? x + ? y for Different x ? and y ? Combinations x ? y ? ()Var x ()Var y (at 0 F = 0. 90, , x y F ? ?= ) ? x ? y ? ()Var x ()Var y (at 0 F = 0. 90, , x y F ? ?= ) ? 1 11 20 6.201 1.106 11 1 20 0.331 11.992 6 16 20 9.181 8.371 16 6 20 6.987 18.725 11 21 20 10.550 17.483 21 11 20 9.449 30.068 16 26 20 11.451 27.422 26 16 20 10.779 40.883 26 31 20 12.364 49.013 31 26 20 12.174 56.686 36 41 20 13.220 69.455 41 36 20 13.110 76.486 46 51 20 13.830 89.823 51 46 20 13.758 96.317 66 71 20 14.664 130.370 71 66 20 14.626 136.052 86 91 20 15.222 170.752 91 86 20 15.198 175.855 106 111 20 15.631 211.034 111 106 20 15.615 215.702 126 131 20 15.947 251.252 131 126 20 15.935 255.579 176 181 20 16.505 351.633 181 176 20 16.498 355.351 226 231 20 16.878 451.882 231 226 20 16.874 455.192 326 331 20 17.361 652.198 331 326 20 17.359 654.979 426 431 20 17.669 852.395 431 426 20 17.668 854.840 526 531 20 17.889 1052.532 531 526 20 17.888 1054.739 1200 3000 20 18.811 2151.446 3000 1200 20 18.788 2275.082 1500 3500 20 18.921 2768.370 3500 1500 20 18.903 2912.635 2000 4000 20 19.037 3913.944 4000 2000 20 19.026 4090.453 2500 4500 20 19.120 5067.119 4500 2500 20 19.112 5263.906 2800 5000 20 19.166 5693.779 5000 2800 20 19.159 5901.618 be rejected at LOS =? if ||x y? > /2, t ? ? ? 22 // x xyy Sn Sn+ , i.e., ? ? Pr(||x y? > /2, t ? ? ? 22 // x xyy Sn Sn+ | x y ? ?? =? = 0) ? Pr(F 1,? > 2 /2, t ? ? | ? = 0) = Pr(F 1,? > ,1, F ? ? | ? = 0) (40b) As in the case of pooled t-test, for the individual two t-CIs, the rejection requirement is either L( x ? ) > U( y ? ) or L( y ? ) > U( x ? ) leading to the same condition as before in Eq. (37a). That is, ??= Pr(reject H 0 | ? = 0) = Pr[||x y? > /2, / x x x tSn ?? /2, / y yy tSn ?? + ] (37a) 81 It is impossible to studentize the argument of Eq. (37a) because when x y ? ?? , the expression for t ? = 2 //Z ? ? ? will show that ||x y? / 22 //+ x xyy Sn Sn is not central t distributed with n x +n y ? 2 df. In other words, there does not exist a central 2 ? ? rv that reduces t ? = 2 //Z ? ? ? to the form ||x y? / 22 //+ x xyy Sn Sn iff x y ? ?? . However, ||x y? / 22 //+ x xyy Sn Sn is approximately t distributed with df ? given in Eqs.(39). Therefore, Eq. (37a) can approximately be written as ??? Pr{| t ? | > ( /2, / x x x tSn ?? /2, / y yy tSn ?? + )/ 22 //+ x xyy Sn Sn } = Pr{F 1,? > ( /2, / x x x tSn ?? /2, / y yy tSn ?? + ) 2 /( 22 // x xyy Sn Sn+ )} Or ??? Pr{F 1,? >( /2, x x y tSn ?? /2, + y yx tSn ?? ) 2 /( 22 yx xy nS nS+ )} = Pr[F 1,? > ( /2, x kt ?? ?+ /2, y t ? ? ) 2 /( 2 1 k+ )] (41a) Let / nyx R nn= (or y nx nRn= ) and 22 0 /= x y FSS. Substituting n R = n y /n x and 0 F into Eq. (41a) results in ??? Pr[F 1,? > ( /2, 0 x n tRF ?? /2, y t ? ? + ) 2 /( 0 1 n RF+ )] (41b) ?? can be also represents as ??? Pr[F 1,? > ( /2, x kt ?? ? + /2, y t ? ? ) 2 /( 2 1k + )] , where k = (/ )/(/ ) x xyy SnSn. When n x = n y = n, R n = 1, and the above formula for ?? reduces to ??? Pr[F 1,? > F ?,1,n?1 ( 0 F 1+ ) 2 /( 0 1F + )] (41c) which is similar to (37e) but ? is given by Eqs. (39) instead of n x +n y ? 2 in the case of the pooled t-test, or instead of n ? 1 as stated by Payton et al. (2000). For equal sample sizes 82 Eq. (39b) simplify to ? = 22 00 (n 1)(1 F ) /(1 F )?+ + and not (n ?1) as reported by Payton et al. (2000). Note that this last formula for ? reduces to 2(n ? 1) at F 0 = 1, which is the df of the pooled t-test, as it should because the unlikely realization F 0 = 1 is in perfect agreement with H 0 : 22 = x y ? ? . Further, Eq. (41b) shows that ?? does not depend on the specific values of 22 x y S and S but only on their ratio k = 0 F n R . For Payton et al.?s (2000) reported example of n 1 = n 2 = 10, S 1 = 0.80 and S 2 = 1.60, Eq. (41c) shows that at n = 10 and F 0 = 0.25, ? = 13.2353 resulting in the value of ??? 0.00940573, which is different from 0.0149 reported by Payton et al. (2000, p. 549). The df used by them was 9 which caused the % relative error in their reported ??= 0.0149 to be equal 54.414%. Payton et al. (2000, p. 549) also make the following statement about 1/3 of the way from the top of their p. 549: ?If the samples are collected from the same normal population, the quantity 2 12 22 12 ()nY Y SS + + is F-distributed with 1 and n?1 degrees of freedom.? The statement should go as follows: If the samples are collected from the two normal populations with identical means and variances, [our Eq. (37e) shows that] the statistic 2 12 22 12 ()nY Y SS ? + is F-distributed with 1 and 2(n?1) degrees of freedom (not n ?1 as stated). Payton et al. (2000, p. 550) also make the following statement in the second paragraph leading to their Eq. [1]: ? If the researcher is willing to assume that S 1 and S 2 are estimating the same parameter value (i.e., homogenous variances), then the above equation simplifies to 0.95 = Pr[F 1,9 < 2F ?,1,9 ] [1]? 83 Their above quote should be stated as follows: In the unlikely event that F 0 = 22 12 /SSis realized to equal 1, then the above equation simplifies to 0.95 = Pr(F 1,13.2353 < 2F ?,1,9 ). Note that they are using (1 ? ?) also as the Overlap confidence level, and secondly just because two independent population variances are equal, it does not imply that the corresponding point estimates 22 12 SandS will be the same. Further, Payton et al. limit their sample means, 1 Y and 2 Y , originating from the same normal population on p. 548. Our work herein is not limited to the same normal population but to any two distinct normal populations. We now proceed to obtain the LUB (least upper bound) and the GLB (greatest lower bound) for ?? in Eq. (41b). The LUB occurs when the argument on the RHS of the Pr in Eq. (41b) is smallest. To this end, let ? 2 = Max(? x , ? y ) and thus ??? Pr{F 1,? > [ 2 /2, 0n tRF ?? 2 /2, t ? ? + ] 2 /( 0 1 n RF+ )} ? LUB(??) = Pr{F 1,? > 2 ,1, 0 ( n FRF ?? 1+ ) 2 /( 0 1 n RF+ )} Conversely, the greatest lower bound occurs when the argument on the RHS of (41b) is largest. Letting ? 1 = Min(? x , ? y ) in (41b) results in ??? Pr{F 1,? > [ 1 /2, 0n tRF ?? 1 /2, t ? ? + ] 2 /( 0 1 n RF+ )} ? GLB(??) = Pr{F 1,? > 1 ,1, 0 ( n FRF ?? 1+ ) 2 /( 0 1 n RF+ )} , or Pr{F 1,? > 1 ,1, 0 ( n FRF ?? 1+ ) 2 /( 0 1 n RF+ )} 2 ,1, 0 ( n FRF ?? 1+ ) 2 /( 0 1 n RF+ )}, while the expression for exact type I Pr from (40b) is ? ? Pr(F 1,? > ,1, F ? ? | x y ??? = 0). The function ( 0n R F 1+ ) 2 /( 0 1 n RF+ ) clearly always exceeds 1 because 0n R F = 84 22 (/)(/) y xxy nn SS? = ()/ ()Vx Vy, which is also a se ratio, can never equal to zero and the function is bounded by 1 < ( 0n R F 1+ ) 2 /( 0 1 n RF+ ) ? 2, the maximum occurring when 0n R F = 1. Because we are seeking to establish that ?? in (41b) is always smaller that ? ? Pr(F 1,? > ,1, F ? ? | x y ? ?? = 0), we consider the very worst-case scenario where the smallest sample has the largest variance. For example, at ? x = 1, ? y =11 (so that R n = n y /n x = 12/2 = 6), 2 x S = 8.00 , 2 y S = 2.4805, F 0 = F 0.10,1,11 = 3.2252, 0.025,1 t = 12.7062, and 0.025,11 t = 2.2010 the value of ? = 1.105755. Substituting these into (41b) results in ??= Pr{F 1,1.105755 > 165.842616) = 0.0386 < 0.05. It has also been verified that 2 y S can be as small as 2 x S /F 0.0001,?x,?y and still?? < ?. Note that if xy 0.90,n 1,n 1 F ? ? < 0 F = 2 x S/ 2 y S < xy 0.10,n 1,n 1 F ?? , then we are recommending using the pooled t-test so that the value of 0 F must lie outside the interval ( xy 0.90,n 1,n 1 F ? ? , xy 0.10,n 1,n 1 F ? ? ) in order to apply the two- independent sample t-test. This is consistent with statistical literature (see J. L. Devore, p.377 ) that suggests not to use the pooled t-test unless there is compelling evidence in favor of H 0 : ? x = ? y . Keeping F 0 ? 0.10, 1, 1 x y nn F ?? fixed, ?? in Eq. (41a) attains its minimum at R n = n y /n x = 1, and the limit of ?? as R n ? 0 or ? is ?; similarly, if F 0 ? 0.90, 1, 1 x y nn F ?? is kept fixed, ?? is minimum at R n = 1 and its limit approaches ? as R n ? 0 or ?. As F 0 ? ?, ?? approaches the value of ?, i.e., the Overlap converges to an ?-level test; however, the farther R n is above 1, the faster is the limiting approach of ?? to ? as F 0 ? ?. As F 0 ? 0, 85 ?? also approaches the value of ?, and the farther R n is below 1, the faster is the limiting approach of ?? to ? as F 0 ? 0. For example, if n x = n y = 50 (i.e., R n =1), F 0 = 10 6 then ??= 0.04978024 (nearly 5%). If n x = 50, n y = 100, R n = 2, F 0 = 10 6 , then ??= 0.04984645882. However, If n x = 50, n y = 25, R n = 0.5, F 0 = 10 6 , ??= 0.0496811387 but if F 0 = 10 ?6 , then??= 0.0498546652. Further, if R n = 1, then the limiting value of ?? as n x ? ? is equal to 0.0055751 as long as 0.10 0 < (or decreasing 00.90 /2, 1n t ? ? ? d Sn/ , i.e., ? = Pr( xy? d nS/ > /2, 1n t ? ? ) = Pr(|t n?1 | ? /2, 1n t ? ? ). For the two separate CIs, the rejection requirement is either L( x ? ) > U( y ? ) or L( y ? ) > U( x ? ) leading to the same condition as in Eq. (37a). ?? = Pr[||x y? > /2, 1 / ?nx tSn ? /2, 1 / ? + ny tSn ? ] = Pr[| d| n > /2, 1 () nxy tSS ? ? + ] = Pr[| d| / d nS > /2, 1 () nxy tSS ? ? + /S d ] (44a) Because the null SMD of d / d nS is the Student?s t with (n ? 1) df, then (44a) can be written as ?? = Pr[|t n ?1 |> /2, 1 () nxy tSS ? ? + /S d ] = Pr[F 1,n-1 > 2 ,1, 1 () nxy FSS ? ? + / 2 d S ] = Pr[F 1,n-1 > 2 ,1, 1 0 0 0 (1)/(12) ? ++? n FF FrF ? ] (44b) We now proceed to show that (S x +S y ) 2 ? 2 d S = 22 ?2 x yxy SS ?+? for all values S x and S y . There are two possibilities: (1) r > 0 ? ? xy ? > 0, in which case it is obvious that (S x +S y ) 2 > 2 d S . (2) r < 0 ? ? xy ? < 0 and 2 d S attains its maximum when r = ?1 ? Max( 2 d S ) 88 = 22 ?2| | x yxy SS ?++ . In this worst-scenario case, it is clear that (S x +S y ) 2 = 22 2 x yxy SS SS++ ? 22 ?2| | x yxy SS ?++ because ?||? xyxy SS? . Thus, as before,??< ? . How much smaller ?? is than ? depends both on the sign and magnitude of the sample correlation coefficient r . The glb occurs when (S x +S y ) 2 is largest relative to 2 d S , i.e., when 2 d S attains its minimum value. This minimum occurs when X and Y are highly positively correlated and in the limit ? xy ? ? S x S y . From Eq. (44b) we obtain ?? = Pr[F 1,n-1 > 2 ,1, 1 () nxy F SS ? ? + / 2 d S ] ? Pr[F 1,n-1 > 2 ,1, 1 () nxy F SS ? ? + /( 22 2 x yxy SS SS+? )] ? ?? > Pr[F 1,n-1 > 2 ,1, 1 () nxy FSS ? ? + / 2 () x y SS? ] ? GLB(??) = Pr[F 1,n-1 > 2 ,1, 1 () nxy F SS ? ? + / 2 () x y SS? ] ? GLB(??) = Pr[F 1,n-1 > 2 ,1, 1 0 (1) ? + n FF ? / 2 0 (1)?F ] (44c) There seems to exist a problem in the inequality (44c), i.e., when S x ? S y , the expression on the RHS of the Pr is not defined. However, this occurs iff the correction coefficient r = 1 which occurs only if the values of the rv Y is precisely a linear function of X, i.e., Y must equal to ax + b +? and the constant a > 0 and b can be any real number. Because a > 0, the variance of Y cannot equal to that of X. Secondly, the largest value of ?? occurs when 2 ,1, 1 () nxy F SS ? ? + / 2 d S attains its minimum value which in turn occurs when S d attains its maximum value of 22 x y SS++ ?2| | xy ? . Thus, ?? = Pr[F 1,n-1 > 2 ,1, 1 () nxy F SS ? ? + / 2 d S ] ? Pr[F 1,n- 1 > 2 ,1, 1 () nxy F SS ? ? + 22 ?/( 2 ) x yxy SS ?++ ] 89 The quantity 22 ?2| | x yxy SS ?++ attains its maximum when the sample correlation coefficient r = ?1, in which case | ? xy ? | = S x S y so the maximum value of ?? reduces to ?? ? Pr[F 1,n- 1 > 2 ,1, 1 () nxy F SS ? ? + 22 /( 2 ) x yxy SS SS++ ] = Pr[F 1,n- 1 > ,1, 1n F ? ? ) = ?. Hence, we have the result Pr[F 1,n-1 > 2 ,1, 1 () nxy FSS ? ? + / 2 () x y SS? ] ? ??? Pr[ n-1 T > /2, 1n t ? ? ] = ? For any r, the value of ??can never exceed ?. For example, for ? = 0.05, F 0 = 1, r = ? 0.25, the value of ?? ranges in the interval 0.01369406 (at n = 101) ???? 0.0321416 (at n = 3). For ? = 0.05, F 0 = 1.5, r = ? 0.25, it ranges in the interval 0.013975172 (at n = 101) ? ??? 0.032328946 (at n = 3); at F 0 = 2.0, 0.014512 (at n = 101) ???? 0.032681748 (at n =3). As F 0 ? ?, ??? ?. For ? = 0.05, F 0 =1, r = ? 0.5, the value of ?? ranges in the interval 0.02406825415 (at n = 101) ???? 0.03820582 (at n = 3). For ? = 0.05, F 0 =1.5, r = ?0.5, it ranges in the interval 0.02430291 (at n = 101) ???? 0.03832841 (at n = 3); at F 0 = 2.0, 0.0247473226 (at n = 101) ???? 0.038559305 (at n =3). As F 0 ? ?, ??? ?. For ? = 0.05, F 0 =1, r = ? 0.75, the value of ?? ranges in the interval 0.0363984432 (at n = 101) ???? 0.0441575 (at n = 3). For ? = 0.05, F 0 =1.5, r = ?0.75, it ranges in the interval 0.03653187 (at n = 101) ???? 0.04421765 (at n = 3); at F 0 = 2.0, 0.036783673 (at n = 101) ???? 0.04433101 (at n =3). As F 0 ? ?, ??? ?. Moreover, the effect of negative correlation is to increase?? toward ? as n ?1 goes toward 1. For example, when n ? 1 =1, F 0 = 2 and r = ? 0.90, then ?? = 0.0487767; at n ? 1 =1, F 0 = 2 and r = ? 0.95, then ?? = 0.0493921346 while at r = 90 ? 0.99 and F 0 = 2, ?? = 0.04987903. For fixed F 0 and ?1 ? r < 0, the limiting behavior of ?? as n ? ? is difficult to investigate because as n ? ?, then per force 0 F = 2 x S/ 2 y S ? 22 xy /??( an unknown parameter), and r ? ? xy (the population correlation coefficient which is another unknown parameter). Most importantly, Matlab loses accuracy in inverting F 1,n-1 once n?1 far exceeds 1,000,000. For example, Matlab gave ??(at n = 1,000,000, F 0 =1, r = ?0.50) = 0.0236254, ??(at n = 10,000,000, F 0 =1, r = ?0.50) = 0.0237475, but ??(at n = 100,000,000, F 0 =1, r = -0.50) = 0.0938949. We are fairly certain that ??(as n ? ? , F 0 = 1, r = ?0.50) = 0.0236254 is the correct answer and the last one is inaccurate because Matlab gave finv(0.95,1,100000000) = 2.10472249984741 instead of the correct value of 3.841458914. For negative correlation we have verified that as F 0 and n ? ?,??? ?. For ? = 0.05, F 0 = 1.00, and r = 0.25, the value of ?? ranges in the interval 0.0016243942 (at n =101) ???? 0.01966083 (at n = 3). For ? = 0.05, F 0 = 1.5, and r = 0.25, the value of ?? ranges in the interval 0.0017703 (at n = 101) ???? 0.0199853 (at n = 3). While, For ? = 0.05, F 0 = 2.00, and r = 0.25, the value of ?? ranges in the interval 0.00206727 (at n = 101) ???? 0.02059583 (at n = 3). For ? = 0.05, F 0 =1, r = 0.50, the value of ??lies in the interval 0.000136523 (at n = 101) ???? 0.0132366264 (at n = 3); at F 0 =1.5, ??lies in the interval 0.000169103 (at n = 101) ???? 0.013633621 (at n = 3); while at F 0 = 2.00, ??lies in the interval 0.0002456622 (at n = 101) ??? ? 0.01438047707496 (at n = 3). For ? = 0.05, F 0 =1, r = 0.75, the value of ??lies in the interval 0.0000001795166(at n = 101) ???? 0.00668445233 (at n = 3);at F 0 =1.5, ?? lies in the 91 interval 0.00000041140283 (at n = 101) ???? 0.00715685 (at n = 3); while at F 0 = 2.00, ?? lies in the interval 0.000001550553 (at n = 101) ? ?? ? 0.0080453 (at n = 3). The impact of positive correlation is to reduce ?? toward zero as n ? ?. For example, at ? = 0.05, F 0 = 1.5, r = 0.75 and n = 500, ?? reduces to 0.00000012177 from its value of 0.00000041140283 at n = 101, and at n = 101, r = 0.90, ?? reduces to 1.25244259407?10 ?12 . Because, we have coded Matlab functions (see Appendix B) to compute the?? values for all three above cases (pooled t-test, two-independent t-test, and the paired t- test), no extra tables are provided. 92 7.0 The Percent Overlap that Leads to the Rejection of H 0 :? x = ? y 7.1 The case of Unknown ? x = ? y = ? Throughout this section, it is understood that a pretest on H 0 :? x = ? y = ? has yielded a P-value > 0.20 so that the null hypothesis H 0 :? x = ? y = ? is tenable leading to a pooled t-test. As before, let ? represent the amount of overlap length between the two individual CIs on process means. Then ? will be 0 either L(? x ) > U(? y ) or L(? y ) > U(? x ), in which case H 0 :? x = ? y is rejected at the LOS < ?. Thus, the overlap amount ? is larger than 0 when U(? x ) > U(? y ) > L(? x ) or U(? y ) > U(? x ) > L(? y ). In these two cases, both U(? x ) > U(? y ) > L(? x ) and U(? y ) > U(? x ) > L(? y ) will lead to the same result. Therefore, only U(? x ) > U(? y ) > L(? x ) is discussed here so that we are making the assumption that x y?? 0 ? ? = U(? y ) ?L(? x ) = ( /2, / y y y y tSn ?? + ) ? ( /2, / x x x x tSn ?? ? ) = ( /2, /2, // xy x xyy tSntSn ?? ?? + ) ? ( x y? ) (45a) Further, the span of the two individual CIs is U(? x ) ? L(? y ) = ( /2, / x x x x tSn ?? +? ) ? ( /2, / y yy y tSn ?? ?? ) = ( /2, /2, // xy x xyy tSntSn ?? ?? + ) + ( x y? ) (45b) 93 From equations (45a &b) the % overlap is given by /2, /2, /2, /2, (/ /)() 100% (/ /)() xy xx yy xx yy tSntSnxy tSntSnxy ?? ?? ?? ?? ? +?? =? ++ (45c) As ||x y? increases, the P-value of the test decreases (i.e., H 0 : ? x = ? y must be rejected more strongly) and ? in Eq. (45c) decreases. Because H 0 : ? x = ? y must be rejected at the ?- level if ||x y? ? /2, t ? ? ? 1/ 1/ px y Snn+ , where ? = n x + n y ?2, then from (45c) H 0 must be barely rejected at ??100% level or less iff /2, /2, /2, /2, /2, /2, (/ /)( 1/1/) 100% (/ /)( 1/1/) xy xx yy p x y xx yy p x y tSntSnt Snn tSntSnt Snn ?? ?? ?? ?? ?? ?? ? +??+ ?? ++ (46a) Putting / nyx R nn= into Eq. (46a) leads to the result ? /2, /2, /2, /2, /2, /2, (/)(1/) 100% xy xy n pn n pn tStSRt S R tStSRt S R ?? ?? ?? ?? ?? ?? ? +??+ ?? ++ ? /2, /2, /2, /2, /2, /2, ()(1) 100% xy xn y pn xn y pn tSRtSt SR tSRtSt SR ?? ?? ?? ?? ?? ?? ? +??+ ++?+ (46b) As defined before, letting 22 0 / x y FSS= into Eq. (46b) and recalling ? = n x +n y ?2 results in 0 /2, /2, /2, 0 0/2,/2,/2, 0 (1 )( ) / 100% (1 )( ) / xy xy nnxy FR t t t R F FR t t t R F ?? ?? ?? ?? ?? ?? ??? ? ??? ?+? + + ?? ?++ + + (46c) where F 0 ?R n = ( 22 / x y SS)? y x n n = 2 2 / / x x y y Sn Sn = () () x y v v = (se ratio) 2 . Thus, the percent overlap at which H 0 should be rejected exactly at the ?-level is given by 94 0/2,/2,/2, 0 0/2,/2,/2, 0 (1 )( ) / 100% (1 )( ) / xy xy nnxy r FR t t t R F FR t t t R F ?? ?? ?? ?? ?? ?? ??? ? ??? ?+? + + =? ?++ + + (46d) = /2, /2, /2, 0 /2, /2, /2, 0 (1 )( ) / 100% (1 )( ) / xy xy nx y nx y kt t t R F kt t t R F ?? ?? ?? ?? ?? ?? ??? ??? ?+? + + ? ?++ + + (46e) Eq. (46d) shows that the % overlap at which H 0 must be rejected at the ?-level depends only on ?, n x , n y and F 0 and not on the specific values of S x and S y . For larger values of n x and n y > 30, the dependency on ? is negligible because /2, x t ? ? , /2, y t ? ? and /2, t ? ? are close in values and are almost equal once n x and n y > 60 . For the case of balanced completely randomized design (i.e., n = n x = n y ? R n =1), the inequality in (46d) reduces to /2, 1 0 /2,2( 1) 0 /2, 1 0 /2,2( 1) 0 (1 ) 1 100% (1 ) 1 nn r tFt F ?? ? ?? +? + =? ++ + (46e) We first discuss the limiting property of the Eq. (46e). Because this is the case of pooled t-test, F 0 = (S x /S y ) 2 (the ratio of the two sample variances) must lie within the acceptance interval ( 0.90, 1, 1nn F ?? , 0.10, 1, 1nn F ?? ); otherwise H 0 : ? x = ? y must be rejected at the 20% level. Further, the first derivative of ? r vanishes at F 0 = 1 so that the maximum of ? r occurs at F 0 = 1 and is equal to 0.613625686 at n = 2 and its maximum approaches (2 2)/(2 2)?+= 0.171573 as n ? ?. Note that as n ? ?, the value of F 0 that must lie within ( 0.90, 1, 1nn F ?? ? F 0 ? 0.10, 1, 1nn F ? ? ) must per force also go towards 1 because the limiting value of both 0.90, 1, 1nn F ?? and 0.10, 1, 1nn F ? ? is nearly 1. That is, for all F 0 values where H 0 : x y ? ?= cannot be rejected, the limiting value of ? r in terms of n cannot be less 95 than 0.171573. At n = 61, ? r = 0.17603 if F 0 = 1.2 so that the approach of ? r toward 0.171573 occurs fairly rapidly in terms of n as long as 0.90, 1, 1nn F ? ? ? F 0 ? 0.10, 1, 1nn F ?? . At n = 31 and F 0 =1.30, the value of ? r = 0.1806 so that H 0 : ? x = ? y must be rejected at 5% or less if the % overlap is less than or equal to 18.06%. In the unbalanced case if R n = 0.50 or 2, the limiting value of ? r at F 0 =1 is equal to (2 1 3)/(2 1 3)+? ++ = 0.164525. Further, as R n ? 1 deviates farther from 1, the limiting value of ? r decreases for a fixed F 0 as long as 0.90, 1, 1nn F ? ? ? F 0 ? 0.10, 1, 1nn F ?? . For example, at R n = 3 (or 1/3), the limiting value of ? r is 0.15470; at R n = 4 (or 0.25), its limiting value is 0.14590; at R n = 5 (or 0.20), its limiting value is 0.13835; at R n = 0.10 (or 10), the limiting value of ? r is 0.11307, while at Rn = 20 (or 0.05) the limiting value of ? r is equal to 0.088472. Clearly, as R n deviates farther from 1, the limiting value of ? r decreases, implying that the Overlap approaches an ?-level test. See the illustration in Table 22. Finally, it must be noted that as n x and n y become very large, the limit of Eq. (46d) becomes identical to r ? = 2 2 (1 1 ) (1 1 ) kk kk +? + ++ + 100% given in Eq. (12e). Next, what should each individual confidence level 1 ?? be so that the two independent CIs lead to the exact ??100%-level test on H 0 : ? x = ? y . The expressions for the two 1?? independent CIs are given by /2, / x x x x tSn ?? ?? ? ? x ? /2, / x x x x tSn ?? +? (47a) /2, / y yy yt S n ?? ?? ? ? y ? /2, / y yy yt S n ?? +? (47b) It is clear that H 0 : ? x = ? y must be rejected at the ?-level iff the amount of overlap 96 Table 22. The Value of r ? for Different F 0 and R n Combinations F 0 x ? y ? R n r ? F 0 x ? y ? R n r ? 0.5 1 1 1 60.90835 0.8 11 5 0.5 24.33944 0.5 11 11 1 19.33138 0.8 21 10 0.5 20.81321 0.5 21 21 1 17.90985 0.8 51 25 0.5 18.92389 0.5 31 31 1 17.42757 0.8 61 30 0.5 18.72412 0.5 41 41 1 17.18504 0.8 1001 500 0.5 17.81175 0.5 51 51 1 17.03911 0.8 11 23 2.0 17.54347 0.5 100 100 1 16.74930 0.8 31 63 2.0 15.87565 0.5 150 150 1 16.64982 0.8 51 103 2.0 15.53555 0.5 200 200 1 16.60028 0.8 1001 2003 2.0 15.04753 0.8 1 1 1 61.31421 1 11 5 0.5 22.76531 0.8 11 11 1 19.95417 1 21 10 0.5 19.37801 0.8 21 21 1 18.53612 1 41 20 0.5 17.85949 0.8 31 31 1 18.05497 1 81 40 0.5 17.14235 0.8 41 41 1 17.81299 1 151 75 0.5 16.81705 0.8 51 51 1 17.66738 1 501 250 0.5 16.56104 0.8 100 100 1 17.37823 1 1001 500 0.5 16.50667 0.8 500 500 1 17.14089 1 10001 5000 0.5 16.45788 0.8 1000 1000 1 17.11144 1 500 1001 2.0 16.50667 1 1 1 1 61.36257 1 2 5 2.0 35.76007 1 10 10 1 20.33789 1 10 21 2.0 19.37801 1 50 50 1 17.75437 1 20 41 2.0 17.85949 1 100 100 1 17.45340 1 50 101 2.0 17.00221 1 10000 10000 1 17.16022 1 100 201 2.0 16.72519 1 500000 500000 1 17.15735 1 10000 20001 2.0 16.45517 1 10000000 10000000 1 17.15729 1 10000000 20000001 2.0 16.45247 1.2 1 1 1 61.33025 1 100 302 3.0 15.77155 1.2 10 10 1 20.28821 1 10000 30002 3.0 15.47304 1.2 20 20 1 18.63655 1 1000000 3000002 3.0 15.47008 1.2 60 60 1 17.60330 1 100 403 4.0 14.91845 1.3 10 10 1 20.23527 1 10000 40003 4.0 14.59306 1.3 20 20 1 18.58326 1 1000000 4000003 4.0 14.58984 1.3 30 30 1 18.05974 1 10000 50004 5.0 13.83815 1.6 5 5 1 23.67919 1 1000000 5000004 5.0 13.83471 1.6 10 10 1 20.01202 1 10000 100009 10.0 11.31132 1.6 13 13 1 19.23368 1 10000000 100000009 10.0 11.30718 between (47a) and (47b) barely becomes zero or less. Without loss of generality, the x- sample will be denoted such that x y? ? 0. Therefore, we deduce from (47a & b) that 97 U'(? y ) ?L' (? x ) = ( /2, / y yy yt S n ?? +? ) ? ( /2, / x x x x tSn ?? ?? ) = /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? ( x y? ) (48a) Because H 0 : ? x = ? y must be rejected at the ?-level as soon as the RHS of Eq. (48a) becomes 0 or smaller, we impose the borderline rejection criterion ||x y? = /2, t ? ? ? 1/ 1/ p xy Snn+ into Eq. (48a). In short, we are rejecting H 0 :? x = ? y as soon as the two independent CIs in (47a) and (47b) become disjoint. This leads to rejecting H 0 :? x = ? y iff /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? ( x y? ) ? 0. (48b) At the borderline value, we set x y? = /2, t ? ? ? 1/ 1/ px y Snn+ and set the LHS of inequality (48b) to 0 in order to solve for? . ? /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? /2, t ? ? ? 1/ 1/ p xy Snn+ = 0 /2, ? y y tS ?? /2, +? x x tS ?? n R ? /2, t ? ? ? 1+ pn SR = 0. /2, y t ? ? /2, +? x t ?? 0 n FR ? /2, t ? ? ? 0 (1)( )/++ nxy RF? ?? = 0. ? /2, y t ? ? /2, +? x t ?? 0 n FR = /2, t ? ? ? 0 (1)( )/++ nxy RF? ?? or /2, y t ? ? /2, +? x kt ? ? = /2, t ? ? ? 0 (1)( )/++ nxy RF? ?? (49a) where 22 0 / x y FSS= , R n = n y /n x , ? = n x + n y ?2 and k= 0 n FR = se ratio of samples. Eq. (49a) clearly shows that the value of ? depends on the LOS ? of testing H 0 : ? x = ? y , also on 22 0 / x y FSS= , and the sample sizes n x , n y . For the case of balanced design (n x = n y = n), (49a) reduces to /2, 1ny tS ? ? ? /2, 1nx tS ? ? +?? /2,2( 1)n t ? ? ? 2 p S = 0 98 ? /2, 1 () nxy tSS ? ? + ? /2,2( 1)n t ? ? ? 22 x y SS+ = 0 ? /2, 1n t ? ? = /2,2( 1)n t ? ? ? 22 x y SS+ / () x y SS+ ? /2, 1n t ? ? = /2,2( 1)n t ? ? ? 0 1F + / 0 (1)F + ? ,1, 1n F ? ? = ,1,2( 1)n F ? ? ?( 0 1 F+ )/(1+ 0 F ) 2 (49b) For example, when ? = 0.05, n x & n y = 21, Eq. (49b) gives ? = 0.16807 so that the two independent CIs have to be set at the confidence level 1?? = 0.83193 in order for the Overlap to provide an exact 5% level test. The values of 1?? range from 0.2020062 at n?1 = 1 down to 0.16596 at n?1 = 100. In order to obtain the limiting value of ?, we let n? ? in (49b) resulting in Lim /2, 1n t ? ? ( n? ?) = 1.96? 11+ / (1 1)+ =1.96/ 2 = 1.38593 ? Limit ? (as n? ?) = Pr(|Z| ? 1.38593) = 0.16578, which is identical to the know-&-equal-variances case from Eq. (13) at K =1. 7.2 The Case of H 0 : xy ? ?= Rejected Leading to the Two-Independent Sample t-Test Assuming that X~N( 2 ,x x ? ? ) and Y~N( 2 ,yy ? ? ) and X Y? is N( x y ? ?? , 2 2 y x x y nn ? ? + ), where now the null hypothesis of H 0 : x y ? ?= is rejected at the 20% level (i.e., the P-value of the pre-test is less than 20%) leading to the assumption that the F- statistic F 0 = 22 xy SS/ is outside the interval ( xy 0.90,n 1,n 1 F ? ? , xy 0.10,n 1,n 1 F ? ? ), where without loss of generality the sample with the larger mean will be called X. It has been shown in 99 statistical theory that if the assumption x y ? ?= is not tenable, the statistic 22 [( ) ( )]/ ( / ) ( / ) x yxxyy x ySnSn???? ? + has approximately the Student?s t-distribution with degrees of freedom given by Eq. (39). 222 22 22 (/ /) (/) (/) 11 xx yy yy xx xy Sn Sn Sn Sn nn ? + = + ?? = 2 22 [() ()] (() (() xy Vx Vy Vx Vy ?? + + = 2 22 [() ()] (() (() xy yx Vx Vy Vx Vy ?? + + (39a) ? =? 2 0 2 0 (1) () + + xy n y nx FR FR ?? ? ? = 22 4 (1)+ + xy yx k k ?? ?? (39b) As before, the amount of overlap between the two individual CIs is given by ? = U(? y ) ? L(? x ) = ( /2, / y y y y tSn ?? + ) ? ( /2, / x x x x tSn ?? ? ) = ( /2, /2, // xy x xyy tSntSn ?? ?? + ) ? ( x y? ) (50a) Further, the span of the two individual CIs is U(? x ) ? L(? y ) = ( /2, / x x x x tSn ?? +? ) ? ( /2, / y yy yt S n ?? ?? ) = ( /2, /2, // xy x xyy tSntSn ?? ?? + ) + ( x y? ) (50b) From equations (50 a &b) the % overlap is given by /2, /2, /2, /2, (/ /)() 100% (/ /)() xy xx yy xx yy tSntSnxy tSntSnxy ?? ?? ?? ?? ? +?? =? ++ (50c) As ||x y? increases, the P-value of the test decreases (i.e., H 0 : ? x = ? y must rejected more strongly) and ? in Eq. (50c) decreases. Because H 0 : ? x = ? y must be barely rejected at the ?- level if ||x y? = /2, t ? ? ? 22 //+ x xyy Sn Sn, where ? is given in Eq. (39), then from (50c) 100 22 /2, /2, /2, 22 /2, /2, /2, (/ /) // 100% (/ /) // +??+ =? ++ xy xx yy xxyy r xx yy xxyy tSntSnt SnSn tSntSnt SnSn ?? ?? ?? ?? ?? ?? ? (51a) In order to simplify (51a), we multiply throughout by y n , divide throughout by S y and replace n y /n x by R n and 22 xy SS/ by F 0 resulting in /2, 0 /2, /2, 0 /2, 0 /2, /2, 0 ()1 100% +??+ =? ++?+ xy nn r tRFt t RF tRFt t RF ?? ?? ?? ?? ?? ?? ? = 2 /2, /2, /2, 2 /2, /2, /2, ()1 100% xy xy kt t t k kt t t k ?? ?? ?? ?? ?? ?? +??+ ? ++?+ (51b) where F 0 lies outside the 20% acceptance interval ( xy 0.90,n 1,n 1 F ? ? , xy 0.10,n 1,n 1 F ?? ). The % overlap in Eq. (51b) changes very little as ? changes, increasing a bit as ? decreases while other parameters n x , n y and F 0 are kept fixed. As F 0 increases, the value of ? r decreases such that as F 0 ? ?, ? r ? 0 so that the overlap becomes an exact ?-level test. The limiting (in terms of n x and n y ) values of ? r at R n = 2, 3, 4, 5, 10 and 20 are independent of ? (because for large n x and n y all 3 t inverse functions in (51b) are almost equal) and are almost identical to those of the pooled t-test, namely 0.164509, 0.154679, 0.1458744, 0.138322, 0.11305, and 0.08845, respectively. When the design is balanced (n x = n y = n), the % overlap in Eq.(51b) that still leads to the rejection of H 0 : ? x = ? y at the ?-level reduces to /2, 1 0 /2, 0 /2, 1 0 /2, 0 (1) 1 100% (1) 1 ? ? +? ? + =? ++ ? + n r n tFtF tFtF ??? ? (51c) 101 where in the balanced case ? = 222 44 (1)( )?+ + xy xy nSS SS = 2 0 2 0 (1)( 1) 1 nF F ?+ + . If the % overlap exceeds Eq.(51c), then H 0 : ? x = ? y can no longer be rejected at the ??100% level of significance. For values of F 0 outside the range ( xy 0.90,n 1,n 1 F, ? ? xy 0.10,n 1,n 1 F ?? ), the limiting value of ? r (at any ? ) as F 0 ? 1 from Eq. (51c) is, as before, equal to (2 2) / (2 2)?+ = 0.171573. Again, as F 0 ? ?, ? r ? 0, which is consistent with the results in Chapter 3 with known but unequal sample case of the SE ratio k ? ?. Now, what should each individual confidence level 1 ?? be so that the two independent CIs lead to the exact ??100%-level test on H 0 : ? x = ? y . As before, the expressions for the two 1?? independent CIs are given by /2, / x x x x tSn ?? ?? ? ? x ? /2, / x x x x tSn ?? +? (52a) /2, / y yy y tSn ?? ?? ? ? y ? /2, / y yy y tSn ?? +? (52b) It is clear that H 0 : ? x = ? y must be rejected at the ?-level iff the amount of overlap between (52a) and (52b) barely becomes zero or less. Without loss of generality, the x- sample will be denoted such that x y? ? 0. Therefore, we deduce from (52a &b) that U'(? y ) ?L' (? x ) = ( /2, / y yy y tSn ?? +? ) ? ( /2, / x x x x tSn ?? ?? ) = /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? ( x y? ) (53a) Because H 0 : ? x = ? y must be rejected at the ?-level as soon as the RHS of (53a) becomes 0 or smaller, we impose the critical limit of rejection ||x y? = /2, t ? ? ? 22 //+ x xyy Sn Sn into Eq. (53a), where ? is given in Eq. (39). In short, we are rejecting 102 H 0 :? x = ? y as soon as the two independent CIs in Eq.(52a) and Eq.(52b) become disjoint. This leads to rejecting H 0 :? x = ? y iff /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? ( x y? ) ? 0. (53b) At the borderline value, we set x y? = /2, t ? ? ? 22 //+ x xyy Sn Sn and set the LHS of inequality (53b) to 0 in order to solve for ? . ? /2, / y yy tSn ?? ? /2, / x x x tSn ?? +? ? /2, t ? ? ? 22 //+ x xyy Sn Sn = 0. ? /2, ? y y tS ?? /2, +? x x tS ?? n R ? /2, t ? ? ? 22 x ny SR S+ = 0. ? /2, y t ? ? /2, +? x t ?? 0 n FR ? /2, t ? ? ? 0 1 n FR + = 0. Or: /2, y t ? ? /2, +? x t ?? 0 n FR = /2, t ? ? ? 0 1 n FR + ? /2, y t ? ? + k? /2, x t ? ? = /2, t ? ? ? 2 1+k (54a) where 22 0 / x y FSS= , R n = n y /n x and ? is given in Eq. (39). Eq. (54a) clearly shows that the value of ? depends on the LOS ? of testing H 0 :? x = ? y , F 0 , and the sample sizes n x and n y . For the case of balanced design (n x = n y = n), (54a) reduces to /2, 1n t ? ? = /2, t ? ? ? 0 1F + /(1+ 0 F ) or ,1, 1n F ? ? = ,1, F ? ? ?( 0 1 F+ )/(1+ 0 F ) 2 (54b) where ? = 2 0 2 0 (1)( 1) 1 nF F ?+ + . The limiting value of ? in Eq. (54b), as n? ?, can easily be obtained from /2 Z ? = /2 Z ? ? 0 1F + /(1+ 0 F ). The results will be the same for Eq. (54a). For example, using (54b) at ? = 0.05, n = 10, F 0 = 4.0, ? = 13.2353 resulting in ? = 103 0.1424. Payton et al. (2000) report this value as 0.1262 because the denominator df of ,1, F ? ? in the formula atop their page 550 is inaccurate. For n x & n y > 100, as F 0 ? ?, ? ? ? so that the Overlap approaches an ?-level test. See the illustration in Table 23. 7.3 Comparing the Paired t-CI with Two Independent t-CIs As before, let O represent the amount of overlap length between the two individual CIs. Then, O will be 0 either L(? x ) > U(? y ) or L(? y ) > U(? x ), in which case H 0 : ? x = ? y is rejected at the LOS < ?. Thus, ? is larger than 0 when U(? x ) >U(? y ) > L(? x ) or U(? y )>U(? x ) > L(? y ). In these two cases, both U(? x ) >U(? y ) > L(? x ) and U(? y ) > U(? x ) > L(? y ) will lead to the same result. Therefore, only U(? x ) >U(? y ) > L(? x ) is discussed here so that we are making the assumption that x y? ? 0. ????????? O = U(? y ) ? L(? x ) = ( /2, 1 / ny y tSn ? ? + ) ? ( /2, 1 / nx x tSn ? ? ? ) = ( /2, 1 /2, 1 // nx ny tSntSn ???? + ) ? ( x y? ) (55a) Further, the span of the two individual CIs is U(? x ) ? L(? y ) = ( /2, 1 / nx x tSn ? ? +? ) ? ( /2, 1 / ny y tSn ? ? ?? ) = ( /2, 1 /2, 1 // nx ny tSntSn ???? + ) + ( x y? ) (55b) From equations (55a &b) the % overlap is given by /2, 1 /2, 1 /2,1 /2,1 (/ /)() 100% (/ /)() nx ny nx ny tSntSnxy tSntSnxy ?? ? ?? +?? =? ++ (55c) As ||x y? increases, the P-value of the test decreases (i.e., H 0 : ? x = ? y must rejected more strongly) and ? in Eq. (55c) decreases. Because H 0 : ? x = ? y must be rejected at the 104 Table 23. The? Value for Different Combinations of x n , y n and n R at Either 0 F = 0.90, , x y F ? ? or 0 F = 0.05, , x y F ? ? x n y n n R 0 F = 0.90, , x y F ? ? ? 0 F = 0.05, , x y F ? ? ? 5 5 1 0.24347 0.13970 4.10725 0.08485 20 20 1 0.54873 0.16301 1.82240 0.13999 60 60 1 0.71470 0.16510 1.39918 0.15333 500 500 1 0.89152 0.16571 1.12168 0.16206 1000 1000 1 0.92208 0.16574 1.08451 0.16320 100000 100000 1 0.99193 0.16762 1.00814 0.16553 5 6 1.2 0.24688 0.15190 3.52020 0.08565 20 24 1.2 0.55765 0.16613 1.75251 0.13711 60 72 1.2 0.72272 0.16635 1.37397 0.15122 500 600 1.2 0.89554 0.16580 1.11574 0.16098 1000 1200 1.2 0.92509 0.16568 1.08054 0.16231 100000 120000 1.2 0.99227 0.16537 1.00779 0.16505 10 15 1.5 0.42534 0.16860 2.12195 0.11484 20 30 1.5 0.56715 0.16794 1.68491 0.13282 80 120 1.5 0.76388 0.16603 1.29555 0.15011 500 750 1.5 0.89977 0.16466 1.10959 0.15858 1000 1500 1.5 0.92825 0.16437 1.07642 0.16011 100000 150000 1.5 0.99262 0.16371 1.00742 0.16329 5 10 2 0.25409 0.17390 2.69268 0.08157 20 40 2 0.57733 0.16722 1.61932 0.12632 80 160 2 0.77239 0.16349 1.27469 0.14452 500 1000 2 0.90426 0.16127 1.10318 0.15392 1000 2000 2 0.93160 0.16082 1.07212 0.15564 100000 200000 2 0.99300 0.15980 1.00704 0.15928 10 50 5 0.45063 0.15367 1.76252 0.08712 50 250 5 0.73697 0.14463 1.30352 0.11609 500 2500 5 0.91313 0.14027 1.09087 0.13122 1000 5000 5 0.93819 0.13961 1.06381 0.13320 100000 500000 5 0.99373 0.13810 1.00629 0.13746 10 100 10 0.45673 0.13019 1.69556 0.07319 50 500 10 0.74397 0.12427 1.28494 0.09796 500 5000 10 0.91637 0.12052 1.08649 0.11195 1000 10000 10 0.94059 0.11992 1.06084 0.11383 100000 1000000 10 0.99400 0.11851 1.00602 0.11790 100000 2000000 20 0.99413 0.10086 1.00588 0.10032 100000 3000000 30 0.99418 0.09215 1.00583 0.09166 100000 5000000 50 0.99422 0.08297 1.00579 0.08254 100000 10000000 100 0.99424 0.07341 1.00576 0.07305 100000 20000000 200 0.99426 0.06654 1.00575 0.06622 100000 50000000 500 0.99427 0.06042 1.00574 0.06015 105 ?- level if ||x y? ? /2, 1 / nd tSn ? ? ? , then from (55c) H 0 must be barely rejected at ??100% or less iff /2,1 /2,1 /2,1 /2,1 /2,1 /2,1 (/ /) / 100% (/ /) / nx ny n d nx ny n d tSntSnt Sn tSntSnt Sn ?? ? ? ?? ? +?? ?? ++ ? 100% xyd xyd SSS SSS ? +? ?? ++ ? 22 22 2 100% 2 xy xy xy xy xy xy SS SS rSS SS SS rSS ? +? +? ?? ++ +? ? 00 0 00 0 112 100% 112 FFrF FFrF ? +? +? ?? ++ +? (56a) Or 00 0 00 0 112 100% 112 r FFrF FFrF ? +? +? =? ++ +? (56b) where F 0 = 22 xy SS/ , 222 ?2 dxy xy SSS ?=+? and ? xy ? = n ii i=1 (x - x)(y y) n -1)/(? ? . Just like the case of known variances, the % overlap r ? in (56b) depends only on the correlation coefficient r and the ratio of the two sample variances, i.e., it does not depend on ? and specific values of S x and S y . It is interesting to note that when r = 0 (i.e., the two samples are independent) and F 0 = 1, then Eq. (56b) reduces to 22 22 ? + ?100% = 17.1573%, which was the % overlap for the case of independent samples and equal variances given in Eq. (3f). When r = 1 and F 0 ? 1, ? r in Eq. (56b) reduces to 1/ 0 F , while if r = 1 and F 0 ? 1, ? r in Eq. (51) reduces to 0 F . On the other hand, when r = ?1, as expected ? r in Eq. (56b) reduces to zero regardless of values of F 0 so that the Overlap becomes an exact ?-level test. 106 Finally, what should the individual confidence levels 1?? be so that the two independent CIs lead to the exact ?-level test on H 0 : ? x = ? y . As before, the expressions for the two 1?? independent CIs are given by /2, 1 / ? ? ?? nx x tSn? ? x ? /2, 1 / ? ? +? nx x tSn (52c) /2, 1 / ? ? ?? ny y tSn ? ? y ? /2, 1 / ? ? +? ny y tSn (52d) It is clear that H 0 : ? x = ? y must be rejected at the ?-level iff the amount of overlap between (52c) and (52d) barely becomes zero or less. Without loss of generality, the x- sample will be denoted such that x y? ? 0. Therefore, we deduce from (52c & d) that U'(? y ) ?L' (? x ) = ( /2, 1 / ? ? +? ny y tSn) ? ( /2, 1 / ? ? ?? nx x tSn ) = /2, 1 / ? ? ? ny tSn /2, 1 / ? ? +? nx tSn ? ( x y? ) (53c) Because H 0 : ? x = ? y must be rejected at the ?-level as soon as the RHS of (53d) becomes 0 or smaller, we impose the rejection criterion ||x y? ? /2, 1? ? ? n t / d Sn into Eq. (53c). This leads to rejecting H 0 :? x = ? y iff /2, 1 / ? ? ? ny tSn /2, 1 / ? ? +? nx tSn? /2, 1? ? ? n t / d Sn ? 0. (53d) At the borderline value, we set the LHS of inequality (53d) equal to 0 in order to solve for? . ? /2, 1? ? ? ny tS /2, 1? ? +? nx tS? /2, 1? ? ? n t d S = 0. ? /2, 1 () ? ? ?+ nyx tSS = /2, 1? ? ? n t d S ? /2, 1? ?n t = /2, 1? ? ? n t 22 2+? + x yxy xy SS rSS SS 107 ? /2, 1? ?n t = /2, 1? ? ? n t 00 0 12 1 +? + FrF F (54c) where 22 0 / x y FSS= . Eq. (54c) clearly shows that the value of ? depends on the LOS ? of testing H 0 : :? x = ? y , F 0 , and the sample size n. When r = ?1, Eq. (54c) shows that ? ?? so that the Overlap becomes an exact ?-level test; while if r = 1 the RHS attains its minimum value leading to maximum value for ?. When r = 0 (i.e., uncorrelated X & Y), Eq. (54c) shows that for very large or very small values of F 0 , the Overlap in the limit becomes an ?-level test. 108 8.0 The Impact of Overlap on Type II Error Probability for the Case of Unknown Process Variances 2 x ? , 2 y ? and Small to Moderate Sample Sizes Since the population variances 2 x ? and 2 y ? are unknown, then their point unbiased estimators 2 x S and 2 y S , respectively, must be used for the purpose of statistical inference. As mentioned in Chapter 6 the rv () / x x x x Sn ?? is not normally distributed but its sampling follows that of W. S. Gosset?s t-distribution with ( x n ?1) degrees of freedom. As a result, the acceptance interval of the test statistic 0 () / x x x x Sn ?? at the LOS ? is ( 1 /2, /2, 1 , ? ? ? xx nn tt ?? ), where /2, t ? ? > 0 for all 0 < ? < 0.50, and it also follows that Pr( /2, 1 / x nx x x tSn ? ? ? x ?? ? /2, 1 / x nx x x tSn ? ? + ) = 1 ?? (57a) Hence, the lower (1 ?? )% CI for x ? is L( x ? ) = /2, 1 / x nx x x tSn ? ? ? , the corresponding upper limit is U( x ? ) = /2, 1 / x nx x x tSn ? ? + , and the CIL( x ? ) =2? /2, 1 / x nx x tSn ? ? (57b) Similarly, L( y ? ) = /2, 1 / y ny y y tSn ? ? ? , U( y ? ) = /2, 1 / y ny y y tSn ? ? + and CIL ( y ? ) = 2? /2, 1 / y ny y tSn ? ? . (57c) 109 8.1 The Case of H 0 : == xy ? ?? Not Rejected Leading to the Pooled t-Test Assuming that X ~ N( 2 ,x ? ? ) and Y~N( 2 ,y ? ? ), then X Y? has the N( x y ? ?? , 22 // x y nn??+ ) distribution, where it is assumed that 2 ? is the common value of the unknown 222 xy ? ??==. With the above assumptions, x y? is an unbiased estimator of x y ? ?? with Var( x y? ) = 2 (1 / 1 / ) x y nn? + . In practice a pretest on H 0 : 222 xy ? ??== is required before deciding to use either the pooled t-test or the two-independent-sample t-test. If the assumption xy ? ??== is tenable and because statistical theory dictates that the total resources be allocated according to n x = /( )+ x xy N? ??= N/2 = n y , then the most common application of the pooled t-test occurs under equal sample sizes. Henceforth, the pooled t-test will be used iff the P-value of the pretest 0 : xy H ? ??= = exceeds 20%, and for very small sample sizes n x & n y < 10, a P-value of at least 40% for the pretest is recommended. Since the common value of the process variances 2 ? is unknown, its unbiased estimators 2 x S and 2 y S should be pooled to obtain one unbiased estimator of 2 ? , which as before is given by their weighted average based on their degrees of freedom, i.e., 2 p S = 22 x xyy xy SS?? ?? + + = 22 (1) (1) 2 x xy y xy nSnS nn ?+? +? (35) Note that E( 2 p S ) = 2 ? . Therefore, the se( x y? ) = 1/ 1/ px y Snn+ and as a result the rv [( ) ( )]/( 1/ 1/ ) x yP x y x ySnn???? ? + has a central ?Student?s? sampling distribution with ? = 2 xy nn+?. Accordingly, the AI (acceptance interval) for a 5%- level test of H 0 : x y ? ?? = 0 is given by 110 0.025, ??t ? 1/ 1/ px y Snn+ ? x y? ? 0.025, ?t ? 1/ 1/ px y Snn+ (58a) where ? = ? x + ? y = 2 xy nn+?. Henceforth, in this section we let 0.025 t represent 0.025, 2+? xy nn t only for notational convenience. Thus the AI in (58a) reduces to 0.025 ??t 1/ 1/ p xy Snn+ ? x y? ? 0.025 ?t 1/ 1/ p xy Snn+ (58b) Under the null hypothesis H 0 : x y ? ?? = 0, the SMD of t 0 = ( x y? )/ 1/ 1/ p xy Snn+ is that of the central t with ? = ? x + ? y = 2 xy nn+ ? . Put differently, the null distribution of ( x y? )/ 1/ 1/ p xy Snn+ is 2 xy nn T + ? . Thus, the AI for in (58b) reduces to AI: 0.025 ?t ? t 0 ? 0.025 t (58c) where t 0 = ( x y? )/ 1/ 1/ p xy Snn+ and 0.025 t = 0.025, 2+ ? xy nn t . However, if H 0 : x y ? ?? = 0 is false (so that a type II error can occur), the SMD of ( x y? )/ 1/ 1/ px y Snn+ is no longer the central 2+? xy nn T . Thus, we next derive the SMD of the test statistic t 0 = ( x y? )/ 1/ 1/ px y Snn+ under the alternative H 1 : x y ??? = ? ? 0. From statistical theory, the SMD of the rv 2 U/ / ? ? ? is that of the central Student?s t with df equal to that of 2 ? ? , where U ~ N(0,1), i.e., U is a unit normal rv, and 2 ? ? is chi-squared distributed rv with ? df and independent of U. However, if E(U) ? 0, then 2 U/ / ? ?? is no longer central t distributed, but the rv 2 (Z ) / / ? +???, where Z ~ N(0,1), has the noncentral t distribution with ? df and noncentrality parameter ? and the distribution is almost universally denoted by t ( ) ? ? ? , i.e., 2 (Z ) / / ? +??? ~ t() ? ? ? . We 111 will now illustrate how the above noncentral t distribution is used to compute type II error Pr when testing equality of two normal means with unknown but equal process variances. (This result has already been known in statistical literature for over 35 years.) By definition ? = Pr (Accepting H 0 : x y ? ?? = 0 if H 0 is false) = Pr( 0.025 ?t ? t 0 ? 0.025 t | ? x ? ? y = ?) = Pr( 0.025 ?t ?t 0 = ( x y? )/ 1/ 1/ p xy Snn+ ? 0.025 t ?? x ? ? y = ?) (59) If H 0 is assumed false so that E( x y? ) = x y ??? ? 0, the SMD of ( x y? )/ 1/ 1/ px y Snn+ is no longer the t ( 0) ? ? ?= , which is the central t with ? = ? x + ? y = 2 xy nn+? degrees of freedom. Thus, we first standardize x y? in Eq.(59) as shown below, assuming ? x = ? y = ?. ? = Pr( 0.025 ?t ? 22 22 [( ) ( ) ( )] / / 1/ 1/ / / /?? ? + ? + ++ x yxy x y px y x y x ynn Snn n n ?? ?? ? ? ??/ ? 0.025 t ?? x ? ? y = ?) = Pr( 0.025 ?t ? 22 22 [( ) / / ] / / x yxy p Z nn S ?? ? ? ? +? + ? 0.025 t ?? x ? ? y = ?) = Pr( 0.025 ?t ? 22 22 ()// / [( 2) / ] / ( 2) []+? + +? +? xy x y xy p xy Znn nn S nn ?? ? ? ? ? 0.025 t ?? x ? ? y = ?) = Pr( 0.025 ?t ? 2 22 2 // / /( 2) [] +? ++ +? nn xy x y xy Z nn nn ?? ? ? ? 0.025 t ) 112 = Pr( 0.025 ?t ? 2 / +Z ? ? ? ? ? 0.025 t ) (60a) where 22 ()// /=? + x yxy nn??? ? ? = 22 // /+ x y nn?? ? and ? = n x +n y ?2. However, as stated above, the SMD of 2 / +Z ? ? ? ? is the noncentral t with ? = n x +n y ?2 and noncentrality parameter ? = 22 // /+ x y nn?? ? = /( )? ?SE x y , i.e., 2 / +Z ? ? ? ? ~ xy nn2 xy t( ) 1/n 1/n +? ? ? ?+ = xy xy nn2 xy nn t( ) nn +? ? ? ?+ . Thus,? = Pr( 0.025 ?t ? 2 / +Z ? ? ? ? ? 0.025 t ) = Pr( 0.025 ?t ? xy xy nn2 xy nn t( ) nn +? ? ? ?+ ? 0.025 t ) (60b) Note that when ? = 0, the argument in (60b) becomes the central t and ? becomes equal to 1??. When the design is balanced, the SMD of the test statistic t 0 under H 1 reduces to 2(n 1) n t( ) 2 ? ? ? ? , i.e., when n x = n y = n ? ? = Pr( 0.025 ?t ? 2 / +Z ? ? ? ? ? 0.025 t ) = Pr( 0.025 ?t ? 2(n 1) n t( ) 2 ? ? ? ? ? 0.025 t ) (60c) As an example, suppose we draw a random sample n x = 7 from a N(? x , ? 2 ) and one of size n y = 11 from another N(? y , ? 2 ) with the objective of testing H 0 : x y ??? = 0 at the nominal significance level of ? = 5% versus the 2-sided alternative H 1 : x y ??? ? 0. 113 We wish to answer the question as to what is the Pr of accepting H 0 if the true mean difference x y ? ?? were not zero but were equal to 0.50?, i.e., we wish to compute the type II error Pr at ? = 0.50? . Then the corresponding value of the noncentrality parameter is equal to? = xy x y (/)nn/(n n)?? + = (0.50 / ) 77 /18?? = 1.03414 and type II error Pr from Eq. (60b) is equal to ? ? = Pr( 0.025,16 ?t ? 16 t (1.03414)? ? 0.025,16 t ) = Pr(?2.119905 ? 16 t (1.03414)? ? 2.119905) = cdf[of 16 t (1.03414)? at 2.119905] ? cdf[of 16 t (1.03414)? at (?2.119905)]. Fortunately, both Minitab and Matlab provide the cdf of the noncentral t distribution. Using Minitab, we obtain cdf[of 16 t (1.03414)? at 2.119905] = 0.838156 and cdf[of 16 t (1.03414)? at ?2.119905] = 0.0016652; thus, ? = 0.838156?0.0016652 = 0.836491 so that the power of the test at ? = 0.50? is equal to 1?? = 1?0.836491 = 0.163509. Clearly as ? = x y ? ?? departs further from zero, the power of the test must increase, which is illustrated next. Suppose now ? = 0.80?; then ? = (0.80 / ) 77 /18?? = 1.65462315 and (at? ? = 0.80?, 7, 11 xy nn==) = Pr(?2.119905 ? 16 t (1.65462315)? ? 2.119905) = cdf[of 16 t (1.65462315)? at 2.119905] ? cdf[of 16 t (1.65462315)? at ?2.119905] = 0.656987 ? 0.0002142 = 0.656733, and hence the power of the test increases from 0.163509 to 1?0.656773 = 0.343227. It is interesting to note that if the design is balanced, then the power of the test always increases for the same parameter values. For example, if n x = n y = 9 so that ? = 16 stays in tact, then at ? = 0.80?, the noncentrality parameter ? = (0.80 / ) 81/18?? = 114 1.6970563 and ? (at ? = 0.80?, 9, 9 xy nn==) = Pr(?2.119905 ? 16 t (1.6970563)? ? 2.119905) = cdf[of 16 t (1.6970563)? at 2.119905] ? cdf[of 16 t (1.6970563)? at ?2.119905] = 0.642235 ? 0.0001839 = 0.6420511, so that Power(at ? = 0.80?) = 0.357949, which exceeds the value of 0.343227 for the unbalanced case. The syntax for Matlab noncentral t cdf is nctcdf(t, ?, ?). As in the case of known variances, the type II error Pr from the Overlap is computed similar to Eqs. (7) shown below. ?? = Pr(Overlap?? > 0) = Pr{[ ( ) ( ) x y LU? ?? ]?[( ) ( ) yx LU? ?? ]| ?= xy ? ??} = Pr{[ x ? /2, x t ? ? / x x Sn ? y+ /2, y t ? ? / yy Sn]? [ y ? /2, y t ? ? / yy Sn? x+ /2, x t ? ? / x x Sn]|? } = Pr{[ x ?y ? /2, x t ? ? / x x Sn+ /2, y t ? ? / yy Sn]? [? /2, y t ? ? / yy Sn? /2, x t ? ? / x x Sn? x ?y]|? } = Pr{[? /2, y t ? ? / yy Sn? /2, x t ? ? / x x Sn? x ?y ? /2, x t ? ? / x x Sn+ /2, y t ? ? / yy Sn]|? } = Pr{[?A? x ?y ? +A]|? } (61) where A = /2, x t ? ? / x x Sn + /2, y t ? ? / yy Sn. In order to apply the noncentral t-distribution to compute the Pr in Eq. (61), not available in statistical literature, we must first inside brackets divide throughout by 1/ 1/ px y Snn+ and then standardize x ?y as illustrated below. ??= Pr{[?A/ 1/ 1/ p xy Snn+ ?( x ?y)/ 1/ 1/ p xy Snn+ ? A/ 1/ 1/ p xy Snn+ ]| ?} 115 = Pr[?A p ? xy xy nn2 xy nn t( ) nn +? ? ? ?+ ? A p ] (62a) where A p = /2, /2, // 1/ 1/ xy x xyy px y tSntSn Snn ?? ?? + + . Note that if ? = 0, Eq. (62a) reduces to 1??? as was shown in Eq. (37a). In the case of balanced design, Eq. (62a) reduces to ??= Pr[? /2,n 1 x y 22 xy t(SS) SS ?? + + ? 2(n 1) n t( ) 2 ? ? ? ? ? /2,n 1 x y 22 xy t(SS) SS ?? + + ] = Pr[? /2,n 1 0 0 t(F1) F1 ?? + + ? 2(n 1) n t( ) 2 ? ? ? ? ? /2,n 1 0 0 t(F1) F1 ?? + + ] (62b) As an example, suppose samples of sizes n x = n y = 9 are drawn from two independent normal universes with unknown but equal variances. We wish to compute the Pr of accepting H 0 : x y ??? = 0 at ? = 0.05 if x y ??? = 0.80? and the sample statistics are S x = 0.65 and S y = 0.54. Note that it is sufficient to provide the ratio F 0 = 22 xy S / S instead of the specific values of S x and S y . Because 0.025,8/2,n 1 tt ?? = = 2.306004, ? = (0.80 / ) 81/18?? = 1.6970563, Eq. (62b) yields ??(at ? = x y ??? = 0.80?, 9, 9 xy nn==) = Pr[?3.247338? 16 t (1.6970563)? ? 3.247338] = 0.904239?0.0000078 = 0.904231. The above value of ?? is much larger than ? = 0.6420511 using the Standard method. It can easily be verified that the random function xy 22 xy SS SS + + lies within the interval 1 < xy 22 xy SS SS + + = 0 0 F1 F1 + + ? 2 . However, we are using the pooled t-test only if 116 0.90,n 1,n 1 F ?? ? 22 0 / x y FSS= ? 0.10,n 1,n 1 F ? ? and hence 0.90,n 1,n 1 0.90,n 1,n 1 F1 F1 ?? ?? + + = 0.10,n 1,n 1 0.10,n 1,n 1 F1 F1 ?? ?? + + ? xy 22 xy SS SS + + = 0 0 F1 F1 + + ? 2 . Note that the equality on the most LHS of this last equation follows from the fact that F 0.10,n?1,n?1 = 1/ F 0.90,n?1,n?1 for all n. Therefore, for a balanced design the GLB of ?? for a 5%-level test is given by GLB(??) = Pr[? 0.025,n 1 t ? 0.10 0.10 F1 F1 + + ? 2(n 1) n t( ) 2 ? ? ? ? ? 0.025,n 1 t ? 0.10 0.10 F1 F1 + + ] (63a) where F 0.10 = F 0.10,n?1,n?1 , and the LUB is given by LUB(??) = Pr[? 0.025,n 1 t2 ? ? 2(n 1) n t( ) 2 ? ? ? ? ? 0.025,n 1 t2 ? ] (63b) i.e., Pr[? 0.025,n 1 t ? 0.10 0.10 F1 F1 + + ? 2(n 1) n t( ) 2 ? ? ? ? ? 0.10 0.10 F1 F1 + + 0.025,n 1 t ? ] ? ?? ? Pr[? 0.025,n 1 t2 ? ? 2(n 1) n t( ) 2 ? ? ? ? ? 0.025,n 1 t2 ? ] (63c) Thus, for the example with n x = n y = 9, the GLB that the overlap type II error Pr can attain is given by Eq. (63a) and is computed below. GLB (?? at ? = x y ??? = 0.80?) = Pr[?2.306004 2.589349 1 2.589349 1 + + ? 16 t (1.6970563)? ? 3.175781] = Pr[?3.175781? 16 t (1.6970563)? ? 3.175781] = 0.89451811? 0.00000954 = 0.89450857. 117 Thus, the smallest % relative error for the power of the test from Overlap is [(0.357949? 0.105492)/ 0.357949]?100% = 70.53%. Furthermore, the LUB that the overlap type II error Pr can become is given by Eq.(63b) and is calculated as following: LUB (??at ? = x y ? ?? = 0.80?) = Pr[ 2.306004? 2?? 16 t (1.6970563)? ?3.26118232 ] = Pr [ 3.26118232?? 16 t (1.6970563)? ?3.26118232 ] = 0.90602841?0.00000756 = 0.90602085 Therefore, the worst % relative error for the power of the test from Overlap is [(0.357949? 0.093979)/ 0.357949]?100 %= 73.75%. 8.2 The Case of H 0 : xy ? ?= Rejected Leading to the Two-Independent Sample t- Test (or the t-Prime Test) Assuming that X~N( 2 ,x x ? ? ) and Y~N( 2 ,y y ? ? ), then X Y? is N( x y ??? , 22 // x xyy nn??+ ), but now the null hypothesis of H 0 : x y ? ?= is rejected at the 20% level leading to the assumption that the F-statistic F 0 = 22 xy SS/ > 2 for all sample sizes 16 ? n x & n y . It has been shown in statistical theory that if the assumption x y ? ?= is not tenable, the statistic 22 [( ) ( )] / ( / ) ( / ) x yxxyy x ySnSn???? ? + has the approximate central Student?s t-distribution with degrees of freedom 118 ? = 2 22 [() ( )] (()) (( )) xy xy xy ?? + + vv vv = 2 22 [() ( )] ( ( )) ( ( )) xy yx xy x y ?? + + vv = 2 0 2 0 (1) () + + xy n y nx FR FR ?? ? ? (39) where 2 () /= x x x Snv and F 0 = 22 / x y SS. The formula for degrees of freedom in (39) rarely leads to an integer and is generally rounded down to make the test of 0 : x y H ? ?? = 0 conservative, i.e., the rounding down ? increases the P-value of the test. However, programs like Matlab and Minitab will provide probabilities of the t-distribution for non- integer values of ? in Eq. (39). It has been verified by the authors that ? in Eq. (39) attains its maximum when the larger sample also has much larger variance than the sample whose size is much smaller. Even then, it is for certain that Min( ? x , ? y ) < ? < ? x + ? y , and hence the two-sample t-test is less powerful than the pooled t-test. When 0 : x y H ? ?= is rejected at the 20% level (i.e., P-value < 0.20), the type II error Pr of a 5%-level test is given by ? = Pr (Accepting H 0 : x y ??? = 0 if H 0 is false) ? ?  Pr( 0.025, ?t ? ? t 0 ? 0.025, t ? | ? x ? ? y = ?) (64) where t 0 = 22 ()/(/)(/) x xyy x ySnSn?+ is approximately central t distributed when H 0 is true with df, ?, given in Eq. (39). Henceforth in this section we let t 0.025 represent 0.025, t ? only for notational convenience. When H 0 is false, the authors have also verified that the exact SMD of the statistic t 0 = 22 ()/(/)(/) x xyy x ySnSn?+ under the alternative H 1 : ? x ? ? y = ? ? 0, unlike the case of x y ? ?= , is intractable using central ? 2 . As far as we know, the exact power of the t-Prime (or the two-sample independent t-test) test has not yet been obtained in statistical literature. That is, the SMD of t 0 is not the noncentral t 119 with some noncentrality parameter ?. The development that follows, the results already existing in statistical literature, is only an approximation because there does not exist an exact solution for type II error Pr of testing H 0 : x y ? ?? = 0 when the variances are unknown and unequal. We first approximately studentize the expression for ? in Eq.(64). ?  Pr( 0.025, ?t ? ? t 0 ? 0.025, t ? | ? x ? ? y = ?) Pr( 0.025, ?t ? ? 22 ()/(/)(/) x xyy x ySnSn?+? 0.025, t ? | ? x ? ? y = ?)  Pr{ 0.025 ?t 22 (/)(/)/ x xyy Sn Sn??+ ? 22 [( ) ] ( / ) ( / )/ x xyy x ySnSn??? + ? 0.025 t 22 (/)(/)/ x xyy Sn Sn??+}  Pr{ 0.025 ?t 22 (/)(/)/ x xyy Sn Sn??+? t ? ? 0.025 t 22 (/)(/)/ x xyy Sn Sn??+}  Pr{ 0.025 ?t ?? ? t ? ? 0.025 t ??} (65) where the studentized mean difference?= 22 (/)(/)()/ x yxxyy Sn Sn?? +? = 22 (/)(/)/ x xyy Sn Sn? + . Unfortunately, the approximate expression for ? in Eq.(65) still depends on the sample se( ?x y ) = 22 (/)(/) x xyy Sn Sn+ , and therefore, the approximation in Eq.(65) can be carried out iff ? is specified in units of the se( ?x y ), or in units of ? x ? ? y , in which case the realized values of 22 x y S and S have to be used posteriori in order to approximate a priori type II error probability. For example, suppose samples of sizes n x = n y = 9 are drawn from two independent normal populations with unknown but unequal variances. We wish to compute the Pr of accepting H 0 : x y ? ?? = 0 at ? = 0.05 if x y ? ?? = ? = 0.4 and the 120 sample statistics are S x = 0.65 and S y = 0.54. Eq.(39) gives ? = 2 22 [() ( )] ( ( )) ( ( )) xy yx xy x y ?? + + vv = 15.48, t 0.025 = t 0.025,15.48 = 2.1257, ? = 22 (/)(/)()/ x yxxyy Sn Sn?? +? = 0.40/0.281681 = 1.420044 so that ? 0.025 t ?? = ? 3.545744, 0.025 t ?? = 2.1257?1.420044 = 0.7057, and ?(at ? = 0.40) Pr(?3.5457? t 15.48 ?0.7056) = 0.75454016?0.00140620 = 0.75313396. If x y ? ?? = 0.60, similar calculations will show that ?(at x y ? ?? = 0.6) Pr(?4.25577? t 15.48 ? ? 0.00437) = 0.4982845? 0.0003233 = 0.4979612. Note that the above approximate type II error Prs would be in exact agreement with what UCLA?s Statistics Department Power Calculator lists on their website (www.stat.ucla.edu). If n x = 7, S x = 0.65, n y =11 and S y = 0.54 the type II error Pr increases a bit from 0.7532 to ?(at x y ? ?? = 0.4) = 0.79083377 ? 0.0022143 = 0.7886. Again, the type II error Pr from the Overlap is computed similar to Eq. (7), just like the case of the pooled t-test, as shown below. ?? = Pr(Overlap ?? > 0) = Pr{[ ( ) ( ) x y LU? ?? ]?[( ) ( ) yx LU? ?? ]| ? x ? ? y =?} Note that the event [ () () x y LU? ?? ]?[ () () yx LU? ?? ] is equivalent to either ( ) x L ? ? () y U ? () x U ?? or() () () yxy LUU? ????. Thus, ?? = Pr{[ x ? /2, x t ? ? / x x Sn ? y+ /2, y t ? ? / yy Sn]? [ y ? /2, y t ? ? / yy Sn? x+ /2, x t ? ? / x x Sn]| ?} = Pr{[ x ?y ? /2, x t ? ? / x x Sn + /2, y t ? ? / yy Sn]? [? /2, y t ? ? / yy Sn? /2, x t ? ? / x x Sn? x ?y]| ?} 121 = Pr{[? /2, y t ? ? / yy Sn? /2, x t ? ? / x x Sn? x ?y ? /2, x t ? ? / x x Sn + /2, y t ? ? / yy Sn]| ?} = Pr{[?A? x ?y ? +A]| ?} (66) where A = /2, x t ? ? / x x Sn+ /2, y t ? ? / yy Sn. Studentizing inside the brackets of Eq.(66) results in: ??= Pr{[?A?(? x ?? y ) ?( x ?y) ?(? x ?? y ) ? +A?(? x ?? y )]| ?} = Pr{[?A?(? x ?? y )]/ se( ?x y )?[( x ?y) ?(? x ?? y )] /se( ?x y ) ? (A?? )/ se( ?x y )} where se( ?x y ) = 22 (/)(/) x xyy Sn Sn+ . Thus, ?? Pr{(?A?? )/ se( ?x y )? t ? ? (A?? )/ se( ?x y )} (67) For the example, if n x = n y = 9, S x = 0.65 and S y = 0.54, A = 0.914715, ? = 2 22 [() ( )] ( ( )) ( ( )) xy yx xy x y ?? + + vv = 15.48 as before, Eq. (67) now gives ??(? = 0.40) Pr[?4.66738? 15.48 t ?1.827295] = 0.95650-0.00014 = 0.9564 as compared to exact value of ?(at ? =0.40) = 0.7532 and a % relative error in power ([( ) / (1 )] 100% / (1 )???? ?? ?? ?) equal to 82.33%. 8.3 The Impact of Overlap on Type II Error Probability for the Paired t-Test (i.e., the Randomized Block Design) when Process Variances are Unknown Consider the 5%-level test of H 0 : x y ??? = ? d = 0 versus the 2-sided alternative H 1 : ? d ? 0, where the paired response (x, y) comes from a bivariate normal universe so that X and Y are correlated random variables with unknown correlation coefficient ?. The 122 appropriate test statistic for testing H 0 : ? d = 0 is t 0 = d dnS/ , where S d = 22 ?2 x yxy SS ?+? and ? xy ? = n ii i=1 (x x)(y y) / (n 1)? ?? ? . The decision rule is to reject H 0 at the 0.05-level iff | d dnS/ | > 0.025, 1n t ? . Thus, for a 5%-level test by definition ? = Pr(Accepting H 0 : x y ??? = 0 if H 0 is false) = Pr( 0.025, 1? ? n t ? t 0 ? 0.025, 1?n t | ? d = ?) Fortunately, just like in the case of the pooled t-test, the exact SMD of t 0 under the alternative H 1 : ? d ? 0 has been known for well over 35 years. That is to say, an exact expression for the OC curve of the paired t-test already exists in statistical literature as illustrated below. ? = Pr( 0.025, 1? ? n t ? d dnS/ ? 0.025, 1?n t | ? d = ?) (68) where for notational convenience we will let 0.025 t = 0.025, 1?n t in this section. Standardizing d dnS/ in Eq. (68) under the alternative H 1 : ? d ? 0 leads to ? = Pr( 0.025, 1? ? n t ? d dnS/ ? 0.025, 1?n t | ? d = ?) = Pr( 0.025, 1? ? n t ? [( ) ] / / dd d dd dn S ? ?? ? ?+ ? 0.025, 1?n t | ? d = ?) = Pr( 0.025, 1? ? n t ? 22 Zn S / / dd dd ? ? ? + ? 0.025, 1?n t | ? d = ?) = Pr( 0.025, 1? ? n t ? 22 Z (n 1)S (n 1) / + ? ? dd ? ? ? 0.025, 1?n t | ? d = ?) 123 = Pr( 0.025, 1? ? n t ? 1 2 Z /(n 1) ? + ? n ? ? ? 0.025, 1?n t | ? d = ?) = Pr( 0.025, 1? ? n t ? 1 () n t ? ? ? ? 0.025, 1?n t ) (69) Eq. (69) shows that the exact SMD of t 0 = d dnS/ under the alternative H 1 : ? d ? 0 is the noncentral t with noncentrality parameter ? = n / dd ? ? and ? = n?1 df, while the null SMD of t 0 is the central t n?1 = 1 (0) n t ? ? . For example, suppose we wish to compute the type II error Pr when testing H 0 : x y ??? = ? d = 0 at the 5% level with a random sample of size n = 10 blocks from a bivariate normal distribution versus the alternative H 1 : ? d = 0.50? d . Thus from Eq. (69), ?(at ? d = 0.50? d ) = Pr( 0.025,9 t? ? 1 () n t ? ? ? ? 0.025,9 t ), where ? = 0.50 10 / dd ? ? = 1.581139. Consequently using Matlab we obtain ?(at ? d = 0.50? d, n=10) = Pr(?2.262157? 9 (1.581139)t? ? 2.262157) = nctcdf(2.262157, 9, 1.581139) ? nctcdf(?2.262157, 9, 1.581139) = 0.7071714?0.00034704 = 0.70682435, so that the power of the test is given by PWF(at 0.50? d , n =10) = 0.29317565. It is common knowledge in the field of Statistics that the power of a test should increase with increasing sample size; a statistical test for which the limit of its PWF does not approach 1 as n ? ?, is said to be inconsistent. It is also estimated that in order to double the power of a test, roughly more than twice the sample size is needed. For this reason, consider this last example where the PWF(at 0.50? d ) was equal to 0.29317565 with n = 10 but now we set the value of n at 20. Then, at n = 20, ?(at ? d = 0.50? d , n = 20) = Pr( 0.025,19 ?t ? 1 () n t ? ? ? ? 0.025,19 t ), where ? = 0.50 20 / dd ? ? = 2.236068 and 124 0.025,19 t = 2.093024 ? ?(at ? d = 0.50? d , n = 20) = nctcdf(2.0930240,19, 2.236068) ? nctcdf( ?2.0930240,19, 2.236068) = 0.43551707475811? 2.152176224301527?10 ?5 = 0.43549555299587 ? PWF(at 0.50? d , n =20) = 0.564504447 < 2? PWF(at 0.50? d , n =10). On the other hand, in order to have the same value of PWF at ? d = (1/2)?0.50? d , roughly four times the sample size is needed. As in sections 8.1 and 8.2, the type II error Pr using the Overlap is given by ??= Pr{[? /2, y t ? ? / yy Sn? /2, x t ? ? / x x Sn? x ?y ? /2, x t ? ? / x x Sn + /2, y t ? ? / yy Sn]| 0?? } = Pr{[?A? d ? +A]| 0?? } (70a) where as before A = /2, x t ? ? / x x Sn+ /2, y t ? ? / yy Sn and d = x ?y . However, because this is a block design, then per force n x = n y = n, and as a result A = /2, 1n t ? ? (S x +S y )/ n . Following the exact same development that leads to Eq. (69), we obtain ?? = Pr[? /2, 1n t ? ? (S x +S y )/ n ? d ? /2, 1n t ? ? (S x +S y )/ n ] = Pr[? /2, 1n t ? ? (S x +S y )/S d ? d n /S d ? /2, 1n t ? ? (S x +S y )/S d ] = Pr[? /2, 1n t ? ? (S x +S y )/S d ? (d ) / / ?+ dd d dd n S ? ?? ? ? /2, 1n t ? ? (S x +S y )/S d ] 125 = Pr[? /2, 1n t ? ? (S x +S y )/S d ? 22 (d ) / (1)/ 1 ?+ ? ? dd d dd n nS n ? ?? ? ? /2, 1n t ? ? (S x +S y )/S d ] = Pr[? /2, 1n t ? ? (S x +S y )/S d ? 1 () n t ? ? ? ? /2, 1n t ? ? (S x +S y )/S d ] (70b) where ?= n / dd ? ? . Because S d = 22 2 x yxy SS rSS+? , it follows that S d ? S x +S y , and equality occurring iff the sample correlation coefficient r = ?1. On comparing the expression for ? in Eq. (69) with that of ?? in (70b), it is clear that ? ? ?? because /2, 1n t ? ? (S x +S y )/S d ? /2, 1n t ? ? . Further, dividing the numerator and denominator of (S x +S y )/S d by S y , we obtain (S x +S y )/S d = 00 0 (F 1)/ F 12rF++?. Substituting this last into (70b) results in ?? = Pr[? /2, 1n t ? ? ( 0 00 F1 F12rF + +? )? 1 () n t ? ? ? ? /2, 1n t ? ? ( 0 00 F1 F12rF + +? )] (70c) The final expression for Overlap type II error Pr in Eq. (70c) clearly shows that the value of ?? depends only on the sample size n, noncentrality parameter ? = n / dd ? ? , the sample correlation coefficient r, and the sample variance ratio F 0 = 22 xy S/S but does not depend on the specific values of 22 xy SandS. Only when r = ?1, the value of ?? equals to ?; otherwise ?? > ?. Further, the limit of ?? as F 0 ? 0 or as F 0 ? ? is also equal to ?. It can easily be shown using calculus that the function 00 0 (F 1)/ F 12rF++? in the argument of ?? in Eq. (70c) attains its maximum at F 0 = 1 with its value equal to 2/(1 r)? . Thus for a 5%-level test, the least upper bound for ?? is given by 126 LUB(??) = Pr[? 0.025, 1?n t 2/(1 r)? ? 1 () n t ? ? ? ? 0.025, 1?n t 2/(1 r)? ] (71) As r? ?1, the LUB(??)? ?, but as r? +1, LUB(??) ? +1. Thus the impact of negative correlation is to reduce the Overlap type II error Pr while the impact of positive correlation is to increase ??. As an example, for a random bivariate pair of size n = 10, a 5%-level test, and r = ? 0.50, Eq. (71) at ? d = 0.50? d yields LUB(??) = Pr[? 0.025,9 t 2/1.5? 9 (1.581139 )?t ?2.262157 2 /1.5 ] = Pr[?2.612114? 9 (1.581139)?t ?2.612114] = 0.793446 ? 0.0001628 = 0.79328344 as compared to ? = 0.70682435 from the Standard method. However, if r were equal to +0.50, then LUB(??) = Pr[? 0.025,9 t 2/0.5? 9 (1.581139)?t ?2.262157?2] = Pr[?4.524314 ? 9 (1.581139)?t ? 4.524314] = 0.976860. 127 9.0 Conclusions and Future Research Chapter 3 used the normal theory with known variances to prove results that already existed in Overlap literature, some of which were obtained through simulation. It was proved that for a nominal significance level ? = 0.05, the corresponding 95% overlapping CIs provide a much smaller LOS ?? = 0.0055746, which fully agrees with the value computed from Eq. (7) on p. 184 of Schenker et al. (2001). Schenker et al. provide their results without any proof. Further, Chapter 3 proved that for a LOS of 0.01, the corresponding Overlap LOS was ?? = 0.0002697, while the literature provides results only for the nominal LOS of 5%. Further the smaller the LOS of the Standard method becomes, the larger is the % relative error of the Overlap LOS. Although, the Overlap literature has never considered the one-sided alternative, Chapter 3 showed that the Overlap LOS is ? of the corresponding two-sided alternative (i.e., the Overlap procedure becomes even more conservative for a one-sided alternative). Second, a concept that has not been discussed in Overlap literature is the maximum % overlap that the two independent CIs can have and H 0 : ? x = ? y cannot still be rejected at a pre-assigned LOS ?. It was proven that this maximum % overlap depends only on the SE ratio [k = yyxx (/n)/(/n)??or k = xx yy (/n)/(/n)??] and is equal to 17.1573% at k = 1 and diminishes to zero as k ?? or zero. At k = 10, it was shown that the maximum % overlap reduces to 4.5137% 128 so that the Overlap procedure converges to an exact ?-level test for limiting values of k. Third, the chapter showed that the two independent CIs must each have a confidence level of 1 ? ? =1 ? /2 2(Z /2) ? ?? in order to provide an exact ?-level test. This last formula gives a confidence levels of 0.931452 for both independent intervals at ? = 0.01, and 1? ? = 0.83422373 at ? = 0.05. This latter value is in perfect agreement with Overlap literature while the former value of 1? ? = 0.931452 has not been reported. Finally, the Overlap procedure leads to less statistical power compared to the Standard method and its RELEFF for small sample sizes is poor and heavily depends on ?/?, but its asymptotic RELEFF is 100% as n ??. For the simplest case of ? x = ? y and n x = n y an exact formula (15e) was obtained for the RELEFF of Overlap to the Standard method. Chapter 4 investigated the Bonferroni Overlap CIs against the Standard procedure and determined that the Bonferroni concept makes the Overlap even more conservative and loses more statistical power. Chapter 5 examined the overlapping CIs for two process variances against the Standard method that uses the Fisher?s F distribution; the Overlap literature has not investigated the Overlap procedure for variance ratios. As in the case of process means, the Overlap reduces the LOS of the test and the limiting value of ?? at ? = 0.05 and k = 1 is roughly 0.0055746, while as k ? ? or zero, the Overlap approaches an exact ?-level test. Second, the limiting value of maximum % overlap that does not reject H 0 : ? x = ? y is exactly 17.15726% as was in the case of two process means. Third, the individual confidence levels have to be set at ? obtained from Eq. (31b) 129 22 /2, 1 /2, / ? x y ? ??? ??= /2, , / xy x y F ??? ? ? , where the limiting value of ? is 0.165766 at k =1 just like the case of means; further, as k ? ? or zero, ? ? 0. Last, the power of Overlap procedure is always less than Standard but approaches that of the Standard as k ? ? or zero. The asymptotic RELEFF of Overlap to the Standard method is 100% as n x & n y ??. Chapter 6 examined the impact of Overlap on type I error Pr, in the normal case with unknown variances but samples sizes ? 50, using the pooled t and two-independent sample t statistics, and also the effect of positive and negative correlations on the Overlap procedure . Specific formulas for ??of the pooled t-test (37c), the two-independent sample t-test (41b), and the paired t-test (44a) were derived and documented. The Overlap literature has not considered the pooled t-test. Chapter 7 used the pooled t-statistic to derive an expression for the % overlap, ? r , below which H 0 : ? x = ? y cannot be rejected at the ? level. Unlike the simple case of known variances where ? r depends only on the SE ratio k, when the process variances are unknown and sample sizes are not large, ? r depends on n x , n y , F 0 = 22 xy S/S, and ?. For the case of two-independent t-statistic, ? r depends on n x , n y , k, and ?, while for the paired t-test, it depends only on the correlation coefficient between X and Y and F 0 = 22 xy S/S. For all 3 cases, Chapter 7 also derived expressions for the individual confidence levels, 1??, that provide an ?-level test by the Overlap method. In the case of pooled t- test, ? depends only on n x , n y , and F 0 . For the two-independent sample t-test ? depends on n x , n y and F 0 . While for the paired t-test, it depends only on n, r and F 0 . 130 Chapter 8 used the noncentral t-distribution to derive formulas for the OC curves (and also power functions) for the case of underlying normal distribution with unknown variances and moderate to small sample sizes n ? 50, the results of which have been available in statistical literature for more than 35 years. However, the chapter also derived formulas for type II error Pr of Overlap (??) using the noncentral t. The exact results obtained for this latter case have not been available in statistical literature. As further research, one could consider the Overlap problem for other normal parameters, such as the coefficient of variation ?/? [see Vangel (1996) and Payton (1996)] and quantiles ? + Z ? ?, 0 < ? < 1. Further, we suspect that the SMD of S (= the standard deviation for a sample) from a non-normal population approaches normality, toward N[?, ? 2 /(2n)], but very agonizingly slowly (n > 100). The exact SMD of S from a N(?, ? 2 ) has been documented in statistical literature more than 50 years ago. For an underlying normal population, it is also widely known that an n > 75 is needed in order for S to be roughly normally distributed according to N[c 4 ?, (1 ? 2 4 c)? 2 ], where c 4 = (/2)2/n? [( 1) / 2] 1{}nn?? ?? < 1 is a well-known QC constant. Note that the approximate V(S) generally reported in statistical literature is ? 2 /(2n), but we know for fact that ? 2 /[2(n ? 0.745)] is a better approximation to the exact variance of S from a N(?, ? 2 ), which is given by V(S) = (1 ? 2 4 c)? 2 . Unfortunately, the farther the skewness and kurtosis of an underlying non-normal distribution are from zero, the much larger sample size is needed for the SMD of S to exhibit normality. Thus, if the underlying distribution is non-normal, only the limiting comparison of Standard CI to Overlap may be accomplished based on CIs of ? x and ? y . Also, we have not yet seen the impact of 131 overlapping CIs on parameters of other underlying distributions such as Uniform, Weibull, and Beta. 132 10.0 References [ 1 ] Brownlee, K. A. (1965), Statistical Theory and Methodology in Science and Engineering, Wiley, NY. [ 2 ] Cole, S. R. and Blair, R.C. (1999), Overlapping Confidence Intervals. Journal of the American Academy of Dermatology, 41(6), pp.1051-1052. [ 3 ] Devore, J. L. (2008), Probability and Statistics, Thomson Brooks/Cole, Canada. [ 4 ] Djordjevic, M. V., Stellman, S. D. and Zang, E.(2000), Doses of Nicotine and Lung Carcinogens Delivered to Cigarette Smokers. Journal of the National Cancer Institute, 92(2), pp.106-111. [ 5 ] Goldstein H. Healy MJR. (1995), The Graphical Presentation of a Collection of Means. Journal of the Royal Statistical Society A,158: pp.175-177. [ 6 ] Hool, J. N. and Maghsoodloo, S. (1980), Normal Approximation to Linear Combinations of Independently Distributed Random Variables. AIIE Transactions, 12, pp.140-144. 133 [ 7 ] Johnson, N. L., Kotz, S., and Balakrishnan, N. (1995), Continuous Univariate Distributions, 2 nd edition, John Wiley & Sons, Inc. [ 8 ] Kelton, W. D., Sadowski, R. P. and Sturrock, D. T. (2004) Simulation with Arena, pp. 265-268, McGraw-Hill Companies, Inc., NY. [ 9 ] Kendall, M. G. and Stuart, A. (1963) The Advanced Theory of Statistics, Charles Griffin & Company Limited, London. [ 9 ] Maghsoodloo S., and Hool, J. N. (1981), On Normal Approximation of Simple Linear Combinations. The Journal of the ALABAMA ACADEMY of SCIENCES, 52, October 1981, No. 4, pp.207-219. [10 ] Mancuso, C. A., Peterson, M.G. E. and Charlson, M. E. (2001) Comparing Discriminative Validity Between a Disease-Specific and a General Health Scale in Patients with Moderate Asthma. Journal of Clinical Epidemiology, 54, pp.263-274. [ 11] Montgomery D. C. and Runger G. C. (1994), Applied Statistics and Probability for Engineers, John Wiley & sons, Inc., p. 411. [ 12] Payton, M. E., (1996), ? Confidence intervals for the coefficient of variation,? Proc. Kansas State Univ. Conf. Appl. Statistical Agriculture, 8, pp. 82-87. 134 [ 13] Payton, M. E., Miller, A. E. and Raun, W. R. (2000) Testing Statistical Hypotheses Using Standard Error Bars and Confidence Intervals. Communication in Soil Science and Plant Analysis, 31, pp.547-552. [ 14] Payton, M. E., Greenstone, M. H. and Schenker, N. (2003) Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance? The Journal of Insect Science, 3, pp.34-39 [ 15] Schenker, N. and Gentleman, J. E. (2001) On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals. The American Statistician, 55, pp.182-186. [ 16] Sont, W. N., Zielinski, J. M., Ashmore, J. P., Jiang, H., Krewski, D., Fair, M. E., Band, P. R. and Letourneau, E. G. (2001) First Analysis of Cancer Incidence and Occupation Radiation Exposure Based on the National Dose Registry of Canada. American Journal of Epidemiology, 153(4), pp.309-318. [ 17] Tersmette, A. C., Petersen, G. M., Offerhaus, G.J.A., Falatko, F. C., Brune, K. A., Goggins, M., Rozenblum, E., Wilentz, R. E., Yeo, C.J., Cameron, J. L., Kern, S. E. and Hruban, H. (2001) Increased Risk of Incident Pancreatic Cancer Among First- degree Relatives of Patients with Familial Pancreatic Cancer. Clinical Cancer Research, 7, pp.738-744. 135 [ 18] Vangel, M.G. (1996), ? Confidence intervals for a normal coefficient of variation,? The American Statistician, 50, pp. 21-26. 136 APPENDICES Appendix A: The Kurtosis of the sum of n independent Uniform, U(0, 1), distributions. Appendix B: Matlab functions 137 Appendix A: The Kurtosis of the sum of n independent uniform, U(0, 1), distributions Suppose x 1 , x 2 , ?, x n are independently and each uniformly distributed over the real interval [0, 1]. It is well known that the 1 st four moments of each x i is given by ? i = 1/2, ? 2 = V(X i ) = 1/12 = ? 2 , ? 3 = 0 (by symmetry), and ? 4 = 1/80 so that ? 4 = ? 4 /? 4 = 144/80 = 1.80 and the kurtosis of each x i is equal to ? 4 = ? 4 ?3 = ?1.20. Now consider the sum Y n = 1 n i i x = ? ; our objective is to compute the 1 st four moments of Y n from the known moment of each x i , i =1,2,?, n. Clearly, the mean of Y n is given by E(Y n ) = n/2, the variance is given by V(Y n ) = nV(X i ) = n/12, ? 3 (Y n ) = 0 by symmetry, and ? 4 (Y n ) is computed below. ? 4 (Y n ) = E[ 1 n i i x = ? ? (n/2)] 4 = E[ n i i1 (x 1/ 2) = ? ? ] 4 = E[ n 4 i i1 (x 1/ 2) = ? ? + 4 C 2 ? 1 22 11 (1/2)( 1/2) nn ij ij xx ? => ?? ?? ] = n? 4 (x i ) + 6? n C 2 ? ij V(X ) V(X )? Note that in the binomial expansion of [ n i i1 (x 1/ 2) = ? ? ] 4 the expectation of odd products such as E[(x 1 -1/2)(x 2 ? 1/2) 3 ] vanish due to mutual independence of x i and x j for all i ? j. Hence, ? 4 (Y n ) = n/80 + 3n(n ? 1)? 4 = n/80 + 3n(n ? 1)/144 = n/80 + n(n ? 1)/48 Thus, ? 4 (Y n ) = 4n n (Y ) V(Y ) ? = 2 n/80 n(n 1)/48 (n /12) + ? = 144 / 80 144(n 1) / 48 n + ? = 1.80 3(n 1) n +? ? ? 4 (Y n ) = 3 ? 1.20/n 138 ? ? 4 (Y n ) = ? 4 (Y n ) ? 3 = ?1.20/n Thus for a 2-fold convolution of U(0, 1), the kurtosis is ?1.20/n = ? 0.60 while for a 6- fold convolution the kurtosis of 6 1 i i x = ? is equal to ?1.20/n = ?0.20. 139 Appendix B: Matlab functions (a) The following Three Matlab functions compute the Overlap significance level, ??, for a pooled t-test, two-independent t-test, and the paired t-test, respectively, at a given significance level ? = a, sample sizes n x & n y and sample variance ratio F 0 = 22 xy S/S. 1. function y = aprP(a,nx,ny,F0); tx=tinv(1-a/2,nx-1);ty=tinv(1-a/2,ny-1);nu = nx+ny-2; RHS=nu*(tx*sqrt(F0*ny)+ty*sqrt(nx))^2/((ny-1+F0*(nx-1))*(nx+ny)); y=1-fcdf(RHS,1,nu); 2. function y = apr(a,nx,ny,F0); tx=tinv(1/a/2,nx-1); ty=tinv(1-a/2,ny-1);Rn=ny/nx; nu= (nx-1)*(ny-1)*(1+F0*Rn)^2/(nx-1+(ny-1)*(F0*Rn)^2); RHS=(tx*sqrt(F0*Rn)+ty)^2/(1+F0*Rn); y=1-fcdf(RHS,1,nu); 3. function y = aprc(a,n,F0,r); F1=finv(1-a,1,n-1); RHS=F1*(sqrt(F0)+1)^2/(1+F0-2*r*sqrt(F0)); y=1-fcdf(RHS,1,n-1); (b) The following Matlab functions compute the overlap proportion for a pooled t-test, two-independent sample t-test, and the paired test, respectively at a given significance level ? = a, sample sizes n x & n y and sample variance ratio F 0 = 22 xy S/S. 140 1. function y = OmegaP(a,nx,ny,F0); Rn = ny/nx;nu=nx+ny-2;n1=nx-1; n2=ny-1; NUM = tinv(1-a/2,n1)*sqrt(F0*Rn)+tinv(1-a/2,n2)- tinv(1- a/2,nu)*sqrt((1+Rn)*(n1*F0+n2)/nu); DEN = tinv(1-a/2,n1)*sqrt(F0*Rn)+tinv(1-a/2,n2)+tinv(1- a/2,nu)*sqrt((1+Rn)*(n1*F0+n2)/nu); y= NUM./DEN; 2. function y = Omega(a,nx,ny,F0); Rn= ny/nx;n1= nx-1; n2=ny-1; nu=(n1*n2*(F0*Rn+1)^2)/(n2*(Rn*F0)^2+n1); NUM=tinv(1-a/2,n1)*sqrt(Rn*F0)+tinv(1-a/2,n2)-tinv(1-a/2,nu)*sqrt(F0*Rn +1); DEN = tinv(1-a/2,n1)*sqrt(Rn*F0)+tinv(1-a/2,n2)+tinv(1-a/2,nu)*sqrt(F0*Rn +1); y= NUM./DEN; 3. function y = OmegaC(F0, r); NUM = sqrt(F0)+1-sqrt(1+F0-2*r*sqrt(F0)); DEN = sqrt(F0)+1+sqrt(1+F0-2*r*sqrt(F0)); y= NUM./DEN; (c) The following Matlab codes compute the value of ? that provides an ?-level test for the two-independent t-test. a=0.05; nx=4; ny=8; F0=1.5; Rn= ny/nx;n1=nx-1; n2=ny-1; nu=(n1*n2*(F0*Rn+1)^2)/(n2*(Rn*F0)^2+n1); RHS =tinv(1-a/2,nu)*sqrt(Rn*F0+1); 141 c(1)=a; for i= 2:25 c(i)= c(i-1)+0.005 LHS(i) = tinv(1-c(i)/2,n2)+tinv(1-c(i)/2,n1)*sqrt(Rn*F0); end for i = 2: 25 if RHS-0.005<=LHS(i) & LHS(i) <= RHS + 0.005 break g=c(i) end end