Location-Scale Bivariate Weibull Distributions For Bivariate Lifetime Modeling Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. Yi Han Certificate of Approval: Asheber Abebe Assistant Professor Mathematics and Statistics Mark Carpenter, Chair Associate Professor Mathematics and Statistics Nedret Billor Associate Professor Mathematics and Statistics Stephen L. McFarland Acting Dean Graduate School Location-Scale Bivariate Weibull Distributions For Bivariate Lifetime Modeling Yi Han A Thesis Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Master of Science Auburn, Alabama May 11, 2006 Location-Scale Bivariate Weibull Distributions For Bivariate Lifetime Modeling Yi Han Permission is granted to Auburn University to make copies of this thesis at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. Signature of Author Date of Graduation iii Vita Yi Han, son of Mulan Zhu and Yujian Han, was born in Qing Chuan, Si Chuan Province, China on twenty eighth of November, 1973. He completed his Bachelors in Engineering Economics and Masters in Management Engineering from Central- South University of Technology, Changsha, China, in July, 1996, and May, 1999, respectively. After graduation, he worked as a Financial Analyst in China Copper Lead Zinc Co. Ltd., Beijing, China. In May, 2003, he earned his Masters in Industrial and Systems Engineering from Auburn University, Alabama, U.S.A. iv Thesis Abstract Location-Scale Bivariate Weibull Distributions For Bivariate Lifetime Modeling Yi Han Master of Science, May 11, 2006 (M.I.S.E., Auburn University, 2003) (M.M., Central South University of Technology, 1999) (B.E., Central South University of Technology, 1996) 114 Typed Pages Directed by Mark Carpenter Much research has been conducted over the last thirty years in the development and characterization of bivariate survival distributions. Typically, the multivariate distribution is derived assuming that the marginal distributions are of some specified lifetime family. In this thesis, we examine various bivariate Weibull models. In addition, a location-scale bivariate Weibull model is proposed. Bivariate parameter estimation, withandwithoutcensoring, isdevelopedandappliedtorealandsimulated data. Examples are drawn from biomedical research. v Acknowledgments I wish to express sincere appreciation to my academic advisor, Dr. Mark Car- penter for his direction, assistance, advice and patience during my studies at Auburn University. My appreciation also goes to Dr. Asheber Abebe and Dr. Nedret Billor for serving on my advisory committee and their constructive comments. I appreciate the great help I received from Mr. Norou Diawara. I would like to extend my special thanks and great appreciation to my family and my fiance for their love, support, and constant encouragement. In loving memory of my father. vi StylemanualorjournalusedJournalofApproximationTheory(togetherwiththe style known as ?aums?). Bibliograpy follows van Leunen?s A Handbook for Scholars. Computer software used The document preparation package TEX (specifically LATEX) together with the departmental style-file aums.sty. vii Table of Contents List of Tables x 1 Introduction 1 1.1 Univariate Lifetime Distributions . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Univariate Weibull Distribution . . . . . . . . . . . . . . . . . 2 1.1.2 Distributional Properties for Univariate Weibull Distribution 5 1.1.3 Univariate Weibull Distribution Parameter Estimation . . . . 8 1.1.4 Asymptotic Normality and Confidence Intervals of MLE Weibull Parameters . . . . . . . . . . . . . . . . . . . . . . . . 9 1.1.5 Information Matrix and Variance-Covariance Matrix of MLE Weibull Parameters . . . . . . . . . . . . . . . . . . . . . . . . 10 1.1.6 Bonferroni Simultaneous Confidence Intervals for the 2- Parameter Weibull Model . . . . . . . . . . . . . . . . . . . . 11 1.2 Multivariate Lifetime Distributions . . . . . . . . . . . . . . . . . . . 12 1.2.1 Multivariate Distribution Functions . . . . . . . . . . . . . . . 12 1.2.2 Dependence Structure and Types . . . . . . . . . . . . . . . . 14 2 Some Bivariate Failure Time Distributions 17 2.1 Linearly Associated Bivariate Failure Time Distributions . . . . . . . 18 2.1.1 Linearly Associated Bivariate Exponential (BVE) . . . . . . . 20 2.1.2 Linearly Associated Bivariate Weibull . . . . . . . . . . . . . . 30 2.2 Bivariate Lifetime Distributions Based on Random Hazards . . . . . . 37 2.2.1 The Bivariate Weibull Model (BVW) of Random Hazards . . 39 2.2.2 The Farlie-Gumbel-Morgenstern (FGM) Family . . . . . . . . 41 2.2.3 The Farlie-Gumbel-Morgenstern Family of BVWs . . . . . . . 43 3 Bivariate Location-Scale Weibull Lifetime Distributions 45 3.1 Bivariate Location-Scale Family Based on BVW with Random Hazards 48 3.2 Maximum Likelihood Estimates of the Bivariate Location-Scale Family Based on BVW with Random Hazards . . . . . . . . . . . . . . . . . 51 3.2.1 Likelihood Functions of Uncensored Lifetime Data . . . . . . . 51 3.2.2 Likelihood Functions for Right Censored Lifetime Data . . . . 52 3.3 Location-Scale Family of BVWs Based on the Farlie-Gumbel- Morgenstern Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 viii 3.4 Maximum Likelihood Estimates of the FGM BVWs . . . . . . . . . . 57 3.4.1 Likelihood Functions of Uncensored Lifetime Data . . . . . . . 57 3.4.2 Likelihood Functions for Right Censored Lifetime Data . . . . 58 3.4.3 Optimization Procedures for MLEs of the FGM BVW . . . . 59 3.5 Bivariate Location-Scale Lifetime Distribution Regression Models . . 60 4 Simulation Study 63 4.1 Linearly Associated Bivariate Exponential and Weibull Models . . . . 63 4.1.1 Bivariate Data Generation . . . . . . . . . . . . . . . . . . . . 63 4.1.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2 Bivariate Location-Scale Models . . . . . . . . . . . . . . . . . . . . . 65 4.2.1 Data Generation for BVW of Hougaard?s Model . . . . . . . . 65 4.2.2 Data Generation for BVW of FGM Model (Sequential Monte Carlo Simulation) . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Simulation Settings and Results for Bivariate Location-Scale Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3 Example: DMBA-Induced Tumors . . . . . . . . . . . . . . . . . . . 70 Bibliography 75 Appendices 78 A SAS Simulation Codes 79 A.1 Data Generation and MLE for Linearly Associated BVE and BVW . 79 A.2 Data Generation and MLE for BVW of Hougaard?s Model . . . . . . 84 A.3 Data Generation and MLE for BVW of FGM Model . . . . . . . . . 92 A.4 Example: DMBA-Induced Tumors . . . . . . . . . . . . . . . . . . . 100 ix List of Tables 4.1 Simulation Study for BVE(?1 = ?2 = 1, n = 25) . . . . . . . . . . . . 66 4.2 Simulation Study BVW(?1 = ?2 = 1) . . . . . . . . . . . . . . . . . . 66 4.3 Joint and Working MLEs with ?1 = 0.5, ?2 = 2, ?1 = ?2 = 10 and varying ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4 Bias Improvements/Losses(%) Over the Working Estimates . . . . . . 71 4.5 Empirical MSE Improvements/Losses(%) Over the Working Estimates 72 4.6 Joint and Working MLEs with ?1 = 0.5, ?2 = 2, ?1 = ?2 = 10 and varying ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.7 Bias Improvements/Losses(%) Over the Working Estimates . . . . . . 73 4.8 Empirical MSE Improvements/Losses(%) Over the Working Estimates 73 4.9 Bivariate Weibull Times to first and second tumor . . . . . . . . . . . 74 x Chapter 1 Introduction 1.1 Univariate Lifetime Distributions The term lifetime generally refers to the time to some events such as death or failure from a certain starting point. We define lifetime or survival analysis as the collection of statistical models and methodologies to analyze lifetime data of various types. In applications in engineering and biomedical sciences, failure time and survival time are often used synonymously as lifetime, thus we have terms failure time distributions and survival time distributions respectively. Lifetime distributions generally have positive support, i.e., lifetime data can take on only non-negative real values. For a nonnegative continuous random variable X, the cumulative distribution function (cdf) FX (?) and survivor distribution SX (?) are defined as FX(x) = P(X ?x) = integraldisplay x 0 f (t)dt, (1.1) SX(x) = P(X ?x) = integraldisplay ? x f (t)dt. (1.2) and the hazard function is given by 1 h(x) = ? limtrianglex?0 bracketleftBig S(x+trianglex)?S(x) S(x) bracketrightBig S(x) = f(x) S(x). (1.3) The hazard function gives the instantaneous rate of failure at time x, given that the individual survives up tox, and carries important information concerning the risk of failure versus time. It is often desirable to model lifetime distributions through the hazard function if factors affecting an individual?s lifetime are time-dependent, or vary over time. 1.1.1 Univariate Weibull Distribution In his 1951 paper, ?A Statistical Distribution Function of Wide Applicability?, the Swedish Professor Waloddi Weibull introduced the Weibull Distribution and stated ?Experience has shown that, in many cases, it fits the observations better than other known distribution functions?. Eventually, the Weibull Distribution be- came the most useful tool in reliability due to its unique characteristics and wide range of applicability, especially so when it pertains to describing the underlying dis- tribution of time to failure (TTF) of mechanical or electrical components or systems. Professor Weibull defined his original cumulative distribution function as F(x) = 1?exp bracketleftbigg ?(x?x?) m xm0 bracketrightbigg , (1.4) where xu, m, and x0 correspond to the more universal notations used herein of the location parameter ?, the shape parameter ?, and the scale parameter ? as in (1.5). 2 We define the three-parameter Weibull probability density function (pdf) as f(x) = ?? parenleftbiggx?? ? parenrightbigg??1 exp bracketleftBigg ? parenleftbiggx?? ? parenrightbigg?bracketrightBigg ,x?? ? 0, ?, ? > 0. (1.5) The two-parameter Weibull probability density function (pdf) is given as f(x) = ?? parenleftBigx ? parenrightBig??1 exp bracketleftbigg ? parenleftBigx ? parenrightBig?bracketrightbigg ,x? 0, ?, ? > 0. (1.6) Note that (1.6) is a special case of (1.5) where ? = 0. When ? = 1, the Weibull pdf becomes an exponential pdf. The corresponding cdfs of the two and three-parameter Weibull distribution are given as F(x) = 1?exp bracketleftbigg ? parenleftBigx ? parenrightBig?bracketrightbigg , (1.7) and F(x) = 1?exp bracketleftBigg ? parenleftbiggx?? ? parenrightbigg?bracketrightBigg . (1.8) Note that all Weibull cdf?s cross at one point where the cdf is valued at ap- proximately 0.63, and the corresponding x value is 1, which is the value of the scale parameter ?. We define such a value of x as the characteristic life of the Weibull distribution,which is the time at which the value of the cdf is exactly equal to 1?e?1. In other words, 63.212% of the population fails by the time of characteristic life no matter what the values of the other parameters are. For two parameter Weibull 3 distribution, the characteristic life is equal to ?, and ? +? for the three parameter Weibull. The survival functions, the probability that an individual survives at least time x, of the two and three-parameter Weibull distribution are given as S(x) = exp bracketleftbigg ? parenleftBigx ? parenrightBig?bracketrightbigg , (1.9) and S(x) = exp bracketleftBigg ? parenleftbiggx?? ? parenrightbigg?bracketrightBigg . (1.10) The hazard function of the three-parameter Weibull distribution is given as h(x) = f(x)S(x) = ?? parenleftbiggx?? ? parenrightbigg??1 ; (1.11) and the corresponding cumulative hazard function is given as H(x) = xintegraldisplay 0 h(t)dt = parenleftbiggx?? ? parenrightbigg? = ?ln[S(x)] (1.12) The Weibull distribution can have increasing, decreasing and constant hazard rates, which in term reflects the versatileness of the Weibull distribution in lifetime or survival analysis. 4 The above (1.5), (1.9) and (1.12) show the relationships among the pdf, the survivor function, the hazard function and the cumulative hazard function of the Weibull distribution, i.e., given any one of them, the others follow. 1.1.2 Distributional Properties for Univariate Weibull Distribution The Moments of Weibull Distribution The general noncentral moments of the two parameter Weibull Distribution is given by E(Xn) = ?n? parenleftbigg 1 + n? parenrightbigg , (1.13) for any integer n, and ?(cdot) is a gamma function defined as ?(k) = ?integraldisplay 0 xk?1 exp(?x)dx, k> 0. The general moments of the three parameter (location-scale) Weibull distribu- tion is more complicated, but can be derived from that of the two parameter Weibull. Let the continuous positive random variable X follow a two parameter Weibull dis- tribution as defined in (1.6), then Y = X +? follows the three parameter Weibull distribution as in (1.5). The general moment is thus given by E(Yn) = E[(X +?)n]. (1.14) 5 By applying the binomial theorem that for positive integers n, (x+a)n = nsummationdisplay k=0 parenleftbiggn k parenrightbigg xkan?k, (1.15) (1.14) becomes E(Yn) = E[(X +?)n] = E bracketleftBigg nsummationdisplay k=0 parenleftbiggn k parenrightbigg Xk?n?k bracketrightBigg (1.16) = nsummationdisplay k=0 E bracketleftbiggparenleftbiggn k parenrightbigg Xk?n?k bracketrightbigg = nsummationdisplay k=0 bracketleftbiggparenleftbiggn k parenrightbigg ?n?kEparenleftbigXkparenrightbig bracketrightbigg = nsummationdisplay k=0 bracketleftbiggparenleftbiggn k parenrightbigg ?k?n?k? parenleftbigg 1 + k? parenrightbiggbracketrightbigg The Mean, Median, and Variance of Weibull Distribution The mean and the variance of the Weibull distribution can be derived from the general moment. The means of the two and three parameter Weibull are given by E(X) = ?? parenleftbigg 1 + 1? parenrightbigg , and (1.17) E(Y) = ?+?? parenleftbigg 1 + 1? parenrightbigg . 6 The variance of the two parameter Weibull distribution is given by Var(X) = EparenleftbigX2parenrightbig?E2 (X) (1.18) = ?2? parenleftbigg 1 + 2? parenrightbigg ??2?2 parenleftbigg 1 + 1? parenrightbigg = ?2 bracketleftbigg ? parenleftbigg 1 + 2? parenrightbigg ??2 parenleftbigg 1 + 1? parenrightbiggbracketrightbigg , and the variance of the three parameter Weibull distribution is given by Var(X) = EparenleftbigX2parenrightbig?E2 (X) (1.19) = 2summationdisplay k=0 bracketleftbiggparenleftbigg2 k parenrightbigg ?k?2?k? parenleftbigg 1 + k? parenrightbiggbracketrightbigg ? bracketleftbigg ?? parenleftbigg 1 + 1? parenrightbigg +? bracketrightbigg2 = bracketleftbigg ?2 + 2??? parenleftbigg 1 + 1? parenrightbigg +?2? parenleftbigg 1 + 2? parenrightbiggbracketrightbigg ? bracketleftbigg ?2?2 parenleftbigg 1 + 1? parenrightbigg + 2??? parenleftbigg 1 + 1? parenrightbigg +?2 bracketrightbigg = ?2 bracketleftbigg ? parenleftbigg 1 + 2? parenrightbigg ??2 parenleftbigg 1 + 1? parenrightbiggbracketrightbigg , which is the same as the variance of the two parameter Weibull distribution. The medians of the two and three parameter Weibull can be solved from the cdfs or the survival functions, and are given as M (X) = ?(?ln0.5)1/?, and (1.20) M (X) = ?(?ln0.5)1/? +?. 7 1.1.3 Univariate Weibull Distribution Parameter Estimation There are several methods for estimating the parameters of the Weibull distribu- tion, i.e., probability plotting, hazard plotting, and maximum likelihood. The method of maximum likelihood (ML) is a commonly used procedure because it has very de- sirable properties that when the sample size n is large. Under certain regularity conditions, the maximum likelihood estimator of any parameter is almost unbiased and has a variance that is nearly as small as can be achieved by any estimator, and its sampling distribution (or pdf) approaches normality [9, Devore (2000)]. Let X1, X2,..., Xn be a random sample from a two-parameter Weibull distri- bution, and x1, x2,..., xn be the corresponding observed values, then the likelihood function (LF) is given by L(?,?) = nproductdisplay i=1 parenleftbigg? ??x ??1 i e ?(xi? )?dxi parenrightbigg (1.21) Since the logarithm transform is a monotone increasing one, maximizing the natural logarithm of the likelihood function is equivalent to maximizing the likelihood function itself. Taking the natural logarithm of the LF, and setting both derivatives to zero yields two sets of score equations that do not give closed-form solution for the maximum likelihood estimates (mle). Instead, for each sample set, the equations can be solved using an iterative numerical procedure which is quite tedious without the aid of computers. In most instances, however, a simple trial and error approach also works [7, Cohen (1965)]. 8 The corresponding loglikelihood function (LLF) for two parameter Weibull dis- tribution is given by l(?,?) = nsummationdisplay i=1 bracketleftbigg ln parenleftbigg? ? parenrightbigg + (??1)ln parenleftBigxi ? parenrightBig ? parenleftBigxi ? parenrightBig?bracketrightbigg 1.1.4 Asymptotic Normality and Confidence Intervals of MLE Weibull Parameters It is well known that the sampling distributions (SMD) of maximum likelihood estimators for Weibull parameters approach normality asymptotically. For example, [35, Miller (1984)] measured the degree of Normality for the MLE of ? using Chi- square goodness-of-fit. He found that when the sample size is around 170, the MLE of ? is approximately normally distributed. For small or medium sample sizes, distri- butions of parameters are clearly skewed. Moreover, [30, Liu (1997)] suggests that for 20 or less observed failures, two-parameter Weibull distribution should be a preferred choice for more stable and more conservative results. We can construct asymptotic confidence intervals for the Weibull parameters estimated by maximum likelihood method when the sample size is large. Because the characteristic life, ?, and the minimum life, ?, are the-larger-the-better (LTB) type of parameters, it is reasonable to construct lower one-sided confidence intervals for ? and ?, and two-sided confidence interval (CI) for ?. In order to calculate the asymptotic confidence intervals, we first need to estimate standard errors of the parameters. Information Matrix and Bootstrapping method 9 can be utilized to better estimate standard errors. The conservative Bonferroni confi- dence interval is also derived to address the problem of correlations between Weibull parameters. 1.1.5 InformationMatrixandVariance-CovarianceMatrixofMLEWeibull Parameters The information matrix I can be constructed from the logarithm of the likelihood function, where its ijth element is Iij = E bracketleftbigg ?? 2L(?;X) ??i??j bracketrightbigg (1.22) The inverse of the information matrix, I?1, is the variance-covariance matrix, where the diagonal elements are variances of parameters and elements elsewhere are covariances. However, applying the expectation operator to the above equations in or- der to obtain exact results is often too complicated to accomplish, though asymptotic information matrix and variance-covariance matrix can be constructed as the sample size increases. One option is to use simulation, such as parametric Bootstrapping as proposed by [10, Efron (1985)], with the MLEs of ? and ? as seeds. 10 1.1.6 Bonferroni Simultaneous Confidence Intervals for the 2-Parameter Weibull Model It is clearly shown from the information matrix that there are correlations be- tween Weibull parameters. The CIs obtained by Variance-Covariance and Bootstrap- ping ignore such factors. If the CIs are independent, then the joint confidence co- efficient for a joint CI would be the product of all the confidence coefficients of the parameter CIs, i.e. (1??joint) = mproductdisplay i=1 (1??i) (1.23) where m is the number of parameters. The intervals, however, are not independent for Weibull parameters. It can be shown that the overall error rate, ?joint, is no more than the summation of all the individual error rates, or, ?joint ? summationtextmi=1?i, which implies that when a joint confidence region is to be constructed with overall error rate ?joint, the individual error rates should be set at around ?jointm , or, if different individual error rates are desired, set them such that summationtextmi=1?i ??joint. So, in order to obtain a simultaneous rectangular Bonferroni (1??joint)CI region for Weibull parameters ? and ?, we should set the individual confidence coefficients for the CIs of ? and ? both at (1? ?joint2 ). Therefore, the Bonferroni (1??joint)CI region for ? and ? are as follows: The lower (1? ?joint2 )CI for ? ???Z?joint 2 ?se(??) (1.24) 11 The two-sided (1? ?joint2 )CI for ? ???Z?joint 4 ?se(??) (1.25) 1.2 Multivariate Lifetime Distributions Literature is abundant on multivariate lifetime data and distributions. [18, Hougaard (2000)] and [37, Murthy, etc. (2004)] provide comprehensive and updated literature reviews. However as mentioned in [26, Lawless (2002)], gaps exist in some areas. A related issue is the introduction of covariates in multivariate survivor anal- ysis, we analyze and investigate this for the bivariate case. 1.2.1 Multivariate Distribution Functions Multivariate lifetime data arise when multiple events occur for each subject in the study. The problem addressed hereby involves continuous nonnegative ran- dom variables of lifetime, X1, X2,..., Xn, with joint probability density function as fX1,X2,...,Xn(x1,x2,...,xn). A function fX1,X2,...,Xn(x1,x2,...,xn) is a bivariate pdf if 1. fX1,X2,...,Xn(x1,x2,...,xn) ? 0 ? xi, i = 1,2,...,n; 2. integraltextintegraltext ...integraltext Rfracturn fX1,X2,...,Xn(x1,x2,...,xn)dx1dx2...dxn = 1. 12 The multivariate distribution and survivor functions are defined as FX1,X2,...,Xn(x1,x2,...,xn) = P (Xi ?xi,?xi,i = 1,2,...,n) (1.26) SX1,X2,...,Xn(x1,x2,...,xn) = P (Xi ?xi,?xi,i = 1,2,...,n) (1.27) and the marginal and joint hazard functions are given by ?j (x) = ??SX1,X2,...,Xn(x1,x2,...,xn)/?xjS X1,X2,...,Xn(x1,x2,...,xn) (1.28) ?(x1,x2,...,xn) = fX1,X2,...,Xn(x1,x2,...,xn)S X1,X2,...,Xn(x1,x2,...,xn) The joint hazard function describes the instantenuous probability that all sub- jects experience an event given the subjects have survived up to a time x. [23, Joe (1997)] summaries the following properties of a multivariate distribution function. 1. limxj??S(x1,x2,...,xn) = 0 j = 1,2,...,n; 2. limxj???jF (x1,x2,...,xn) = 1, j = 1,2,...,n; 3. For all (a1,...,an), (b1,...,bn) with aj tj, i, j = 1,2, j negationslash= i 17 The lifetimes T1,T2 are not in general independent, such as in the case of life- times of a pair of twins. Literature is abundant with methods of modeling bivari- ate distributions. For models with specified continuous marginal distributions, the joint survivor function can be represented by a parametric family of copulas such as models considered by [5, Clayton (1978)]. Extensive work have been done on the construction of bivariate exponential models as in [15, Gumbel (1960)], [13, Freund (1961)] and [40, Sarkar (1987)]. [32, Marshall and Olkin (1967)] and [27, Lee (1979)] constructed bivariate Weibull models by power transformation of the marginal of a bivariate exponential. [33, Marshall and Olkin (1988)] derived general families of bi- variate distributions from mixture models by transformation. [17, Hougaard (1986)] discussed another common approach through random effects which will be introduced in following sections. 2.1 Linearly Associated Bivariate Failure Time Distributions Let X1 and X2 be fixed marginally as exponential random variables with hazard rates ?1 and ?2, respectively. Then by introducing a latent variable, Z, statistically independent of X1, a linear relationship is formed between X1 and X2 by setting X2 = aX1 +Z, (2.4) for a > 0. [21, Iyer, Manjunath and Manivasakan (2002)] and [22, Iyer and Manju- nanth (2004)] show through Laplace transforms, the distribution of the latent variable 18 Z can be completely and uniquely characterized as the product of a Bernoulli random variable with P(Z = 0) = a?2/?1 and a continuous random variable having the same distribution as X2. Therefore, Z is distributed as mixture of a point mass at zero and an exponential with hazard rate ?2. Note that when Z = 0 then X2 is proportional to X1 with proportionality constant a, which is fixed and known. For the special case of a = 1 in (2.4), there is a positive probability for simul- taneously events, i.e., P(X2 = X1) > 0. This phenomenon is often referred to as a ?fatal shock? in reference to the now famous bivariate exponential proposed by [32, Marshall and Olkin (1967)]. Most bivariate exponential and Weibull models proposed in the literature share this property, including, for example, the multivariate Weibull proposed by [16, Hanagal (1996)]. In system reliability theory, [39, Rausand and Hoyland (2004)] refers to this situation as ?common cause failures? or as ?cascading failures? when the failure of one component is initiated by the failure of another in a system. There are many realistic applications of this model in the physical and biological sciences, such as, in medical research where simultaneous failure can occur in pairs of organs (kidneys, livers and eyes), in engineering where a random shock to a system of components may cause simultaneous failures, or in animal chemopre- vention studies where several tumors may become palpable on the same day. Since a > 0, the model driven by (2.4) is less restrictive in that it includes the possibility for simultaneous failure (a = 1) and proportional failure times with proportionality constant a, i.e.,P(X2 = aX1) = P(Z = 0) = p> 0. 19 2.1.1 Linearly Associated Bivariate Exponential (BVE) Suppose the continuous random variable Xi has an exponential pdf with hazard ?i, fXi(x) = ?ie??ixI(x> 0), (2.5) i = 1,2 and I(?) is the indicator function. Based on the linear structure given in (2.4) and the fact that Z has a point mass at zero, we see that X2 = aX1 +Z = ? ?? ?? aX1 if Z = 0 aX1 +Z if Z negationslash= 0 , where P(Z = 0) = p, P(Z negationslash= 0) = 1 ?p, p = a?2/?1, and Z is independent of X1. Since Z is a mixture of discrete and continuous distributions with a point mass at 0, i.e., P(Z = 0) = p, we can use the Direc delta to express the distribution of Z as the following integrable density function, fZ(z) = p?(z) + (1?p)fX2(z)I(z > 0), (2.6) where I(?) is an indicator function and ?(?) is Dirac delta function. More details and applications of the ?-function can be found in [1, Au and Tam (1999)] and [25, Khuri 20 (2004)]. Here, we define the?-function through its following mathematical properties: ?(t) = 0, if tnegationslash= 0,and integraldisplay ? ?? ?(t)dt = 1 (2.7) If h(t) is a real function with simple roots t1,...,tn and is differentiable at each root with hprime(ti) negationslash= 0, i = 1,...,n, then ?(h(t)) = nsummationdisplay i=1 ?(t?ti) |hprime(ti)| ??(ct) = 1 |c|?(t),cnegationslash= 0,??(?t) = ?(t) (2.8) integraldisplay ? ?? g(t)?(t?t0)dt = g(t0) ? integraldisplay ? ?? ?(t?t0)dt = 1 (2.9) [25, Khuri (2004)] demonstrates how the?-function can be used to generalize dis- tribution theory and provides a unified approach in finding transformations, without regard to whether the transformation is one-to-one and without the computation of the Jacobian matrix. This property proves quite useful in the distribution derivations in this paper, since each is discontinuous on a line-transect, i.e., P(X2 = aX1) > 0. Since X1 is independent of Z, from (2.5) and (2.6), we can write the joint pdf as: fX1,Z(x,z) = fX1(x)fZ(z) = pfX1(x)?(z) + (1?p)fX1(x)fX2(z)I(z > 0). (2.10) Notice that (2.6) and (2.10) are stated for any positive support distributions rather than the specific exponentials given in 2.5. Theorem 1 similarly expresses the joint 21 density in terms of general positive support distributions. After the proof of Theorem 1, we give the resulting bivariate exponential. Theorem 2.1 Let X1 be a positive support random variable with marginal density fX1(x1). Let Z be a random variable with density function given in 2.6 and let X2 = aX1 +Z, a> 0. Denote the variance of Xi as ?2i,i = 1,2, assume ?2 >a?1 and let p = a?1/?2. Then the joint pdf of X1 and X2 is given by fX1X2(x1,x2) = pfX1(x1)?(x2?ax1)+(1?p)fX1(x1)fX2(x2?ax1)I(x2 >ax1),x2 ?ax1, (2.11) and the variance/covariance matrix, ?, and correlation matrix, ?, are given as ? = ? ???21 a?21 a?21 ?22 ? ?? and ? = ? ?? 1 a?1/?2 a?1/?2 1 ? ??, (2.12) and ? positive definite. Proof. Following [25, Khuri (2004)], we find the joint density, f(x1,x2) as f (x1,x2) = integraldisplay ? ?? fX1,Z (x1,z)?(ax1 +z?x2)dz = integraldisplay ? ?? {pfX1 (x1)?(z)?(ax1 +z?x2) +(1?p)fX1 (x1)fX2 (x2)?(ax1 +z?x2)}dz 22 from 2.10. Note that 2.7 implies ?(z)?(ax1 +z?x2) = 0 when z negationslash= 0 and ?(ax1 +z? x2) = ?(z?(x2 ?ax1)), so f(x1,x2) = integraldisplay ? ?? {pfX1(x1)?(z)?(x2 ?ax1) +(1?p)fX1(x1)fX2(z)?(z?(x2 ?ax1))}dz = pfX1(x1)?(x2 ?ax1) integraldisplay ? ?? ?(z)dz +(1?p)fX1(x1) integraldisplay ? ?? fX2(z)?(z?(x2 ?ax1))dz = pfX1(x1)?(x2 ?ax1) + (1?p)fX1(x1)fX2(x2 ?ax1)I(x2 >ax1), where the first integral evaluation is from 2.7 and the second from 2.9. The covariance can be found by noting that cov(X1,X2) = cov(X1,aX1 ?Z) = cov(X1,aX1) + cov(X1,Z) = a?21, using the fact that X1 and Z are independent. From Theorem 3, if we let Xi be an exponential given in 2.5, then 2.11 gives a bivariate exponential, henceforth referred to as the BVE(?1,?2,a), f(x1,x2) = p?1e??1x1?(x2 ?ax1) + (1?p)?1?2e??2x2e?(?1?a?2)x1I(x2 >ax1), (2.13) where p = a?1/?2 = a?2/?1 = corr(X1,X2). Note that cov(X1,X2) = a/?21. The density given in 2.13 differs from the one presented in [21, Iyer, Manjunath, and 23 Manivasakan (2002)] and [22, Iyer and Manjunath (2004)] due to an error in their derivation. The joint cumulative distribution function (JCDF) for the exponential given in 2.13 can be written as F(x1,x2) = p?1 integraldisplay x1 0 integraldisplay x2 0 e??1u?(v?au)dvdu +(1?p)?1?2 integraldisplay x1 0 integraldisplay x2 au e??2ve?(?1?a?2)udvdu = p?1 integraldisplay x1 0 braceleftbigg e??1u integraldisplay x2 0 ?(v?au)dv bracerightbigg du +(1?p)?1 integraldisplay x1 0 e?(?1?a?2)u(e?a?2u ?e??2x2)du = p?1 integraldisplay x1 0 e??1udu +(1?p)?1 integraldisplay x1 0 e??1udu?(1?p) ?1(? 1 ?a?2) e??2x2(1?e?(?1?a?2)x1). Recalling that p = a?2/?1, we get F(x1,x2) = (1?e??1x1) +e??2x2(e?(?1?a?2)x1 ?1),x2 ?ax1, (2.14) which we see is discontinuous at y = ax. Note that the expression in (11) is the exact expression given in [22, Iyer et al. (2004)]. Similarly, sinceS(x1,x2) = 1+F(x1,x2)? F(?,x2)?F(x1,?), the joint survival function (JSF) is given as S(x1,x2) = e??2x2e?(?1?a?2)x1. (2.15) 24 If we let X(1) = min{X1,X2}, then from 2.15 we see that P(X(1) >t) = P(X1 >t,X2 >t) = S(t,t) = e??2te?(?1?a?2)t = e?(?1?a?2+?2)t which is the survival function for an exponential(?1 ?a?2 +?2). Further, with minor adaptation, X1 and X2 are said to have a joint distribution with Weibull minimums after arbitrary scaling, as defined in [27, Lee (1979)]. In Theorem 4, below, we give the maximum likelihood estimators of ?1 and ?2 based on the joint likelihood expression. We refer to these estimators as ??1 and ??2 and these will be compared to the marginal MLE?s, denoted as ???1 and ???2, which we give immediately following the proof of Theorem 2. We define the marginal MLE?s as those estimators that maximize the univariate marginal likelihood functions sepa- rately for ?1 and ?2. [26, Lawless (2002)] refers to the analysis of the marginal MLE?s as assuming ?working independence?. As we will show later, assuming working inde- pendence comes at a cost in terms of mean-squared-error. Theorem 2.2 For a given random sample of size n, (x1i,x2i),i = 1,...,n,, from a bivariate exponential(?1,?2), the joint maximum likelihood estimators of (?1,?2) is (??1,??2), where ??1 = a ?x2 + (n?k) n?x1 and ??2 = 1 ?x2 (2.16) k = summationtextni=1I(x2 ?ax1 = 0), ?x1 = summationtextx1/n and ?x2 = summationtextx2/n. Also, ??prime = (??1,??2)prime is approximately bivariate normal with mean vector ? and variance/covariance matrix 25 ?, where ? = ? ???1 ?2 ? ?? and ? = 1 n ? ???1(?1 ?a?2) +a2?22 a?22 a?22 ?22 ? ?? (2.17) and Corr(??1,??2) = ((1?p)/p2 + 1)?1/2. Proof. For a given pair of bivariate exponential random variables, (x1i,x2i),i = 1,...,n, it is easy to see that maximum likelihood estimation under the likelihood function L(?1,?2|x1i,x2i) is equivalent to the maximum likelihood estimation under L(?1,?2|x1i,zi), since x2i = axi1+zi. From 2.13, the likelihood function for a random sample of size n of pairs (x1i,zi) for 1 ?i?n is given by L(?1,?2) = nproductdisplay i=1 parenleftbigp? 1e??1x1i parenrightbigriparenleftbig(1?p)? 1?2e??1x1i??2zi parenrightbig1?ri = nproductdisplay i=1 bracketleftbiga? 2e??1x1i bracketrightbigribracketleftbig(? 1 ?a?2)?2e??1x1ie??2zi bracketrightbig1?ri = (a?2) P irie??1 P irix1i bracketleftbig(? 1 ?a?2)?2 bracketrightbigPi(1?ri) ?e??1 P i(1?ri)x1ie??2 P i(1?ri)zi. 26 where ri = I(zi = 0). The log-likelihood is given as LL(?1,?2) = log(a?2) summationdisplay i ri ??1 summationdisplay i rix1i +logbracketleftbig(?1 ?a?2)?2bracketrightbig summationdisplay i (1?ri)??1 summationdisplay i (1?ri)x1i ??2 summationdisplay i (1?ri)zi = log(a) summationdisplay i ri +nlog(?2)??1 summationdisplay i rix1i +log(?1 ?a?2) summationdisplay i (1?ri)??1 summationdisplay i (1?ri)x1i ??2 summationdisplay i zi, since summationtext(1?ri)zi =summationtextzi. The partial derivative with respect to ?1 is given as ?LL ??1 = ? summationdisplay i rix1i + summationtext i(1?ri) ?1 ?a?2 ? summationdisplay i (1?ri)x1i = summationtext i(1?ri) ?1 ?a?2 ? summationdisplay i x1i. Setting ?LL/??1 = 0 gives the following likelihood equation ?1 ?a?2summationtext i(1?ri) = 1summationtext ix1i . (2.18) The partial derivative with respect to ?2 is given by ?LL ??2 = n ?2 ? asummationtexti(1?ri) ?1 ?a?2 ? summationdisplay i zi = n? 2 ?a summationdisplay i x1i ? summationdisplay i zi, by substituting the right hand side of 2.18 for the left hand side. 27 Setting ?LL/??2 = 0 we get the second likelihood equation n ?2 = a summationdisplay i x1i + summationdisplay i zi (2.19) Solving for ?2 in 2.19, we have ?2 = nasummationtext ix1i + summationtext izi = 1 a P ix1i n + 1 n summationtext izi = 1a?x 1 + ?z . Solving for ?1 in 2.18 and substituting the above value of ?2 gives ?1 = aa?x 1 + ?z + summationtext i(1?ri)summationtext ix1i = aa?x 1 + ?z + summationtext i(1?ri) n?x1 . Since ?x2 = a?x1 + ?z and k = summationtext(1 ?ri), it follows that the estimators in 2.16 are the solutions to the likelihood equations. It easily follows that the Hessian matrix is given by H(?1,?2) = parenleftbigg?2LL(? 1,?2) ??i??j parenrightbigg = ? ?? ?? ?? ?? ? ? P i(1?ri) (?1?a?2)2 a P i(1?ri) (?1?a?2)2 a P i(1?ri) (?1?a?2)2 ? n ?22 ?a 2 P i(1?ri) (?1?a?2)2 ? ?? ?? ?? ?? ? . and det(H(??1,??2)) = n(n?k)? ?22(??1 ?a??2) > 0, 28 since ??1 ?a??2 = (n?k)/(n?x1) > 0. Therefore, estimators in 2.16 are the maximum likelihood solutions. Sincekis distributed as a Binomial(n,1?p), i.e.,E(k) = n(1?p), where p = a?2/?1 and ?1?a?2 = ?1(1?a?2/?1), Fisher?s Information (see page 546, [26, Lawless (2002)]) is given as I(?1,?2) = E parenleftbigg ?? 2LL(?1,?2) ??i??j parenrightbigg = ? ?? ?? ?? ?? ? n ?1(?1?a?2) ? a?n ?1(?1?a?2) ? a?n?1(?1?a?2) n?2 2 + a2?n?1(?1?a?2) ? ?? ?? ?? ?? ? . Therefore, from [26, Lawless (2002)], ?? is approximately normal with mean ?prime = (?1,?2)prime and variance/covariance matrix ? = I?1(?1,?2). As mentioned previously, alternatives to the joint MLE?s given in 2.16 can by found by maximizing the marginal likelihood expressions separately for ?1 and ?2. These marginal MLE?s are well-known (see page 54 of [26, Lawless (2002)]) and are given as ???1 = 1 ?x1 and ???2 = 1 ?x2. (2.20) We observe that the marginal and joint MLE?s for ?2 are identical, i.e., ???2 ? ??2, but for ?1 the MLE?s are quite different for this model. 29 We can adapt the likelihood for censored data. Suppose (X1i,X2i), i = 1,...,n represents a random sample of size n from a population with joint survival func- tion given in (11). If the observations are subject to right censoring with poten- tial censoring times C1i and C2i, then the data will come in the form of (x1i,x2i), x1i = min(X1i,C1i) and x2i = min(X2i,C2i), i = 1,...,n. Then, following [26, Law- less (2002)], the likelihood function for a given observation can be expressed as L?(?1,?2) = L(?1,?2)?1i?2i bracketleftbigg??S ?x1i bracketrightbigg?1i(1??2i)bracketleftbigg??S ?x2i bracketrightbigg(1??1i)?2i ?S(1??1i)(1??2i), where S = S(x1i,x2i) is defined in 2.14, L(?1,?2) is defined in the proof of Theorem 2, and ?ji = 1 if the ijth data value is not censored and zero otherwise. 2.1.2 Linearly Associated Bivariate Weibull If Xi is exponential random variable with hazard rate ?i, i = 1,2 and pdf given in 2.5, then for a fixed ? > 0, it is well known that Yi = X1/?i is distributed as Weibull(?i,?) random variable with pdf given as, fYi(y) = ??iy??1e??iy?I(y> 0),i = 1,2 (2.21) FromthisonecouldderiveajointWeibull(?1,?2,?)withmariginals, Weibull(?i,?),i = 1,2, with the following linear relationship Y?2 = aY?1 +Z? (2.22) 30 Since the marginal are Weibull with pdf given in 2.21, we know that (see page 18, Lawless (2002)) EYri = ??r/?i ?(1+r/?) and ?2yi = 1 ?2/?i bracketleftbig?(1+ 2 ?)?? 2(1 + 1 ?) bracketrightbig, i = 1,2. (2.23) In Theorem 3, below, we give the derived bivariate Weibull, based on the struc- ture in 2.22, along with its covariance structure. The proof of this theorem involves the ?-function and its properties given in (2.7), (2.8) and (2.9). Theorem 2.3 Suppose (X1,X2) has the joint distribution given in 2.5, with expo- nential marginal given in 2.5. Let Yi = X1/?i ,i = 1,2, ? > 0. Then the joint density of (Y1,Y2) is given as f(y1,y1) = p?1?y??11 e??1y?1?(y1,y2)+(1?p)?1?2?2y??11 y??12 e??2y?2e?(?1?a?2)y?1I(y1,y2), (2.24) where ?(y1,y2) = ?(y2 ?a1/?y1) and I(y1,y2) = I(y2 >a1/?y1). The marginal distri- bution of Yi is given in 2.21, i = 1,2 and Cov(Y1,Y2) = a 1+ 1?? 2?(1+ 2/?) ?1+2/?1 + ?2 (?1 ?a?2)1? integraldisplay ? 0 IG(g(y2),1 + 1/?)fY2(y2)dy2 ?? 2(1 +1/?) (?1?2)1/? where g(y2) = (?1 ?a?2)y?2/a and IG(x,k) = integraltextx0 tk?1e?tdt as defined on page 25 of Lawless (2003). 31 Proof. To derive the density in 2.24, we follow the transformation approach given in [25, Khuri (2004)], by using the appropriate ?-functions as follows, f(y1,y2) = integraldisplay ? ?? integraldisplay ? ?? f(x1,x2)?(x1/?1 ?y1)?(x1/?2 ?y2)dx2dx1 = p integraldisplay ? ?? integraldisplay ? ?? fX1(x1)?(x2 ?ax1)?(x1/?1 ?y1)?(x1/?2 ?y2)dx2dx1 +(1?p) integraldisplay ? ?? integraldisplay ? ?? fX1(x1)fX2(x2 ?ax1)?(x1/?1 ?y1)?(x1/?2 ?y2)dx2dx1 = p Part 1+ (1?p) Part 2 Since ?(x2 ?ax1)?(x1/?2 ?y2) = ?(x2 ?ax1)?((ax1)1/? ?y2), we have Part 1 = integraldisplay ? ?? integraldisplay ? ?? fX1(x1)?(x1/?1 ?y1)?(x2 ?ax1)?((ax1)1/? ?y2)dx2dx1 = integraldisplay ? ?? braceleftbigg ?(x1/?1 ?y1)?(x2 ?ax1)?((ax1)1/? ?y2)fX1(x1) integraldisplay ? ?? ?(x2 ?ax1)dx2 bracerightbigg dx1 = integraldisplay ? ?? ?(x1/?1 ?y1)?((ax1)1/? ?y2)fX1(x1)dx1 = integraldisplay ? ?? ?(x1/?1 ?y1)?(a1/?y1 ?y2)fX1(x1)dx1 = ?(a1/?y1 ?y2) integraldisplay ? ?? ?y??11 fX1(x1)?(x1 ?y?1)dx1 = ?y??11 fX1(y?1)?(y2 ?a1/?y1). 32 Similarly, Part 2 = integraldisplay ? ?? integraldisplay ? ?? fX1(x1)fX2(x2 ?ax1)?2y??11 y??12 ?(x1 ?y?1)?(x2 ?y?2)dx2dx1 = ?2y??11 y??12 integraldisplay ? ?? fX1(x1)fX2(y?2 ?ax1)?(x1 ?y?1)dx1 = ?2y??11 y??12 fX1(y?1)fX2(y?2 ?ay?1). Putting together Parts 1 and 2 above we get f(y1,y2) = p?y??11 fX1(y?1)?(y2 ?a1/?y1) (2.25) +(1?p)?2y??11 y??12 fX1(y?1)fX2(y?2 ?ay?1)I(?) I(?) is an indicator function for all {(y1,y2) : y2 ? a1/?y1}. Since Xi, i = 1,2, have densities given in 2.5, the joint density given in 2.24 follows directly from 2.25 above. Given that X1 and X2 have the density given in 2.5, the result follows directly. Now, to derive the covariance expression we must find E(Y1 ?Y2) = integraldisplay ? 0 integraldisplay ? 0 y1y2fY1,Y2(y1,y2)dy2dy1 = Part A + Part B 33 where Part A = p integraldisplay ? 0 integraldisplay ? 0 ?1?y1y2y??11 e??1y?1?(y1,y2)dy2dy1 = integraldisplay ? 0 y1 ?(a1/?y1)?p??1y??11 e??1y?1dy1 (by ?-function property (6)) = p??1a1/? integraldisplay ? 0 y21y??11 e??1y?1dy1 = pa1/?E(Y22 ) = a1+1/???2/??11 ?2?(1+ 2/?) (by (20) and p = a?2/?1) and Part B = integraldisplay ? 0 integraldisplay ? 0 y1y2fY1,Y2(y1,y2)I(y?2 >ay?1)dy1dy2 = (1?p)??1?2 integraldisplay ? 0 y?2e??2y?2dy2 integraldisplay y2/a1/? 0 ?y?1e?(?1?a?2)y?1dy1 = (1?p)?1?2? 1 ?a?2 integraldisplay ? 0 ?y?2e??2y?2dy2 integraldisplay y2/a1/? 0 ?(?1 ?a?2)y1y??11 e?(?1?a?2)y?1dy1 = (1?p)?1?2? 1 ?a?2 integraldisplay ? 0 ?y?2e??2y?2dy2 1(? 1 ?a?2)1/? integraldisplay (?1?a?2) a y ? 2 0 t1/?e?tdt = (1?p)?1?2 (?1 ?a?2)1+ 1? integraldisplay ? 0 ?y?2e??2y?2dy2 integraldisplay (?1?a?2) a y ? 2 0 t1/?e?tdt. Substituting Parts A and B into the E(Y1 ? Y2) expression above and subtracting E(Y1)?E(Y2) from 2.23, the covariance expression follows. square Henceforth, we will refer to the bivariate Weibull density given in 2.23 as a BVW(?1,?2,?,a). The covariance structure given in Theorem 3 for the proposed BVW(?1,?2,?,a) is quite complicated and not in closed-form. Similarly complicated structures are found with the multivariate Weibull proposed by [17, Hougaard (1986)] 34 and [31, Lu and Bhattacharyya (1990)], as well as, the bivariate Weibull derived from [32, Marshall and Olkin (1967)] by taking the identical transformation. The correlation between Y1 and Y2, denoted ?(Y1,Y2) is found as ?(Y1,Y2) = Cov(Y1,Y2)radicalbig Var(Y 1) radicalbigVar(Y 2) = parenleftbigga? 2 ?1 parenrightbigg1+ 1 ? ?(1+ 2?) ?(1+ 2?)??2(1 + 1?) ? ?2(1 + 1?) ?(1+ 2?)??2(1 + 1?) + ?(1+ 1 ?)?2 ?(1+ 2?)??2(1 + 1?) bracketleftbigg ? 1?2 (?1 ?a?2) bracketrightbigg1 ? integraldisplay ? 0 ?y?2e??2y?2IG(g(y2),1 + 1/?)fY2(y2)dy2. Similar to the likelihood for the exponential, we see that the likelihood for a given pair (y1,y2) is given as L(?1,?2,?) =parenleftbiga?2?y??11 e??1y?1parenrightbigr parenleftBig (?1 ?a?2)?2?2y??11 y??12 e??2y?2e?(?1?a?2)y?1 parenrightBig1?r , (2.26) where r = I(y2 ?a1/?y1 = 0). The likelihood in (23) will lead to joint maximum likelihood estimators of (?1,?2,?) given as (??1,??2,??), where ??1 = a ?y?2 + (n?k) n?y?1 , ??2 = 1 ?y?2, and ?? (2.27) where ?y?j = 1nsummationtextni=1y??ji,i = 1,2,k =summationtextni=1I(y??2 ?ay??1 = 0). The estimator ?? represents the solution to the third likelihood equation found by differentiating the log-likelihood with respect to ?, plugging the above estimators ??1 and ??2 and numerically solving for ?. 35 The JCDF can be found directly from (11) as: F(y1,y2) = FX1,X2(y?1,y?2) = (1?e??1y?1 ) +e??2y?2 (e?(?1?a?2)y?1 ?1), (2.28) for y2 ? a?y1. From (25), it is easy to verify that the marginal have the Weibull distribution given in (18) by noting that: FY1(y1) = limy 2?? F(y1,y2) = (1?e??1y?1 ) FY2(y2) = limy 1?? F(y1,y2) = F(y2/a1/?,y2) = (1?e??2y?2 ) Similarly, the JSF is given as: S(y1,y2) = e??2y?2e?(?1?a?2)y?1 (2.29) Note that if we let Y(1) = min{Y1,Y2}, then (26) gives P(Y(1) >t) = P(Y1 >t,Y2 >t) = S(t,t) = e??2t?e?(?1?a?2)t? = e?(?1?a?2+?2)t? which is the survival function for an Weibull(?1?a?2+?2,?). It is easy to show that Y1 and Y2 have a joint distribution with Weibull minimums after arbitrary scaling, as defined in Lee (1979). 36 2.2 Bivariate Lifetime Distributions Based on Random Hazards [17, Hougaard (1986)] provides a general method of constructing bivariate failure timedistributionswherebothcomponentsinasystemareaffectedbyrandomhazards. Lemma 2.4 Let ?1(t) and ?2(t) be two arbitrary hazard functions, and ?1(t) and ?2(t) be the corresponding cumulative hazard functions, where ?j(t) = integraltextt0 ?j(x)dx. Let Tj, j = 1, 2, be conditionally independent lifetimes given a specific quantity Z. The marginal hazard of Tj is Z?j, and its cumulative hazard function is Z?j. Then the conditional bivariate survivor functionP (T1 ?t1,T2 ?t2|Z = z) = exp(?z?.), where ?. = ?1 (t)+?2 (t). Proof. The conditional joint survivor function of Tprimejs given Z is S(t1,t2|Z = z) = P (T1 >t1,T2 >t2|Z = z) = 2productdisplay j=1 S(tj|Z = z) = 2productdisplay j=1 exp bracketleftbigg ? integraldisplay t 0 z?j(x)dx bracketrightbigg = 2productdisplay j=1 exp bracketleftbigg ?z integraldisplay t 0 ?j(x)dx bracketrightbigg = 2productdisplay j=1 exp[?z?j (t)] = exp bracketleftBigg ?z 2summationdisplay j=1 ?j (t) bracketrightBigg = exp(?z?.) where ?. =summationtextnj=1 ?j (t) Definition 1 [18, Hougaard (2000)] defines the positive stable distribution as: Let Xi, i = 1,...n, be independent, identically distributed random variables with positive supports. If there exists a scale factor function c(n) having the form n1/?, 37 ?? 1, such that c(n)X =D X1 +...+Xn, where =Ddenotes having the same distribution as. The the distribution ofX is called a positive stable distribution with two parameters ? and ? given by the Laplace transform L(s) = exp(??s?/?). Theorem 2.5 Let Tj, j = 1, 2, be conditionally independent lifetimes given a specific quantity Z, which in term has a positive stable distribution with parameter ? given by the Laplace transform E{exp(??Z)} = exp(???), ?? (0,1], The marginal hazard of Tj is Z?j, and ?j(t) =integraltextt0 ?j(x)dx is the cumulative hazard. Then the unconditional bivariate survivor function P (T1 ?t1,T2 ?t2) = exp(??.?) 38 Proof. Assuming some regularity conditions and by lemma 1, we have the following result P (T1 ?t1,T2 ?t2) = ??integraldisplayintegraldisplay t1t2 f (x1,x2)dx1dx2 = ??integraldisplayintegraldisplay t1t2 ? ? ?integraldisplay ?? f (x1,x2,z)dz ? ?dx1dx2 = ??integraldisplayintegraldisplay t1t2 ? ? ?integraldisplay ?? f (x1,x2|z)f (z)dz ? ?dx1dx2 = ?integraldisplay ?? ? ? ??integraldisplayintegraldisplay t1t2 f (x1,x2|z)dx1dx2 ? ?f (z)dz = ?integraldisplay ?? P (T1 >t1,T2 >t2|Z = z)f (z)dz = E[P (T1 >t1,T2 >t2|Z = z)] = E[exp(?z?.)] = exp(??.?) 2.2.1 The Bivariate Weibull Model (BVW) of Random Hazards [17, Hougaard (2000)] derives a bivariate Weibull distribution with common shape parameter? such that the arbitrary hazard rate?j(x) = epsilon1j?t??1. The marginal distributions are also Weibull with common shape ??, 39 Using more conventional parameterizations, [31, Lu and Bhatacharyya (1990)] derive the same model without the assumption of conditional independence and com- mon shape parameter, and render the survivor function as S(t1,t2) = P (T1 ?t1,T2 ?t2) (2.30) = exp braceleftBigg ? bracketleftBiggparenleftbigg t1 ?1 parenrightbigg?1/? + parenleftbiggt 2 ?2 parenrightbigg?2/?bracketrightBigg?bracerightBigg , 0 0. Many of the widely used statistical distributions belong to such a family of distributions. Examples of distributions that belong to the location-scale family are normal distribution, exponential distribution, double exponential distri- bution, Cauchy distribution, logistic distribution, and uniform distribution, etc. [34, Meeker and Escobar (1998)] emphasizes the importance of the widely used location- scale family with respect to its adaptivity and simplicity. Definition 2 A group family of distributions is a family obtained by applying a suit- able family of transformations to a random variable with a fixed distribution [29, Lehmann and Casella (1998)]. 45 Definition 3 Let f (x) be a pdf. Then the family of pdfs 1?f(x??? ), ?? < ? < ?, ?> 0,is called the location-scale family with standard pdff (x); ?is called the location parameter and ? is called the scale parameter. [2, Casella and Berger (2002)] The following three transformations result in three families of distributions, i.e., location families, scale families, and location-scale families. Examples of distributions that belong to the location-scale family are normal distribution, exponential distribu- tion, double exponential distribution, Cauchy distribution, logistic distribution, and uniform distribution, etc. Let U be a random variable with a fixed distribution FU (u) with pdf fU (u) and let ? , the location parameter, and ? > 0 ,the scale parameter, be any given constants. Then the random variables X = ? + U, X = ?U, and X = ? + ?U have distributions FX(x??), FX(x/?), and FX(x??? ) with fX?s equal to fU(x??), 1 ?fU(x/?) and 1 ?fU( x?? ? ), which constitute a location family, a scale family, and a location-scale family, respectively. [29, Lehmann and Casella (1998)] states that the families of transformations for the above location-scale family distributions are closed under composition and inversion. Let the continuous random vector U =(U1,...,Un)prime have a fixed joint distribution FUprime (uprime). The random vector X = ?+?U, where ? is an n?1 constant vector and ? is an n?n diagonal matrix of constants with the diagonal entry being ?i, i=1, ..., n, and off-diagonal entries being zero. The marginal pdf of Xi is fXi (x) = 1?ifUi(xi??i?i ). 46 The joint pdf of the random vector Xprime=(X1,...,Xn) is given by fXprime (xprime) = 1producttextn i=1 (?i) fUprime parenleftbiggx 1 ??1 ?1 ,..., xn ??n ?n parenrightbigg . which defines a multivariate location-scale family of distributions. More specifically, when n=2, we have the bivariate case. The bivariate joint pdf of Xprime=(X1,X2) is given by fX1,X2 (x1,x2) = 1? 1?2 fU1,U2 parenleftbiggx 1 ??1 ?1 , x2 ??2 ?2 parenrightbigg . Since the bivariate location-scale transformations are one-to-one, the proof of the above proposition can be readily obtained using Jacobian of the transformation. Also it can be shown that the three properties of the bivariate distribution function reiterated in [23, Joe (1997)] are satisfied. 1. limui??SU1,U2 (u1,u2) = 0, i = 1,2; 2. limui???iFU1,U2 (u1,u2) = 1; and 3. If FU1,U2 has second-order derivatives, ?2FU1,U2/?u1?u2 ? 0 (the rectangle in- equality). A proof of the proposition for the specific case of BVW is given in the next section. 47 3.1 Bivariate Location-Scale Family Based on BVW with Random Haz- ards Recall the BVW distribution defined by (2.30), its joint pdf is given by fT1,T2 (t1,t2) = (?1) 2?2S T1,T2 (t1,t2) ?t1?t2 = ? 2 ?t1?t2 exp braceleftBigg ? bracketleftBiggparenleftbigg t1 ?1 parenrightbigg?1/? + parenleftbiggt 2 ?2 parenrightbigg?2/?bracketrightBigg?bracerightBigg = ?1? 1 ?2 ?2 parenleftbiggt 1 ?1 parenrightbigg?1 ? ?1 parenleftbiggt 2 ?2 parenrightbigg?2 ? ?1 bracketleftBiggparenleftbigg t1 ?1 parenrightbigg?1 ? + parenleftbiggt 2 ?2 parenrightbigg?2 ? bracketrightBigg??2 ? braceleftBiggbracketleftBiggparenleftbigg t1 ?1 parenrightbigg?1/? + parenleftbiggt 2 ?2 parenrightbigg?2/?bracketrightBigg? + 1??? bracerightBigg ?exp braceleftBigg ? bracketleftBiggparenleftbigg t1 ?1 parenrightbigg?1/? + parenleftbiggt 2 ?2 parenrightbigg?2/?bracketrightBigg?bracerightBigg 0 0 is the scale parameter, ?j is the corresponding regression coefficient vector, and the errors ?1, ?2 have a joint distribution specified by a copula function C? with an association parameter ?. [19, He and Lawless (2005)] shows that estimators of the regression coefficients ??1 and ??2 are consistent estimators and robust to misspecification of the marginal distributions of the errors. Also, they examine the relative efficiency for using the bivariate model to estimate?1 and?2 compared with using the marginal distributions. 61 In their simulation study, a specific bivariate location-scale regression model as proposed in [5, Clayton (1978)] is studied. The model is defined by the joint survivor distribution H?(?1,?2) = bracketleftBig S1 (?1)?1/? +S2 (?2)?1/? ?1 bracketrightBig?? , ?> 0 (3.19) where the survivor functions S1 (?1) and S2 (?2) define the location-scale marginal distributions of Yj = ?j0 +xprimej?j +?j?j. 62 Chapter 4 Simulation Study The simulation study is focused on maximum likelihood estimation of several bivariate models. Bivariate data generation, maximum likelihood estimation and statistical properties are studied, and validation examples are provided. 4.1 Linearly Associated Bivariate Exponential and Weibull Models 4.1.1 Bivariate Data Generation To generate the linearly associated bivariate data set, (Xi1,Xi2),i = 1,...,n, of BVE(?1,?2,a) as defined in section 2.1.1, we utilize the fact that X1 and Z are in- dependent of each other and generate the two random variables first. X1 ? exp(?1) is generated with the SAS exponential random number generator. Since Z is the product of of a Bernoulli random variable with P(Z = 0) = a?2/?1 and a continu- ous random variable having the same distribution as X2 ? exp(?2), it is generated using the SAS Bernoulli random number generator and the SAS exponential random number generator. By the linear association defined in 2.4, X2 is then generated as aX1 +Z. The bivariate data set of BVW(?1,?2,?,a) as defined in section 2.1.2 can be readily obtained by a power transform of the BVE data set, i.e., letting Yij = X1/?ij , j = 1, 2. 63 4.1.2 Simulation Results Table 4.1 summarizes the results of a simulation study where 10,000 simulated samples of size 25 pairs from a BVE(1,1,a) were generated. Several other simulations were conducted using various combinations of ?1, ?2 andaand similar results were found as given in Table 1. Note that since?1 = ?2 = 1, Corr(X1,X2) = a?2/?1 = a. The empirical MSE was computed for both ??1, the estimator based on the joint likelihood given in (25), and ???1, the usual maximum likelihood estimator based on the marginal distribution. Also computed, were the maximum likelihood estimators of the ?, given as ?? = a??2/??1 and ??? = a???2/???1. Percent MSE improvement was computed as (MSE(???1)?MSE(??1))/MSE(???1)?100%. The joint MLE estimator ? gave MSE improvement over the marginal MLE for all values of ?. Interestingly, percent improvement is a concave function of ?, with maximum occurring at ? = 0.5, giving over 25%. The joint MLE for the correlation coefficient gives monotonically increasing percent improvement over the estimator based on the marginal MLE?s, with 44% improvement when ? = 0.99.. Therefore, ignoring the multivariate relationship between X1 and X2 comes at a significant cost with respect to MSE. Similarly, Table 4.2 summarizes the results of a simulation study where 10,000 simulated samples of size 25 pairs from a BVW(1,1,?,a), ? = 0.5,1,1.5,2.0 were 64 generated. The empirical MSE was computed for both ??1, the estimator based on the joint likelihood given in (25), and ???1, the usual maximum likelihood estimator based on the marginal distribution. Similar patterns of MSE improvement emerge for the bivariate Weibull as in the bivariate exponential case, for all values of ?. 4.2 Bivariate Location-Scale Models 4.2.1 Data Generation for BVW of Hougaard?s Model [27, Lee (1979)] and [31, Lu and Bhattacharyya (1990)] show that (X1,X2) of BVW defined by 2.30 can be represented by two independent random variables (U, V) as X1 = U?/?1V1/?1?1, X2 = (1?U)?/?2 V1/?2?2, where U ? Uniform(0,1), and V is distributed as the mixture of a standard expo- nential and standard Gamma(2). The pdf of V is given by f (v) = ?vexp(?v) + (1??)exp(?v), v> 0. Sowestartbygenerating(U, V).U isgeneratedbySASuniformrandomnumber generator. V is obtained by generating other four standard uniform random variables u1,...u4, and using the logarithm transform as on page 248 of [2, Casella and Berger 65 Mean Squared Error (MSE) ? ???1 ??1 %-imp ???1 ??1 %-imp 0.01 0.04585 0.04621 0.782 0.00001 0.00001 0.529 0.05 0.04522 0.04723 4.251 0.00021 0.00022 2.990 0.10 0.04512 0.05005 9.849 0.00077 0.00082 6.410 0.20 0.04118 0.04877 15.570 0.00252 0.00284 11.243 0.30 0.03930 0.04937 20.406 0.00456 0.00546 16.563 0.40 0.03627 0.04749 23.615 0.00634 0.00815 22.176 0.50 0.03757 0.04834 22.293 0.00741 0.01011 26.710 0.60 0.03771 0.04982 24.307 0.00807 0.01169 31.014 0.70 0.03799 0.04713 19.384 0.00763 0.01131 32.517 0.80 0.04210 0.04913 14.311 0.00611 0.00966 36.764 0.90 0.04326 0.04675 7.465 0.00343 0.00574 40.287 0.99 0.04939 0.04978 0.796 0.00039 0.00067 42.560 Based on 10,000 simulated samples Table 4.1: Simulation Study for BVE(?1 = ?2 = 1, n = 25) Percent-Improvement in MSE a ? = 0.5 ? = 1 ? = 1.5 ? = 2 ? = 10 0.01 1.864 0.152 0.141 0.233 0.490 0.05 4.904 4.836 4.568 3.986 1.973 0.1 5.987 5.378 2.763 4.352 5.882 0.2 4.285 5.021 7.807 4.352 10.089 0.3 9.974 9.035 12.566 5.988 6.809 0.4 9.408 15.852 20.258 10.532 11.684 0.5 15.240 12.699 15.718 18.472 18.161 0.6 16.083 10.914 16.048 13.707 14.489 0.7 7.228 12.788 12.214 13.676 12.571 0.8 9.946 10.300 9.967 10.299 9.189 0.9 4.922 6.716 4.595 5.940 6.174 0.99 0.367 0.478 0.590 1.118 1.499 Based on 10,000 simulated samples of size n = 25. Table 4.2: Simulation Study BVW(?1 = ?2 = 1) 66 (2002)] such that V = ? ?? ?? ?lnu1 ?lnu2, if u4 ?? ?lnu4, if u4 >? where (?lnu1 ?lnu2) ?gamma(2), and ?lnu4 ? exp(1) 4.2.2 Data Generation for BVW of FGM Model (Sequential Monte Carlo Simulation) To generate bivariate data set, (Xi1,Xi2),i = 1,...,n, of the FGM BVW as defined in 2.34, we first generate X1 ?Weibull(?1,?1) by setting its marginal CDF, FX1, equal to a random number of standard Uniform distribution, then xi1 = ?1 [?log(1?ui1)]1/?1 , where ui1 ?Uniform(0,1). The censoring indicator ?i1 is then determined by com- paring xi1 with the censoring value ci1. X2 is generated sequentially by setting its conditional CDF, FX2|X1, qual to a random number of standard Uniform distribution. The conditional CDF is given by FX2|X1 = FX1,X2 (x1,x2)F X1 (x1) (4.1) = FX1 (x1)FX2 (x2)[1 +?(1?FX1 (x1))(1?FX2 (x2))]F X1 (x1) = FX2 (x2)[1 +?(1?FX1 (x1))(1?FX2 (x2))] = (1?V)(1 +?WV) 67 where V = exp parenleftbigg ? parenleftBig x2 ?2 parenrightBig?2parenrightbigg ,W = exp parenleftbigg ? parenleftBig x1 ?1 parenrightBig?1parenrightbigg . Setting (4.1) equal to a standard uniform random number, then we have a quadratic equation in terms of V ?WV2 + (1??W)V + (ui2 ?1) = 0, where ui2 ?Uniform(0,1) is independent of ui1. By Quadratic Formula and V ? 0, V = (?W ?1) + radicalBig (1??W)2 ?4?W (ui2 ?1) 2?W (4.2) = 12 + radicalBig (1 +?W)2 ?4?Wui2 ?1 2?W , and then X2 is given by xi2 = ?2 [?log(V)]1/?2 . The censoring indicator ?i2 is then determined by comparing yi2 with the censoring value ci2. 4.2.3 Simulation Settings and Results for Bivariate Location-Scale Mod- els Simulation Settings 1. Data sets are generated by methods listed in section 4.2.1 and 4.2.2; 68 2. Sample size, n, is set at 25; 3. Simulation iteration is set to 500; 4. True parameter values are ?1 = 0.5, ?2 = 2, ?1 = ?2 = 10.(equal scale parame- ters); 5. Depedence parameters are set as ? = 0.5, ? = 0.5 for the RE and FGM models, respectively. Both the joint MLEs and working independence MLEs are obtained and compared against each other. Percent improvements/losses in terms of absolute bias and empirical MSE (mean squared errors) are also calculated. We found mixed turnout of improvements and losses. The Random Effect(Hougaard) Model Simulation Results Table 4.3 summarizes maximum likelihood estimation results based on the joint and the working independence models. Tables 4.4 and 4.5 summarize the percent- age improvement/losses obtained by comparing joint mle?s against working mle?s in terms of their biasses and Empirical Mean Squared Errors(EMSE). Table 4.5 shows overall improvement in MSE for mle?s of the shape parameters ?1 and ?1. But for mle?s of the scale parameters, no such pattern is found. It is expected as stated in [24, Johnson, Evans and Green (1999)] that neither the sample correlation nor the 69 population correlation depends on the values of the underlying scale parameters of the marginal distributions. The FGM Model Simulation Results Table 4.6 summarizes maximum likelihood estimation results based on the joint and the working independence models. Tables 4.7 and 4.8 summarize the improve- ments/losses in biasses and EMSE obtained by comparing the joint mle?s against working mle?s. Table 8 shows overall improvement in MSE only for mle?s of ?1 and slightly loss in ?2. 4.3 Example: DMBA-Induced Tumors Table 4.9 contains the first and second tumor times for 30 control and 30 treated animals, simulatedasBVW(0.000009,0.000005,? = 3)andBVW(0.000001,0.0000005,? = 3), respectively, with a = 1 in both cases. We see that there were 18 and 15 simul- taneous tumors for the control and treated animals, respectively. We first estimated the shape parameter based on the marginal Weibull likelihoods as ?? = 2.93. Then we compute the scale parameter estimates for each population using the estimates given in (24). This yielded ?? = (0.00001287,0.00000776) and (0.00000179,0.00000082) for the controls and treated animals, respectively. 70 ? ??1 ??2 ??1 ??2 ???1 ???2 ???1 ???2 0.1 0.52768 2.11475 10.5071 9.9578 0.52775 2.11411 10.4546 9.9468 0.2 0.52601 2.10351 10.3663 9.9479 0.52588 2.10134 10.3499 9.9376 0.3 0.52724 2.11299 10.5591 9.9889 0.52688 2.10869 10.5284 9.9826 0.4 0.53299 2.11051 10.7050 9.9997 0.53218 2.11411 10.6547 9.9946 0.5 0.53413 2.16718 10.6873 10.0069 0.53587 2.16407 10.6663 9.9937 0.6 0.52750 2.11608 10.3724 9.9060 0.52631 2.11830 10.3399 9.8988 0.7 0.52602 2.09350 10.4076 9.9371 0.52619 2.09460 10.3778 9.9301 0.8 0.53550 2.11476 10.8448 10.0144 0.53444 2.11509 10.8130 10.0098 0.9 0.53219 2.12814 10.6916 9.9758 0.53207 2.12961 10.6822 9.9707 1.0 0.52696 2.10093 10.5949 9.8745 0.52768 2.10355 10.6092 9.8756 Where ??? and ??? are working independence MLEs. Table 4.3: Joint and Working MLEs with ?1 = 0.5, ?2 = 2, ?1 = ?2 = 10 and varying ? ? ?1 bias improv. ?2 bias improv. ?1 bias improv. ?2 bias improv. 0.1 -2477.84 0.24710 -0.56025 -11.5567 0.2 -1968.70 -0.49651 -2.13658 -4.6773 0.3 -9211.99 -1.34267 -3.95333 -5.8183 0.4 -328586.83 -2.51654 3.15616 -7.6807 0.5 -17276.79 4.84207 -1.89587 -3.1514 0.6 -1058.00 -4.51075 1.87806 -9.5496 0.7 -1560.73 0.63741 1.16725 -7.8866 0.8 -7062.51 -3.07633 0.28694 -3.9194 0.9 -4334.31 -0.37612 1.13056 -1.3817 1.0 -853.27 2.60031 2.52758 2.3545 Table 4.4: Bias Improvements/Losses(%) Over the Working Estimates 71 ? ?1 mse improv. ?2 mse improv. ?1 mse improv. ?2 mse improv. 0.1 90.8621 1.61977 1.35134 -1.40621 0.2 81.2788 1.28675 4.61830 -0.01806 0.3 70.9359 4.39391 -1.04580 0.26043 0.4 63.3886 5.83908 6.00907 -0.99596 0.5 58.1423 7.17802 3.45869 1.05814 0.6 44.2761 4.93833 0.76780 -0.44087 0.7 32.2627 4.65393 3.68219 -0.62015 0.8 22.5092 -0.46238 -0.63960 0.20701 0.9 18.3216 0.68599 -1.02801 -0.20352 1.0 17.0853 -0.25471 1.01207 -0.14121 Table 4.5: Empirical MSE Improvements/Losses(%) Over the Working Estimates ? ??1 ??2 ??1 ??2 ???1 ???2 ???1 ???2 0.1 0.53263 2.08660 10.3397 9.76154 0.53228 2.08518 10.3515 9.76080 0.2 0.52865 2.09570 10.4262 9.67449 0.52839 2.09407 10.4271 9.68024 0.3 0.53248 2.05551 10.5498 9.60398 0.53197 2.05479 10.5549 9.61140 0.4 0.52701 2.02371 10.1626 9.39615 0.52585 2.02132 10.1284 9.39784 0.5 0.53353 2.06266 10.8290 9.24950 0.53263 2.05816 10.7823 9.24247 0.6 0.53342 2.02963 10.5895 9.10675 0.53249 2.02805 10.5501 9.10546 0.7 0.52007 2.01353 10.4359 8.95880 0.51912 2.01013 10.3927 8.95503 0.8 0.53457 2.01028 10.7978 8.87893 0.53256 2.00488 10.6808 8.86241 0.9 0.53131 1.99020 10.7238 8.60068 0.52961 1.98629 10.6340 8.60127 1.0 0.52763 2.00859 10.8060 8.50162 0.52544 2.00157 10.7181 8.48742 Where ??? and ??? are working independence MLEs. Table 4.6: Joint and Working MLEs with ?1 = 0.5, ?2 = 2, ?1 = ?2 = 10 and varying ? 72 ? ?1 bias improv. ?2 bias improv. ?1 bias improv. ?2 bias improv. 0.1 -342.920 -1.10145 -1.660 3.3686 0.2 -288.550 -0.92829 -1.731 0.1915 0.3 -205.638 -1.59816 -1.311 0.9308 0.4 -134.917 -4.48150 -11.252 -26.6615 0.5 -120.447 -2.76162 -7.725 -5.9750 0.6 -118.480 -2.84038 -5.632 -7.1674 0.7 -94.343 -4.99189 -33.548 -10.9885 0.8 -104.619 -6.18119 -110.420 -17.1839 0.9 -103.403 -5.74152 28.534 -14.1625 1.0 -119.249 -8.62320 -447.628 -12.2393 Table 4.7: Bias Improvements/Losses(%) Over the Working Estimates ? ?1 mse improv. ?2 mse improv. ?1 mse improv. ?2 mse improv. 0.1 90.4678 -1.16397 0.13321 0.37230 0.2 84.3979 -0.35227 -1.69771 -0.51571 0.3 75.6162 -0.94049 -0.72841 -0.61527 0.4 71.8620 -2.32073 0.54219 -2.55347 0.5 69.7788 -1.72126 -2.74445 -2.78227 0.6 69.4358 -1.16677 -0.15066 -3.16534 0.7 65.3872 -1.84386 0.36968 -1.85708 0.8 65.0311 -1.49496 0.06637 -2.62671 0.9 68.4773 -1.58717 -0.34932 -4.46156 1.0 69.2833 -2.11464 -0.88495 -1.56760 Table 4.8: Empirical MSE Improvements/Losses(%) Over the Working Estimates 73 Control Animals (n = 30,a = 1,k = 18) (28,80), (53,53), (30,47), (25,63), (75,75), (21,21), (55,67), (55,66), (42,43), (79,79), (56,56), (56,56), (42,42), (64,64), (44,56), (39,57), (63,63), (41,41), (28,28), (49,49), (34,34), (10,82), (53,55), (26,26), (43,52), (16,16), (57,86), (56,56), (29,29), (19,19) Treated Animals (n = 30,a = 1,k = 15)) (50,50), (69,69), (53,53), (66,95), (77,77), (102,102), (114,142), (83,83), (63,63), (58,116), (80,137), (122,122), (90,90), (42,65), (106,106), (90,90), (114,114), (117,172), (98,98), (82,82), (22,99), (102,138), (123,147), (61,120), (80,138), (75,142), (78,146), (51,166), (12,180), (147,147), Table 4.9: Bivariate Weibull Times to first and second tumor 74 Bibliography [1] Au, C. and Tam, J. (1999), ?Transforming variables using the dirac generalized function,? The American Statistician, 53, pp. 270-273. [2] Casella, G and Berger, R. L. (2002), Statistical Inference, Duxbury Thomson Learning, Pacific Grove. [3] Csorgo, S. and Welsh, A.H. (1989), ?Testing for exponential and Marshall-Olkin distributions,? Journal of Statistical Planning and Inference, 23, pp. 287-300. [4] Dudewicz, E. J. and Mishra, S. (1988), Modern Mathematical Statistics, Wiley, New York. [5] Clayton, D. G. (1978). ?A model for association in bivariate lifetables and its application in epidemiological studies of familial tendency in chronic disease in- cidence, ? Biometrika, 65, pp. 141-151. [6] Conway, D. (1983). ?Farlie?Gumbel?Morgenstern distributions,? Encyclopedia of Statistical Sciences, 3, pp. 28?31. [7] Cohen, A. C. (1965). ?Maximum likelihood estimation in the weibull distribution based on complete and on censored samples, ? Technometrics, 7, pp. 579-588. [8] Cohen, A. C. (1975). ?Multi-Censored Sampling in the Three Parameter Weibull Distribution, ? Technometrics, 17, pp. 347-351. [9] Devore, J. L. (2000). Probability and Statistics for Engineering and the Sciences, Duxbury Thomson Learning, Pacific Grove. [10] Efron, B. (1985). ?Bootstrap confidence intervals for a class of parametric prob- lems, ? Biometrika, 72, pp. 45-58. [11] Eskow, E. and Schnabel, R. B. (1991). ?Algorithm 695: software for a new modified Cholesky factorization,? ACM Transactions on Mathematical Software, 17, pp. 306 -312. [12] Farlie, D. J. G. (1960). ?The performance of some correlation coefficients for a general bivariate distribution,? Biometrika, 47, pp. 307-323. 75 [13] Freund, J. E. (1961). ?A bivariate extension of the exponential distribution,? J. Am. Statist. Ass., 56, pp. 971-977. [14] Gould, A. and Lawless, J. F. (1988). ?Consistency and efficiency of regression coefficient estimates in location-scale models, ? Biometrika, 75, pp. 535-540. [15] Gumbel, E. J. (1960). ?Bivariate exponential distributions,? J. Am. Statist. Ass., 55, pp. 698-707. [16] Hanagal, D. D. (1996). ?Multivariate Weibull distribution,? Economic Quality Control, 11, pp. 193-200. [17] Hougaard, P. (1986). ?A class of multivariate failure time distributions.? Biometrika, 73, 3, pp. 671-678. [18] Hougaard, P. (2000). Analysis of Multivariate Survival Data, Springer-Verlag, New York. [19] He, W. and Lawless, J. F. (2005). ?Bivariate location-scale models for regression analysis, with applications to lifetime data, ? J. R. Statist. Soc., 67, pp. 63-78. [20] Huang, J. S. and Kotz, S. (1984). ?Correlation structure in iterated Farlie? Gumbel?Morgenstern distributions,? Biometrika, 71, pp. 633?636. [21] Iyer, S. K., Manjunath, D. and Manivasakan, R. (2002). ?Bivariate Exponential Distributions Using Linear Structures,? Sankhya, 64(A), pp. 156-166. [22] Iyer, S. K. and Manjunath, D. (2004). ?Correlated Bivariate Sequence for queue- ing and reliability applications,? Communications in Statistics, 33, pp 331-350. [23] Joe, H. (1997). Multivariate Models and Dependence Concepts, Chapman and Hall, New York. [24] Johnson, R. A., Evans, J. W. and Green, D. W.(1999). Some Bivariate Dis- tributions for Modeling the Strength Properties of Lumber, U.S. Department of Agriculture, Forest Service, Forest Products Laboratory, Madison. [25] Khuri, A. (2004). ?Applications of Dirac?s Delta Function in Statistics,? Int. J. Math. Educ. Sci. Technol, 35, 2, pp. [26] Lawless, J. F. (2002). Statistical Models and Methods for Lifetime Data, Wiley, New York. 76 [27] Lee, L. (1979), ?Multivariate Distributions having Weibull Properties,? Journal of Multivariate Analysis, 9, pp. 267-277. [28] Lehmann, E. L. ( 1983). Theory of Point Estimation, Wiley, New York. [29] Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, Springer- Verlag, New York. [30] Liu, C. (1997), A Comparison Between the Weibull and Lognormal Models Used to Analyse Reliability Data, University of Nottingham Doctoral Thesis, UK. [31] Lu, J. C. and Bhattacharyya, G. K. (1990). ?Some New Constructions of Bivari- ate Weibull Models,? Ann. Inst. Statist. Math., 43(3), pp. 543-559. [32] Marshall, A. W. and Olkin, I. (1967). ?A Multivariate Exponential Distribution,? J. Amer. Stat. Assoc., 63, pp. 30-44. [33] Marshall, A. W. and Olkin, I. (1988). ?Families of multivariate distributions,? J. Amer. Stat. Assoc., 83, 834?841. [34] Meeker, W. Q. and Escobar, L. A. (1998). Statistical Methods for Reliability Data, Wiley, New York. [35] Miller, G. (1984). ?Inference on a future reliability parameter with the Weibull process model, ? Naval Research Logistics Quarterly, 31, pp. 91-96. [36] Morgenstern, D. (1956). ?Einfache beispiele zweidimensionaler verteilungen,? Mitt. Math. Statis., 8, pp. 234-235. [37] Murthy, D. N. P., Xie, M., Jiang, R. (2004). Weibull Models, Wiley, New York. [38] Rao. C. R. ( 1973). Linear Statistical Inference and Its Applications, Wiley, New York. [39] Rausand, M. and Hoyland, A. (1994). System Reliability Theory; Models, Statis- tical Methods and Applications, Wiley, New York. [40] Sarkar, S. K. (1987). ?A continuous bivariate exponential distribution,? J. Amer. Stat. Assoc., 82, pp. 667?675. [41] Zeller, A. (1962). ?An efficient method of estimating seemingly unrelated re- gressions and tests for aggregation bias,? J. Am. Statist. Ass., 57, pp. 400-405. 77 Appendices 78 Appendix A SAS Simulation Codes A.1 Data Generation and MLE for Linearly Associated BVE and BVW data null ; retain c 1; do i=.10, .2,.3,.4,.5,.6; call symput(?loop?||left(c),i); call symput(?step?, c); c+1; end; run; %MACRO sim(lambda1, lambda2,beta, n, iter,table); %do i= 1 %to &step; title?lambda1=&lambda1lambda2=&lambda2roh=&&loop&isamplesize=&n?; DATA data1; lambda1=&lambda1; lambda2=&lambda2; beta=β roh=&&loop&i; a=roh*(lambda1/lambda2); 79 p=a*lambda2/lambda1; DO iter=1 to &iter; DO i = 1 to &n; d=0; z=RANBIN(0,1,1-p)*RANEXP(0)/lambda2; X1=RANEXP(0)/lambda1; X2=a*X1 + z; y1=x1**(1/beta); y2=x2**(1/beta); if z=0 then d=1; ind=1-d; OUTPUT; end; END; run; PROC NLP tech=newrap DATA=data1 OUTEST=init noprint; MAX logf; PARMS Lam1=1, Lam2=1,b=1; logf=2*log(b) + log(Lam1)+ log(Lam2) +(b-1)*log(y1) + (b-1)*log(y2) -Lam1*y1**b - Lam2*y2*b; BY iter; RUN; 80 DATA init; SET init; IF TYPE =?PARMS?; KEEP iter lam1 lam2 b; RENAME b=bhat; RUN; DATA data2a; MERGE data1 init; by iter; y11=(y1**bhat); y22=(y2**bhat); RUN; proc means data=data2a noprint; var y11 y22 a d; by iter; output out=data2 mean(y11 y22 a bhat lam1 lam2)=y1bar y2bar a bhat lam1 lam2 sum(d)=k; run; data mle; set data2; lambda1=&lambda1; lambda2=&lambda2; roh=&&loop&i; a=roh*(lambda1/lambda2); 81 lambhat11=a/y2bar + (&n-k)/(&n*y1bar); lambhat12=1/y2bar; rohhatstar=a*lambhat12/lambhat11; lamb11bias=lambhat11-&lambda1; lamb12bias=lambhat12-&lambda2; rohhatstarbias=rohhatstar-&&loop&i; lamb11mse=lamb11bias**2; lamb12mse=lamb12bias**2; rohhatstarmse=rohhatstarbias**2; se lamb11=sqrt((lambhat11*(lambhat11-a*lambhat12)+a**2*lambhat12**2)/&n); LCL = lambhat11 - 2*se lamb11; UCL = lambhat11 + 2*se lamb11; conf=(lcl le lambda1 le ucl); se lam1=sqrt(lam1**2/&n); LCL1 = lam1 - 2*se lam1; UCL1 = lam1 + 2*se lam1; conf1=(lcl1 le lambda1 le ucl1); range=(ucl-lcl)/2; range1=(ucl1-lcl1)/2; rohhat=a*lam2/lam1; rohhatbias=rohhat-&&loop&i; lam1bias=lam1-&lambda1; 82 lam2bias=lam2-&lambda2; lam1MSE=lam1bias**2; lam2MSE=lam2bias**2; rohhatMSE=rohhatbias**2; run; PROC MEANS data=mle noprint; var roh conf lcl ucl conf1 lcl1 ucl1 range range1 lambhat11 lam1 lamb11bias lam1bias lamb11mse lam1mse lambhat12 lam2 lamb12bias lam2bias lamb12mse lam2mse rohhatstar rohhat rohhatstarbias rohhatbias rohhatstarmse rohhatmse; output out=stats mean=rho conf lcl ucl conf1 lcl1 ucl1 range range1 lambhat11 lam1 lamb11bias lam1bias lamb11mse lam1mse lambhat12 lam2 lamb12bias lam2bias lamb12mse lam2mse rohhatstar rohhat rohhatstarbias rohhatbias rohhatstarmse rohhatmse; run; PROC DATASETS nodetails nolist force; APPEND BASE=work.table&table DATA=stats; RUN; %end; DATA sim.table&table; SET table&table; 83 lbiasimp=(abs(lam1bias)-abs(lamb11bias))/lam1bias*100; lmseimp=(lam1mse-lamb11mse)/lam1mse*100; rhobiasimp=(abs(rohhatbias)-abs(rohhatstarbias))/rohhatbias*100; rhomseimp=(rohhatmse-rohhatstarmse)/rohhatmse*100; RUN; PROC EXPORT DATA= sim.TABLE&table OUTFILE=?C:\DocumentsandSettings\BB\MyDocuments\PAPER\Spring05\code\tables.xls? DBMS=EXCEL REPLACE; SHEET=?sheet&table?; RUN; %MEND sim; %sim(10,1,2, 50, 1000,118); /*%sim(100,1,50, 10000,6); %sim(1,100,50, 10000,7); %sim(100,100,50, 10000,8);*/ A.2 Data Generation and MLE for BVW of Hougaard?s Model %LET directory=C:\; LIBNAME sim ?&directory?; %let seed=0; %let beta1=0.5; %let beta2=2; 84 %let theta1=10; %let theta2=10; %let n=25; %macro hougsimu(iter); %do dl=1 %to 10; %let delta=&dl/10; %do k=1 %to &iter; DATA houg; DO i = 1 to &n; U=uniform(0); U2=uniform(0); U3=uniform(0); U4=uniform(0); U5=uniform(0); V=(-log(U2)-log(u3))*(u5 le &delta) -log(u4) *(U5 > &delta); X1=U**(&delta/&beta1)*v**(1/&beta1)*&theta1; x2=(1-U)**(&delta/&beta2)*v**(1/&beta2)*&theta2; delta=δ OUTPUT; END; RUN; PROC NLP tech=newrap outest=out1 DATA=houg noprint; MAX logf; PARMS bhat1 that1; logf=-log(that1)+log(bhat1)+(bhat1-1)*log(x1/that1) -(x1/that1)**bhat1; 85 by delta; RUN; data out1; set out1; KEEP type bhat1 that1 delta; if type =?PARMS?; RUN; PROC NLP tech=newrap outest=out2 DATA=houg noprint; MAX logf; PARMS bhat2 that2; logf=-log(that2)+log(bhat2)+(bhat2-1)*log(x2/that2) -(x2/that2)**bhat2; by delta; RUN; DATA out2; set out2; KEEP type bhat2 that2 delta; IF type =?PARMS?; RUN; *working independent mle; DATA seedwrk; merge out1 out2; RUN; 86 PROC NLP tech=congra DATA=houg inest=seedwrk outest=hougwrkest no- print; MAX logf; PARMS bhat1 that1 bhat2 that2; logf=log(bhat1)-log(that1)+log(bhat2)-log(that2) +(bhat1-1)*log(x1/that1)+(bhat2-1)*log(x2/that2) -log((x1/that1)**bhat1 + (x2/that2)**bhat2) +log((x1/that1)**bhat1 + (x2/that2)**bhat2) -(x1/that1)**bhat1 - (x2/that2)**bhat2; by delta; RUN; data simwrk; set hougwrkest; if type =?PARMS?; bhat1wrk=bhat1; bhat2wrk=bhat2; that1wrk=that1; that2wrk=that2; keep bhat1wrk bhat2wrk that1wrk that2wrk; run; *joint mle; DATA seed1; 87 set seedwrk; dhat=0.1; RUN; PROC NLP tech=congra DATA=houg inest=seed1 outest=seed2 noprint; MAX logf; PARMS dhat; bounds 0